RESCRIPt forward filling

Hi @owlpen ,
Thanks for the kind words and for using RESCRIPt and QIIME 2!

The problem at the end of the day is with biom add-metadata, and is related to how the taxonomic classification occurs. It is not coming from RESCRIPt. So I can explain the taxonomic annotations and where these empty ranks come in... but I encourage you to open a NEW separate topic with the full error message that you are receiving from biom add-metadata so that someone more familiar with biom-format can help — if you want to troubleshoot that error.

Let's look at the taxonomy:

These missing ranks are due to incomplete classification of your query sequences because they do not have any good hits in the database (or more specifically they match multiple different families of Enterobacterales with sufficient confidence that the sequence cannot be confidently classified to one family).

These gaps should not appear in the reference database (at least when you are using RESCRIPt with rank propagation). When a reference sequence has an incomplete taxonomic annotation you would instead see an annotation like this:

d__Bacteria;k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__;g__;s__

You also ran this with and without rank propagation, so clearly the issue did not originate with RESCRIPt (but nice troubleshooting! thanks for that :grin: )

It sounds like you don't even want to run biom add-metadata anyway, as you are not trying to get a biom table:

There is an easier way! You can merge these using metadata tabulate then download as a CSV. See some instructions here:

Good luck! :boar:

2 Likes