I'm new to bacterial communities and have been working with a 16S V3-V4 dataset classified with SILVA 138.1, which I downloaded using the RESCRIPt tutorial.
I've been noticing that some of my taxonomy assignments have the same name for multiple taxonomic levels. For example: o__Candidatus_Peribacteria, f__Candidatus_Peribacteria, fs__Candidatus_Peribacteria, g__Candidatus_Peribacteria. Another example is BD7-11 designated as class through genus.
When I look up these on the SILVA website, it looks like the name should go at the first designation, followed by Incertae Sedis. Is it normal for QIIME to fill all the ranks with the same name when this is the case? How would you report these taxa, would it be appropriate to say genus "Candidatus Peribacteria", for example?
Depending on the version of the QIIME 2 formatted SILVA database you are using... Other than enabling --p-rank-propagation by default, and possibly the --p-include-species-labels we do not edit the SILVA taxonomy. Please see the "Species-labels: caveat emptor!" menu in the tutorial.
So the Incertae sedis rank label will appear in whatever rank that the SILVA curators decide to place it. You can make use of RESCRIPt's edit-taxonomy, to modify the taxonomy as you see fit.
This is the default as outlined in the SILVA tutorial, under the expandable menu labeled "Rank-Propagation". You can disable this if you decide to construct the database yourself. More on this later...
Hard to say, as Candidatus has a different meaning than Incertae sedis. The former usually meaning that the taxonomic designation is somewhat supported but lacking more details about the biology of the organism (currently uncultivable, etc), where as the later literally means that the broader phylogenetic and taxonomic placement is currently uncertain. But I leave it to proper taxonomists in the forum to provide more insight here.
Also, SILVA released version 138.2 which can be downloaded with the current github version of RESCRIPt. They've updated to the most recent taxonomy, e.g.Firmicutes are now Bacillota. You can install as follows:
conda activate qiime2-amplicon-2024.5
pip install git+https://github.com/bokulich-lab/RESCRIPt.git
qiime dev refresh-cache
Then you can use the SILVA tutorial as a guide as to how you'd like to construct your SILVA reference database. Keep in mind the tutorial is simply providing examples as to what you can do, feel free to modify as you see fit.