I’m having some trouble at the taxonomy assignment step using a trained classifier (new user of Qiime2 approach). I would prefer working with the latest SILVA 138.2 database. I initially used the full-length classifier from the QIIME2 Resources page (https://resources.qiime2.org/), specifically: #silva 138 99% OTUs full-length sequences##.
However, when I analyzed my reads, none of the ASVs were assigned to species names. I understand the limitations for short reads, but I was still hoping to retrieve some species-level information. Another issue I noticed was with taxonomy naming — for example, I saw older names like Firmicutes instead of Bacillota, and Proteobacteria instead of Pseudomonadota.
Then I found this discussion on the forum: SILVA V4 classifier - #9 by timanix
I followed links shared by @timanix and downloaded the SILVA 138.2 V3–V4 trained classifier. This one worked much better — correct phylum names and even species-level assignments for many of my ASVs.
How should I correctly cite your trained SILVA database?
Do you have a trained SILVA 138.2 classifier (full-length version) compatible with the latest QIIME2 release? I am not always using the same V3V4 region.
Looking forward to hearing from @timanix & the community!
Cheers,
Just cite the database and plugin that were used, and mention Silva v.138.2 and rRNA region.
Nope, I didn't train it since I only trained classifiers for the regions I worked with. Probably I will use only full-lengths classifiers in future, but I would like to test it first.
There are V4, V1-V2, V3-V4 and V1-V3 classifiers for q2 2024.10. If you urgently need a full-length classifier, I can train it when I have some time.
Thank you for the quick reply @timanix!
I’ll definitely stick with your trained V3–V4 classifier for now — it works well for my data.
Just wondering: do you have any insights on the limited species-level assignments and outdated taxonomy (like phylum names) when using the “official” QIIME2 Resources classifiers?
This part is easy - Silva 138.1 contains outdated taxonomy names that were updated in 138.2. Otherwise, it is the same database. Qiime2 classifiers from the official page were trained before Silva 138.2 was released.
That part puzzles me, but it is probably related to the differences in performance between full-length and region-specific classifiers. Anyway, I usually don't draw any conclusions based on species annotations received with 16S rRNA gene amplicons targeting only 1 or 2 regions.
Note: SILVA only curates down to genus level. The species annotations are not curated, and a very large proportion are incorrect/inaccurate. So if you are using the SILVA database I would not trust the species level assignments, even if you are using full-length 16S (which in any case still cannot reliably distinguish some species so might not yield species-level classifications in that case). This is a feature of SILVA itself, not QIIME 2. See here for more details: