Taxonomic classification with the greengenes database

Using qiime2 for 16s sequencing analysis, do the V1V2 and V3V4 species annotations use the same database? The database I downloaded by myself can only be annotated in the V3V4 area, but V2V1 cannot be annotated. Is the database I downloaded wrong? Is there a link to download a database, and can you comment on all of them?

Welcome to the forum @zhangyun!

Please note: I translated the title and text of your topic into English, as this is the common language of the forum moderators, and will make your question (and solutions) more searchable by others on the forum.

Yes, V1V2 and V3V4 sequences can be classified using the same database, e.g., if you use a full-length 16S database for the classifications.

It sounds like you are using the wrong database, e.g., you are using a V3V4-specific database. You can only classify sequences if the target gene/region is encompassed in your reference database. So a full-length 16S rRNA gene database works for V1V2 as well as V3V4, but a V3V4 classifier will not work for V1V2.

The QIIME 2 team has provided a variety of pre-trained classifiers for full-length 16S rRNA genes (which you could use for your data), as well as for V4 (which you should not use for your data). These are located in the "data resources": Data resources — QIIME 2 2020.8.0 documentation

You can also train your own classifiers for V1V2 or V3V4, as described in this tutorial:
https://docs.qiime2.org/2020.8/tutorials/feature-classifier/

Those variable domain-specific classifiers will most likely be more accurate. You can use the greengenes or SILVA sequences from the "data resources" page as input to the "extract-reads" action.

Also note: many of the tutorials from the QIIME 2 website have been translated into Chinese and are available here on the forum:

For example:

Good luck!

1 Like

Thank you very much. Your reply helped me a lot!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.