Which database should I choose?

Hi, thanks for this excellent tool for amplicon data analysis. I have data from v5-7 regions, which database should I choose when I am doing feature-classifier? Is it wrong to use silva_v4? I appreciate your answer. Thanks in advance.

Welcome to the forum!
For your targeted region I would advise either training your own classifier with your primers for reference sequences extraction (Rescript tutorial), or using pretrained full length classifier from resources page.



I see. Thanks for your reply.


I am following this thread since I am experiencing the issue of bad classification: too many counts in Bacterai or OD1 not corresponding to literature data, so maybe the wet laboratory changed the region for primers

Hi @MichelaRiba, sadly there have been quite a lot of changes in the last 5 years with regard to microbial taxonomy. Each curated database can vary, and use different nomenclatural rules.

Fortunately, we've made it easier to import a few via various QIIME 2 tools and the RESCRIPt plugin. Aside from SILVA, you can make use of:

  1. RDP
  2. GTDB
  3. GreenGenes2

RESCRIPt also has some tools by which you can compare these.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.