Which database should I choose?

Hi, thanks for this excellent tool for amplicon data analysis. I have data from v5-7 regions, which database should I choose when I am doing feature-classifier? Is it wrong to use silva_v4? I appreciate your answer. Thanks in advance.

Welcome to the forum!
For your targeted region I would advise either training your own classifier with your primers for reference sequences extraction (Rescript tutorial), or using pretrained full length classifier from resources page.

Best,

2 Likes

I see. Thanks for your reply.

Hi,

I am following this thread since I am experiencing the issue of bad classification: too many counts in Bacterai or OD1 not corresponding to literature data, so maybe the wet laboratory changed the region for primers

Hi @MichelaRiba, sadly there have been quite a lot of changes in the last 5 years with regard to microbial taxonomy. Each curated database can vary, and use different nomenclatural rules.

Fortunately, we've made it easier to import a few via various QIIME 2 tools and the RESCRIPt plugin. Aside from SILVA, you can make use of:

  1. RDP
  2. GTDB
  3. GreenGenes2

RESCRIPt also has some tools by which you can compare these.

-Cheers!
-Mike

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.