Taxonomy diversity

Hello,
as there are multiple databases available for taxonomy, including SILVA, RDP, Greengenes, and NCBI, I am curious if there is a comprehensive review or article that allows for the comparison of these databases. Specifically, I am interested in understanding when to use one over the others and the distinctions among the SILVA, RDP, Greengenes, and NCBI databases.
If someone possesses a table or a concise review summarizing these aspects, I would appreciate it if they could share it with me.

Hi @Will_Ericksen ,
We directly compared these databases in this article:

and here is another earlier article that compared these more qualitatively:

The RDP database is no longer maintained, so I would not consider that one for comparison any more.

Good luck!

3 Likes

@Nicholas_Bokulich Sorry for this stupid question but what do you mean by 'no more maintained'? Do you mean that if I use the latest version of qiime2, I can no longer use the RDP database for taxonomy classification? Or can I still use it, but you no longer perform updates on the database?

The developers are no longer making new releases and their website is down (at least right now! and the past few times I tried)

No, you can absolutely use the RDP database with QIIME 2 if you can find a copy of the database (see below).

Half correct. You can still use it, but RDP will no longer make any new releases (and have not for a few years). The significance being that the taxonomy may go out of date, and newer species and sequences will be missing.

The QIIME 2 developers have nothing to do with RDP, so it's not the QIIME 2 developers who are no longer updating RDP.

see this post for some related info:

We do however have a tutorial for importing and using the RDP database with QIIME 2 if you want to use this database (I think the data are still available in a few places on the internet even if their website is currently down):

4 Likes

Thank you very much .
Based on your experience, what's the most used database by the community: SILVA, GreenGenes, or NCBI?

Hi @Will_Ericksen ,
It is tough to tell. On this forum it seems like SILVA is most commonly mentioned, but there could be selection bias. All 3 of these are commonly used (and of note, greengenes has been replaced by Greengenes2, for which there is a QIIME 2 plugin). There is also the GTDB database, which has shaken things up and is popular to many! These all have strengths and weaknesses and it really depends on your application so there is not one clearly superior database ā€” see the papers above for a more thorough discussion.

You can download and format any of those databases using RESCRIPt, with the exception of Greengenes2 (as it has its own plugin for this).

3 Likes

thank you again !!!!

1 Like

@Nicholas_Bokulich , based on the information from the previous article, the mentioned latest releases of SILVA, RDP, Greengenes, and NCBI are as follows: (SILVA, 29/09/2016), 30/09/2016 (RDP), May 2013 (Greengenes), 5th Oct 2016 (NCBI).

Could you please provide the current up-to-date releases of these databases, including the version and date? Additionally, where can I find this information specifically?

Hi @Will_Ericksen ,

Indeed, some of these databases have been updated since 2016, so the paper that you are quoting (from 2017) is not up to date.

These databases all have websites, so I suggest checking these websites directly to see the latest release information.

1 Like