Silva 138.1, which database do I download? Ref NR 99 or Ref?

Hi there!

It is my first time doing 16S metagenomic analysis on environmental samples and would appreciate any help on deciding which database I should be using. Silva seems to be the most appropriate database to use for analysis of environmental samples. I am trying to download Silva 138.1 from this link here: https://www.arb-silva.de/download/arb-files/. If you look at the first two options for SSU, you are given the choice of “Ref NR 99” and “Ref”. Which should I use? The website says the Ref NR 99 is the recommended option, but I wanted a second opinion.

Thank you in advance for your help :slight_smile:

Hi @DannyBoi97,
Personal opinion: I think the ref 99 makes perfect sense and significantly reduces redundancy without losing accuracy (probably why it is the recommended database). It also makes it much easier to work with, the full SILVA SSU is massive and training a classifier with it takes a massive amount of memory which most personal computers can’t handle. I would stick with that one.

3 Likes

Hi @DannyBoi97,
Just to add to @Mehrbod_Estaki’s excellent advice:

Note that if you are trying to use these files with QIIME 2 (or many other bioinformatics tools) the SILVA files need to be reformatted somewhat significantly to be compatible. Fortunately, we have a QIIME 2 plugin for that, and a tutorial here:

Though the SILVA 138 sequences and taxonomy (processed using this plugin) are also available on the QIIME 2 data resources page:
https://docs.qiime2.org/2020.11/data-resources/

That’s 138, though, not 138.1 yet, so if you want to use 138.1 with QIIME 2 for now we recommend following that RESCRIPt tutorial… the get-silva-data method will grab and process the 138.1 files by default.

3 Likes

Thanks for the input! I appreciate it :slight_smile:

1 Like