create NCBI database

I’d like to use NCBI database for classification. I can find taxonomy/sequences for Silva and GreenGenes, but there’s no any hints how to create such NCBI-based database.

Any suggestion how to perform it? Should I download all bacteria sequences one-by-one or there are already some tricks how to do it fast?


Hi Damian, hope all is well, there’s some threads on this:

Personally, I haven’t used these methods, but briefly looking through these archived threads (and one more recently), it’s not different than using the 16S gene databases (which I am most familiar with greengenes). It looks like you will have to do some work to get it working. Ben

Great, thanks, I’ll have a look on that!.
I don’t want to use GreenGeenes as it’s 6 years old now ;(

Yeah, the lab still uses greengenes as some of the inferred metagenome is connected to greengenes annotations. Ben

did you manage to find/create the NCBI 16S Fasta file(s) and a respective taxonomy file?
I would also be happy to test the native (original) NCBI database, as that should be the authentic and most recent reference.
Moreover I find contradictory items in Silva132 like
that I do not really know what to think about.

Hi @Peter_Kos,

The new SILVA v138 reference database recently became available. There should be a QIIME compatible version soon, with less messy taxonomy. My recent attempt to making a cleaner taxonomy is available here:

Other details are outlined in the rest of the above thread as well this one:



Regarding this:

“k__Bacteria;p__ Cyanobacteria;c__Chloroplast ;o__Stramenopiles


We often recommend removing these sequences prior to analysis.


