create NCBI database

Hey,
I’d like to use NCBI database for classification. I can find taxonomy/sequences for Silva and GreenGenes, but there’s no any hints how to create such NCBI-based database.

Any suggestion how to perform it? Should I download all bacteria sequences one-by-one or there are already some tricks how to do it fast?

Thanks

2 Likes

Hi Damian, hope all is well, there's some threads on this:

Personally, I haven't used these methods, but briefly looking through these archived threads (and one more recently), it's not different than using the 16S gene databases (which I am most familiar with greengenes). It looks like you will have to do some work to get it working. Ben

1 Like

Great, thanks, I’ll have a look on that!.
I don’t want to use GreenGeenes as it’s 6 years old now ;(

1 Like

Yeah, the lab still uses greengenes as some of the inferred metagenome is connected to greengenes annotations. Ben

Hi,
did you manage to find/create the NCBI 16S Fasta file(s) and a respective taxonomy file?
I would also be happy to test the native (original) NCBI database, as that should be the authentic and most recent reference.
Moreover I find contradictory items in Silva132 like
“k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Stramenopiles
D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Betaproteobacteriales
that I do not really know what to think about.

Hi @Peter_Kos,

The new SILVA v138 reference database recently became available. There should be a QIIME compatible version soon, with less messy taxonomy. My recent attempt to making a cleaner taxonomy is available here:

Other details are outlined in the rest of the above thread as well this one:

-Mike

2 Likes

Regarding this:

“k__Bacteria;p__ Cyanobacteria;c__Chloroplast ;o__Stramenopiles

See:

We often recommend removing these sequences prior to analysis.

-Mike

1 Like