Hey,
I'd like to use NCBI database for classification. I can find taxonomy/sequences for Silva and GreenGenes, but there's no any hints how to create such NCBI-based database.
Any suggestion how to perform it? Should I download all bacteria sequences one-by-one or there are already some tricks how to do it fast?
Hi Damian, hope all is well, there's some threads on this:
Personally, I haven't used these methods, but briefly looking through these archived threads (and one more recently), it's not different than using the 16S gene databases (which I am most familiar with greengenes). It looks like you will have to do some work to get it working. Ben
Hi,
did you manage to find/create the NCBI 16S Fasta file(s) and a respective taxonomy file?
I would also be happy to test the native (original) NCBI database, as that should be the authentic and most recent reference.
Moreover I find contradictory items in Silva132 like
"k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Stramenopiles
D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Betaproteobacteriales"
that I do not really know what to think about.
The new SILVA v138 reference database recently became available. There should be a QIIME compatible version soon, with less messy taxonomy. My recent attempt to making a cleaner taxonomy is available here:
Other details are outlined in the rest of the above thread as well this one: