Finding the greengenes database

doudou2047 · April 6, 2020, 4:23pm

Hi, Actually,where si the gg v3-v4? I have no idea where to fine them
Thank you for sharing.

doudou2047 · April 6, 2020, 4:24pm

Hi, Thank you for sharing.
Where to find this file? "99_otu_taxonomy.txt"
Thank you,
BEst,
Qiong

jwdebelius · April 6, 2020, 4:40pm

Hi @doudou2047,

The data resources page has links to both the greengenes and silva databases.

Best,
Justine

doudou2047 · April 6, 2020, 4:58pm

Thank you so much, Justine
I assuming the comments of this topic direct us to find the trained classifier for V3-V4.
The links listed are the whole length and V4 only, right?
Actually, I am curious about:

could we directly use the whole length of gg or sliva without extract the reference reads?
I have trouble to find the "taxonomy.txt" file at the page Data Resources, could you please kindly point out?
Thank you,
BEst,
QIong

doudou2047 · April 6, 2020, 5:43pm

Dear Justine,
I am sorry.
I unzip the "gg_13_8_otus.tar.gz", all the fasta and taxonomy are inside.
Thank you for bothering.
THank you,
BEst,
QIong

Mehrbod_Estaki · April 7, 2020, 1:06am

Hi @doudou2047,
The V3-V4 classifier I was referring to the previous posts here were made using older version of Qiime2/Sci.Kit bio. If you are using Qiime 2020.2, you'll have to train your own V3-V4 classifier, unless someone has shared one recently that I am unaware of. A quick search through the "Community Contributions" section should answer that.
As for your question regarding using full length without extract reads, the answer is yes, you can certainly use full lengths, however in previous benchmarks it was shown that there is a slight increase in accuracy, and obviously reduce processing time, when classifier is trained on the specific region.

system · May 8, 2020, 10:54am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.