Trouble Training a Classifier using green genes

Hello all,

I am trying to train my classifier.
I downloaded gg_13_5_otus.tar.gz and gg_13_taxonomy.txt.gz files from website greengenes.secondgenome.com.

I extracted both the files as shown in following pic.
image

I am running the following command:

Can someone please help.

Thank you

Hi @Parul_Baranwal,

Check the contents of the gg_13_5_otus taxonomy directory. My guess is that it contains several sub directories, including rep_set (or something similar). You will need to use one of those files as your sequences.

Best,
Justine

2 Likes

Thank you so much!
It worked!

2 Likes

I have a followup question:

I am using my paired-end-sequences and I am following the moving picture tutorial.

I performed demultiplexing step and then denoising through DADA2. At this step I got rep-seqs.qza which looks like following in qiime2 view

image

I am not getting any rep-seqs.qza file while importing the dataset from greengenes.
Just want to make sure am I going correct?

Any suggestions please.

Hi @Parul_Baranwal,

“Rep set” is short hand for “representative subset”, basically the sequence that represnts an ASV/OTU. But, I recoganize that it can make the nomenclature confusing!

The “rep set” from greengenes is the sequence that represents that database sequence. The “rep set” from dada2 is the representative sequence for your ASV. You’ll use the greengenes sequences to train your classifier and then you’ll apply it to your ASV rep seqs.

Best,
Justine

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.