No taxonomic classification after closed reference using GG

Dear All,
My aim is to perform functional analysis using picrust from the feature table generated by GG. So, I download the latest GG version from From the compressed file, I extracted rep-set ---> 97_otus.fasta and executed the following commands for closed reference OTUs.

qiime tools import --type 'FeatureData[Sequence]' --input-path 97_otus.fasta --output-path 97_otus_gg_database.qza

qiime vsearch cluster-features-closed-reference --i-table table.qza --i-sequences rep-seqs.qza --i-reference-sequences 97_otus_gg_database.qza --p-perc-identity 0.97 --o-clustered-table table-gg-cr-97.qza --o-clustered-sequences rep-seqs-gg-cr-97.qza --o-unmatched-sequences unmatched-gg-cr-97.qza

table.qza and rep-seqs.qza were generated by dada2 plugin.

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza --i-reads rep-seqs-gg-cr-97.qza --o-classification taxonomy_gg_cr_97.qza

qiime taxa barplot --i-table table-gg-cr-97.qza --i-taxonomy taxonomy_gg_cr_97.qza --m-metadata-file sample_meta.tsv --o-visualization taxa-bar-plots_gg_cr_97.qzv

I used Greengenes 13_8 99% OTUs full-length sequences for the feature classifier. I could not get the taxonomic classification. I attached my barplot chart for your reference.
taxa-bar-plots_gg_cr_97.qzv|attachment](upload://yTjjJvBiw6NPRgVfcU7AFxc6SPT.qzv) (337.7 KB) .

But when I executed feature-classifier using the rep-seq from the output of dada2, I got several taxa

With this output (biom format), I also performed picrust2 pipeline in qiime2 environment and got the functional annotation. Can I proceed with this output? or how can i improve crossed reference step.
Why the output differes like this? How do I taken this forward to picrust analysis?

Hi @steffi,

I have a high level question about your motivation. Greengenes is a reference database that was generated by OTU clustering which then gets used as the standard reference. So, I guess Im confused why you’re taking the rep set and re-clustering and then re-processing? It is weird to me that your classifier isnt replicating the original sequence… what’s your reference database?

You may also want to look into the PICRUSt 1 databse and process if you want the functional annotation associated with Greengenes 13_5. They use that as the basis for functional prediction.


My motivation is to perform functional annotation using q2-picrust plugin.
So i tried to get closed reference using the above mentioned command.
What is the correct work flow to proceed with q2-picrust

Hi @steffi,

I think I misinterpreted your experiment (I’m sorry). For some reason, I thought you were clustering the greengenes database itself, which didn’t make sense. But, there are still a handful of issues.

So, one is that if you’re picking closed reference, you don’t want or need to do taxonomy classification. Closed reference OTU picking gives you a taxonomic assignment: you get hte assignment of the centroid you’ve clustered against. (Incidentally, you also get the tree for free with the closed reference picking.)

If I were you and operating on the closed reference table, I’d probably just cram it through the PICRUSt 1 galaxy server and cite as such… there are some really nice features of PICRUSt 1 that q2-PICRUSt doesn’t have, including a better ability to collapse pathways and fact that you get a copy number normalised table.

I think you’re better off proceeding with the ASVs because there’s something strong to be said for specificity, external validity, etc. In that case, I’d follow the q2-PICRUSt tutorial to classify your data, keeping in mind that you have to do fragment insertion into the PICRUSt provided tree.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.