No taxonomic classification after closed reference using GG

steffi · August 22, 2019, 3:44am

Dear All,
My aim is to perform functional analysis using picrust from the feature table generated by GG. So, I download the latest GG version from ftp://greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz. From the compressed file, I extracted rep-set ---> 97_otus.fasta and executed the following commands for closed reference OTUs.

qiime tools import --type 'FeatureData[Sequence]' --input-path 97_otus.fasta --output-path 97_otus_gg_database.qza

qiime vsearch cluster-features-closed-reference --i-table table.qza --i-sequences rep-seqs.qza --i-reference-sequences 97_otus_gg_database.qza --p-perc-identity 0.97 --o-clustered-table table-gg-cr-97.qza --o-clustered-sequences rep-seqs-gg-cr-97.qza --o-unmatched-sequences unmatched-gg-cr-97.qza

table.qza and rep-seqs.qza were generated by dada2 plugin.

qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-nb-classifier.qza --i-reads rep-seqs-gg-cr-97.qza --o-classification taxonomy_gg_cr_97.qza

qiime taxa barplot --i-table table-gg-cr-97.qza --i-taxonomy taxonomy_gg_cr_97.qza --m-metadata-file sample_meta.tsv --o-visualization taxa-bar-plots_gg_cr_97.qzv

I used Greengenes 13_8 99% OTUs full-length sequences for the feature classifier. I could not get the taxonomic classification. I attached my barplot chart for your reference.
taxa-bar-plots_gg_cr_97.qzv|attachment](upload://yTjjJvBiw6NPRgVfcU7AFxc6SPT.qzv) (337.7 KB) .

But when I executed feature-classifier using the rep-seq from the output of dada2, I got several taxa

With this output (biom format), I also performed picrust2 pipeline in qiime2 environment and got the functional annotation. Can I proceed with this output? or how can i improve crossed reference step.
Why the output differes like this? How do I taken this forward to picrust analysis?

jwdebelius · August 22, 2019, 8:22am

Hi @steffi,

I have a high level question about your motivation. Greengenes is a reference database that was generated by OTU clustering which then gets used as the standard reference. So, I guess Im confused why you're taking the rep set and re-clustering and then re-processing? It is weird to me that your classifier isnt replicating the original sequence... what's your reference database?

You may also want to look into the PICRUSt 1 databse and process if you want the functional annotation associated with Greengenes 13_5. They use that as the basis for functional prediction.

Best,
Justine

steffi · August 22, 2019, 9:39am

My motivation is to perform functional annotation using q2-picrust plugin.
So i tried to get closed reference using the above mentioned command.
What is the correct work flow to proceed with q2-picrust

jwdebelius · August 22, 2019, 12:19pm

Hi @steffi,

I think I misinterpreted your experiment (I'm sorry). For some reason, I thought you were clustering the greengenes database itself, which didn't make sense. But, there are still a handful of issues.

So, one is that if you're picking closed reference, you don't want or need to do taxonomy classification. Closed reference OTU picking gives you a taxonomic assignment: you get hte assignment of the centroid you've clustered against. (Incidentally, you also get the tree for free with the closed reference picking.)

If I were you and operating on the closed reference table, I'd probably just cram it through the PICRUSt 1 galaxy server and cite as such... there are some really nice features of PICRUSt 1 that q2-PICRUSt doesn't have, including a better ability to collapse pathways and fact that you get a copy number normalised table.

I think you're better off proceeding with the ASVs because there's something strong to be said for specificity, external validity, etc. In that case, I'd follow the q2-PICRUSt tutorial to classify your data, keeping in mind that you have to do fragment insertion into the PICRUSt provided tree.

Best,
Justine

system · September 22, 2019, 6:19pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.