Dear All,
My aim is to perform functional analysis using picrust from the feature table generated by GG. So, I download the latest GG version from ftp://greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz. From the compressed file, I extracted rep-set ---> 97_otus.fasta and executed the following commands for closed reference OTUs.
I used Greengenes 13_8 99% OTUs full-length sequences for the feature classifier. I could not get the taxonomic classification. I attached my barplot chart for your reference.
taxa-bar-plots_gg_cr_97.qzv|attachment](upload://yTjjJvBiw6NPRgVfcU7AFxc6SPT.qzv) (337.7 KB) .
But when I executed feature-classifier using the rep-seq from the output of dada2, I got several taxa
With this output (biom format), I also performed picrust2 pipeline in qiime2 environment and got the functional annotation. Can I proceed with this output? or how can i improve crossed reference step.
Why the output differes like this? How do I taken this forward to picrust analysis?
I have a high level question about your motivation. Greengenes is a reference database that was generated by OTU clustering which then gets used as the standard reference. So, I guess Im confused why you're taking the rep set and re-clustering and then re-processing? It is weird to me that your classifier isnt replicating the original sequence... what's your reference database?
You may also want to look into the PICRUSt 1 databse and process if you want the functional annotation associated with Greengenes 13_5. They use that as the basis for functional prediction.
My motivation is to perform functional annotation using q2-picrust plugin.
So i tried to get closed reference using the above mentioned command.
What is the correct work flow to proceed with q2-picrust
I think I misinterpreted your experiment (I'm sorry). For some reason, I thought you were clustering the greengenes database itself, which didn't make sense. But, there are still a handful of issues.
So, one is that if you're picking closed reference, you don't want or need to do taxonomy classification. Closed reference OTU picking gives you a taxonomic assignment: you get hte assignment of the centroid you've clustered against. (Incidentally, you also get the tree for free with the closed reference picking.)
If I were you and operating on the closed reference table, I'd probably just cram it through the PICRUSt 1 galaxy server and cite as such... there are some really nice features of PICRUSt 1 that q2-PICRUSt doesn't have, including a better ability to collapse pathways and fact that you get a copy number normalised table.
I think you're better off proceeding with the ASVs because there's something strong to be said for specificity, external validity, etc. In that case, I'd follow the q2-PICRUSt tutorial to classify your data, keeping in mind that you have to do fragment insertion into the PICRUSt provided tree.