Hi Qiime2 Forum,
First would like to give a shoutout to you for your amazing work!
Regarding my question: I will phrase it as simple as I can. I am running a pipline where I compare relative frequencies of my taxa assignment performed on a feature table after denoising with deblur
By manually taking the average of each relative frequency from each sample corresponding to the same Group VERSUS relative frequencies of my taxa assignment performed on a GROUPED (mean-ceiling) grouped by ''Group'' feature table after denoising with deblur By taxa barplot tool.
I hoped to obtain similar relative frequencies, however the values are very discrepant between the two different ways... My code is the following:
NON-GROUPED MANUALLY AVERAGE OF REL. FREQUENCIES
qiime feature-classifier classify-sklearn
--i-classifier trained-classifier-97-otus.qza
--i-reads rep-seqs.qza
--o-classification taxonomy-rep-seqs.qza
qiime taxa collapse
--i-table table.qza
--i-taxonomy taxonomy-rep-seqs.qza
--p-level 2
--o-collapsed-table phyla-table.qza
qiime feature-table relative-frequency
--i-table phyla-table.qza
--o-relative-frequency-table rel-phyla-table.qza
qiime tools export
–inputh_path rel-phyla-table.qza
--output-path pcil/deblur/results/rel-table/
biom convert
-i feature-table.biom
-o phyla-table.tsv
--to-tsv
In the excell table I take the average of each group and plot it as a bar graph.
GROUPED WITH FEATURE-TABLE MEAN-CEILING AND TAXA BARPLOT
qiime feature-table group
--i-table table.qza
--p-axis sample
--m-metadata-file Metadata_Tablesheet_August_2020_run.txt
--m-metadata-column “Group”
--p-mode mean-ceiling
--o-grouped-table grouped-table.qza
qiime feature-classifier classify-sklearn
--i-classifier trained-classifier-97-otus.qza
--i-reads rep-seqs.qza
--o-classification taxonomy-rep-seqs.qza
qiime taxa barplot
--i-table grouped-table.qza
--i-taxonomy taxonomy-rep-seqs.qza
--m-metadata-file pcil/Metadata_Tablesheet_August_2020_run.txt
--o-visualization bar-plots.qzv
Any ideas on where the two pipelines might differ? Am I doing something wrong? And which way should I trust?
PS: On a slightly different note, is it possible to filter a FeatureData[Taxonomy] based on the confidence value attributed to each feature?
Best,
João Lopes