feature-table group mean-ceiling vs. individual values

Hi Qiime2 Forum,

First would like to give a shoutout to you for your amazing work!

Regarding my question: I will phrase it as simple as I can. I am running a pipline where I compare relative frequencies of my taxa assignment performed on a feature table after denoising with deblur
By manually taking the average of each relative frequency from each sample corresponding to the same Group VERSUS relative frequencies of my taxa assignment performed on a GROUPED (mean-ceiling) grouped by ‘‘Group’’ feature table after denoising with deblur By taxa barplot tool.

I hoped to obtain similar relative frequencies, however the values are very discrepant between the two different ways… My code is the following:

NON-GROUPED MANUALLY AVERAGE OF REL. FREQUENCIES
qiime feature-classifier classify-sklearn
–i-classifier trained-classifier-97-otus.qza
–i-reads rep-seqs.qza
–o-classification taxonomy-rep-seqs.qza

qiime taxa collapse
–i-table table.qza
–i-taxonomy taxonomy-rep-seqs.qza
–p-level 2
–o-collapsed-table phyla-table.qza

qiime feature-table relative-frequency
–i-table phyla-table.qza
–o-relative-frequency-table rel-phyla-table.qza

qiime tools export
–inputh_path rel-phyla-table.qza
–output-path pcil/deblur/results/rel-table/

biom convert
-i feature-table.biom
-o phyla-table.tsv
–to-tsv

In the excell table I take the average of each group and plot it as a bar graph.

GROUPED WITH FEATURE-TABLE MEAN-CEILING AND TAXA BARPLOT
qiime feature-table group
–i-table table.qza
–p-axis sample
–m-metadata-file Metadata_Tablesheet_August_2020_run.txt
–m-metadata-column “Group”
–p-mode mean-ceiling
–o-grouped-table grouped-table.qza

qiime feature-classifier classify-sklearn
–i-classifier trained-classifier-97-otus.qza
–i-reads rep-seqs.qza
–o-classification taxonomy-rep-seqs.qza

qiime taxa barplot
–i-table grouped-table.qza
–i-taxonomy taxonomy-rep-seqs.qza
–m-metadata-file pcil/Metadata_Tablesheet_August_2020_run.txt
–o-visualization bar-plots.qzv

Any ideas on where the two pipelines might differ? Am I doing something wrong? And which way should I trust?

PS: On a slightly different note, is it possible to filter a FeatureData[Taxonomy] based on the confidence value attributed to each feature?

Best,
João Lopes

Hi!
I encountered the same issue couple of months ago.
In first case, you are making a relative table first and then manually group your table
In second case, you are grouping a table with frequencies and then convert them to a relative abundances by creating a barplot.
I am not exactly sure why it is like this. I checked my code several times and so did others and everything looked consistent. So I decided to keep my own barplot.

Thanks for your input Timur.

Only problem with me is that when I compare the relative frequencies from both ways I get very different results =(

In my case, despite I got the differences, all patterns and ratios were kept. The differences weren’t statistically significant either.
Maybe you should double check your custom pipeline