TAXA barplot using relative frequency (not absolute frequency)


I want to obtain a taxa barplot using relative abundance data. I see that Taxa barplot plugin uses a FeatureTable[Frequency], but I wonder if it could use a FeatureTable[relative frequency] instead.
The problem with FeatureTable[Frequency] is that it does not normalize taxa abundance by the total number or reads of each sample. For instance:

S. aureus sample 1= 1000; Sample 2= 100; Sample 3= 2000; Sample 4= 160
E. coli sample 1= 100; Sample 2= 10; Sample 3= 500; Sample 4= 500

If I group samples 1 and 2 as ‘healthy’ and samples 3 and 4 as ‘sick’, when I perform taxa barplot over those two groups, differences in samples 1 and 3 will have more importance that 2 and 4, because samples 1 and 3 have more total number of reads, since it performed the mean using the sum of the number of reads of each taxa in each sample / number of samples inside the group, instead of using the mean of the relative frequency of each taxa in each sample.
I followed these steps:

 qiime feature-table group --i-table UPSTREAM/filter-finished-data/table-only-bacteria-rarefied.qza --p-axis sample --m-metadata-file metadata.tsv  --m-metadata-column "Group" --p-mode mean-ceiling --o-grouped-table UPSTREAM/Group-grouped-table.qza
 qiime taxa barplot --i-table UPSTREAM/Group-grouped-table.qza --i-taxonomy UPSTREAM/taxonomy/taxonomy-silva.qza --m-metadata-file metadata.tsv --o-visualization diversity/beta-div-taxa-bars/taxa-bar-plots-Group.qzv
qiime tools view diversity/beta-div-taxa-bars/taxa-bar-plots-Group.qzv

Sorry if I did not explain myself properly. Thanks!

Hi @KirKara,
The title of your topic is a bit misleading — the issue is not barplot (which always converts absolute count to relative frequency), but rather that feature-table group cannot operate on a FeatureTable[RelativeFrequency]. I believe this is on our radar but I do not have an eta on when the operation might be allowed.

In the mean time how about you do this: use qiime feature-table rarefy to evenly subsample reads from all of your samples, then group them. While some of the rarer ASVs may drop out during subsampling, they would just be “noise” on the unrarefied barplots anyway (because they would be in the legend but too minuscule to see).

Now that I mention it, it looks like that is what you are already doing! Is that not yielding adequate results?

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.