Relative frequency of bacteria (%) : download CSV table

Dear all,

I performed taxonomic analyses of gut bacteria referring to hitdb and i obtained bar plots representing the relative frequency of bacteria (%) function of samples.
I downloaded the corresponding CSV table summarizing results, the proportions of different bacteria detected for each sample in this table don’t appear in %, indeed for example for sample 1 i found for bacteroidetes 17858. I think that this number corresponds to the OTU. How can i get a table summarizing directly the proportions in % as they appear in bar plots ? When i used QIIME1 this table appear directly with the bar plots.
Thank you

Hi!
In the barlot.qzv file all abundances are “actual” as they are and conversion is performed “under the hood”.
To obtain relative table you can convert your table.qza to relative table and then convert biom into a .tsv file, or you can just recalculate your .csv file
If you are familiar with Python and read your .csv table in pandas as df, the following line will do it:
df = df.div(df.sum(axis=1), axis=0)*100

Basically, you should take the sum of all features frequencies in each sample and based on it calculate the percentage for frequency of each feature in this sample.

1 Like

Hi @timanix

Thanks for the response. When I performed taxonomic analyses of gut bacteria referring to hitdb i obtained as outputs taxonomy.qza /taxonomy.qzv and barplots.qzv files. How can i get table.qza file noting that as described using qiime2 command lines the table.qza file is not generated ?

Best

  1. If you want to convert table.qza to relative table, you can use the same table.qza file you used to generate barplot.qzv
  2. If you want to recalculate a .csv file, you shuold extract it directly from barplot.qzv

Hello @timanix
Thank you for the response. So you recommend to use python to calculate from the csv file i downloaded the relative abundance of bacteria for each sample in % noting that the csv file display for example for sample 1 bacteroidetes 17858. For each line or each sample you recommend to calculate the sum of values as they appear and to apply this formula
df = df.div(df.sum(axis=1), axis=0)*100 /i want to know if python will calculate for each sample the relative abundance in % taking into consideration the sum for example for healthy versus affected i will calculate the sum for each group for each bacteria and i will apply this formula. If i will use excel instead of python what can i do ?
Thanks

Hi! Yes, you can use excel.

  1. Make a new copy of the table with actual frequencies.
  2. Add a new “Sum” column to original table that will sum all features frequencies for each sample
  3. In new copy of a table create a formula which takes the actual value of frequency of Feature1 in Sample1 in original table, divide it on the value in column “Sum” for Sample1 and multiplies it on 100. then apply this formula for all samples and features.

Or take a table.qza you used to create a barplot, convert it to relative table, export a biom file fron relative table and convert it to .tsv file. It should be the same.

1 Like

Hi @M_F,

If you wish to stay within the qiime 2 environment, you could use the following plug in https://docs.qiime2.org/2020.6/plugins/available/feature-table/relative-frequency/

To create the relative abundance table. If you want to get an excel like table, from the result you should export the data in biom format and convert this into the final table.

Cheers

3 Likes