taxa relative abundance percentage across samples in a taxa barplot

Hi there!

in my analyses I am looking at four different datasets of cucumber microbiome using 16s rRNA sequences. I have merged all the feature tables and managed to see the taxonomy barplot with all the studies, However I want to look into the details. The first think I am looking for is the similarities among the datasets. Firstly I want to know the family taxa they all have in common. I can easily see the most abundant ones they have in common just by looking at the taxa plot, however I want to look deeper and get an average % of the relative abundance across samples in each study for these family taxa, is that a way I can do that?

for example in the image above you can see that Rhizobiaceae is present in different percentage in each sample within this one study. Is that a feature I can use to obtain the average of all these samples so I can make a comparison with the other studies?



I have another question under the same topic :grin:
within my 4 datasets I have a suspicion that 2 of them are more similar than between themselves than in comparison to the other 2, is that a way I can make that comparison that you guys know of?



Hi, @lilycrook Lily

I think that grouping your table by “study” column in your metadata can help you to get the barplot with samples averages (average relative abundances) by study.

Maybe PCoA from core-diversity metrics or biplot emperor visualizations from diversity plugins are what you are looking for. They are great in demonstrating similarities/dissimilarities between communities. You also can perform beta-group-significance test.


Thank you Timur for your reply.

I am looking into grouping, and I am interested at looking at 'sample.collection' column of my metadata, but I am stuck, this is the output I am getting:

I am not sure what it means....could you, please, help me identify the mistake I have made here?

sample-metadata.tsv (18.6 KB) farm-collection-metadata.tsv.txt (2.1 KB)

I have attached my metadata files just in case it helps you understand what is happening.

Hi, Lily!
Sorry I forgot to mention it before, but to obtain a grouped taxa barplot you also need to collapse your metadata file. Just create a new metadata, in which instead of ids in “sample-id” column you will have your variables from “sample-collection” column. Of course, this new metadata will be shorter, since each variable in the “sample-id” column in your new metadata should be unique. Others columns are not very important on this step since you need this new metadata only for barplot creation to bypass the error.


Thank you for that Timur!

I stumbled upon another issue:

I dont know why I am getting this message, would you know?


Hi! I think that you are getting this error because non of the taxa contain the string you provided. As a variant, check the spelling of this string.

one thing I noticed is that in the example I am following for the command line, there is a “k_bacteria” instead of “d_Bacteria” i guess k stands for kingdom?, but dont know what d stands for…

The syntax is different in different databases. It is better to open your assigned taxonomy artifact and check how it is written there

I did...

Double check the spelling, it looks like you need double underscore - you are using “_” in the command, while in the taxonomy file it is provided as “__”. Or you used two underscores? Hard to tell by screenshot

1 Like

Thank you heaps, the double underscore worked!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.