General questions

marko · February 27, 2019, 2:00am

Hi Guys,
I have recently started using QIIME2. Totally self taught through the tutorials and online vids and seem to be getting there slowly.
This data set is from mice receiving different drug treatments (ie grouped by treatment), followed by 16s analysis of the faecal microbiome.
I have a few (probably very simple) questions I haven't quite been able to work out from searching, my apologies if I should have found the answers.

When looking at the alpha and beta analysis outputs (output via core-metrics command), under the box plots there is usually a p'value for 'all groups' followed by a heap of p-values for 'pairwise'. I understand the value for ''pairwise' is simply the comparison between one group and another, however if I have for example 10 groups, what exactly is the 'all groups' p-value telling me?
When looking at beta-diversity boxplots such as the unifrac-distance, why is it that the first boxplot is the group being compared to. Eg in the attached graph, why is 37 graphed, and why isn't it '1' if it is being compared to itself?
unweighted-unifrac-body-site-significanceday3.qzv (431.9 KB)
Finally, in my taxa summary output after using gneiss-balance-taxonomy, am I right to pick the bacterial genera listed in the numerator and denominator bargraphs under 'balance taxonomy' as the most likely bacteria to be changing between treatment groups, and therefore these specific bacteria may be worth investigating individually using pairwise analysis?

Thankyou in advance, I have spent a few days trying to get my head around these aspects and apologise if they have been answered before.
mark

thermokarst · February 27, 2019, 3:05pm

Welcome! :qiime2:

Care to share? We haven't made any (that I know of), so I suspect these are community-built resources.

No worries, that is what we are here for! In the future though, it helps us, and other users, when you try and keep each topic post to a single question (so here, this topic post would actually be three topic posts). This helps improve search-ability, as well as makes it a bit easier for us to triage and reply to when it is more digestible. Thanks!

This is the H statistic and p value for the KW test across all 10 groups (each group is treated as a kw "sample", so each qiime 2 sample is then a "measurement"). Since KW tests the null hypothesis that the population median of all of the sample groups are equal, this will tell you whether or not the metadata groups medians are equal or not. It won't tell you which groups are responsible, but that is where the pairwise tests come in.

Please see:

cc @mortonjt

marko · February 27, 2019, 11:50pm

Hi Matthew,
thanks for that.

I probably should have been more clear, I was referring to the general microbiome analysis vidoes from Dan Knights on youtube

Makes sense - thanks!

So if we look at the example in the post you linked, am I right in looking at the box 'sub' as kind of the alpha-diversity within the 'sub' group, and the other boxes are the beta-diversity distances between sub/supra and sub/supra_sub respectively?

Thanks
Mark

thermokarst · February 28, 2019, 2:44pm

No, I don't think that is a great way to think about it --- these are all beta diversity measures. The first box is a distribution of computed beta diversity values of all pairwise comparisons of the samples within the 'sub' group.

Yep, I agree with this! The other boxes are all pairwise comparisons of the samples from the 'sub' group to the other groups.

marko · February 28, 2019, 11:18pm

Ok thanks - makes sense.
And thanks to you guys for answering all the questions and maintaining this resource.