Best way to analyze diversity of data with multiple factors


I am working on an assay to evaluate the impact of three factors on bacterial diversity: treatment (treatment1, treatment2, or control), dose (control or three different doses), and sample harvest time (t1 or t2). As I am not experienced with multivariate designs, I would appreciate your advice on analyzing diversity to highlight the potential effects of these factors. I am unsure if directly studying the effect of one factor on the entire dataset is appropriate, as the other two factors could mask some effects.

Regarding beta-diversity analysis: For instance, if I want to examine the dose effect only for samples harvested at t2, should I subset the data to include only t2 samples and then perform the qiime diversity beta-group-significance analysis? And what if I want to further analyze only samples harvested at t2 and treated with treatment1? Should I subset the data again to include only treatment1 samples?

I am also struggling to understand how the adonis test can be appropriately used here. If I suspect that the harvest time factor introduces more noise than information, what is the best way to proceed? I saw in another post that it is possible to 'remove' the variance of factors I am not interested in, but I couldn't figure out how to write the p-formula exactly.

For alpha-diversity analysis: I have the same question: should I subset the data to a certain level before performing the qiime diversity alpha-group-significance analysis, or is there a better approach?

Thank you in advance for your help!



1 Like

Hello Ben,

Welcome back to the forums! I'm glad you found that old adonis() post I wrote.

Yes, that's the easiest way, though less data means less power.

Are you going to look at all 6 combinations of timepoint and treatment?
Do you know how to do multiple testing corrections for this?
Edit: Sorry, that was rude. :frowning: I just had a project where we did lots of tests on subgroups, yielding nothing significant after adjusting for multiple testing. That power loss is brutal, so be warned!

First, you can see if the harvest time factor covaries with the distance matrix at all.

--p-formula "HarvestTimepoint"

This is a win-win:

  • If Timepoint is NOT significant: you can combine all your t1 and t2 samples into one big cohort to double your power!!! :100:
  • If Timepoint IS significant: then you can control for it when testing things like Dose.
--p-formula "HarvestTimepoint + DoseNumber"

Adonis works best with fully blocked formulas, which you will discover when looking at the results!


Hi Colin,

Thank you so much for your reply!

Probably not the 6 combinations but some of them, yes (e.g. treatment1 at t2, treatment2 at t2, etc.). I actually don't know how to do multiple testing corrections, I am very interested to read your advice on how to do it! And when should this multiple testing corrections be applied? As soon as I sub-sample my data to analyze the diversity?

Another related question: so, it is better to perfrom adonis testing (qiime diversity adonis) rather than "simple" permanova testing (qiime diversity beta-group-significance) in this kind of situation?

Yes, it seems to be significant:

Am i right to interpret the Pr(>F) as the p-value? And should R2 also be used in the statistical interpretation of the test?

Here are the results obtained when I applied this formula:

So, the correct interpretation of these results is: although not considering (i.e. controlling) the variance brought by the time factor, the dose factor still doesn't significantly affect the diversity observed, am I right?

Thanks a lot for your help!


Here's the wiki on the multiple comparison correction I've been using: Holm–Bonferroni method - Wikipedia

I suppose you can do that manually, but I suspect that your stats software will support this.

Like, I use R with Tidyverse and Rstatix, and I can adjust_pvalue.
These days it does the adjustment for me, by default!

Let's use this data set of cars :red_car: as an example.

I'm testing if HP depends on cylinders, which we would expect to be true.
More cylinders make more horsepower, 'cyl make hp go brr'

Of course, automatic and manual cars may be different, so first I group by that.

> mtcars |> group_by(am) |> wilcox_test(hp ~ cyl)
# A tibble: 6 Ă— 10
     am .y.   group1 group2    n1    n2 statistic     p p.adj p.adj.signif
* <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>
1     0 hp    4      6          3     4         0 0.05  0.05  *
2     0 hp    4      8          3    12         0 0.011 0.022 *
3     0 hp    6      8          4    12         0 0.004 0.012 *
4     1 hp    4      6          8     3         2 0.051 0.149 ns
5     1 hp    4      8          8     2         0 0.05  0.149 ns
6     1 hp    6      8          3     2         0 0.139 0.149 ns

It looks like it's only significant for am==0, which are the automatic cars!

If I was only looking at manual cars with 4 cylinders vs 8 cylinders, I would see p=0.05 and draw the wrong conclusion.

If you want to learn something, explain this:

  • each and every V8 car always has more horsepower than any V4 or V6 car, yet it's not always statistically significant

Yes Pr(>F) is p-value. If you set this up right no multiple testing correction is needed.

R2 is the R-squared value. In your example, 6.3796% of the variance in that distance matrix can be explained by timepoint, which is interesting! You can report on that.

You are controlling for timepoint. It's the first variable in the formula.

1 Like

Hi Colin,

Thank you for your reply!
I still have a few questions to ensure I understand it well:

  • When must the multiple comparison correction be applied? Is it because I consider restricting my data ? Or should it be performed in all cases, even when working with the entire dataset?
    If no data restriction is applied, is the error-corrected q-value provided by the basic PERMANOVA enough?

  • With ADONIS, controlling the variance of one factor allows studying the variance of another factor without multiple testing correction, if I understand well. Is there a similar way to analyze alpha diversity (i.e. controlling for one factor to analyze the variance another factor)? My idea is to avoid the current situation where the qiime diversity alpha-group-significance command does not highlight significant differences for the dose factor while I suspect that a difference could be observed if the timepoint and treatment factors were controlled.

  • Is it a good practice to always include the --p-method permdisp argument in the PERMANOVA analysis to check that an observed significant difference is caused by variation within one group?

Thanks again for your help,


I'm not a card-carrying statistician, so please double-check my work!

Multiple testing correction addresses an 'exploit' / 'hack' for making small p-values.

Slicing your data into tiny cohorts does reduce power, but this exploit is based on running all those tests on the cohorts. The multiple tests are the core problem.

Whenever I write code to do multiple tests, I add the adjusted p-values before reviewer 3 asks for them. :upside_down_face:

I think the q-value from PERMANOVA is an adjusted p-value. Do you want to find the documentation or citation for that value and report back here?

Good question! Because adonis partitions the matrix in order, I think it counts as one test.

adonis(var1 + var2 + var3)

If you run adonis 3 times, then that's multiple testing.


Sure, though I'm not sure how to do that in Qiime2. I would do this in R

It's just measuring something different. You may want to run it both ways.

image figure shows difference in dispersion, location, and both!