How to properly design formulas in Adonis ?

Dear Qiime community,
After spending hours reading about it, and lacking of knowledge about GLM, I am still struggling about the proper way to design model formulas to run Adonis.
Especially, I’m facing two questions :

  1. What factors should be included in the formula ?
  2. When to use interaction terms ?
    Answers based on practical cases would be greatly appreciated :slight_smile:

I’ll illustrate my point with a practical example. I have an experiment with treated and control samples, that were collected at two different months. I also have a bunch of other metadata. The PCoA clearly
shows that samples are clustered according collection month. But, what I am interested in, is the treatment effect.
Is the proper way to run the test is using the formula “Collection_month + Treatment” ? And then, what about other factors that may have an impact on the clustering (I’ve seen that the p-value changes when adding factors) ?

And what about the “Collection_month * Treatment” formula ? I think I understood that it will add the combined effect of month collection and treatment to the model, but I’m not really sure
how to interpret that.


Hi @Erwan,

This depends on the question you want to answer from your data. Usually we include the factors we are interested or concerned in the model. You need to use your domain knowledge to decide which factors may affect your treatment effects and include them in the model.

Usually it’s good to include interaction terms in the model when you have more than one factor in your model. If you know for sure that Collection_month does not interact with Treatment, you can just use the formula “Collection_month + Treatment” . Otherwise, you probably should include the interaction term, using the formula “Collection_month * Treatment”. If the interaction term is significant, it means your Treatment effect is depend on the Collection_month. An example would be that your Treatment effect is stronger in Collection_month A than Collection_month B.

You can read this blog post regarding the interaction term in regression models.


Just an add on to @yanxianl’s excellent answer: adonis performs a sequential test of terms which means the order in which you include your variables are actually important. Essentially, whatever variance is not explained by the first term is passed on to the second term, and then the remainder from that to the 3rd and so on. If your terms are uncorrelated then the order may not be as a big of an issue but just so you are aware, it can have an effect on the results interpretation.


Thanks to both of you for your relevant answers and for the excellent post about interaction term, really appreciate !

1 Like