adonis2 output without metadata information

Hello,
I don't know if it's the right place to ask this question. It's more of a R Question. I am working on microbial ecology of skin cancer. I am using the adonis2 package. As input, I have a bray Curtis-Distance-Matrix from qiime2 and an metadata file. The adonis2-command itself works finde, but I am clueless as to why the anova output only names "model" instead of my metadata categories tumor and bodyside. This is also the case when I'm looking for the interaction of the two using the *

adonis2(formula = bray_curtis_distance_matrix ~ swab_tibble$tumor * swab_tibble$bodysite)
          Df SumOfSqs      R2      F Pr(>F)   
Model     17    5.453 0.16218 1.2867  0.002 **
Residual 113   28.170 0.83782                 
Total    130   33.622 1.00000                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Is there somehing wrong with my metadata file or the distance matrix? I tried both using a nibble and a dataframe as input, but the problem persists.

Thank you very much in advance,

Jonas

1 Like

Hello Jonas,

Great question! In R, formula are a little magic :magic_wand:

When you use the syntax swab_tibble$tumor, you are passing the literal vector of values from that column into the formula. R's formula interface interprets the entire expression swab_tibble$tumor * swab_tibble$bodysite as a single, complex predictor variable. Because adonis2 only sees one predictor, it lumps all the explained variance into the "Model" line.

Try this:

adonis2(
  formula = bray_curtis_distance_matrix ~ tumor * bodysite,
  data = swab_tibble
)

Here, these predictor columns are read implicitly from inside the data frame swab_tibble, but now the formula knows to treat them as separate variables. You should see both of them, and their interaction effect, in the output table.

4 Likes

Hello Colin,

thank you for the swift and helpful reply! That was the problem!

Hello Jonas and Colin,

I'm doing the same thing! Reading in the bray curtis qza and extracting distance matrix by using bray_dist$data.

I had the exact same issue with adonis2 only giving the full model significance, but adding the formula and data explicitly sadly didn't fix things in my case. However, I found that adding by = "terms" to the equation fixed the issue for me:

PERMANOVA_terms <-  adonis2(formula = bray_dist ~ Sex*Genotype, 
                          data = sample_data, 
                          permutations = 999, 
                          by = "terms") 

Just wanted to add to this thread in case someone else has the same issue!

5 Likes

Thanks for the tip!

It looks like by = "terms" should be the default option, so maybe we found a bug?

data(dune)
data(dune.env)
## default test by terms
adonis2(dune ~ Management*A1, data = dune.env)
## overall tests
adonis2(dune ~ Management*A1, data = dune.env, by = NULL)
1 Like

Interesting!
May I ask a follow-up question:
When using by = terms the order or the arguments in the formula matters (because of the extent to which the first variable explains diversity.)
But when using by = making the order of the arguments matters much less and I especially cannot "manipulate" the level of significance by reordering the formula (which is a good thing, I guess.)
So why would anyone want to use by = terms instead of by = margin?

2 Likes

I suppose it depends on what you are trying to test!

I like that the order matters in adonis, because I can control for variables that I don't care about before testing the ones I do care about.

beta ~ sequencing_run + treatment

I've never had a reviewer complain about my formula order before, though I suppose this could hide multiple testing for p-hacking :person_shrugging:


This brings me back! Story time :mouse:

On this paper, I tested for diet while controlling for different building used to house our mouse model: :mouse: :house:
beta ~ building + diet, strata = timepoint

When it turned out that the building effect was pretty strong, we added it as a main finding of the paper!

1 Like