Beta-group-significance plots not representative of PERMANOVA test

Hi, I attended the QIIME 2.0 workshop at NIH. It was a great opportunity to learn a lot about the new features of QIIME 2.0. I had one suggestion.

For the qiime diversity beta-group-significance command, in the visualization you see pairwise dispersion plots. These plots appear to be the distance to the centroid for each body site. The boxplots are here: https://view.qiime2.org/visualization/?type=html&src=https%3A%2F%2Fdocs.qiime2.org%2F2018.2%2Fdata%2Ftutorials%2Fmoving-pictures%2Fcore-metrics-results%2Funweighted-unifrac-body-site-significance.qzv

You are also given PERMANOVA test statistics.

The issue is PERMANOVA is not testing dispersion. If you were testing dispersion using a PERMDISP it would not be testing the among-group dispersion that the plots are showing in QIIME. From Anderson & Walsh 2013, Ecological Monograph:

The null hypothesis tested by PERMANOVA is that, under the assumption of exchangeability of the sample units among the groups, H0: ‘‘the centroids of the groups, as defined in the space of the chosen resemblance measure, are equivalent for all groups.’’

When you are showing dispersion from centroids, you are visualizing results from a PERMDISP, but those should be within-group distances (not among-groups as shown in QIIME):

Even if centroids differ, PERMDISP explicitly tests only H0: ‘‘the average within-group dispersion (measured by the average distance to group centroid and as defined in the space of the chosen resemblance measure), is equivalent among the groups.’’

I recommend replacing the plots with package ‘vegan’ plot from betadisper function. It’s a great visualization for both dispersion and centroid positions per group. Here’s an example using phlyoseq (and vegan) packages

library(phyloseq)
data("GlobalPatterns")


jacc_GP <- distance(GlobalPatterns, "jaccard", binary = T)

df_GP <- as(sample_data(GlobalPatterns), "data.frame")

## PERMANOVA
adonis_jaccGP <- adonis(jacc_GP ~ SampleType, data = df_GP)
adonis_jaccGP

## Significant PERMANOVA indicates that centroid (or spatial median) among groups is different and/or with-group dispersion among groups is different

## PERMDISP
groups_GP <- df_GP[["SampleType"]]
jacc_dispGP <-betadisper(jacc_GP, groups_GP, type=c("median"))
anova(jacc_dispGP)

## If PERMANOVA and PERMDISP are both significant, you can use plotting to tell if PERMANOVA was significant based on centroid (or spatial median)

plot(jacc_dispGP)
?plot.betadisper

## Would look better with higher replication for groups
plot(jacc_dispGP, label = F)

## Plot with 1 standard deviation ellipses around the group medians
## sample size issue here, but you get the idea
plot(jacc_dispGP, label = F, hull = F, ellipse = T)

## Within-group dispersion that PERMDISP is testing
boxplot(jacc_dispGP)

## pairwise p-values
TukeyHSD(jacc_dispGP)

more here on plot.betadisper: https://www.fromthebottomoftheheap.net/2016/04/17/new-plot-default-for-betadisper/

6 Likes

Hi @CarlyRae,
Thanks a lot for coming to the NIH workshop, and for following up on the forum! This is a very interesting post.

I see your point that these are not the right plots for PERMANOVA, and I like the vegan plots that you’re describing. We are very interested in adding support for betadisper (also called PERMDISP - my understanding is that that’s the same method) - we’ve had an open issue on that for a while now. I’m hoping we can wrap that one up before too long. What I think would make the most sense here is to include updates to the plotting functionality when we have support for betadisper/PERMDISP, so that PERMANOVA and betadisper/PERMDISP are both run by default in this visualization since they tell us different things, and plots are shown that clearly illustrate the location of centroids and dispersion from centroids. I have added a note to the issue I linked above so that we have this on our radar.

One note just for clarification: the plots that we’re showing are not actually distances to centroids, but rather all pairwise distances within a group (the first box in each plot) and between groups (all of the subsequent boxes in each plot). What we were going for here, which on re-reading Anderson and Walsh (2013) seems to actually visualize ANOSIM (an optional test run by this method), was to illustrate “the degree to which there is greater clumping (smaller distances) among samples within the same group compared to that observed among samples in different groups.”

Thanks again for the follow-up, this is very helpful!

2 Likes

Ah, ANOSIM. Yes, that would make sense.

The one thing I noticed as I went back to do a PERMDISP using the betadisper function in vegan is that it looks like they have changed to using the distances to the spatial median as the default instead of to the centroid position. I’m not sure the rationale for that or if centroid should be specified as type so to be congruent with what PERMANOVAs are testing using the adonis function in vegan. Just an extra note. I posted this question here as I mentioned above (https://www.fromthebottomoftheheap.net/2016/04/17/new-plot-default-for-betadisper/) to see if the vegan folks can explain this switch and suggest best approach.

Thanks again for the great workshop. I learned a lot.

2 Likes