Does it make sense to explore differential abundant OTUs, when 2 groups are not significantly differente according to Permanova?

smd · September 20, 2021, 2:26pm

Hello community,
For the past few days, in my Lab, we have been struggling with the following question: Does it make sense to explore differential OTUs (form ASVs to phyla) in two groups that are not significantly different according to PERMANOVA (a test that assesses the global effect of the microbiota)?

We are trying to explore differences between samples of the same patients in different time-points (differing in exposure to treatment). According to PERMANOVA, there are no overall differences in composition, although we observe a slight but significant increase in Proteobacteria, specifically Haemophilus after treatment.

Other thing that is concerning us is if there is some "cut-off" in weighted_unifrac distance (meaning % of difference, 5%, 10% and so on) that should be considered when saying that 2 microbiotas have different longitudinal dynamics?

Finally, have you heard about "linear decomposition model (LDM)"? Is it better than preform a PERMANOVA and LEfSe???

thank you in advance,
best regards,
Sara Dias

jwdebelius · September 20, 2021, 8:11pm

Hi @smd,

There are quite a few questions here!

There's some debate on this topic. Personally, I tend to air on the side of being conservative, and i'd rather report a false negative than a false positive. Its probably more damaging for my career to have a false negative, but its more damaging for everyone to have a false positive. Whether or not you should test depends, to me, on whether or not you have a hypothesis about a specific taxa aprori.

So, like, if I dont see a difference in beta diversity but we all agreed we were specifically going to test genus Fusobacterium before hand then, yeah, it's a test worth running in a compositionally aware manner while couching the whole thing in strong terms about a lack of global difference.

But, if I want to do hypothesis generating differential abundance testing, i don't trust results without a difference in at least one (preferably multiple) beta diversity metrics. This is a combination of sample size issues, noises, and a history of tools with high false positive rates (I'd recommend reading Microbiome differential abundance methods produce disturbingly different results across 38 datasets for more about the differential abundance tools.)

I'm also not much of a fan of differential abundance testing much past family level (although YMMV depending on your ecosystem.) My general experience has been that specific organisms (ASVs, OTUs, genera, families) are the primary drivers of differences and going further up the clade tends to muddy things.

I'm not sure permanova at each time point is ideal because the individual's microbial signature is often a stronger predictor than anything else in your time series. (Check out Chen et al, 2021). Have you tried longitudinal analysis, yet? This might be a more powerful option to see if you've got a change. I'm recommend looking at q2-longitudinal and he corresponding tutorial. You might specifically look at gemelli which is a gradient-aware technique. It might be a more reliable way to test for features.

Is this in terms of the metric for maybe generalized UniFrac, or like for testing? With testing, I'd suggest looking at linear mixed effects models. If you are specifically intersted in dynamics, you might also want to look at NMIT in the q2-longitudinal I linked above.

I'm not familiar with it; is there a paper. But, I'm also pretty convinced that most compositional techniques outperform LefSe.

Best,
Justine

smd · September 20, 2021, 9:33pm

Hi Justine, thanks for your answers!

We have already explored some longitudinal analysis and Linear mixed effect models, but you gave me some new ideias for exploring the data.

About the unifrac: Our major difference in terms of beta diversity between intervention and control groups was arround 7.5% in unifrac distance (with a very significant p-value but a very low eta squared). And i am affraid that theese difference resulted majorly form the sample size, and could nota bem enough to tell if composition is different longitudinaly, although the microbiota dynamics of the 2 groups seem different.

there is the paper on LDM: https://academic.oup.com/bioinformatics/article/36/14/4106/5823298

jwdebelius · September 21, 2021, 7:45am

Hi @smd,

If the dynamics are different, then they're different. There's lots of ways for a microbiome to be perturbed without requiring specific differential abundance.
But also, permanova R2 are notoriously low. I have a paper where we hit 1% variation and used it as an excuse for cinnamon rolls. (In our defense, they make very good cinnamon rolls here.) Like, it obviously depends on your sample size - more samples => smaller effect size. If you're getting p = 1/(n perm), I'd call that significant.

Thanks for the LDM paper; I dont know if anyone else here has played with it.

Best,
Justine

smd · September 24, 2021, 12:53pm

Hi Justine,
Just for curiosity, can you further elaborate on "There's lots of ways for a microbiome to be perturbed without requiring specific differential abundance."???

thank you

jwdebelius · September 24, 2021, 8:15pm

Hi @smd,

Purturbation doesn't have to be a single organism, it can be a community shift. I think his is best summarized by the Anna Karenina principle:

https://www.nature.com/articles/nmicrobiol2017121

Best,
Justine