taxonomy analyses, differential abundance

Hello,
For my project I am comparing the gut microbiome of two different clinical group (patients who gained weight vs lost weight after an intervention) and whether it changed over time (before and after intervention). So basically for each individual I have two different gut microbiome samples.
The diversity analyses results showed that the microbiome of the group that lost weight had a lower diversity after intervention.
What I’m struggling with is the taxonomy analyses.
My first question is aside from heatmaps and differential abundance testing, and looking quickly at the taxa-barplots that I’ve generated, what other analyses can I do?

My other question is: I’m not sure which differential abundance testing to use. I tried to use ANCOM (which showed no significant differences) but as I understand, there are many different algorithms including songbird, qurro, aldex-2 etc… Is there a paper that you can refer me to that compares the different methods?

Also, is there a method that can test whether aggregates of taxons are more or less abundant in a clinical group over another. I know that Gneiss can do that, but it’s no longer recommended?

Thanks,
Rima

1 Like

Hi, @rnasrah! :wave:

My knowledge is limited here, but I think if we have a little more background about the resources you have already consulted, that would be helpful in providing you with more specific resources. If you haven’t already searched the forum for discussions about similar topics, would you mind doing so and sharing links to those discussions here, along with any comments or questions? I’m also curious which tutorials you have consulted. If you haven’t already (it sounds like you might have), this tutorial in particular could be useful, as well as this one, which might help you discover if Gneiss is useful in your situation.

1 Like

Hi Rima,

This isn’t an exhaustive answer but here are some thoughts.

My first question is aside from heatmaps and differential abundance testing, and looking quickly at the taxa-barplots that I’ve generated, what other analyses can I do?

The first thing that jumps out to me is looking at beta diversity plots, if you haven’t already (not sure if this was included in the diversity analyses you did) – in my opinion these are most useful as a diagnostic of “broadly, how different do these samples look?” and it sounds like this might be useful for your dataset. If your samples are all clustered together, then it may make sense that there wouldn’t be very clear differences in later analyses like differential abundance.

I can’t think of anything else off the top of my head that would be useful for your particular project, but I’m sure other folks on here (or some of the tutorials in Andrew’s post) might have ideas. (Usually I’d recommend checking out some of the functionality in q2-longitudinal, but since your dataset has just two timepoints per sample I’m not sure much of that functionality would be meaningful for it.)

My other question is: I’m not sure which differential abundance testing to use. I tried to use ANCOM (which showed no significant differences) but as I understand, there are many different algorithms including songbird, qurro, aldex-2 etc… Is there a paper that you can refer me to that compares the different methods?

There have been a lot of papers back and forth on differential abundance tools. One recent one is this preprint – there was a lot of discussion on twitter about it, and it looks like they just updated the preprint based on this feedback to consider some new methods (including Songbird). It may be worth checking this out. (That being said, I just read over the way they ran Songbird [without tuning the regression parameters], so I’m not super confident their results for Songbird are very useful.)

qurro

Qurro isn’t really an algorithm so much as a visualization tool – it should be usable with most differential abundance tools’ outputs (of the tools mentioned so far it’s mostly been tested with Songbird / ALDEx2 outputs, in particular).

I tried to use ANCOM (which showed no significant differences)

Although I’ve heard that ANCOM can be overly conservative with assigning “significance,” it’s worth noting that there really could just not be substantial differences between your sample groups (although one of the groups had lower diversity, I’m not confident that this means that there must be some “significantly” different taxa between the groups). This is a situation where beta diversity plots can be useful – if there is a noticeable difference in how your samples cluster in one of these visualizations, then that could be something worth looking for … but if there isn’t, then it could be an indication of your samples having mostly similar compositions. (This is all kind of hand-wavy, the main point is just to suggest that the boring result [the samples are mostly similar] might be the true one.)

Also, is there a method that can test whether aggregates of taxons are more or less abundant in a clinical group over another. I know that Gneiss can do that, but it’s no longer recommended?

I think PhyloFactor may be useful for this? It isn’t available in QIIME 2 though as far as I know. (cc’ing @mortonjt.)

6 Likes

Thanks so much @fedarko and @andrewsanchez for your awesome replies!
(Sorry my bad, I wasn’t clear. I had followed all the tutorials on qiime docs. My question was if there is anything else I can do specifically for the taxonomy analyses but your replies helped a lot!

Thank you :slight_smile:

1 Like

For differential abundance method selection, I suggest you have a look at this preprint from my colleague Jakob Russel and gives a try to his DAtest package

2 Likes

Hello Fedarko,
I have a couple of questions related with mentioned discussion. (1) Is it appropriate to use multivariate option in songbird to analyze a single factor data? Like I want to analyze the factor “Treatments” that has Six sub-levels and seek differential microbiota in five levels in comparison with Control group (Reference). (2) When songbird differentials is used as input in Qurro it generates Natural_Log_Ratios (after selecting reference frame) for features across different groups. Should we use the output data of Natural_Log_Ratios for statistical analysis (ANOVA or Kruskal-Wallis etc with post hoc tests) to see significant differences among groups?
I would be very thankful for your comments/answers.
Best regards,
Bilal

Hey Bilal,

Here are my comments – however, I didn’t write Songbird and I am not a statistician, so these are just my opinions on these things. (I’m tagging @mortonjt, who is probably more qualified to give advice on these things :slight_smile: )

I think this sounds reasonable – to clarify, this Treatments field in your data has six categories, and you just want to compare five of these categories independently with respect to a reference category? That sounds ok to me. Not sure if you’ve seen this, but the Songbird README shows how to set the exact reference category used here.

There are papers in the literature that do ordinary significance testing on these sorts of log-ratios – this is done, for example, in Songbird’s original paper, this paper on fermented foods and gut microbiota, and in this preprint on oral microbiota. However, as you’ve said, doing this constitutes a post hoc test – so the problem of multiple comparisons remains. (So although computing a p-value in order to compare log-ratios between groups can be useful as long as you stay aware of this problem and describe it accordingly, it is possible in theory to try 100 different log-ratios and just report the five most interesting ones without mentioning it. That is obviously a bad idea, and this is part of why Qurro leaves the task of determining “significance” up to the user.)

My personal preference here is reporting effect sizes instead of p-values (I come back to this tweet every so often), but I think reporting p-values could be kosher so long as you describe that Qurro was used to do exploratory data analysis for finding differential log-ratios across sample groups, and that these p-values are based on that exploratory analysis (and therefore should probably be validated in another dataset). As another way of handling this, the fermented foods paper I linked above did a permutation test on Songbird’s results to validate that the log-ratios seemed significant – that could be a useful way to come at this problem. Yet another way of (partially) mitigating this problem is sharing your Qurro plots so that readers can validate for themselves that your log-ratios seem robust – see here for details.

I’m not sure about what it would mean to do ANOVA using these log-ratios – my impression of Songbird is that the multinomial regression it’s doing under the hood is equivalent to ANOVA, so I’m not sure that doing ANOVA on the log-ratios obtained after the fact would add much information. I suppose it might be useful, but this is beyond my expertise, sorry!

2 Likes

To follow up, yes, it is currently recommended to follow up with posthoc testing with ANOVA once your log-ratios are chosen.

@rnasrah to answer your question about regarding differential benchmarks, there are actually quite a few different benchmarks out there in addition to DAtest


One of the major challenge behind differential abundance analysis is that in order to perform a hypothesis test, you almost always have to enforce some untestable assumption regarding the total microbial population size. ANCOM assumes that some percentage of taxa aren’t changing. ALDEX2 assumes that the mean isn’t changing. Songbird doesn’t make such assumption (and doesn’t perform a hypothesis test because of that).

If you need a test to test the difference between groups, then Gneiss is still a good candidate. Just beware that pseudocounts will add bias to your inference.

7 Likes

Thank you so much Fedarko and Jamie for your comprehensive replies. These are really very helpful for my study.
Stay safe and healthy.
Bil