ANCOM abundance table: more abundant in which treatment?

Hi there,

I’m a little confused about one of my results from ANCOM:

Specifically the abundances of Firmicutes. I can’t work out if it’s higher in T3 or T4. I drew a bar chart using the percentiles given in the hope that it would help but I’m still none the wiser (apologies for messiness - I’m not a fan of rulers :rofl:).

From this, it doesn’t look like there is any difference between the two treatments - but maybe I’m missing something in the way that ANCOM calculates this?

Thanks so much in advance for you help! :slightly_smiling_face:

Hi @xchromosome,
You are correct to have concerns about this. This isn’t the first time we’ve seen ANCOM identify features significantly different with very low W values. See this post for more details. However, I should also mention that ANCOM is much more powerful for larger datasets, meaning if you were to compare your data at the Genus level instead of the Phyla you would have much more meaning results. You are only comparing 7 Phyla which means the max W value you can get is 6, which doesn’t allow for alot of certainty. This may be why you are seeing differences that may not actually be true.

2 Likes

So @xchromosome, you could use a t-test for small groups like phyla, then ‘dive deeper’ using ANCOM for more specific taxonomies. :+1:

Colin

2 Likes

Thank you to you and @colinbrislawn for the replies. I wasn’t sure which taxonomic level to look at for my analyses, so I have done all of them at levels 2, 5, 6 and 7. I read the post you linked to - so I understand why ANCOM isn’t really suitable for phylum level comparisons. What about level 5/family?

Sorry, this is probably a silly question but how exactly would you recommend I do this? I was under the impression that t-tests couldn’t be used for microbiome data!

Thank you! :grinning:

In R: youtube example
In excel: youtube example

…um, stats tests don’t care about where the data came from (microbiome or otherwise) but they do have assumptions that need to be met for the result to be valid. For example, the t-test assumes a normal distribution, and if your data distribution isn’t normal (microbiome or not) then you got to pick a different test.

I’m not a card-carrying statistician, so I think I’ll let the experts answer your stats questions!

Colin

1 Like

Thank you for the replies.

Oh, I know! Sorry, maybe I didn’t explain that right! What I meant was that I thought t-tests weren’t really suitable for comparing relative abundances like you would generate from microbiome analysis, and that that was the advantage that ANCOM had over them. I’ve always struggled with stats though.

So, given that a W value of 2 (as in my result above) is invalid, can I ask what you think of this result, which is the same table and treatment comparison but at family level instead, with W scores of 30 - 38? ANCOM-D37-T3-vs-T4-collapsed-level-5.qzv (455.6 KB)

Thanks so much!

1 Like

Hello again,

Good point! Compositional data has its own issues that can violate expectations of some tests. So let’s dive into these new ANCOM results.

Here’s how Jamie describes the ANCOM volcano plot:

You have no points your top right corner… meaning that nothing is super different between groups.

Have you tried this test using all your features without grouping them at a higher taxa level? Grouping can hide cool trends of specific microbes, so performing this test yet a third time using all feature might work well. :man_shrugging:

Colin

Hi Colin,

Thanks for that. I’ve just had a read at that thread where Jamie explained the volcano plot and noticed that the example in question looked a bit different to mine. It seems that poster was comparing multiple groups, so there is an F-score on the x axis in that case:

So I understand that. But in my graph, there are only two groups. My x axis looks different:

In this post here, Jamie said that for the example given with only 2 groups,

“Considering the volcano plot, it looks like there isn’t an F-statistic being run. When there are only 2 categories, just the clr mean difference is calculated (which is essentially a log fold change). If you have negative log fold change, that is indicative of decrease (since log(x) < 0 for 0 < x < 1), whereas a positive log fold change is indicative of increase (since log(x) > 0 for x > 1).”

(Sorry, can’t find how to link the quote from that post!)

So from what I can gather, it seems like features appearing in the top left of the graph are just as significant/valid/important as those in the top right, in this case where there are only two groups? Is that correct?

Thanks again :slightly_smiling_face:
Lindsay

Hello Lindsay,

I’m a little out of my depth as I’m not a statistician, but let’s start with the CLR statistic.

I googled CLR and it turns up CLR® Calcium, Lime, & Rust Remover, which is not a stat test :man_facepalming:


In our context, CLR stands for center log ratio, which works like this :

This paper goes into more of the math, and explains why CLR is helpful for samples with different numbers of reads.


Yep! That sounds correct to me. These microbes are decreasing between groups.

EDIT: I don’t have any idea what I’m talking about. I think a statistician needs to ‘qiime-in’ to tell you how best to report this.

Colin

1 Like

Haha, I had a similar issue when trying to Google gneiss for the first time :laughing:

Thanks - I will check that out. I’m struggling to get my head around the concept of an “average microbe” and how the increasing/decreasing relative to that links in with treatment groups. Maybe that paper will help it click!

Haha no problem! I really appreciate you taking the time to help. I will wait patiently for an expert to arrive. :ambulance:

Thanks!
Lindsay

1 Like

Hi @xchromosome, part of the problem has to do with the nature of relative data – it is not possible infer absolute differences from relative data. This means it isn’t possible to infer increase / decrease from relative data. But it maybe possible to infer which microbes increase / decrease the most, and simulations show that ANCOM maybe able to do this. See our paper here: https://www.nature.com/articles/s41467-019-10656-5

Are you seeing separation in beta diversity? If PERMANOVA doesn’t give you significant results, maybe differential abundance won’t give you the insights you want.

1 Like