ANCOM minimal recommended sample size

Hello,
I am performing a differential abundance analysis using ANCOM, adjusted for covariates.
I was wondering, what is the minimal sample size for implementing ANCOM? I’ve read that the method is accurate as long as the group sizes are not too small, however, couldn’t find any data regarding the minimal recommended group size.

Hope you can help me understand this better, and would appreciate references :slight_smile:

Thanks in advance!
Lena

Hi @LenaLapidot,

This is a complex question :slight_smile:, and I dont think there’s a clear answer. I mean, I can give you a lower limit at 5 per group for the kruskal wallis based test, because kruskal wallis as a test starts to fail below that sample number, but I can also tell you that even at that size, you’re unlikely to see a W-value over zero.
In general, the sample szie you need depends on two major factors: the effect size for a set of individual feature and the number of features you’re testing.

I’m not sure there’s a good measurement for the relative effect size of difference factors. But, essentially, your effect size has to be big enough to create a p-value low enough that it remains significant after FDR correction. If your sample size is too small for your effect size, you may actually see W=0 for many or all of your features because there isn’t anything significant enough to reach the significance threshhold after correction.

That correction is the place hte second factor comes in. The more features you have, the more tests you conduct, the more extreme the difference has to be to be detectable over the noise. If you filter your data to remove low abundance/prevalence features, you increase your power for remaining features. I tend to be in the camp that anything present in one person is noise; I can’t do statistical tests on it. Where to filter beyond that is debatable; I tend to use an empirical threshold based on what has worked for the type of tests I want to do. In general, though, the rule here is that if you have fewer features, you can detect smaller effect sizes in the features you test. So, you’re still playing with an underpowered analyses, potentially missing interesting/important things because they’re low abundance and/or low prevalence.

One semi controversial way to check whether or not you have enough of a global effect for testing is to look at the beta diversity. If you see a significant effect in beta diversity, particularly a compositional metric like Aitchinson, you’re more likely to see an effect with a smaller sample size. This isn’t fool proof, it’s still prone to type II error, but my experience has been that it’s a good indicator of whether or not it’s possible to get a signal in ANCOM. (Incidentally, I have lots of cases where I can find a signal in beta diversity for a smallish sample size, but my data set is too small to pick out individual features.) I find ANCOM is conservative, so you may also want to look at other differential abundance techniques, like ALDEX, which may be better at smaller sample sizes.

Best,
Justine

5 Likes

Dear Justine,
Thank you for this clear and explanatory answer!

I am dealing with a data set with over 150 subjects. I wanted to be sure, that if I do a stratified analysis, the remaining subgroups (of approximately 50-80 subjects) are still ok for ANCOM. The effect size is good enough to pass the FDR correction and I do identify differentially abundant features, just wanted to be sure ANCOM works accurately on a smaller sample size.
Also, thank you for suggesting exploring beta diversity using a compositional metric like Aitchinson, I actually used Weighted UniFrac, Unweighted UniFrac, Bray Curtis, and JSD, which were all significant. I will try to see the results with the Aitchinson metric. I am also not familiar with ALDEX, I’ll read about it, but I wanted to use ANCOM since I have the option to do an adjusted analysis with this method.

Thank you again for your help.
Best,
Lena

1 Like

Hi @LenaLapidot,

With differences in so many metrics, I’d be pretty confident you should have a signal! I also think 50-80 subjects you should do okay. You may still lack power to detect all the differences, but based on my experiences, I would worry less about power here.

…And, just to add new things to the mix, you could also look at songbird, which allows adjusted analysis but can be kind of hard to check. I think the R version of Aldex2 has adjustment, but I haven’t played with that much.

Best,
Justine

1 Like

That’s great. Thank you!

1 Like

@jwdebelius definitely gave a fantastic explanation - I found a couple of sources mentioning the min sample size of ANCOM (although not much in the way of benchmarking): Analysis of microbial compositions: a review of normalization and differential abundance analysis mentions that ANCOM and ANCOMBC fail to control FDR at sample sizes <10, and Analysis of compositions of microbiomes with bias correction benchmarks ANCOM and several other differential abundance algorithms for small and unbalanced data (n1=20, n2=30). Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size looks like it might address this in more detail. I’m currently diving into this as after stratification my sample sizes are in the 3-5 range

3 Likes

@hsapers,

Thank you! Ive added both to my reading list!

Thank you @hsapers,
The sources look great and very relevant :slight_smile: