ANCOM-BC2 paired samples analysis

Hello,

I am working with microbiome 16S data from fecal samples collected from 6 subjects at two time points (T0 and T1). I would like to perform a differential abundance analysis to identify potential changes between T1 and T0 while accounting for the effect of each individual subject. Here, you can see an Aitchison Beta Diversity plot, where clustering appears to be more related to the subject than to the time points.

I am using ANCOM-BC2 to perform this comparison by fitting a mixed-effect model (Is there a way to test Paired-samples in ANCOMBC2? · Issue #183 · FrederickHuangLin/ANCOMBC · GitHub):

ancombc2(data = ps_paired, 
                tax_level = NULL,
                fix_formula = "timepoint",
                rand_formula = "(1 | subjectID)",
                p_adj_method = "BH",
                prv_cut = 0.1
                ...)

First, I got the next error message:

The error message from `lmerTest` is as follows:
number of levels of each grouping factor must be < number of observations (problems: paciente)

However, I reproduced the analysis for all the ASVs outside of ANCOM-BC2 using the lmerTest package, and I didn't encounter any errors. An example for ASV_1 is as follows:

lmerTest::lmer(ASV_1 ~ timepoint + (1 | subjectID), data = data, control = lme4::lmerControl())

This error was solved by increasing the prevalence filtering parameter prv_cut to 0.2. After this change, I received the following warnings, which are related to model misspecification and led me to suspect that the results might not be reliable.

1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
   Hessian is numerically singular: parameters are not uniquely determined
2: In as_lmerModLT(model, devfun) :
  Model may not have converged with 1 eigenvalue close to zero: 6.8e-11

Globally, 350 out of the initial 866 ASVs were included in the analysis after detecting structural zeros. As you can see in the image below, 106 out of 350 ASVs had no estimation for lfc, se, and W, and their p-values and adjusted p-values were equal to 1. After examining these ASVs with no parameter estimation in the feature table, I observed that they all shared the common characteristic of not being present in both time points for at least one subject. However, their p-values were still used to calculate the adjusted p-values, which resulted in all adjusted p-values from the time variable being equal to 1.

When I increased the prv_cut to 0.3, only 81 out of the initial 866 ASVs were included after structural zero detection, and only 5 out of those 81 had no estimations for the previous parameter. I did not receive any errors or warnings in this case.

What is the most suitable way to perform this comparison?

1. I could use the results obtained after applying prv_cut = 0.3. However, I am concerned that I might be missing rare but biologically important signals, as many ASVs were lost after using this filter.

2. I have read that I could use a fixed effects model (formula: subjectID + timepoint) instead of a mixed effects model. I believe this is the typical design for paired analysis in other omics tools like DESeq2. However, as far as I understand, the model would be quite complex and might not be generalizable beyond the results we are observing.

3. Perhaps this tool is not yet the most suitable for this type of analysis, and I should explore other alternatives like ALDEx2, which performs Wilcoxon test after CLR transformation.

I would be grateful if you could share your opinion and recommend more suitable alternatives, if you are aware of any.

Thanks,

Andrés

1 Like

Hi @andresarroyo,

I dont think this is a modeling or a tool issue, I think this is a data issue and I would very cautious about any interpretation you present with 6(!) subjects and 12(!) samples. You're seriously underpowered and so filtering isn't likely to make that any worse than it is in your design. I'll also note that when I filter my data in large studies of the gut, I drop 90% of my features because they're low abundance/low prevelance.

So, in terms of your options...

My best professional recommendation would be to not do a test at all and focus on a descriptive analysis. Pretty much every analyst comes across an underpowered data set at some point. Congratulations, you've gotten one of yours!

Of your options

I like heavy filtering for this data set. A prev > 0.3 filter basically says, "ASV in 3 samples" which means "ASV in at least 2 people. I think this is a good choice because it means that you're looking at a distribution of people.

Base don some unpublished simulations, this works okay as long as certain assumptions are met. My experience says its more about the number of groups and capabilities of the tool than whether or not a fixed or random effect is more appropriate. (I like random effects for timeseries because I think that grouping aspect is useful, FWIW).

However, like I said earlier, your biggest issue isnt the model, your issue is power.

You could try something like Gemelli and then seeing if you could use the factor loadings to construct an ALR. I did something similar in a recent paper. This gives you a polymicrobial statistic, so youre not paying an FDR penalty. You may still need to filter aggressively before hand.
Its also slightly harder to explain, so thats another consideration.

But mostly, I would recommend acknowledging the limitations of your experiment and not trying to torture underpowered data.

Best,
Justine

6 Likes

Thank you very much for your prompt response @jwdebelius!

I am aware that this project has a significant limitation in terms of sample size, and the conclusions drawn should be taken with great caution. I believe it should be regarded as no more than an exploratory analysis.

As you mentioned, sometimes we have to face these kinds of challenges, and I wanted to know which would be the most suitable analysis option (if there is one). I will consider your suggestions and decide which path to follow.

Andrés

1 Like