Associate changes in taxa relative abundances (for all taxa) with calculated changes of continuous metadata variables

jwdebelius · August 28, 2024, 1:08pm

So, a few thoughts, for what they're worth.

First, this sounds like an interesting and complex experiment!

Your goal doesn't align with the currently avalaible tools. I think we can accomplish the same thing, but you have to adjust your theoretical framework. I wrote about this a while ago when someone was struggling with relationships in their DAG.

Because measures are contemporaneous, you have no way of knowing which came first: the change int he microbiome and the change in the overall physiology. You have a guess, but you don't know for sure. So, switch you model and remember that in a linear regression if y = mx + b, x = (y - b) / m.

This is especially true at this stage, where your p-value should be reasonably close regardless of the modeled relationship and you're essentially using the p-value as a cut-off for looking at your data.

This is basically a modeling problem. Its not an easy modeling problem, but its a modeling problem. You essentially want to model this as y ~ m*t which expands to y ~ m + t + m:t
where

y is your dependent variable,
m is the change in y for each unit change in m when t = 0 - or the effect of your independent variable at baseline if t is time.
t is the change in y for each unit change in t when m = 0 - or the effect of time alone on your depepent variable
m:t is the amount y changes for each unit change in m and t - how much does m change for over each time step.

The way you put the variables int he model - whether you treat t as continous or categorical, etc can determine how these predictors get put together. But, ultiamately, what you want is an interaction model. I'd find a statistican to work with on this one, if you have one avaliable, and youre still fuzzy.

This is exactly what the longitudinal plugin does for alpha and beta diveristy - again, flipping the direction because the direction of the equation matters less than you want it to.

Tool wise, I find @MARTIN_MIHULA's proposal interesting:

ANCOM-BC 2, in R has the linear mixed effect framework with inmpeleted. Its just not avaliable in QIIME 2. Im not familiar with these, although Im not sure if thye use appropriate underlying models. I know the p-values in MaAsLin are a bear to work with. I would double check Table 1 from Microbiome differential abundance methods produce different results across 38 datasets which unfortunately does not include ANCOM-BC but will give you details about hte others.

Best,
Justine