Multiple regression?

Hi @colinbrislawn and @Audrey_Anne,

Kind of, although not in this way. Like, I've done dynamics predicting outcome, and I've done dynamics as an outcome, but I've never done repeated measures predicting time.

I think I'm also struggling with how the model gets integrated. It seems like you have a relatively small sample size. Even if we assume bodysties are mostly independent (noting that :mouse2: are coprophagic assholes, not everyone washes their hands well, and oral microbe translation seems to be a mark of poor health), there's still an issue of repeated measures through time. I dont know if/how a random forest model would address this.

I have a broader concern (cue eye rolling from friends), which is your sample size. I don't know that you can do causal inference on what looks like it might be 2(?) bodies with repeated sampling. I worry about any model being over fit, because it's based on a single person, and I think that's a major factor to consider when you look at modeling.

I think also deciding on what your assumptions around future classification are matters. (More eye rolling as I get philosophical). So, like, do you think this classifier will help predict time since death when a body is found in a field? Are you planning to follow bodies over time to see if you can predict the change?

If you think your current data supports the model, then I think you do have to account for it. I'm personally moving toward using more continuous derived log ratios in my work (i.e. model the data, construct an ALR, treat the ALR as an independent variable in my model). In theory, you could do this across multiple body sites in a training set, throw them all into some kind of regression for age (although I'm not sure QIIME 2 can do this), and then use the regression model as a classifier/predictor. (Apparently just linear regression is a classifier :woman_shrugging:). I've used code from Jamie Morton to build an ALR on an LME). It's a little bit fragile with Stan, so that's kind of a downside.

I think my best advice is actually to find a statistician and/or machine learning expert to collaborate with. The kind of modeling you need to do here requires some serious expertise, and will also require knowedlge/insights about some of the specifics of your project.

Sorry I dont have a tidy solution.

Best,
Justine

3 Likes