How to incorporate the baseline data within a crossover analysis?

isuzjc · May 26, 2020, 6:43pm

Hi,

We are trying to incorporate baseline data as a covariate when analyzing the 16S rRNA NGS data from our placebo-controlled crossover study. The two treatments (placebo and experimental) were compared within each research participant and the order of the treatments was randomly assigned.

We generated 8 sets of data: 1) Baseline for placebo treatment in Period 1 (B1), 2) Baseline for experimental treatment in Period 1 (B2), 3) Post treatment for placebo treatment in Period 1 (P1), 4) Post treatment for experimental treatment in Period 1 (P2), 5) Baseline for placebo treatment in Period 2 (B3), 6) Baseline for experimental treatment in Period 2 (B4), 7) Post treatment for placebo treatment in Period 2 (P3), and 8) Post treatment for experimental treatment in Period 2 (P4).

Our objective is to test if the placebo and experimental treatments lead to different diversity and relative abundance results. We are struggling with how to handle the baseline data. One way is to look at the change between “baseline” and “post”. However, we prefer to consider the baseline data as a covariate in a crossover study. We tried using ANCOM2 in R, which can accommodate covariates, however, it cannot accommodate a crossover analysis, e.g. the need to address period effects (differences in the treatment responses between periods 1 and 2).

For the diversity and relative abundance data, what plugin/packages should we use to incorporate the baseline data within a crossover analysis (including analysis of period effects)?

Any assistance would be much appreciated!

jwdebelius · May 26, 2020, 7:30pm

Hi @isuzjc,

Welcome to the forum!

This is a great experimental design. Unfortunately, Im not sure all the tools are equipped to handle it. I do actually think you're better off comparing the deltas here, where the question becomes "Does the treatment shift the microbiome more than the placebo"? This lets you account for big inter-individudal differences and handle the baseline data. It also, to some degree, simpliifes your analysis because it lets you break some of the assumptions around independence/dependence that make working with diversity hard. (However, i'd also check that your two baselines compare!) Based on that recommendation, I would suggest you look at q2-longitudinal. ...You may also need to explore solutions outside of qiime 2.

Trying to include the prior community in the analysis as a traditional covariate is going to be a headache for any number of reasons, including interdependence, the number of features, and the fact that there aren't great tests equipped to handle this.

Best,
Justine

isuzjc · July 21, 2020, 9:22pm

Thank you very much!