Time + treatement effects

Nicheca · July 24, 2019, 7:10am

Hi Everyone,

I conducted an experiment with 20 newborn calves and I analyzed tissue samples collected in the colon at birth (baseline) and at d5 of life in the same animals. I have 2 groups of calves CON and TRT (treatment started just after first sample collection) and I would like to look at the time and TRT effects on microbial population. I also have the gene expression in the same samples and would like to find the best tool to look at the correlation between host and microbial populations.
I was thinking at ANCOM and Sparcc analysis, do you have any other suggestions?
Concerning the ANCOM analysis, what is the best way to run it? Can I use the d0 as a baseline and look at the differences concerning the treatment AND the time? Or should I split my dataset?
Many thanks in advance,

jwdebelius · July 24, 2019, 7:14am

Hi @Nicheca,

Welcome!

My tendency is to recommend everyone starts with beta divesrity because I figure if there's a community-level difference there is a lower probability that whatever you see in feature-based analysis is a false-positive. (There are definitely other opinions on what the appropriate order is!) As far as your experimental design, I think you'll get the most milage out of accounting for your paired design. Have you checked out the longitudinal data tutorial which deals with both paired samples and larger gradients?

Best,
Justine

Nicheca · July 24, 2019, 2:09pm

Hello Justine,

Thanks for you quick reply!
I did not mentioned, but I looked first at the alpha and beta diversity - no difference except for the Faith’s Phylogenetic between the baseline and d5. I would like then to go to the features levels. I did check the longitudinal data and wanted to give a try to the Feature volatility analysis, do you recommend it ? What will be the difference with ANCOM?
Thanks again,

jwdebelius · July 24, 2019, 2:27pm

Hi @Nicheca,

ANCOM behaves sort fo like a t-test to test the hypothesis that \bar{x}_{1} \neq \bar{x}_{2}, assuming that the samples are different between the two groups. So, it is compositionally aware (pro) but you can't account for the replicate nature in your data (con). To be honest, I haven't played that much with the individual volatility plots for features. The other thing I will mention, though, is that you cna also do LMEs in q2-Gneiss (see the Gneiss tutorial here), which is a compositionally aware technique, but can also be harder to interpret.

Best,
Justine

Nicholas_Bokulich · July 24, 2019, 2:36pm

feature-volatility is just going to assess what features are changing over time in all samples that you feed to it... so it will not be aware of treatment or of individuals. I do not mean to that this action won't be useful to you, just that it won't stratify samples by treatment by default.

Something that could be very interesting to try here is the maturity-index action, which is essentially a special case of feature-volatility. This was originally developed for assessing how different treatments impacted the "maturation" of the gut microbiome in human infants, so a somewhat similar design to your own. It will stratify the data by treatment, and train the model only on the control group to assess what "normal" maturation conditions look like, then use that model to test whether each subject is maturing at a "normal" rate. It will report important features, just as feature-volatility does, and you can use those feature importances as input to the plot-feature-volatility visualizer to get the same data exploration visualization that feature-volatility generates.

Nicheca · July 24, 2019, 11:51pm

Many thanks to both of you. I will give a try to the maturity-index and may come back to you if I run into troubles!

Cheers,

Nicheca · July 25, 2019, 4:43pm

Hi @Nicholas_Bokulich,

I just went trough the tutorial for the maturity-index and it specified that 'This analysis will only work on data sets with a large sample size, particularly in the "control" group, and with sufficient biological replication at each time point'.
I have only 10 animals in each group (control and treated) and only 2 time points. Do you think it is still relevant to apply this action?

Thanks,

Nicholas_Bokulich · July 25, 2019, 4:45pm

No, unfortunately that is too small for this action — the reason being that the action subsets your data, so you would be testing a tiny number of samples and just would not get robust results.

Nicheca · July 25, 2019, 9:01pm

I see, what do you think about running ANCOM analysis comparing the baseline time point with the other time point for the control group and the treated group separately. I could then identify the taxa that are different between my 2 time points for each group and then compare the results from control and treated group?

Nicholas_Bokulich · July 26, 2019, 11:48am

I would discourage that approach. If you are trying to determine what differs by treatment, then this comparison of what changed over time and essentially making a venn diagram of those changes would not be a rigorous enough approach. If anything, I'd recommend running ANCOM to compare treatments at your two individual time points.

If you really want to assess the interaction between time and treatment, then you could try out q2-gneiss. But I assume treatment effects are probably the main concern here, in which case splitting up your data by time would be an easy to run and interpret approach.

I missed this in your initial message. For finding associations between host expression + gut microbiome, you could use something like rhapsody in QIIME 2. rhapsody is designed for metabolite-microbiome interactions, but would work just as well for expression-microbiome correlations. Give it a spin! But if you have questions, open a new topic and link back to this one so that this topic discussion does not get off track.

Nicheca · July 26, 2019, 2:25pm

Thank you so much @Nicholas_Bokulich! My last idea was to use the log2FC (between my 2 time points) and look at the treatment effect with ANCOM. Not appropriate neither?
I am working on the q2-gneiss and will give a try as rhapsody!

It is difficult to find the best way to analyze these data because actually I am equally interested by the time AND the treatment effect! 2 different questions but I would love to address both!

Nicholas_Bokulich · July 26, 2019, 3:54pm

Oh I see, it sounds like you are using ANCOM in R for full functionality, not the QIIME 2 wrapper. Yes! That sounds like a good way to go if you are looking at paired sample comparisons across time.

Nicheca · July 26, 2019, 10:11pm

Yes, I use R for ANCOM, sorry, I forgot to mention it. Thanks again for your guidance.

Nicheca · August 21, 2019, 11:25pm

Hi @Nicholas_Bokulich,

I have an extra question for you! Do you know if it is possible to calculate the log2FC for each individual or samples between 2 time points ? I managed to calculate the log2FC between 2 time points but with an average value of all my samples/treatment for each feature.

Thanks,
Clo

Nicholas_Bokulich · August 22, 2019, 4:59am

Not in QIIME 2 — it would be a neat method to have if you are interested in contributing!

Nicheca · August 23, 2019, 12:00am

Hahaha!
My knowledge in bioinformatics is below the sea level, I am barely trying to swim at the surface. I would not be very useful