What is the validity of comparing groups using their individual core microbiomes?

fabianomenegidio · October 4, 2019, 1:52am

I have a microbiome dataset, which I am using to compare different groups of mice (symptomatic animals, asymptomatic animals and a control group). I did some preliminary analysis with this data set in QIIME 1.9.1, but found a large degree of variability between individual animals, which I fear makes it difficult to identify specific group profiles.

To overcome this problem, I performed some comparisons among the different groups using their respective core microbiomes and got much better results.

To accomplish that, I used the script compute_core_microbiome.py , from QIIME 1.9.1, to generate individual core microbiomes for each group (with an 80% prevalence). Next, I merged these core microbiomes in a single OUT Table, using the command merge_otu_tables.py and performed a beta diversity analysis (NMDS-Permanova, using MicrobiomeAnalyst) to compare the three groups. This approach resulted in perfect separation of animals in three groups, according to their respective conditions (symptomatic animals, asymptomatic animals and control).

However, I feared that this separation might be artefactual, since the core microbiomes would naturally emphasize similarities within each group. To test for that, I distributed my animals into randomly balanced groups (each containing a similar amount of animals from each of the original groups) and reran the entire analysis, as described above. This resulted in no clear group separation, which gave me some confidence that my original group separation was not artefactual (especially after I obtained the same null results with three different sets of randomly balanced groups).

Although I am cautiously confident of my results, I have not found any papers in the literature that employed this exact same approach, which keeps me wondering if there are any alternatives to conduct my analyses.

I will do the analysis again using QIIME2, but my biggest concerns at this point are: (i) What is the validity of comparing groups using their individual central microbiomes? (ii) Would the use of randomly balanced groups be sufficient to rule out possible biases introduced by this approach? (iii) Is there a better alternative? (iv) Can you point me to any work that has followed a similar methodology?

jwdebelius · October 4, 2019, 8:02am

Hi @fabianomenegidio,

This is a delightfully complex topic, and my response might be long.

This is a common feature of microbiome analysis. Inter-individual differences are to be expected and are, in fact, both a challenging and important piece of the work. You have an added layer of complexity in your experiment because you likely have multiple cages ( are coprophagic and so a cage becomes a secondary nesting unit if they are co-housed. As a result, its important to have multiple cages per treatment.) A second issue with mice is that if you have multiple lineages of mice where you bought the animals from different suppliers, there is a lineage-specific microbiome. So, a few caveats in mouse experiments. General discussion about mice almost over, if you've not read it, I think How informative is the mouse for human gut microbiota research? is a good article that addresses some of the issues.

fabianomenegidio:

To accomplish that, I used the script compute_core_microbiome.py , from QIIME 1.9.1, to generate individual core microbiomes for each group (with an 80% prevalence). Next, I merged these core microbiomes in a single OUT Table, using the command merge_otu_tables.py and performed a beta diversity analysis (NMDS-Permanova, using MicrobiomeAnalyst) to compare the three groups. This approach resulted in perfect separation of animals in three groups, according to their respective conditions (symptomatic animals, asymptomatic animals and control).

However, I feared that this separation might be artefactual, since the core microbiomes would naturally emphasize similarities within each group. To test for that, I distributed my animals into randomly balanced groups (each containing a similar amount of animals from each of the original groups) and reran the entire analysis, as described above. This resulted in no clear group separation, which gave me some confidence that my original group separation was not artefactual (especially after I obtained the same null results with three different sets of randomly balanced groups).

I think there are several potential issues here, the mouse issues aside. First, core microbiome is still kind of a controversial topic and hotly debated. I'm not sure I'd trust your threshold of 80% prevalence (although I like that its prevalence based) because you're picking things that are really there, especially if its at the OTU level where you're making this assessment. Microbial time and microbial generations are a lot faster than at the macroscale, and so some sparsity is to be expected in the data. Interesting things can happen at much lower prevelance in a group. But, in general, this seems like a really good way to make sure that you're seeing differences. You're taking very sparse data, cherry picking things that are highly prevelant within one group (possibly without accounting for cage nesting) and then asking does it seperate. I'm also not sure 3 iterations is enough to insure the validity of your work. ...It also sounds like you didn't do any statistical tests?

I would, instead, start with a few distance metrics - maybe the set in core diveristy - and then project the full metric into PCoA space with something like Adonis or permanova (there are examples of both in a mouse study in the PD mice tutorial). This lets you ask if - including individual differences - you see a difference between groups. You get a p-value and Adonis even gives an R² that gives variation explained... plus its multivariate so you can account for things like cage nesting. I think the statistics give you a fuller picture than does simple PCoA projection - PCoA explains the largest source of variation int he data but often reflects composite features of your data. Relying on seperation within PCoA space as metric for seperation in the microbiome is sub optimal - its true that seperation in PCoA space is almost always significantly different but its also true that things which are significantly different in the microbiome do not always separate in PCoA space.

Best,
Justine

fabianomenegidio · October 10, 2019, 3:33pm

Thanks for the answer. I'm testing the Parkinson's Mouse Tutorial.

Soon I will make new comments.