How to compare taxonomy from a single sample against different groups

LuciaGG · November 12, 2020, 5:12pm

Greetings to all,

I am trying to analyze some data from human fecal samples. From the taxonomic tables obtained I need to compare one single sample against several groups of samples (each group correspond with a group of people with certain condition) and asses if the single sample is clustered whith some group in particular or not (or if it is more simmilar/dissimilar to a certain group). And also, which taxons are differentially abundant between the sample and each group.

Until now, I have performed NMDS and PCoa with Anosim to test the clusters.

My question is: wich statistic test or wich tool should I use to asses which taxons are differentially abundant between the sample and each group?
Could ancom serve for this?

(Also I tested Lefse, but it does not accept one single sample as input to compare).

Any help would be appreciated, kind regards

Lucía

LuciaGG · November 19, 2020, 11:28am

Greetings to all,

I would like to know if is possible to use ANCOM for assesing which taxons are differentially abundant between a single sample and a group of samples.

Any help would be appreciated, kind regards

Lucía

Please read the following before posting!

Is this post about a Third-party plugin (i.e., plugins not included in the Core Distribution)? Have you reviewed the QIIME 2 Forum Glossary? Please do not post questions here that have to do with interpretation of results, or general discussion, or technical support for "core" plugins. Posts in this category are not guaranteed a response.

llenzi · November 19, 2020, 12:12pm

Hi @LuciaGG,

I merged your post with your previous one, because it seems to me they are on the same topic.
I understand it may take long time to get an answer, but please be patients, moderator as well as other user forums are doing their best to help anyone asking in here but it still may take time!

On your specific question, I can try to help but I am not a statistician so I do not have the full picture.
However, as a general observation, it seems a very peculiar experimental design if you have to compare a single sample to one or more sample groups (each group containing n biological replicates).
Could you explain better your experimental design please (or why you end up on having only a single sample? Is anything happening before or during the qiime2 analysis?)
To me, your main problem is to assess how representative your single sample is for is own group! How do you know you are not dealing with an 'outlier'?
I don't know how informative would be comparing one sample vs a sample group. I honestly can not tell if there is any statistic comparison tool which can deal with sample size=1, without breaking its beyond assumption.

Hope it helps
Luca

LuciaGG · December 22, 2020, 11:57am

Thanks you for yor response @llenzi.

I am aware of the limitations of the experimental design, however I have been asked to compare the microbial composition of one individual against several groups of people with certain conditions. And the objective is to asses if the individual is more simmilar/dissimilar to a certain group , as I said, and which are the taxons responsible for such differences/ simmilarities.

So regarding your questions:
"Is anything happening before or during the qiime2 analysis?"
Is happening before, since the objetive was decided previously.

How do you know you are not dealing with an ‘outlier’?
The single sample belongs to an individual not related to the other groups. Is the problem sample, that has to be classified within one of the other groups.

I hope I have clarified something instead of repeating.

Kind regards

Nicholas_Bokulich · December 22, 2020, 3:34pm

That is relatively straightforward: you can use your beta diversity statistic of choice to see which group of samples that individual is most similar to.

An alternative might be to use q2-sample-classifier to train a classifer to predict different conditions, then use it to classify the unknown sample. This will tell you the most likely group, the probabilities that individual belongs to the different groups, and tell you which taxa are most predictive of the different groups (though the latter is maybe a bit more difficult to interpret, depending on the method you use... see the tutorial for this plugin at qiime2.org for more details).

This is less straightforward. As @llenzi pointed out, ANCOM is not appropriate for this as you need appropriate replication to perform this or any statistical test.

The best solution may be to make something like a PCA/PCoA biplot to answer both questions simultaneously. This will tell you which group that individual is closest to, AND the taxa that are associated with those differences. Check out DEICODE or look in the forum/documentation for more details on different types of biplots available in QIIME 2.