Which taxa is contributing most to the first distances

jjmmii · March 20, 2018, 11:06am

I calculated the first distances of a beta diversity metric. Some samples, over the course of weeks, have a greater first distance than others. I would to know which taxa is contributing most to the first distances. Is it possible to do so in QIIME2 currently? If not, please let me know of pointers that I can try. Thank you very much.

Nicholas_Bokulich · March 20, 2018, 1:10pm

This is a tricky but really great question! — first distances do not necessarily indicate a directional change, so there may not necessarily be any taxa that drive these changes across all samples. But the suggestions below might lead you somewhere...

Check out q2-sample-classifier regress-samples — use the first distances as your --metadata-file and the name of the variable of interest as your --metadata-column. The output will tell you how well microbiota composition predicts first distances, and rank features by order of importance. The most important features contribute most to first distances. See the tutorial for more details.

The top important features could be used to build volatility charts, or run LMEs to visualize and test factors that impact abundance of these features over time.

q2-gneiss might be another method to check out — it could also tell you what features are differentially abundant along a first distances gradient.

You could also do a LME with first distances as your metric and select taxa as random effects — but if you want to go that route it would be best to discuss with a statistician how many random effects you could select, what metadata variables to include in the model, etc. Which taxa to select? Follow the steps above to figure out important features, or look at the taxa barplots of each subject over time and "use your gut" to figure out which taxa look like they differentiate these samples.

If there is a specific timepoint of interest to you, and first distances are different between metadata groups, you could also just run ANCOM on that individual time point to see what features are different between groups. OR run pairwise-differences with individual taxa of interest as the metric (e.g., important features from above) and the time points of interest that bracket that first distance as state_1 and state_2.

So as you can see, there are probably many different ways to start exploring this question.

I am exploring some future methods for q2-longitudinal that might address this problem, but only obliquely — as I explained above, first distances are really a special case!

I hope that helps!

jjmmii · March 21, 2018, 2:32am

Thank you very much, I’ll let you know if it works for me.

system · April 21, 2018, 8:35am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.