Choosing two metadata columns for diversity analysis

I am running QIIME2-v2023.2 in a conda environment.

I am trying to run beta (and alpha) diversity analysis using the following command:

qiime diversity beta-group-significance
--i-distance-matrix core-metrics-results/unweighted_unifrac_distance_matrix.qza
--m-metadata-file sample-metadata.tsv
--m-metadata-column body-site
--o-visualization core-metrics-results/unweighted-unifrac-body-site-significance.qzv \

However, the above command only compares samples based on one column i.e. body-site. Is it possible to perform analyses on samples that satisfy two columns i.e. body-site AND treatment?

Hi @macrobiome,

For beta diversity, you can use qiime diversity adonis action, which allows for R-style formulas in its formula parameter. Note however that this implementation of adonis performs a sequence of terms test, meaning the order of your variables matter.

For alpha diversity, you can run a linear mixed effect model using qiime longitudinal plugin's linear-mixed-effects action. See here for a more thorough example.

Hope that helps!

4 Likes

Thanks @Mehrbod_Estaki!

Sorry, I am new to bioinformatics and coding in general. Is there a simpler version of doing this? Can I go to previous steps and create a filtered table with only the samples I want and go from there?

Thank you!

Hi @macrobiome,
Can you clarify exactly what it is you are trying to do? Your initial question read to me like you were trying to add co-variables to your formula (diversity ~ body-site + treatment). However, your recent inquiry is about filtering your table to retain only certain samples, which you can of course do pretty easily (see full tutorial here). Namely you'll be wanting to use qiime feature-table filter-samples action. If you can let us know exactly what you are trying to do then we can point you towards the right workflow.

Also a couple of examples of my original answer below. @jwdebelius pointed out that the q2-longitudinal plugin has a regular anova action which may be more straightforward in this case than doing a full LME. Thanks Justine!

or with alpha diversity

qiime longitudinal anova \
  --m-metadata-file sample-metadata.tsv \
  --m-metadata-file core-metrics-results/faith_pd_vector.qza \
  --p-formula 'faith_pd ~ body-site + treatment' \ 
  --o-visualization site_treatment_faith_anova.qzv

and for beta diversity for ex.

qiime diversity adonis \
  --i-distance-matrix core-metrics-results/unweighted_unifrac_distance_matrix.qza \
  --m-metadata-file sample-metadata.tsv \
  --p-formula bodysite+treatment \
  --o-visualization core-metrics-results/unweighted_adonis.qzv
2 Likes

Thank you! I will try those options. Sorry about the confusion regarding the filtering point. I was thinking out loud and realized it would not work.

Will keep you guys posted on how it goes. Thanks!

3 Likes

Hi @Mehrbod_Estaki!

I tried using the anova code and it didn't give me quite I was looking for. I will try explaining a little more below:

Let's say I have microbiome data from different sources, for e.g. tumor and feces from 6 mice that were either given Drug A, Drug B, Drug C (n=2)

The metadata file would look something like this

Source Treatment
Mouse 1 Tumor Drug A
Mouse 2 Tumor Drug A
Mouse 3 Tumor Drug B
Mouse 4 Tumor Drug B
Mouse 5 Tumor Drug C
Mouse 6 Tumor Drug C
Mouse 1 Feces Drug A
Mouse 2 Feces Drug A
Mouse 3 Feces Drug B
Mouse 4 Feces Drug B
Mouse 5 Feces Drug C
Mouse 6 Feces Drug C

Now, when I conventionally do the core-metrics-results, it allows me to choose either between source or treatment. However, I want to do comparisons for part of the data i.e. compare the tumor samples between different drugs. So, I only want samples where the source is Tumor (and so on for other analyses).

Hi @macrobiome,

Can you please expand on this? What visualization or artifact exactly are you looking at? What did you expect, or better yet, what do you want exactly?

The diversity core-metrics action does not have a parameter that allows you to select a column, so I'm guessing you are talking about the output visualization artifact of the alpha-diversity results?

Did you try filtering your data based on your metadata to retain only the samples you want? I provided a link to the tutorial on how to do this kind of filtering in my last post. Follow that, and then re-run core-metrics (or anova, or whatever test you wanted to run).

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.