I am trying to perform differential abundance using qiime2-2023.5 and composition ancombc
My metadata is something like this
Phenotype, Bodysite
A, L
A, M
B, L
B, M
C, L
C, M
I want to compare A,L vs A,M; B,L vs B,M. I used following parameters
I think there are a couple of ways to approach this, depending on some details of the exact comparisons you're trying to make!
Within ANCOM-BC, all possible group comparisons will be made (based on the chosen formula), and they will be displayed in reference to a particular intercept for each column used within the formula. So, to provide a simple example - if we had the following column/groups:
bodysite gut tongue left palm right palm
Let's say we'd like to examine the differences between each group in this column, and we'd like to look at those differences in reference to the 'tongue' group. We'd run ancombc with the --p-formula parameter set to 'bodysite' and --p-reference-levels set to 'bodysite::tongue'. This will provide us with those comparisons, all with respect to 'bodysite:tongue' (i.e. log-fold change for 'bodysitegut', 'bodysiteleft palm', and 'bodysiteright palm').
So what I'm seeing in the metadata columns you've provided is:
Three groups in the 'Phenotype' column (A, B, C)
Two groups in the 'Bodysite' column (L, M)
From what you've described, it sounds like you'd like to examine the interaction between Phenotype at each Bodysite group, is that correct? Assuming that's the case, you do have several options that you can provide within the formula column (all of which do require some understanding of linear regression models). You'll want to make sure you have an understanding on whether or not these are dependent/independent variables, and if you'd like to examine the effect of one vs the other and/or the interaction of one vs the other.
This is a helpful course lecture on interaction terms in linear regression models, which you may find helpful if these are new concepts. Cheers
Thanks for your patience! Happy to go through the comparisons you mentioned below:
Phenotype A at bodysite L versus Phenotype A at bodysite M
Phenotype B at bodysite L versus Phenotype B at bodysite M
Phenotype C at bodysite L versus Phenotype C at bodysite M
The best way to do each of these comparisons will be to filter down your feature table three different times - each time, filtering on the Phenotype you'd like to compare between each bodysite. This would look something like:
This would be the same for each phenotype - you would just modify the where parameter to be phenotype=B & C to create the other two filtered tables.
From here you can run ANCOM-BC on each of these filtered tables, using bodysite in the formula parameter. The only thing to be aware of is that one group within your bodysite groups will be selected as the reference (i.e. the differential abundance will be calculated with respect to a particular group). By default, the selected reference level will be chosen in alphabetical order, but you can specify another group if you prefer. So for your data, bodysite L would be the default reference level selected, but you could choose bodysite M if you prefer.
Phenotype A vs B vs C at bodysite L
Phenotype A vs B vs C at bodysite M
These comparisons would be made through a 'global test', which is a parameter not currently exposed in QIIME 2's implementation of ANCOM-BC. If you're interested in utilizing this parameter and are familiar with R, you could run the R implementation of ANCOM-BC (which is what our implementation wraps, just without some of the same functionality). Information on that R package can be found here, if you're interested:
Hi @lizgehret
Happy new year.
I really appreciate your detailed response to my issue.
For comparing body-sites for a given phenotype:
I will generate three sub-tables, one for each phenotype.
I think I can run this analysis in qiime2?
How do I address taxa levels? Shall I collapse sub-tables to Level 2, 5, 6, 7 and then analyze with qiime2?
For comparing phenotype at a body-site:
Analysis should be run in R using existing table. Correct?
Hello!
Just in case if @lizgehret on a vacation I will jump in and try my best to address the question.
First of all, happy New Year!
As I understood, you have two sample types, that should be compared versus each other. In that case yes, you can run it separately for each phenotype within Qiime2.
You can run it at ASV level or at any taxonomy level of your choice. In that case you need to collapse the table to desired level before Ancombc.
If your goal to perform A vs B vs C comparisons then you should follow @lizgehret advice from the earlier comments and run it in R. If you have one phenotype that can be chosen as a reference for two others, then you can run it in Qiime2.
I will collapse phenotype tables to appropriate levels and analyze with qiime2.
For the second part, comparing A vs B vs C, I can assign a reference phenotype in "--p-reference-levels".
How do I specify the body-site? (I want to compare A vs B vs C at body-site M and so on).
As per my understanding, I should use --p-formula 'bodysite * animal'. Is this correct?
Or shall I filter the table to generate sub-tables for each body site?
Since from my experience (not a rule!) body-sites have different microbiome profiles I would separate tables based on sample type as well and then compare phenotypes either all vs all in R or vs a reference in Qiime2.