ANCOM-BC formula and ref-levels for interactions

kindergarten · December 8, 2023, 6:51pm

Hi

I am trying to perform differential abundance using qiime2-2023.5 and composition ancombc

My metadata is something like this
Phenotype, Bodysite
A, L
A, M
B, L
B, M
C, L
C, M
I want to compare A,L vs A,M; B,L vs B,M. I used following parameters

--p-formula 'Phenotype + Bodysite' \
--p-reference-levels Phenotype::A Bodysite::L \

Output looked like this
Intercept PhenotypeB PhenotypeC BodysiteM
I believe PhenotypeB is PhenotypeB (bodysite L+M). Am I correct?

How do I get the desired outcome? e.g. PhenotypeA at bodysiteL vs PhenotypeA at bodysiteM.

Thanks

colinbrislawn · December 8, 2023, 7:00pm

One option is to make a new metadata column:

Phenotype, Bodysite, Combined
A, L, AatBodysiteL
A, M, AatBodysiteM
B, L, BatBodysiteL

Then --p-formula 'Combined'

There may be a better way using a powerful R formula, but I'm not sure what ANCOM-BC supports. Let's see what others suggest!

lizgehret · December 8, 2023, 8:30pm

Hey @kindergarten,

I think there are a couple of ways to approach this, depending on some details of the exact comparisons you're trying to make!

Within ANCOM-BC, all possible group comparisons will be made (based on the chosen formula), and they will be displayed in reference to a particular intercept for each column used within the formula. So, to provide a simple example - if we had the following column/groups:

bodysite
gut
tongue
left palm
right palm

Let's say we'd like to examine the differences between each group in this column, and we'd like to look at those differences in reference to the 'tongue' group. We'd run ancombc with the --p-formula parameter set to 'bodysite' and --p-reference-levels set to 'bodysite::tongue'. This will provide us with those comparisons, all with respect to 'bodysite:tongue' (i.e. log-fold change for 'bodysitegut', 'bodysiteleft palm', and 'bodysiteright palm').

So what I'm seeing in the metadata columns you've provided is:

Three groups in the 'Phenotype' column (A, B, C)
Two groups in the 'Bodysite' column (L, M)

From what you've described, it sounds like you'd like to examine the interaction between Phenotype at each Bodysite group, is that correct? Assuming that's the case, you do have several options that you can provide within the formula column (all of which do require some understanding of linear regression models). You'll want to make sure you have an understanding on whether or not these are dependent/independent variables, and if you'd like to examine the effect of one vs the other and/or the interaction of one vs the other.

This is a helpful course lecture on interaction terms in linear regression models, which you may find helpful if these are new concepts. Cheers

kindergarten · December 9, 2023, 8:16pm

hi @lizgehret

Thanks for the detailed explanation.

So what I'm seeing in the metadata columns you've provided is:

Three groups in the 'Phenotype' column (A, B, C)
Two groups in the 'Bodysite' column (L, M)

Above is correct.

From what you've described, it sounds like you'd like to examine the interaction between Phenotype at each Bodysite group, is that correct?

I want to compare

Phenotype A at bodysite L versus Phenotype A at bodysite M
Phenotype B at bodysite L versus Phenotype B at bodysite M
Phenotype C at bodysite L versus Phenotype C at bodysite M
Phenotype A vs B vs C at bodysite L
Phenotype A vs B vs C at bodysite M

To be honest, I read the lecture in provided link twice. I could not understand the math. If it is critical, I would have to learn it.

lizgehret · December 14, 2023, 5:23pm

Hi @kindergarten,

Thanks for your patience here! I've gotten tied up with some other things this week but will follow up with you as soon as I am able.

Cheers

lizgehret · December 14, 2023, 9:44pm

Hi @kindergarten,

Thanks for your patience! Happy to go through the comparisons you mentioned below:

Phenotype A at bodysite L versus Phenotype A at bodysite M
Phenotype B at bodysite L versus Phenotype B at bodysite M
Phenotype C at bodysite L versus Phenotype C at bodysite M

The best way to do each of these comparisons will be to filter down your feature table three different times - each time, filtering on the Phenotype you'd like to compare between each bodysite. This would look something like:

qiime feature-table filter-samples \
--i-table your_feature_table.qza \
--p-where "[Phenotype]='A'" \
--o-filtered-table your_filtered_table_A.qza

This would be the same for each phenotype - you would just modify the where parameter to be phenotype=B & C to create the other two filtered tables.

From here you can run ANCOM-BC on each of these filtered tables, using bodysite in the formula parameter. The only thing to be aware of is that one group within your bodysite groups will be selected as the reference (i.e. the differential abundance will be calculated with respect to a particular group). By default, the selected reference level will be chosen in alphabetical order, but you can specify another group if you prefer. So for your data, bodysite L would be the default reference level selected, but you could choose bodysite M if you prefer.

Phenotype A vs B vs C at bodysite L
Phenotype A vs B vs C at bodysite M

These comparisons would be made through a 'global test', which is a parameter not currently exposed in QIIME 2's implementation of ANCOM-BC. If you're interested in utilizing this parameter and are familiar with R, you could run the R implementation of ANCOM-BC (which is what our implementation wraps, just without some of the same functionality). Information on that R package can be found here, if you're interested:

https://github.com/FrederickHuangLin/ANCOMBC

Hope this helps! Cheers

kindergarten · January 2, 2024, 6:50pm

Hi @lizgehret
Happy new year.
I really appreciate your detailed response to my issue.

For comparing body-sites for a given phenotype:
I will generate three sub-tables, one for each phenotype.
I think I can run this analysis in qiime2?
How do I address taxa levels? Shall I collapse sub-tables to Level 2, 5, 6, 7 and then analyze with qiime2?

For comparing phenotype at a body-site:
Analysis should be run in R using existing table. Correct?

Thanks
Gurjit

timanix · January 3, 2024, 5:41pm

Hello!
Just in case if @lizgehret on a vacation I will jump in and try my best to address the question.

First of all, happy New Year!

As I understood, you have two sample types, that should be compared versus each other. In that case yes, you can run it separately for each phenotype within Qiime2.

You can run it at ASV level or at any taxonomy level of your choice. In that case you need to collapse the table to desired level before Ancombc.

If your goal to perform A vs B vs C comparisons then you should follow @lizgehret advice from the earlier comments and run it in R. If you have one phenotype that can be chosen as a reference for two others, then you can run it in Qiime2.

Hope it helps.
Best,

kindergarten · January 3, 2024, 6:14pm

Hi @timanix
Thanks for prompt response.

I will collapse phenotype tables to appropriate levels and analyze with qiime2.

For the second part, comparing A vs B vs C, I can assign a reference phenotype in "--p-reference-levels".
How do I specify the body-site? (I want to compare A vs B vs C at body-site M and so on).
As per my understanding, I should use --p-formula 'bodysite * animal'. Is this correct?
Or shall I filter the table to generate sub-tables for each body site?

Thanks
Gurjit

timanix · January 3, 2024, 6:21pm

Since from my experience (not a rule!) body-sites have different microbiome profiles I would separate tables based on sample type as well and then compare phenotypes either all vs all in R or vs a reference in Qiime2.

system · February 4, 2024, 12:22am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.