ANCOM using a continuous metadata column


Is there any way to use ANCOM within QIIME2 to assess whether features differ in abundance based on a continuous variable (rather than a categorical variable that compares feature abundance between two or more groups)? For example, I would essentially like to use a continuous numeric variable in place of the parameter --m-metadata-column. It seems from the ANCOM paper that was referenced in one of the tutorials that fitting a model using a continuous variable is indeed possible but I’m unsure how to implement that here within QIIME2. I apologize if this is being posted in the wrong category!

Thank you!


Hi, @Dot! Welcome to the forum! :wave:

Can you point to specific quotes from the paper you are referring to?

As far as QIIME 2 is concerned, you can treat a numeric metadata column as categorical by specifying the column type as categorical. Our handy dandy metadata tutorial will clarify how to do that:

Disclaimer: I’ve never done this before, so take this advice with a grain of salt.

With this feature, you should be able to discretize your data (as they did in the ANCOM paper) by lumping your data into bins representing slices of that continuous variable. This would require that you edit your metadata to support such discretization.

For example, if you have a numeric column with possible values 1-10 and you want to discretize that into 5 categories, you could then create 5 new categorical metadata columns indicating which category the item belongs to (1-2, 3-4, 5-6, 7-8, or 9-10). You could then run ANCOM, once for each group. For reference, the Parkinson’s Mice tutorial demonstrates how to run ANCOM on two different groups and compare the results.

Hi @andrewsanchez ! Thank you for your response! :grinning:

Yes, in the legend for Figure 3 it says:

The third row provides the mean OTU relative abundance for Bacilli against categories of breast milk variable and for Clostridia against categories of ‘Days on antibiotics’. Although, as in LaRosa et al. (16), ‘Day of life’ and ‘Days on antibiotics’ were analyzed as continuous variables, for simplicity of plotting in this figure they were discretized."

They also say right before Figure 3:

For plotting purposes, we discretized days on antibiotics into four categories.

…so I assumed the ANCOM analysis was performed on continuous variables in these cases and only plotted in discretized form afterwards?

I’ll try what you suggested and discretize the data I have. Thanks!