ANCOM--CLR vs ALR

audreyduff · April 17, 2023, 10:00pm

Hello all,

This may end up being moved to General Discussion, but I had a question regarding log transformations performed by the ANCOM plugin. Are these transformations truly centered log ratio transformations (clr) or are they additive log ratios (alr)? If I understand correctly, clr uses the geometric mean as the reference and alr uses a particular component as a reference.

I've come across papers that cite ANCOM performs alr transformations (Gloor et al., Nearing et al., Hu et al.) whereas differential abundance tests like ALDEx and ALDEx2 use clr. However, the QIIME2 Parkinson's tutorial specifically states the centered log transformation is used with ANCOM, as do some forum posts like this one, and the ANCOM plugin "--p-transform-function" text options include "clr" but not "alr". The original paper by Mandal et al does not explicitly say centered log transformations were used, but does mention compositional log transformations which would also be abbreviated clr...

I'm far from a statistician or mathematician, but would like to better understand this discrepancy. Could anyone provide some clarity on what's actually being performed with the plugin?

Thanks!
Audrey

crusher083 · April 18, 2023, 12:42pm

Hello and welcome to :qiime2:!

This is true, compositional data analysis is quite messy, and statisticians tend to use heterogeneous terminology for the operations they perform.

ALR basically utilizes one feature (sequence) as a reference frame. It is a common practice in RNA-Seq, because we can select consistently expressed genes called "housekeeping genes".

In the microbiome, this will not work because we don't have information about microbes with a stable abundance anywhere. Therefore it's hard to select a meaningful reference.

In cases you don't know what the plugin does, it's useful to take a look at the code, as it's open-source software.
Let's see:

github.com

qiime2/q2-composition/blob/d8265fd270e0d2fe4b04783d2980d3a24cf9c0e9/q2_composition/_ancom.py#L9-L18


      
          import json
          import os
          import pkg_resources
          from distutils.dir_util import copy_tree
          
          import qiime2
          import q2templates
          import pandas as pd
          from skbio.stats.composition import ancom as skbio_ancom
          from skbio.stats.composition import clr

ANCOM imports CLR function from skbio package and later uses it for data normalization.

Cheers,
Valentyn

jwdebelius · April 18, 2023, 4:12pm

Hi @crusher083 and @audreyduff,

The transform underlying the ANCOM I test is an ALR. ANCOM takes the pair of species, calculates the log ratio, and applies the statistical test. The W statistic is calculated as the total (or percent) of species that are significantly different after FDR correlation at a threshold set a priori.

Within QIIME 2, the visualization needs to be calculated on transformed data. So, I think the --p-transform-function dictates the transform applied to calculate the effect size for the volcano plot.

Best,
Justine

system · May 19, 2023, 10:12pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.