Differential abundance analysis (e.g. ANCOM) for paired samples (e.g. normal tissue vs. tumor tissue from cancer patients)

Hello everyone,


I’m analyzing microbiome sequencing data of gastric cancer patients in South Korea. Basically, we have sequencing data from four samples (normal tissue, tumor tissue, gastric fluid, and stool) for each patient (N=17). Thanks to QIIME 2, I have been able to make many interesting – and biologically sound – observations. For example, when I made taxonomy bar plots, I observed that certain bacteria are clearly differentially abundant between normal tissue vs. tumor tissue (sorry I can’t share the plots since this work has not been published yet).

The issue

Encouraged by the above observation, I proceeded to perform differential abundance analysis between normal tissue vs. tumor tissue using ANCOM in QIIME 2. Contrary to my expectation, however, ANCOM returned no significant hits, which was very surprising to me because my taxonomy bar plots were telling a very different story. Before anyone asks, I did filter the ASV table so that it only contains samples from normal tissue and tumor tissue. In addition, I did add pseudocount to the ASV table and also tried collapsing the ASV table at different taxonomic ranks (e.g. genus).

Even though the ANCOM result was disappointing, because I was so convinced that there was differential abundance between normal tissue vs. tumor tissue (based on the taxonomy bar plots), I decided to manually perform paired testing (i.e. the Wilcoxon signed-rank test) for the top five most abundant taxa in the samples. This time, the top two most abudant taxa returned as statistically significant with p-value = 0.000839 and p-value = 0.003845. Of note, those p-values are not adjusted for multiple testing.

My current hypothesis for why ANCOM did not produce any significant hits earlier is because ANCOM did not use paired testing. In other words, I think if I were somehow able to perform ANCOM with paired testing, ANCOM would have returned significant hits (i.e. increased statistical power). This is where my train of questions leaves from the station.

The questions

Q1. Is it possible to perform ANCOM with paired testing?

I’ve been searching in this forum for the answer, but I’m getting apparently mixed signals (probably because some posts are older than others). Here is the list of relevant posts I found in the forum:

Generally speaking, so far, I’m getting the impression that:

  1. Currently, it is NOT possible to perform ANCOM with paired testing within QIIME 2. For example, according to the post Taxa abundance analysis, @mortonjt wrote (I can’t seem to directly quote his remark for some reason):

    @jairideout is right - ANCOM doesn’t support pvalues or pairwise comparisons. But you can get the W-statistic, which is a proxy for the statistical significance.

  2. Some say paired testing is supported in ANCOM-II in R, but I could not find any reference or documentation which says this is possible. For example, @Nicholas_Bokulich wrote in the post Compare more taxa simultaneously using pairwise-difference:

    Instead, use ANCOM. The ANCOM action currently in QIIME 2 does not allow paired testing, but you can use ANCOM or DESeq2 in R to performed paired tests.

Q2. Are there alternative tools for differential abundance analysis that support paired testing?

While looking for answers for Q1, I came across a number of different tools that could potentially be used instead of ANCOM. Of course, I understand that these tools do not necessarily output the same type of differential abundance analysis as ANCOM.

Here is the list of alternative tools I found so far:

  1. There is this qiime longitudinal pairwise-differences command which I didn’t know before, but apparently runs paired testing for one specific taxon at a time. I reckon this runs the Wilcoxon signed-rank test under the hood, similar to what I did manually above. However, it seems like you’re not supposed to abuse this command and run this for all taxa in your dataset because @Nicholas_Bokulich wrote in the post Compare more taxa simultaneously using pairwise-difference:

    Not possible in QIIME 2. We do not make this more convenient because the Wilcoxon test used in that action is not really appropriate for compositional microbiome data (has a high false-positive error rate).

  2. There is a QIIME 2 plugin called Gneiss which performs differential abundance analysis using what’s known as “balances”. From my understanding, this is fundamentally different from ANCOM but is still useful for exploring differential abundance between two or more groups. However, it seems like Gneiss does not support pairwise comparison either according to the post Pairwise ANCOM and Gneiss, filtering and interpretation. In this post, @mortonjt wrote:

    With Gneiss, if you have n groups, you can run n-1 tests (by keeping one category as a reference) - but not pairwise comparisons (see explanation here ). I believe this is also the case with ANCOM2 since it is also using linear models underneath the hood.

  3. There are some other tools that got mentioned along my journey like DESeq2 and LEfSe. However, I wanted to ask the QIIME 2 community before digging in deeper on those tools. Plus, it seems like DESeq2 is primarily designed for differential expression analysis from RNAseq data and LEfSe is not a QIIME 2 plugin.


If you read this far, I really appreciate you taking time for me. I hope I’m not the only one who’s curious about differential abundance analysis with paired testing. Looking forward to hearing everyone’s thoughts!


Neat study!

Seems plausible

Not at the moment

Yep! This is a matter of post age... in the first few releases of q2-composition it was technically possible to do a paired test with ANCOM but the test option was removed in later releases.

Yes — not in QIIME 2 (as far as I know), but aldex2 would be a good solution in R, as a matter of fact here's the relevant documentation:

Perhaps @dgiguer can let us know if a paired test is also possible with q2-aldex2, but I could not find it from a quick search of the documentation.

Also not a QIIME 2 solution, but a very neat solution recently discussed by @jwdebelius and @mortonjt on this forum:

I'm glad you did — see what @jwdebelius has to say about these:


Hi @Nicholas_Bokulich,

q2-aldex2 is not yet modular like the R package so the option for a paired t-test is not yet available in q2-aldex2 :frowning:



Hi @sbslee,

If I can add one more thing to Nick’s advice, I would check and see if there’s a signal in beta diversity first. This approach is semi controversial, but I have power concerns (:zap:), especially with 17 individuals. So, my first suggestion would be to check and see what the within/between individual differences look like and think about how to work with them. ANCOM and the songbird-based method have less power to detect differences because of FDR penalties, and it feels like compositional methods are just more conservative than previous solutions.


@sbslee, while it is not possible to do paired t-test in qiime2 ANCOM, it is totally doable to do it in skbio : http://scikit-bio.org/docs/0.5.0/generated/generated/skbio.stats.composition.ancom.html

It would look something like as follows

import pandas as pd
from scipy.stats import wilcoxon
import qiime2
table = qiiime2.Artifact.load('<your table>').view(pd.DataFrame)
metadata = pd.read_table('<your sample metadata>', index_col=0)
res = ancom(table, metadata['your_experimental_condition'], significance_test=wilcoxon)

That being said, songbird and gneiss can both handle paired samples (gneiss has a linear mixed effects model) through specifying the regression formula. Note that songbird doesn’t do hypothesis testing, it’ll just estimate means (so noting @jwdebelius’s comment, songbird doesn’t have any power). But significance testing can be enabled with tools such as qurro - you can always run wilcoxon and paired t-test with raw log-ratios.


Hello @mortonjt,

Thank you very much for commenting! Can I ask a quick follow-up question?

I looked at your example code as well as the the skbio link you attached, but I’m still not sure whether skbio-ANCOM can do paired t-test as you suggest. Using my study design described above (i.e. normal tissue vs. tumor tissue from 17 gastric cancer patients) as an example, in your code, your_experimental_condition would be the column that indicates whether a sample is from normal tissue or tumor tissue. However, isn’t that essentially the same as a typical qiime2-ANCOM analysis involving two groups without paired infromation? That is, you’re not providing any information about the patients to ANCOM, so how would ANCOM control for variation specific to each patient? I think this is more apparent in the docs page for skbio.stats.composition.ancom you attached:

skbio.stats.composition.ancom(table, grouping, alpha=0.05, tau=0.02, theta=0.1, multiple_comparisons_correction='holm-bonferroni', significance_test=None, percentiles=(0.0, 25.0, 50.0, 75.0, 100.0))

As you can see, there is no argument for providing pair information. The grouping argument is for dividing the samples into different groups (e.g. normal tissue vs. tumor tissue) but it does not capture the pair information. There is also no mention about paired testing throughout the docs. Maybe I’m missing something here. Could you please let me know what I’m missing?

P.S. I also appreciate your comments about songbird and gneiss, and how you can specify regression formula for them. That is very good to know. If ANCOM (both qiime2 version and skbio version) truly doesn’t support paired t-test, I will try those options.

Hello @dgiguer,

Thank you for letting us know that q2-aldex2 doesn’t support paired testing yet. Can I ask a follow-up question?

After reading the ALDEx2 paper (Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis) (it’s very nice paper BTW, I learned a lot from it!), I was interested in trying ALDEx2 in R for paired testing. As Nick suggested above, I tried out the aldex.ttest method by following the example in the manual. (I’m aware that the dataset used for this example is not microbiome.)

#subset for efficiency
selex <- selex[1201:1600,]
conds <- c(rep("NS", 7), rep("S", 7))
x <- aldex.clr(selex, conds, mc.samples=2, denom="all")
ttest.test <- aldex.ttest(x)

Do you have a tutorial or example I can take a look for paired testing with the aldex.ttest (because above example is not paired testing)? I understand I have to use paired.test = TRUE, but how does ALDEx2 know which samples are paired? I can’t find any argument for providing pair information. I looked through the manual but couldn’t find any mention of this (including the aldex.clr method). I would greatly appreciate if you could give me a toy example or point to a reference where this is discussed.

1 Like

Hello all,

For those who may be interested, I just wanted to give a quick update on aldex.ttest for paired testing. According to one of the authors I contacted via the GitHub page for ALDEx2, you just need to set paired.test=T and order the samples properly as shown below:

to run a paired t-test, the samples must be in order of pairing.

So A1,B1,A2,B2 etc or A1,A2,A3,… B1,B2,B3 …


I also can confirm this works great for differential abundance analysis with paired testing!


Hi @sbslee,

I see what you mean. It may help to see the scipy docs on this: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html

Specifically, the ordering needs to be preserved. According to the scipy docs, the ordering of x and y needs to come from the same source; so if you were to grab element 4 for x and y, where x represents normal tissue and y is the diseased tissue, then both of those would come from patient 4.

If you want to do this in skbio ancom, you’ll need to sort accordingly, meaning that you’ll need to first sort by patient, then by tissue type.

import pandas as pd
from scipy.stats import ttest_rel
import qiime2
table = qiiime2.Artifact.load('<your table>').view(pd.DataFrame)
metadata = pd.read_table('<your sample metadata>', index_col=0)
metadata = metadata.sort_values(['patient_id', 'tissue_type'])
table = table.loc[metadata.index]
res = ancom(table, metadata['tissue_type'], significance_test=ttest_rel)

For those who may find this helpful:

I was finally able to perform ANCOM with paired testing by following @mortonjt’s informative instructions (thank you Jamie!).

However, I ran into the same problem discussed in the following post: ANCOM: ‘low W taxa identified as significant’ issue’s workaround, ANCOM2 code/instructions. Basically, ANCOM returned many taxa with W=0 as “significant” (i.e. TRUE in the Reject null hypothesis column). In fact, all of my 165 taxa were returned as significant (note: I only have 165 taxa because I collapsed the feature table to genus level).

My speculation is that because I only have 165 taxa and 34 samples (17 tumor and 17 normal tissues), the threshold calculation did not go well for ANCOM – as described in above post – and that’s why it returned everything as significant. But other than that, ANCOM correctly (?) gave taxa with seemingly high differential (e.g. determined from taxa bar plots by eye) a large W value (34 was the largest).

Here’s the code I used in Jupyter Notebook (btw, I tried both paired and unpaired testing for comparison):

from qiime2 import Artifact
from qiime2 import Metadata
from skbio.stats.composition import ancom as skbio_ancom
from scipy.stats import ttest_rel
from scipy.stats import ttest_ind
import pandas as pd

table = Artifact.load('tissue-level6-comp-table.qza').view(pd.DataFrame)
metadata = Metadata.load('sample-metadata.tsv').to_dataframe()
metadata = metadata[metadata['Site'].isin(['N', 'T'])]
metadata.sort_values(['Subject', 'Site'], inplace=True)
table = table.loc[metadata.index]

# (34, 165)

results_ttest_rel = skbio_ancom(table, metadata['Site'], significance_test=ttest_rel)
results_ttest_ind = skbio_ancom(table, metadata['Site'], significance_test=ttest_ind)

# Here, all taxa were returned as significant by ANCOM
results_ttest_rel[0]['Reject null hypothesis'].unique()
# array([ True])
1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.