Differential abundance analysis (e.g. ANCOM) for paired samples (e.g. normal tissue vs. tumor tissue from cancer patients)

sbslee · December 11, 2020, 8:27am

For those who may find this helpful:

I was finally able to perform ANCOM with paired testing by following @mortonjt’s informative instructions (thank you Jamie!).

However, I ran into the same problem discussed in the following post: ANCOM: ‘low W taxa identified as significant’ issue’s workaround, ANCOM2 code/instructions. Basically, ANCOM returned many taxa with W=0 as “significant” (i.e. TRUE in the Reject null hypothesis column). In fact, all of my 165 taxa were returned as significant (note: I only have 165 taxa because I collapsed the feature table to genus level).

My speculation is that because I only have 165 taxa and 34 samples (17 tumor and 17 normal tissues), the threshold calculation did not go well for ANCOM – as described in above post – and that’s why it returned everything as significant. But other than that, ANCOM correctly (?) gave taxa with seemingly high differential (e.g. determined from taxa bar plots by eye) a large W value (34 was the largest).

Here’s the code I used in Jupyter Notebook (btw, I tried both paired and unpaired testing for comparison):

from qiime2 import Artifact
from qiime2 import Metadata
from skbio.stats.composition import ancom as skbio_ancom
from scipy.stats import ttest_rel
from scipy.stats import ttest_ind
import pandas as pd

table = Artifact.load('tissue-level6-comp-table.qza').view(pd.DataFrame)
metadata = Metadata.load('sample-metadata.tsv').to_dataframe()
metadata = metadata[metadata['Site'].isin(['N', 'T'])]
metadata.sort_values(['Subject', 'Site'], inplace=True)
table = table.loc[metadata.index]

print(table.shape)
# (34, 165)

results_ttest_rel = skbio_ancom(table, metadata['Site'], significance_test=ttest_rel)
results_ttest_ind = skbio_ancom(table, metadata['Site'], significance_test=ttest_ind)

# Here, all taxa were returned as significant by ANCOM
results_ttest_rel[0]['Reject null hypothesis'].unique()
# array([ True])