Using ANCOM and gneiss with one categorical variable

prmesi · August 18, 2017, 4:27pm

Hi,
I liked the idea of implementing ANCOM and gneiss for differential abundance test in my analyses pipeline so after completing the tutorials I wanted to give them shot on my own dataset, where 3 groups were exposed to different amount of nitrate containing food, with 10 subjects per group. Fecal specimen was taken from the jejunum. 16S libraries were made targeting v3v4 region and were sequenced using MiSeq platform. I used qiime2 pipeline for single read analyses. My feature table was very simplified I only had one categorical variable, which was the nitrate treatment and min 30000 reads per sample. None of the core diversity metrics came back significant though looking at the composition seems like there is some group related differences along with big individual variation so I thought about giving a shot for both gneiss and ANCOM.
• I got results with gneiss, using correlation clustering and ols-regression with one variable. (Is there any way to do a simple linear regression using gneiss? Does it even make sense to do it with one variable only?).
• Using the same feature table I didn't get anything for ANCOM. Wouldn't be expected that the potential taxa from gneiss would also show up in the ANCOM results?

At this point I am not completely sure if I am using those two plugins with the right parameters or with properly tailored artifacts. Unfortunately I cannot relate my data to any of the tutorials to work it out by myself. I apologize if I am asking too trivial questions. Please find my files attached:

jejunum_table-no.blanks_beef.diet_raref.qza (36.5 KB)
BJ4_mapping.file_for.Q2_JEJ.txt (2.0 KB)

R1_jejunum_heatmap.qzv (100.4 KB)
y0_taxa_summary_R1_jejunum.qzv (87.1 KB)
regression_summary_jejunum_R1.qzv (279.5 KB)

ancom-jejunum_R1_nitrate.qzv (41.1 KB)

Thank you,

mortonjt · August 21, 2017, 5:44pm

Hi @prmesi - yes you can run simple linear regression using gneiss. You just need to just include a single continuous variable.

Given the gneiss plots that you attached I'd be hesitant to say that there are differentially abundant taxa present in this dataset (I couldn't find any corrected p-values that were below 0.05)

prmesi · August 21, 2017, 7:06pm

Thank you Jamie for the quick feedback. Just to make sure I got this right, when you said that you couldn't find any corrected p-values below 0.05, you meant in the first two rows (Nitrate[T.no] and Nitrate[T.low]) of the regression heatmap above the intercept? By any chance is there any table output where I can check for the p-values corresponding to each balances?

Thank you.

mortonjt · August 21, 2017, 7:51pm

Yup - you can download it from regression_summary_jejunum_R1.qzv as shown below

Careful - you'll want to only consider those pvalues under Corrected Pvalues. And pvalues close to 0.05 will also warrant skepticism, so make sure to carefully sanity check the boxplots / scatter plots in the balance-taxonomy command.

prmesi · August 21, 2017, 8:12pm

Great. This is very helpful.
One more thing, I brought this up in the original post: Is it reasonable to expect that the gneiss and ANCOM results overlaps, since both approaching differential abundance with log ratio?

mortonjt · August 21, 2017, 10:45pm

That is a very good question!

Yes and no. Turns out that this is an ill-defined problem - you can't actually exactly determine which species are differentially abundant (see gneiss tutorial).

Underneath the hood, ANCOM computes all pairwise log ratios, and attempts to infer which species changing based on this pairwise log ratio matrix. To do this, ANCOM assumes that few species are changing wrt to a variable. The balances in gneiss tries to determine which partitions of microbes can explain a variable. Because these are very different methods for answering fundamentally different questions, there are definitely scenarios where these methods give different results.

prmesi · August 22, 2017, 9:40pm

Thank you so much Jamie for clarifying this! I appreciate your time.