So I am currently trying out the ANCOMBC module from QIIME2 and I am interested in checking the differential abundance among ASVs using an interaction term in the --p-formula parameter.
A bit of background is that I have two cultivars, Tarraco and Mardia, and I have two health-status, Healthy and Diseased. ADONIS results suggested that there is an interaction between both cultivar and health-status factors. As a result, when running the ANCOMBC module, I include this interaction in the --p-formula parameter like this:
It runs all fine! However, in the output, it seems that the Tarraco cultivar and the Healthy status are taken as reference but I am not sure how to interpret the output.
If the blue bars represent the enriched ASVs in Healthy Tarraco samples, do the orange bars represent Diseased Mardia samples? But what about the rest of the combinations like Diseased Tarraco and Healthy Mardia samples?
Thanks for reaching out! Happy to provide some clarification on this.
I'll first talk about the reference level(s) and how those are utilized, as I think this may help clarify what you're seeing in this barplot. When running ancombc, you have the option to select reference level(s) in your data - these are the 'intercept(s)' in the calculation and will be used as the measurement by which each other group is compared against (for relative enrichment or depletion). If no reference level(s) are selected, this will default to the term(s) included in the formula. In your example above, you've included two columns within your metadata - 'Cultivar' and 'Status'. Since you didn't use a particular group within either of those columns, the reference levels will default to the group within each selected column that occurs first in alphabetical order.
With regards to your barplot results - what this is showing is the log-fold change for all features with respect to the chosen intercept AKA reference level. Blue bars represent features that are enriched relative to the intercept, and orange bars represent features that are depleted. You should be able to click through the different groups in this visualization to examine the different interactions between these two columns - this is just a barplot representation of the ancombc output differentials with per-column visuals. You should expect to see one column per group combination for the two columns you provided in your formula - i.e. Cultivargroup:Statusgroup.
Hello!
You used "Cultivar::Mardia" "Status::Diseased" as references, so barplot is showing you the barplots for other levels relative to the reference, so level that is indicated at the top is a level that was compared to the reference, with blue enriched in the level and orange - depleted in the level, or more abundant in the reference.
You used it correctly if you intended to use indicated in the command levels as references.
Alright. Good to know. So just to confirm, if I used "Cultivar::Mardia" and "Status::Diseased" as mi reference levels, then it means that the ASVs represented by the blue bars are enriched in my reference levels in comparison to Cultivar Tarraco and Status Healthy, right?
Also, is there a numeric interpretation for the log fold change or is it enough to mention that a specific ASV is enriched in 2 LFC units in comparison to the specified reference treatment?