Hi everybody,
I have kind of a general question about which table to use for the ANCOM-BC calculation based on previous calculations.
In my dataset (16s) I have samples from 2 rodent species, from each animal I analysed gut and spleen samples. I want to check if the microbial composition changes based on landuse intensity (low, medium, high).
My first thought was to run the analysis on the whole data set (including both species and both organs) not in qiime but in R, since only there the global test is possible and in my landuseintensity category I have more than 2 groups and don´t want to set one of them as a reference. With the settings: formula = "Species + Organ + Landuseintensity.cat", group = "Landuseintensity.cat", global = TRUE
Then I read this thread which made me think I should probably better use a subsetted table.
I´ve run alpha and beta diversity analysis on my whole dataset, and on subsetted datasets:
- only species A (but including both organs)
- only species B (but including both organs)
- only gut samples (but including both species)
- only spleen samples (but including both species)
- species A_gut, speciesA_spleen, speciesB_gut and speciesB_spleen
In all the alpha and beta diversity analysis, is turns out, that microbial compsition in significantely different between organs and species (alpha and beta). In the speciesA_gut, ... datasets, alpha and betadiversity sometimes are significantly different between landuse intensites and sometimes they´re not.
Based on that knowledge I am wondering which subsetted dataset to use, since apparently organ and species has a huge influence on the microbial community and if I understood the above mentioned thread correctely, I would run ANCOMBC 4 times, on datasets species A_gut, speciesA_spleen, speciesB_gut and speciesB_spleen with formula = "Landuseintensity.cat", global = TRUE, to be able to detect the differently abundant taxa in the landuseintensity categories. Is that correct?
Or should I use a "higher" dataset, say the gut dataset, with the formula set to "Species + Landuseintensity.cat", group = "Landuseintensity.cat", global = TRUE? I think I should kind of do the same but since the whole dataset has a different microbial community, it would show different values and different differentely abundant taxa? Is that thought correct?
Which setting and datasets are the right ones to use in order to know how landuseintensity influences the microbial abundance of Species A, Species B in gut and spleen samples?
I´m sorry if these questions are redundant but me being confused by the properties of my dataset and not being familiar with ANCOMBC has me
Thanks in advance
Best,
Lea