Hello,
Sorry, I know this isn't a QIIME2 question, but I have nowhere to find such excellent and responsive people on the internet.
I was running ANCOMBC2 in R but I got an error message.
kenaioutput<-ancombc2(data = phyloseq,
assay_name = "counts",
tax_level = "Genus",
fix_formula = "Lake_Collected_From + Sex_f_m_NA+fish_type + Mass_g+Std_length_mm+Surface_area_group+Prevalence",
rand_formula = NULL,
p_adj_method = "BH",
prv_cut = 0.10,
group = "Lake_Collected_From",
struc_zero = TRUE,
neg_lb = FALSE,
alpha = 0.05,
n_cl = 4,
verbose = TRUE,
global = TRUE,
pairwise = TRUE,
dunnet = TRUE,
trend = FALSE,
iter_control = list(tol = 1e-2, max_iter = 20,
verbose = TRUE))
Error message:
Error: Estimation failed for the following covariates:
fish_typelimnetic, Surface_area_group2, Surface_area_group3, Prevalence0.07, Prevalence0.14, Prevalence0.5
Consider removing these covariates
I found a thread talking about the same error on github. It looks like it's because of multicollinearity issue. However, there was no solution to it. I really need to include these problematic variables in my model as they are really important to my study. I'm thinking about two solutions.
-
My fish_type has two levels. I can make it into a dummy variable and create PCA values. However, ancombc2 is using log-linear regression. I didn't find evidence that PCA values can be used in log-linear regression. Similar to my Surface_area_group (3 levels) and Prevalence (4 levels). I will decrease them to 2 levels (I don't like this).
-
Instead, I use other models, such as MAaslin for multivariate analysis.
Any suggestions would be grateful!
Thank you so much for your help.