I'm currently exploring a microbiome dataset with gneiss and was wondering if I could get some advice regarding my output.
The dataset consists of 19 samples (15 in group "A" and only 4 in group "B") so I assumed that it would be difficult to tease apart any inter-group variation. I wanted to explore preliminary differences before additional sequencing was performed to increase sample sizes.
I began by running gneiss ols-regression and dendrogram-heatmap. The "regression_summary.qzv" output contains 19 samples and 3 covariates (violating the 1 in 10 rule) but pred_mse is consistently lower than model_mse. Can I therefore assume that my model is not overfitting? Also, if I am interested in significant taxa differences within the "samplegroup" covariate, can I preferentially explore only the "y8" balance as this has the only significant corrected Pvalue with respect to "samplegroup" in the Regression Coefficients Summary?
The gneiss balance-taxonomy output for "y8" and "samplegroup" (taxa.qzv) contains a list of numerator and denominator taxa. Is it acceptable to say that the balance of these taxa is significantly different between groups A and B based on the corrected Pvalue found in the regression summary file?
Thank you in advance; any perspective on this output would be greatly appreciated.