Hello,

I’m currently exploring a microbiome dataset with gneiss and was wondering if I could get some advice regarding my output.

The dataset consists of 19 samples (15 in group “A” and only 4 in group “B”) so I assumed that it would be difficult to tease apart any inter-group variation. I wanted to explore preliminary differences before additional sequencing was performed to increase sample sizes.

I began by running gneiss ols-regression and dendrogram-heatmap. The “regression_summary.qzv” output contains 19 samples and 3 covariates (violating the 1 in 10 rule) but pred_mse is consistently lower than model_mse. Can I therefore assume that my model is not overfitting? Also, if I am interested in significant taxa differences within the “samplegroup” covariate, can I preferentially explore only the “y8” balance as this has the only significant corrected Pvalue with respect to “samplegroup” in the Regression Coefficients Summary?

The gneiss balance-taxonomy output for “y8” and “samplegroup” (taxa.qzv) contains a list of numerator and denominator taxa. Is it acceptable to say that the balance of these taxa is significantly different between groups A and B based on the corrected Pvalue found in the regression summary file?

Thank you in advance; any perspective on this output would be greatly appreciated.

regression_summary.qzv (118.0 KB)

heatmap.qzv (69.1 KB) taxa.qzv (96.9 KB)