A couple of questions about Gneiss and model overfitting

mstagliamonte · August 6, 2019, 7:20pm

Dear Qiimers,

I am moving my first steps with Gneiss, and I am trying to get more familiar with the results. I might need some help with interpreting the output:

Looking at the models summary, mse vs pred_mse: as I understand it, we are comparing the mse from the model on the training dataset (random 90% of the data) vs the test dataset (leftover 10% of the data). If the model is overfit, the error in the predictions will be larger than the mse on the model. My question is: how good is good enough? I.e, should the pred_mse be in the order of 1/10th of the other? What if the two values are about the same ? Is there a ratio between the two that can be used as a rule of thumb to judge over/underfitting?
Comparison between the two plots: "projected predictions" and "projected residuals". In the first, I can eyeball if the predicted values are a reasonable representation of the real data; in the second, by comparison with the first, I can check if the residuals are in the same order of magnitude of the predictions (= not good; that would mean large random error and scarce predictive value of my model, like it appears on the tutorial dataset, where Rsquared=~0.11). Is my interpretation correct?

Thank you for your kind attention,
Max

mortonjt · August 7, 2019, 1:58pm

pred_mse just needs to be equal or smaller to mse. Otherwise there is a good chance of overfitting
Remember you are predicting the abundances of an entire microbiome community - so that R^2 means that you are able to explain 11% of all of the variance in that community.

mstagliamonte · August 7, 2019, 2:19pm

Thank you, @mortonjt ,

For your kind answer. Just to clarify no. 2: Is my interpretation of the two scatter plots correct? Please let me know if my question is not clear and I need to add further details.

Bear with me, I'm a noob

mortonjt · August 9, 2019, 2:05am

Yes, your interpretation for question 1 is correct

system · September 9, 2019, 8:29am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.