Hi @Tohseef,
This is a big question! Let's try to break it down. You may also find it useful to review linear regression and interpretation if its been a while (a good, basic stats book might be helpful) or talk to a statistician. I've also found this post on Towards Data Science and the parameters section of this post about Stata results helpful at looking at interpretation, but a lot of your questions come down to some regression modeling basics, and there are a lot of really smart people who have invested huge amounts of time into building material teaching these things.
I think the shaded area may be the standard deviation as opposed to the variation, but yes, this is correct.
I'm going to refer you to this article on residual plots, generally. Once again, you have your mean as the solid line, the standard deviation as the shaded area, and the actual values.
Some of this has to do with the way the model is coded here. You could explore the documentation of this particular function to get better control over the modeling.
For this model, and most categorical regression models, you're working against a reference group. Your reference group here is a susceptible mouse from the healthy donor at some theoretical time 0. So, on average, there's a within-group distance of 0.248 [95% CI 0.101, 0.396].
Then, the genotype[T.wild type]
term asks how much the intercept of your line changes if you have a wild type mouse compares to a susceptible mouse if you hold everything else constant. (Your distance increases by 0.265 [95% CI [0.047, 0.465], which is significant.)
Then, the interaction term, genotype[T.wild type]:donor[T.pd-1]
, tells us how the slope changes when you've got the pd-1
donor, compared to when we held things constant, so here, we find that there's a decrease of -0.425 [95% CI -0.723, -0.126] distance units compared to the genotype[T.wild type]
term alone (which is basically genotype[T.wild type]:donor[T.hc-1]
because of the way we hold things constant.)
This expands out with the terms.
Now that we've talked through the terms and you've got some resources, Im going to bounce this one back to you.
In your table, you have
Values | Definition |
---|---|
Coef | The slope (for continuous variables) or intercept (for categorical) that describes the difference in the value. |
Std.Err. | The error in that slope measurement |
z | the parametric test statistic that gets used to calculate your p-value. You maybe want to look at t-tests and f-tests for a better sense of this (although it's a different distribution) |
P>abs(z) | the frequentist p-value giving you the probablity that the value is significantly different at some critical level. This essentially is a representation of the probability that the z-value associated with your data is more extreme than X of z-values in a given distribution. |
[0.025 & 0.97] | The lower and upper limits of the 95% confidence interval, which is useful for describing the error and estimate around your results. Look for info on effect size representation for more information about why this matters and why you should probably be presenting it. |
Best,
Justine