Hi - i'm going through the Parkinsons tutorial and reached the last part in which a mixed linear effect model is applied which in the context in which its applied from my understanding is that its testing if there is a relationship between donor and genotype in affecting fecal community (beta diversity) over time with the initial baseline time point of day 7
qiime longitudinal linear-mixed-effects
--m-metadata-file ./metadata.tsv
--m-metadata-file ./from_first_unifrac.qza
--p-metric Distance
--p-state-column days_post_transplant
--p-individual-id-column mouse_id
--p-group-columns genotype,donor
--o-visualization ./from_first_unifrac_lme.qzv!
i just need a little help with the interpretation of results (attached)
Are these regression scatterplots essentially showing the variation in fecal community in susceptible vs wild-type and hc (healthy) vs pd (parkinsons mice) - with the solid lines as the group mean of diversity and data points as individual beta diversity of samples - with the shaded area being the average overall variation of each group type at specific time points?
I'm not sure how to interpret the project residual plot either - is this like an ANOVA plot? so is it looking at variation in data between different groups - I've read that data point should roughly be centered around 0 and if they are lower it indicates lower variation compared to the mean variation of the group and if placed higher than it is higher than the mean and if the plots are not centered then its a poor model? (not sure what poor model means either really) - if this is correct then why is this useful to know? / how would you integrate this into analysis?
Also i'm a little confused with the model results its my understanding that genotype acts as the independent variable and donor + distance (days_post_transplant) would be pitted against genotype to determine if theres a relationship between these dependent variables and genotype in eliciting change in beta diversity of feces - however not all combinations are explored e.g. genotype[wild type]:donor[pd_1] are looked at but not susceptible genotype and wild-type donor etc - i.e. not all combinations of metadata categories are explored to determine if there is a relationship between these variables
so for one of the questions in the tutorial it asks - is there a significant association between genotype and temporal change?
looking at days_post_transplant[T.wild-type) the P value is <0.05 indicating there is - but only Wild-type is considered and not susceptible - so how would i answer these types of questions?
lastly i'm assuming i just look at the p value to determine significance but are there any other values of importance e.g. z value and the others 0.025 and 0.975 - are these important and what do they represent ?
Any advice is much appreciated - thank you