Hey everyone, it's John again
I'm processing the gut microbiome data from the Mars500 experiment to see how the gut microbiome of the participants change due to the isolation they are in. Since this data is time series data I've been using the regular alpha and beta diversity testing along with the volatility plots in the q2-longitudinal plug-in.
I gathered the data from the paper, Reanalysis of the Mars500 experiment reveals common gut microbiome alterations in astronauts induced by long-duration confinement. Which looked at the same data but also included the habitat microbiome data as well. I'm more interested in the changes overtime in the gut microbiome due to the effects of isolation they are experiencing.
The paper just compared the gut and habitat samples from the beginning and ending of the isolation part of the experiment. I included the samples found in-between to see the full timeline, so I created a column titled time_group which had five different groupings depending on what time the sample was taken.
Pre is between 0 and 1 days, 2 time points samples were taken.
Early is between 7 and 45 days, 5 time points samples were taken.
Mid is between 60 and 390 days, 13 time points samples were taken.
Late is between 420 and 520 days, 5 time points samples were taken.
After is between 530 and 700 days, 3 time points samples were taken.
Metadata: sample-metadata.tsv (25.7 KB)
I conducted my tests and now I'm validating the results I found in the volatility plots. I saw that to test the results that are seen in the volatility plots, you want to use the LME that in the q2-longitudinal plug-in.
Shannon_volatility plot: shannon_volatility.qzv (476.3 KB)
So I'm currently running the LME with the Shannon diversity matrix and I created two LME models and I'm trying to figure out why they are both saying two different things? First I made this LME just using the state column, shannon-LME-time.qzv (406.5 KB) I made this because I'm just interested to see if the Shannon scores change overtime, which would help show if the prolonged isolation had any effect on the Shannon scores. The function call is below.
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv \
--m-metadata-file Analysis/shannon_vector.qza \
--p-metric shannon_entropy \
--p-individual-id-column Characteristics.Subject. \
--p-state-column time_day \
--o-visualization LME/shannon-LME-time.qzv
In the model results we can see that the p-value is significant for the state column, which my interpretation is that the changes in the Shannon scores over time was due to time.
I then wanted to see if including the time_group column in the independent part would change anything in the LME output. So I changed the above prompt and got this LME, shannon-LME-grp.qzv (449.9 KB). The function call is below.
qiime longitudinal linear-mixed-effects \
--m-metadata-file sample-metadata.tsv \
--m-metadata-file Analysis/shannon_vector.qza \
--p-metric shannon_entropy \
--p-individual-id-column Characteristics.Subject. \
--p-state-column time_day \
--p-group-columns time_group \
--o-visualization LME/shannon-LME-grp.qzv
The results from this LME show that the state column and the group variable didn't have an effect on the Shannon diversity scores. My interpretation from this LME is that the change in the Shannon scores wasn't due to time.
From these results I have 3 questions.
- Why would the inclusion of the group variable change the results of the state column being significant?
- Should I use the time_group variable at all? I'm rethinking the usage of this group variable for the LME since the state column is time and including time again doesn't seem to make the most sense to me.
- Would just using the metric and state-column parameters for the LME function show if the changes in the metric overtime was due to time?
Thank you all for your time and help.