Hello,
I’m struggling trying to make sense of the first-distances output. Both the tutorial and previous posts on the topic have been very helpful to understand how the method works, but haven’t find an answer for the particular question I have, which perhaps is very naïve.
Briefly, I have samples collected from the same individuals at pre-treatment (0), and at 1, 5, 9 and 12 months post treatment. I applied first-distances to calculate distances between successive samples collected from the same subjects, in order to use them to analyze longitudinal changes in beta diversity by linear-mixed-effects (LME). My problem is that when I run the volatility analysis to visualize the output of first-distances, it only plots data for time points 1, 5 and 9 months post-treatment (see picture below). If first-distances calculates the magnitude of change between successive time points (ΔYt=Yt−Yt-1), shouldn’t I be getting values also for 12 months? i.e., ΔY12=Y12−Y9. Indeed, when I run first-distances with baseline set at pre-treatment (0) (using the same metadata files as above) I get values for time point 12, which I take are ΔY12=Y12−Y0.
Commands used to run and visualize first-distances (no baseline):
qiime longitudinal first-distances \
--i-distance-matrix bray_curtis_distance_matrix.qza \
--m-metadata-file sample-metadata-all-filtered.tsv \
--p-state-column Timepoint \
--p-individual-id-column Patient \
--p-replicate-handling random \
--o-first-distances first-distances-all-filtered.qza
qiime longitudinal volatility \
--m-metadata-file sample-metadata-all-filtered.tsv \
--m-metadata-file first-distances-all-filtered.qza \
--p-default-metric Distance \
--p-default-group-column treatment \
--p-state-column Timepoint \
--p-individual-id-column Patient \
--o-visualization volatility-distance-all-filtered.qzv
Volatility plot:
Commands used to run and visualize first-distances (baseline):
qiime longitudinal first-distances \
--i-distance-matrix bray_curtis_distance_matrix.qza \
--m-metadata-file sample-metadata-all-filtered.tsv \
--p-state-column Timepoint \
--p-individual-id-column Patient \
--p-replicate-handling random \
--p-baseline 0 \
--o-first-distances first-distances-baseline-0-all-filtered.qza
qiime longitudinal volatility \
--m-metadata-file sample-metadata-all-filtered.tsv \
--m-metadata-file first-distances-baseline-0-all-filtered.qza \
--p-default-metric Distance \
--p-default-group-column treatment \
--p-state-column Timepoint \
--p-individual-id-column Patient \
--o-visualization volatility-distance-baseline-0-all-filtered.qzv
Volatility plot:
In a different, but related note, when I apply LME to the first-distances output, both having and not having set a baseline, I see significant changes by time point. How can I then see which pairs of time points show statistical differences? I’ve tried with PERMANOVA using the beta-group-significance command and the --pairwise parameter, but it didn’t show significant differences.
Many thanks in advance for your time and help!