Help with first-distances

AlbaCC · January 28, 2021, 11:08am

Hello,

I’m struggling trying to make sense of the first-distances output. Both the tutorial and previous posts on the topic have been very helpful to understand how the method works, but haven’t find an answer for the particular question I have, which perhaps is very naïve.
Briefly, I have samples collected from the same individuals at pre-treatment (0), and at 1, 5, 9 and 12 months post treatment. I applied first-distances to calculate distances between successive samples collected from the same subjects, in order to use them to analyze longitudinal changes in beta diversity by linear-mixed-effects (LME). My problem is that when I run the volatility analysis to visualize the output of first-distances, it only plots data for time points 1, 5 and 9 months post-treatment (see picture below). If first-distances calculates the magnitude of change between successive time points (ΔYt=Yt−Yt-1), shouldn’t I be getting values also for 12 months? i.e., ΔY12=Y12−Y9. Indeed, when I run first-distances with baseline set at pre-treatment (0) (using the same metadata files as above) I get values for time point 12, which I take are ΔY12=Y12−Y0.

Commands used to run and visualize first-distances (no baseline):

qiime longitudinal first-distances \

--i-distance-matrix bray_curtis_distance_matrix.qza \

--m-metadata-file sample-metadata-all-filtered.tsv \

--p-state-column Timepoint \

--p-individual-id-column Patient \

--p-replicate-handling random \

--o-first-distances first-distances-all-filtered.qza

qiime longitudinal volatility \

--m-metadata-file sample-metadata-all-filtered.tsv \

--m-metadata-file first-distances-all-filtered.qza \

--p-default-metric Distance \

--p-default-group-column treatment \

--p-state-column Timepoint \

--p-individual-id-column Patient \

--o-visualization volatility-distance-all-filtered.qzv

Volatility plot:

Commands used to run and visualize first-distances (baseline):

qiime longitudinal first-distances \

--i-distance-matrix bray_curtis_distance_matrix.qza \

--m-metadata-file sample-metadata-all-filtered.tsv \

--p-state-column Timepoint \

--p-individual-id-column Patient \

--p-replicate-handling random \

--p-baseline 0 \

--o-first-distances first-distances-baseline-0-all-filtered.qza

qiime longitudinal volatility \

--m-metadata-file sample-metadata-all-filtered.tsv \

--m-metadata-file first-distances-baseline-0-all-filtered.qza \

--p-default-metric Distance \

--p-default-group-column treatment \

--p-state-column Timepoint \

--p-individual-id-column Patient \

--o-visualization volatility-distance-baseline-0-all-filtered.qzv

Volatility plot:

In a different, but related note, when I apply LME to the first-distances output, both having and not having set a baseline, I see significant changes by time point. How can I then see which pairs of time points show statistical differences? I’ve tried with PERMANOVA using the beta-group-significance command and the --pairwise parameter, but it didn’t show significant differences.

Many thanks in advance for your time and help!

Nicholas_Bokulich · January 30, 2021, 12:24pm

Hi @AlbaCC,

that's correct — I recommend checking the metadata to make sure that the individual IDs are consistent between months 9 and 12. This might explain why the 12 month samples are dropping out.

you are welcome to share your metadata if you want us to take a look...

this means that the rate of change (first distances) differs over time... it does not mean that there is a particular timepoint that differs between groups. A group*time interaction would indicate that the slope of the rate of change differs between groups, but not that they differ at a specific timepoint.

Let us know if you have more questions!

system · March 2, 2021, 6:24pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.