First-differences mis-handling of unequal time

Hi all,

New QIIME2 user here. I’m using q2-longidunal plugin to run a first-differences analysis on my Shannon diversity data over time. My time variable includes timepoints of sampling that are unequally spaced. The result is that I get only a couple of correctly-calculated first distance values, instead of first distance values for all samples. If I use a dummy variable of equally spaced time points, I have no such problem. No error messages.

Code to recreate the problem:

qiime longitudinal first-differences \
--m-metadata-file mapping_RS.txt\
--m-metadata-file reimport_shannon.qza\
--p-state-column Days-Post-Trach\
--p-metric shannon\
--p-individual-id-column Participant-ID\
--p-replicate-handling random\
--o-first-differences FirstDifferences_days.qza

I’m providing my files as needed to recreate the problem.

mapping_RS.txt (9.9 KB)

reimport_shannon.qza (16.7 KB)

The output from this file here is ok, using my dummy time variable.

FirstDifferences_days.tsv (1.9 KB)

This is the output with my actual time variable, that does not work right.

FirstDifferences.tsv (157 Bytes)

If you can help me please get all the samples to calculate first differences properly, I’d be grateful. :smiling_face_with_sunglasses: Thanks!

1 Like

Hi @rsteuart ,

Welcome to the forum and thanks for opening this question!

As far as I can tell from your description, this is working as intended. The reason being that the first-differences action looks at all possible time point values to determine which samples have gaps, and skipping those intervals. Otherwise first-differences would be measured at every time point, even when there are missing samples (see some discussion and an example here).

So the solution is as you have already found: that when you have unevenly sampled timepoints you could use a dummy variable to make these consistent. So like in your study where you have samples collected at roughly weekly intervals, but the timing is not exactly uniform, you can instead use a dummy variable like time point ā€œ0ā€, ā€œ1ā€, ā€œ2ā€, etc to indicate the sample order instead of the day post-treatment. I know this is not the most convenient option, but it also allows you to better control which intervals are counted, and which are skipped.

Good luck!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.