Diversity Group Significance Interpretation

ikunz · May 31, 2018, 5:27pm

I am having some trouble interpreting the results of my group significance testing for both alpha and beta diversity. I am attempting to test the significance of the sample collection day "groups" for a longitudinal study that I conducted.

For testing alpha diversity group significance, this is my command:

qiime diversity alpha-group-significance --i-alpha-diversity shannon_vector.qza --metadata-file metadata.txt --o-visualization shannon-group-significance.qzv

For testing beta diversity group significance, this is my command:

qiime diversity beta-group-significance --i-distance-matrix weighted_unifrac_distance_matrix.qza --metadata-file metadata.txt --m-metadata-column ExperimentDay --o-visualization weighted_unifrac_ExperimentDay_significance.qzv

I have attached the resulting output visualizations and am hoping that someone may be able to help me better interpret the results which I find to be somewhat confusing.

shannon_vector.qza (31.1 KB)
weighted_unifrac_ExperimentDay_significance.qzv (316.3 KB)

From reading this topic and a few others, it is my understanding that the pairwise alpha diversity results indicate that samples from day 2 are significantly different from samples from day 3. Yet the results from the pairwise beta diversity tests indicate that all of the samples are more similar to samples collected from the same experiment day than those from other samples days except day 2 samples which are not more similar to each other than they are to day the 3 samples.

So, the alpha diversity and beta diversity results are exactly opposite in terms of the significance of pairwise comparisons. While I understand and appreciate that alpha and beta diversity measures and statistical testing have very different meanings, I am just perplexed by these specific results and feel that I am incorrectly interpreting the significance. I hope my question makes sense, I appreciate any help that can be provided. Thank you!

Nicholas_Bokulich · June 1, 2018, 11:34am

Hi @ikunz,

Looks like you included the shannon_vector data used as an input, but not the output of this command!

It is right on the border, if I'm looking at the correct value (0.05), and depending on what you consider significant.

I will be able to comment more once the alpha diversity viz is uploaded...

In the meantime, alpha and beta diversity estimates often can give results that seem to be contradictory at first glance — but they are really measuring entirely different things. Consider, for example, that different ASVs that differ by a single nucleotide will each be counted by alpha diversity results (depending on the metric you use) but may have negligible impact on beta diversity (also depending on metric; in this case you use weighted UniFrac, which would be negligibly impacted by small differences between sequences, and by low-abundance sequences).

You should also check out q2-longitudinal, which will allow you to examine time-series data and paired samples from individual subjects sampled over time. Depending on the design of your experiment, that plugin may offer some better tools for analysis, rather than pairwise comparison of groups at each time point.

I hope that helps!

ikunz · June 1, 2018, 4:34pm

Thank you for catching my mistake-I have attached the correct file. I also appreciate the advice regarding the q2-longitudinal approach, I have used it with past studies and plan to work through it with this project as well. Thank you!

shannon-group-significance.qzv (318.8 KB)

Nicholas_Bokulich · June 1, 2018, 5:01pm

Thanks for sending your data!

So Shannon diversity decreases between days 2 and 3, but weighted UniFrac does not differ significantly. (notably, neither do after multiple test correction)

This does not raise any red flags in my mind. Shannon (a measure of richness and evenness) drops, but there is no change to the dominant phylotypes present in the samples. Unweighted UniFrac will be more sensitive to low-abundance features, so would be useful to compare to these results.

After multiple test correction, there is not a significant difference between groups. Paired tests, as used by longitudinal pairwise-difference and pairwise-distance, could be a better/more sensitive test if you are sampling the same subjects at each time point.

Better yet, longitudinal linear-mixed-effects would let you look at multiple effects on alpha and beta diversity, taking into account individual differences, over the entire time series.

I hope that helps!

system · July 2, 2018, 11:01pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.