I am experiencing some problems in the interpretation of the results of the beta diversity on my data.
I have run the following command:
qiime diversity beta-group-significance --i-distance-matrix core-metrics-results/unweighted_unifrac_distance_matrix.qza --m-metadata-file metadata2.tsv --m-metadata-column Sampletype --o-visualization core-metrics-results/unweighted_unifrac-Sampletype-significance_9999.qzv --p-pairwise --p-permutations 9999 (same for weighted unifrac and bray-curtis)
PERMANOVA gave a p-value 0.0001 and also the pairwise differences among groups are all significant, consistently for unweighted, weighted unifrac and bray-curtis. This should tell me that the factor “Sampletype” significantly influenced the similarity among the samples, isn’t it? Which is actually what I expected. The problem is that if I look at the emperor PCoA plots (attached), I don’t see any clear clustering pattern among the samples when coloring by Sampletype.
bray_curtis_emperor.qzv (1.5 MB)
weighted_unifrac_emperor.qzv (1.5 MB)
unweighted_unifrac_emperor.qzv (1.5 MB)
Also the variability explained by the axes is quite low.
So I am having troubles to interpret the results of beta group significance together with the absence of clustering pattern in the PCoA. Any help would be very welcome.
This is an excellent question! The issue is multifaceted and a bit complex and while I may not have all the answers to your concerns here is my attempt anyways.
First, I think it would be really useful to have a little reading on what the PERMANOVA and ANOSIM tests actually do, and those links will do a much better job of explaining it than I ever could so I recommend taking a look there first. I’m not sure if the implementation of these 2 tests in qiime2 are exactly that of what those links discuss, but they will be similar enough for our purpose.
First, as you mentioned the combination of your 3 axis explain somewhere around 40-60% of the variance, which is not high, but is not low either. Have a look through the literature and you’ll see much less is often reported. The important takeaway from this is that there may be other axes that can explain some meaningful aspects of your data which we would not be able to see in those 3 dimensions. Typically in microbiome data I’ve come across the first 3 axis just about explain everything meaningful and it’s likely the same with your data. You axes 4-5 explain another ~12% which may or may not be important. Just wanted to mention it… When you perform tests like PERMANOVA you are comparing the actual distances data (or their ranks) and not just a few dimensions. That is more for the sake of the ordination plots. So even though clustering and the stats usually coincide, I don’t think a visible clustering is a requirement and doesn’t necessarily contradict what the test says. Think of them as 2 separate, but complementing entities. But a contrast between the two might be a good indicator to perhaps look more closely at your data as you may have missed something.
Moving on to the plots themselves, I’ve looked through your PCoA plots and used the visibility option to highlight a few pair-wise comparisons (see attached). While I agree that some have no clear clustering (Fig. C and F) the others very well may (Fig A-B & D-E), it all depends on your perspective. If you were to run the PCoA on a subsample of your feature table with only with 2 SampleType, you might even see more visible clusters because the plot’s scale would only focus on those samples’ ranges, and wouldn’t have to worry about the other samples. Anyways, I think the results of your tests are ok and as you said taken into consideration with the design of your experiment, is likely revealing true patterns.
Hope that helps.
ps. There may be a touch of a horseshoe effect in some of your SampleTypes, which is a whole other topic but I’ll point you to this paper which may shed some light on those as well.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.