# understanding bray curtis and Jaccard

hi, i have an enormous question about how to interpret this results. i did a beta diversity analysis using jaccard and bray curtis metrics. According to the p-value, there is a significant difference bettwen my samples, but in the boxplot (and according to this discussion [(beta diversity explanation (jaccard_distance)) )] shows that there isn’t a clear difference with the jaccard distance. am i wrong?
and i dont know how to interprete the bray curtis’s results with the jaccard’s ressults.
best regards sofi

Hello @Sofi,

Welcome to the forums!

I’m glad you found that excellent post by Mehrbod that describes how the box plots are made.

Before we dive in, can you post the full command you ran?

Hi @colinbrislawn thank you very much for your quick response.
the commands that i used are

qiime diversity core-metrics-phylogenetic
–i-phylogeny rooted-tree.qza
–i-table table.qza
–p-sampling-depth 32000
–output-dir core-metrics-results

qiime diversity beta-group-significance
–i-distance-matrix core-metrics-results/bray_curtis_distance_matrix.qza
–o-visualization core-metrics-results/bray_curtis_type_significance.qzv \

Thanks!

As discussed in that thread, the PERMANOVA is performed first to look for differences between groups and give you a p-value of significance.

Afterwards, the box plots are made, and if that p-value is under your alpha threshold, then you can look to see what groups are most different. I think of the box plots as a post-hoc test.

Yeah, the difference are very small but I can see a few…
Pristine to pristine: slightly lower mean, some outliers, larger standard deviation
Human to pristine: slightly higher mean, no outliers, smaller standard deviation

Remember that a stat test can be significant, but the effect size can still be very small.
Paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3444174/
Interactive visualization! https://rpsychologist.com/pvalue/

When you view all your samples in a PCoA plot, do these groups visually overlap? The file `core-metrics-results/jaccard_emperor.qzv` should contain this graph. I posted about how I read those graphs over here.

Let me know what you find!

They are just two different ways of calculating how different two samples are.
Jaccard is the percent of taxa not shared by two samples (A+B/union in the diagram below)

Jaccard and Bray are similar methods, so it makes sense to me that your graphs are similar.

2 Likes

Thank you!!

yeah, I found clustering between the two groups

But I still have the question of how to interpret the results as a whole. I know that bray-curtis considers the abundance of the species and jaccard only the presence / absence of species. Would it be correct to say that the difference between the communities is mainly due to a difference between the most abundant species? or does the interpretation go the other way?

1 Like

That’s right!

Because you see larger differences in Bray-Curtis dissimilarities than Jaccard distances, that makes sense to me. Keep in mind that you also see differences in Jaccard, which is not biased towards the most abundant features.

Looking at the PCoA, I noticed something else…

While you have 50+ samples, I only see 7 clusters in that PCoA. This makes me worry that a mistake or processing artifact has ‘pushed’ you samples closer together.

Compare this to a typical PCoA plot, like this one I got from the pd-mouse tutorial:

While the samples still cluster by `doner`, they are overlapping as much as I see in your 7 clusters.

I know that this is not part of your original question, but I wanted to mention it before reviewer 3 does!

1 Like

Actually i have 14 samples, you can see only 7 cluster because i have duplicates values and they are overlap.