beta diversity explanation (jaccard_distance)

Mehrbod_Estaki · June 5, 2020, 9:23am

Hi @terren,
Ok, take a look at the diagram below I made (values are just made up)

What beta-group-significance is doing:

calculate within group distances
In our case look at the green circles and the table beside it
There are 3 circles (samples) so there are 3 distances between them. Same within the red group.
But blue group has 4 samples so there are 6 distances within that group (no table shown).
Calculate across group distances
Here look at the distances between the red samples to the green samples.
You can see that there are 9 distances there (dashed lines) and those lines are longer than the within group distances because the samples are further.

Now, what the null hypothesis for beta-group significance (default permanvoa) is that the distances within a group (say Green here) are similar to those distances across another group (to red in our example).

Let's visualize this in the boxplot:
This boxplot is 1 of 3 we need to make for this example, but just showing distances-to-green. Here we are showing distances to the Green group. So, green-to-green distances are relatively short (n=3, look at 3 green lines). Red-to-green distances are larger because they are further away (n=9, look at 9 dashed lines). And finally Blue-to-green distances look about the same as the red distances to green distances. So our stats (not shown here) would probably suggest that both the red and blue groups have significantly larger distances to the within-green distances.

In your example, all the distances look about similar, suggesting there are no group clustering/differences.

Hope this clarifies things for you.