comparison of mean or median dissimilarity

Hi, I intend to analysis two groups of samples (let's say group A (n = 11) and B (n = 21)). In some publications, they would compare the mean or median dissimilarity (distance?) between two groups.

My problem is how to do the calculation correctly.
I would do the calculation as follow:

  1. calculate dissimilarity matrix using A + B samples (n = 32)
  2. extract dissimilarity information between any of two samples with in group A and group B using matrix created in step 1
    in group A, there would be (11 - 1)^2 / 2 = 50 distances
    in group B, there would be (21 - 1)^2 / 2 = 200 distances
  3. compare 50 distances in group A and 200 distances in group B using t test or M-W U test

Seemed a little bet confusion because of the growing samples

another idea about this is to get the distance between each sample and the centroid
the calculation will be like this:

  1. get the coordination of the dissimilarity matrix of A + B samples (n = 32)
  2. calculate the centroid of group A and group B (arithmetic mean of each axis?)
  3. calculate the distance between samples in group A and the centroid A (n = 11) and between samples in group B and the centroid B (n = 21)
  4. compare distances in group A and distances in group B using t test or M-W U test

seemed more reasonable but still not very sure
also, is this calculation applicable in matrices other than Euclidean distance matrix?
BTW, is there any tools in qiime2 (or perhaps other platforms) to do the calculation?

thank you for your kind reply

Hi @b87401116,
The first calculation you describe is effectively what is done in qiime diversity beta-group-significance with the ANOSIM or PERMANOVA methods. This would typically be performed on distance matrices computed with one of the metrics used in qiime diversity core-metrics-phylogenetic: jaccard, bray-curtis, unweighted unifrac, or weighted unifrac. Other metrics are relevant too, based our your specific question, but these are a good starting point and generally perform better on these type of data that euclidean.

Your second calculation sounds more like a comparison of the variance within each group. You can also compute that with qiime diversity beta-group-significance with the PERMDISP method.

I think this should address your question, but let me know if not.

2 Likes

Hi, @gregcaporaso,
Thank you for your reply :slight_smile: .
I will give it a try.

1 Like

Hi, @gregcaporaso,
I used qiime diversity beta-group-significance and tried PERMDISP method, getting the following results:

PERMDISP results
method name PERMDISP
test statistic name F-value
sample size 14
number of groups 2
test statistic 4.843262
p-value 0.027
number of permutations 999

I sought to find the data which produce this statistic. However, the raw data provided was the distance matrix presented in long-form. So, how can I get the data which produce the statistic (distance to group centroid?)

Best

**I noticed some error in my original post:
in group A, there would be 11 * (11 - 1) / 2 = 55 distances
in group B, there would be 21 * (21 - 1) / 2 = 210 distances
compare 55 distances in group A and 210 distances in group B using t test or M-W U test

Hi @b87401116,
I don't think we have a way to extract those values directly from the distance matrix, but you can load the distance matrix in Python (if your comfortable with Python programming) or export it to a tsv file and extract them yourself.

You can export the distance matrix to a tsv file using qiime tools export.

If you want to load the distance matrix in Python to do the extraction there, you can do the following:

import qiime2
import skbio

dm = qiime2.Artifact.load('./distance-matrix.qza').view(skbio.DistanceMatrix)

That code assumes that the relative path to the distance matrix you want to load is ./distance-matrix.qza - you can adapt that to your actual relative path. It will load the data as a scikit-bio DistanceMatrix object, which you can learn how to use here. You can see how we do this for the beta-group-significance action here.

Apologies that there isn't an easier way to do this right now. I think this is something we should support in QIIME 2, and I have created an issue for this here.

I hope this helps!

1 Like

Hi @gregcaporaso,
I will take some time to read the code.
Thanks!

1 Like