Distance matrix between taxas in OTU table


I was exportin the OTU table to the csv file. Since I want to use this file for the mixed model I wanted to use genetic distances between OTUs as my variance-covariance matrix. Is there any possibility to generate the genetic distances matrix between the OTU or any other kind of similarity/distance matrix?

Hi @misterie,

Welcome to the :qiime2: forum!

There are two ways to answer this question. The first is that, underlying the phylogenetic tree, you have a distance matrix that describes the distance between two OTUs/ASVs/features.

This isn't easily retrievable through the standard QIIME API, so you'd either need to look at the underlying data structures. In python the (untested) code would be something like

from skbio import TreeNode
from qiime2 import Artifact

tree = Artifact.load('path to your qza tree').view(TreeNode)
dist = tree.tip_to_tip_distance()
dist.save('path to your new distance matrix as a tsv')



Hi @jwdebelius, thank you! Should I use rooted or unrooted tree?

Hi @misterie,

As far as I know, the two trees should perserve the same relationship, but the magnitude of the distance may be different - so id suggest picking one and just being clear about which you used.

However (a thought that just hit me), this may not be your best covariance matrix. Phylognetically similar organisms may have very different behavior (despite the fact that this is often the assumption we make!) So, you may also want to consider a co-occurance ranking. I think SparCC might be a good way to go for that kind of co-occurance/co-exclusion matrix.


@jwdebelius thank you! My another question is: can I build this matrix only for classified OTU? Now it produces very big matrix, but number of my taxas is lower after the classification with greengenes db.

1 Like

Hi @misterie,

You can eithe sheer the tree before hand (you’ll need to look at the documentation to do so), or you can filter the tree to just your features using the “filter” property of the distance matrix:

import pandas as pd

table = Artifact.load('path to your table').view(pd.DataFrame)
feature_ids = table.index.values

only_otu_dist = dist.filter(feature_ids)


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.