PCA vs PCoA - which is the appropriate one for microbiome data

colinbrislawn · September 12, 2018, 5:32pm

Good question.

I want to take a moment to differentiate the ordination method vs the distance metric.

ordination methods
 - PCoA
 - NMDS
 - CCA
distance metrics
 - unifrac
 - jaccard
 - euclidian

I think you probably know that already, but I just wanted to post that for future users to see.

I'm not sure if euclidian is bad, but other methods are arguably more clear or more biologically relevant. Let's take Jaccard and UniFrac as examples.

Jaccard distances are simple:

percentage of taxa not found in both samples

So if 30% of taxa are in both samples, this means 70% are only found in one sample, and the Jaccard distance is 0.7. Very easy!

UniFrac distances are equally easy, and add phylogenetic information:

percentage of phylogenetic branch length not found in both samples

This makes UniFrac a tremendously powerful method for measuring the difference between samples because it incorporates the underlying phylogenetic tree of the taxa.

Now let's look at Euclidian distance

the square root of
    the sum of
        the squares of
            the percentage of unique taxa in each sample

How could that possibly be useful!?!
Who was crazy enough to invent the Euclidean distance?

Colin