Using PCoA versus PCA

Hi @Emily_Yu,

Welcome to the :qiime2: forum!

I think the classic source from this is Numerical Ecology by Legrande and Legrande. I think it's like chapter 9, but I'm not sure off the top of my head. I just went hunting for GustaMe, which is another excellent resource, but doesn't seem to be online right now.

There are two terms that we use which are very similar:

  • Principal Components Analysis - PCA
  • Principal Coordinates Analysis - PCoA

From a general perspective, you can think of PCA as a special case of PCoA, although the steps are a little different.

In Principal Components Analysis, we:

  1. Start with a table of features (or transformed table)
  2. Calculate the euclidean distance between the features (for those paying attention, Aitchison = euclidean distance on CLR-transformed data)
  3. Use an eigenvalue based ordination to project the distances into lower dimension space (this is a lot of linear algebra that's slightly over my head, TBH)
  4. Use the known transform to place the features intot he same space (optional)
  5. Show off your shiny new PCA and evaluate similiarity

In Principal Coordinates analyusis, we:

  1. Start iwth a distance matrix. These can be any distances you want (and occasionally a dissimilarity). Bray Curtis, unweighted UniFrac, Aitchison, Jensen-Shannon, it doesn't care. Got a metric that compares two samples on the basis of their favorite musical genre :musical_note: ?
  2. Use an eigenvalue based ordination to project the distances into low dimensional space
  3. Show off your shiny new PCoA and evaluate similarity

There are a few points of comparison

PCA PCoA
Input feature table distance matrix
distance used Euclidean only any distance you want
clusters based on metadata No No
uses eigenvalues Yes Yes
can automagically map features into a biplot Yes No

Best,
Justine

5 Likes