Using PCoA versus PCA

jwdebelius · April 5, 2023, 2:14pm

Welcome to the :qiime2: forum!

I think the classic source from this is Numerical Ecology by Legrande and Legrande. I think it's like chapter 9, but I'm not sure off the top of my head. I just went hunting for GustaMe, which is another excellent resource, but doesn't seem to be online right now.

There are two terms that we use which are very similar:

Principal Components Analysis - PCA
Principal Coordinates Analysis - PCoA

From a general perspective, you can think of PCA as a special case of PCoA, although the steps are a little different.

In Principal Components Analysis, we:

Start with a table of features (or transformed table)
Calculate the euclidean distance between the features (for those paying attention, Aitchison = euclidean distance on CLR-transformed data)
Use an eigenvalue based ordination to project the distances into lower dimension space (this is a lot of linear algebra that's slightly over my head, TBH)
Use the known transform to place the features intot he same space (optional)
Show off your shiny new PCA and evaluate similiarity

In Principal Coordinates analyusis, we:

Start iwth a distance matrix. These can be any distances you want (and occasionally a dissimilarity). Bray Curtis, unweighted UniFrac, Aitchison, Jensen-Shannon, it doesn't care. Got a metric that compares two samples on the basis of their favorite musical genre ?
Use an eigenvalue based ordination to project the distances into low dimensional space
Show off your shiny new PCoA and evaluate similarity

There are a few points of comparison

	PCA	PCoA
Input	feature table	distance matrix
distance used	Euclidean only	any distance you want
clusters based on metadata	No	No
uses eigenvalues	Yes	Yes
can automagically map features into a biplot	Yes	No

Best,
Justine