Hi, I would like to ask about the microbial loadings on each axis of PCoA plots.
I already read the two threads above and got some results for my 16S analysis data. But actually the results I got are not sufficient, and here are my questions.
The PCoA-biplot shows top five most prominent features, but there are many features with same taxa in the metadata table. Are the features on the 3D plot represent non-redundant taxa or they just show exactly that feature but not other features assigned to the same taxa?
Can I get the table for loadings (or contribution) of each bacterial taxon to each PCoA axis for all taxa?
QIIME 2 can operate either on ASV or tables collapsed to taxa. Unless you specifically collapsed a table to a taxonomical level before biplot creation, in the process of creating a biplot with taxa names you should have received a biplot with individual feature hashes initially. Taxa are just labels added for the readability of the plot (instead of raw sequence).
It is not possible to get loading scores for any type of PCoA, because it is not a linear dimensionality transformation technique. It is computed based on a distance matrix, and distance functions preserve no information about the features that make up the distances.
Here's a detailed explanation: Using PCoA versus PCA - #3 by jwdebelius
Thank you again and I would like to ask few more questions!
Analyzing microbiome data using ASV features is better than using taxa. Is it correct?
Loading scores on both PCoA and PCA are meaningless and there is no need to struggle to get the loading scores. Is it correct?
So, what would be the meaning of the five most prominent features on PCoA biplot? How are five features selected? Can the length of the five features be expressed in numbers? Are the numbers meaningless either?
The image above is my PCoA-biplot result.
I thought the length of the arrows could be expressed in numbers, and those numbers can be decomposed to the loading scores corresponding to each axis based on basic Pythagorean theorem.
And I thought if I could get five arrows for five ASV features, then I could get all arrows corresponding to all ASV features.
If my words sound quite offensive, I feel sorry but I never mean it.
It just happens to make the notions clearer and more recognizable to me.
Yes, it is better to analyze ASVs than taxa. The definition of a taxon is complicated and subjective. ASVs are clearly defined.
There is no way to get loading scores in PCoA, there is a way in PCA and they are meaningful. In the case of the microbiome, it is an example of compositional data, and its analysis is more intricate. Here, PCA shouldn't be applied at all.
In the picture you can see which features drive the separation on the PCoA, so feature arrows point to the samples where the features are more abundant. In that case, upward arrows show that the separation between those two samples and the rest is mostly driven by these 2 features. Hope that helps!