Hello everyone!
I recently did an rPCA using Gemelli. I followed the documentation regarding the low-rank assumption, which suggests choosing a rank between 2 and 10. I used a rank of 3 because it is the default value. I am confused about how to interpret the proportion of variance explained in the resulting (bi)plot.
When I look at my PCs, the variance explained sums up to exactly 100%. Coming from PCoA, I am used to the first few axes explaining a much smaller proportion of the total variance (e.g., PCo1 11%, PCo2 5%) and having a long tail of unexplained variance.
Does the fact that my rPCA axes sum to 100% mean that:
- I have somehow captured 100% of the original biological variance in my dataset? (I don't think so
) - This a property of the matrix completion algorithm, where the percentage shown is relative only to the reconstructed matrix (the approximation) defined by the rank I chose?
If it is the latter, how would you report this in a manuscript? Is it misleading to say something like "PC1 explained 60% of the variance" without clarifying that this is 60% of the low-rank approximation? It feels like lying! ![]()
Thanks in advance for your help!
Best,
Sergio