Help understanding DEICODE

@jbarlow ,

Yes, the percent explained variance will spread out as you increase the rank. However, the input rank does not limit the ordination to only three clusters (could be less or more), similar to PCA. So, to answer your question, no that should not cause overfitting. With the caveat being very high-rank (gradient like) data such as the 88 Soils dataset. The only non-gradient case when you may need to increase the rank is if you have many samples over many environments such as the full EMP or American Gut datasets. These cases will be very evident in the ordination.

For more reading on matrix completion see:

OptSpace the matrix completion method used in DEICODE by Keshavan, Montanari, Oh
Maryam Fazel's (now famous) thesis paper on Nuclear Norm Minimization
Matrix Ranks by Srebro and Shraibman
Solution Guarantees by Recht, Fazel, Parrilo
Candes and Recht on exact matrix completion

C

4 Likes