This is a great question!
Two quick thoughts:
Databases will lead to database bias
- Using annotations (taxonomy or KEGG) will introduce database bias into the features
- Summarizing by taxonomy or KEGG pathway will reduce resolution to that of the (limited/biased) annotations
- (Perhaps people don't make KEGG pathway PCoAs for the same reason they don't make species-level taxonomy PCoAs) EDIT: I have been informed that some people do use summarized data like this. And I use summarized data for bar plots, so maybe this is fine...
A distance matrix is a distance matrix
- so if you trust the pathways enough to report on them, why NOT put them into a PCoA plot?