I am trying to estimate microbe-metabolite interactions with mmvec (in qiime-2020.6 since tensorflow won't install in newer qiime2 versions) and my biplot and heat map look really strange!
The metabolite data is normalised for an internal octanol standard and each sample measured twice so I used the mean thereof and filtered with frequency >1.
My microbiome data is just a filtered FeatureTable of Fungi.
In the Emperor plot, all taxa align on a single error and Axis 1 has >93%!
It might not be a problem -- I would be curious to see what your alpha / beta diversity for both your microbiome / metabolome looks like. If you have an acute stressor (i.e. antibiotics, or a pathogenic take over), you could see a dramatic shift in occurring in your community (which is what is hinted by your very skewed MMvec PC axes).
A Pseudo-Q2 = 0.63 is very good, I haven't seen many studies that have achieved that level of cross-validation accuracy. So you may have something biologically interesting
I double checked with my collaborator who generated the metabolomics data and we now think that some samples were burnt in the GC-MS because he found some Maillard reaction products in some of them.
In the beta diversity PCoA plots of the metabolites one can also see that a few samples are far aways from the cluster of most samples.
I think the way to go now is to remove these outlier samples but I really don't want to just delete individual samples simply because they appear a bit off in the PCoA plots.
Do you have any recommendations on how to approach this?
Thanks again!
I am attaching the PCoA plots with various diversity metrics for the microbiome as well as the metabolite data:
We used Isolation Forests to remove Outliers and our Emperor Plots look now more like what we expected - the arrows stratify better - while retaining high predictive accuracy!