I’m starting to go through and add the percentile normalized data type (
FeatureTable[PercentileNormalized]) to the relevant functions where it should be an allowable input, and noticed that
q2-sample-classifier currently only accepts
FeatureTable[Frequency] as input. (L83 in plugin_setup.py)
Philosophically speaking, machine learning classifiers should accept any type of feature data (presence/absence, relative abundance, etc). Practically speaking, will this break the underlying
sklearn code at any point? I don’t think it should (though it might produce weird results, especially in regression), but wanted to get another brain on this before submitting the PR.
Here’s the commit with my proposed edits to