I am curious about options for feature reduction if you are using classify-samples-ncv vs. sample-classifier. With sample-classifier (random forest), you can use --p-optimize-feature-selection to specify recursive feature elimination, which can help with dimensionality reduction and down-stream analysis. This is not an option with classify-samples-ncv. So how would you go about doing something similar in a statistically relevant/justifiable way?
I have a dataset that I ran classify-samples-ncv (random forest) on. The accuracy was pretty good. But I wanted to see what would happen if I reduced features. So I used the importance .qza table and reduced to the top “X” number of features and re-ran the classifier. I did this repeatedly reducing “X” by 5 features each time. As I did this, model accuracy improved to a point, plateaued, and then at a certain point began to decline. I would say that is where I would want to cut things off. But I am pretty sure that this is not the proper method/approach to achieve feature reduction. Is there some option with classify-samples-ncv that I am missing that would approximate this approach or what is achieved through recursive feature elimination with sample-classifier?