In my opinion NCV is generally better, since it is training and testing across multiple iterations so gives a sense of variance in accuracy, as well as predictions for all samples.
Yes, because indeed these are using the same methods but different training/testing schemes.
Not exactly — it's which samples are included in training, and how many times training occurs. This could indeed be a matter of which features are tested, but not necessarily.
No — importance is determined on the training set, so the difference is that importance is determined on only some of the samples and only once (whereas for NCV importance is averaged across each iteration).
classify-samples is a complete pipeline and easier to use. It also outputs a trained classifier that can be re-used to predict other samples (NCV does not, since it trains K classifiers!). The motivations for NCV I mentioned above.
Both are perfectly valid to use, so it just depends on which is more suited to your use case... and the type of training/testing scheme you want to use.