update scikit-learn version

thermokarst · July 22, 2020, 3:33pm

I will address this by answering some of the questions raised by @the_dummy & @devonorourke:

Check this part of the error message to see which version the classifier was trained with:

This is the right answer: if you're using an old version of QIIME 2 (which you might be, @rparadiso), then you just need to use the version of the pretrained classifier that was released with that version of QIIME 2. The easiest way to do that is to go to docs.qiime2.org, and in the upper left, select the version of QIIME 2 you are using, then navigate to the "Data resources" link.

No, this error is generated (deliberately) by q2-feature-classifier, not scikit-learn.

No, because the underlying sklearn model can change between versions. Worst-case scenario, if we didn't let q2-feature-classifer error out, you could see classification results that appear to have run successfully, but in fact have produced incorrect results (we call this a data-integrity bug - think "uh oh, I need to retract a paper" sort of issue). That scenario isn't super likely IMO, a more likely situation is that sklearn would just crash, potentially with a less clear error message.

We hope that one day scikit-learn will have portable serializations of their classifier models, which would make this process a little bit smoother, but no clue if that is even on their radar.

BTW, this is a useful read relevant to this discussion - scikit-learn: machine learning in Python