The scikit-learn version (0.23.1) used to generate this artifact does not match the current version of scikit-learn installed (0.22.2.post1). Please retrain your classifier for your current deployment to prevent data-corruption errors.
Can someone help me to understand in which way I can update the required version of the classifier?
I just ran into this issue also. Will sklean fail just because there are different versions? It’s not a big deal to rerun code with an updated qiime deployment, but these classifier trainings can take several days.
Any chance this kind of error message can just be a warning, but not kill the process?
I will address this by answering some of the questions raised by @the_dummy & @devonorourke:
Check this part of the error message to see which version the classifier was trained with:
This is the right answer: if you're using an old version of QIIME 2 (which you might be, @rparadiso), then you just need to use the version of the pretrained classifier that was released with that version of QIIME 2. The easiest way to do that is to go to docs.qiime2.org, and in the upper left, select the version of QIIME 2 you are using, then navigate to the "Data resources" link.
No, this error is generated (deliberately) by q2-feature-classifier, not scikit-learn.
No, because the underlying sklearn model can change between versions. Worst-case scenario, if we didn't let q2-feature-classifer error out, you could see classification results that appear to have run successfully, but in fact have produced incorrect results (we call this a data-integrity bug - think "uh oh, I need to retract a paper" sort of issue). That scenario isn't super likely IMO, a more likely situation is that sklearn would just crash, potentially with a less clear error message.
We hope that one day scikit-learn will have portable serializations of their classifier models, which would make this process a little bit smoother, but no clue if that is even on their radar.
Thanks @thermokarst, appreciate the insider knowledge.
This feels like a really stupid question… the error message shows what version sklearn was applied to generate the --i-classifier object, but how do I go about identifying what QIIME version was used to create that same object? If I run qiime tools peek..., I don’t get that specific info:
UUID: 50e48dcb-e40c-4757-b918-2bacfbf0afc6
Type: TaxonomicClassifier
Data format: TaxonomicClassiferTemporaryPickleDirFmt
but maybe there’s something in the UUID that will be useful? I thought maybe I’m supposed to view it in view.qiime2.org and get other provenance info, but the example doesn’t show which QIIME version was used in the screenshot in that example.
Related stupid question / feature request: would it be to possible to add the QIIME version associated with any artifact as an output in that qiime tools peek... command?
Turns out I was trying to run @Nicholas_Bokulich 's hybrid classifier, so I ended up needing not only to figure which QIIME version I used to generate that classifier object, but I also needed to resolve when the hybrid classifier became an option. I’m going to have to recreate the classifier object, so at the moment on Monsoon I’m eating up something like 1TB or RAM. Apologies anyone in Flagastaff…
Not a silly question at all - we don't have a great solution here - maybe we need to publish a simple table or something with this information. Otherwise, scrolling through different versions of the docs (or checking out the commit history on GitHub - qiime2/q2-feature-classifier: QIIME 2 plugin supporting taxonomic classification) (cc @Oddant1 - let's chat about this, I could use a hand putting together this table, if you're available).
Ah bummer, too bad screenshots can't scroll! The framework version is literally just below the bottom of that image cutoff: