Warning (scikit-learn version 0.23.1) while training classifier

Hello,

I am using qiime 2 version 2021.2 and the Silva_132_release 16S database to build my own classifier in VirtualBox. From the Silva folder I used the qiime tools import functions to retrieve the Silva_132_99_16S.fna ('FeatureData[Sequence]') and the majority_taxonomy_7_levels_.txt files and converted them into .qza files.

I then extracted the reference reads using the following code:

qiime feature-classifier extract-reads
--i-sequences /home/qiime2/Desktop/16SClassifierTraining_Ben/Silva132_99_16S.qza
--p-f-primer GTGYCAGCMGCCGCGGTAA
--p-r-primer GGACTACNVGGGTWTCTAAT
--p-trunc-len 120
--o-reads /home/qiime2/Desktop/16SClassifierTraining_Ben/ref-seqs_Truncated_081221.qza

I then trained the classifier using the following code, and included the --p-classify--chunk-size optional command to reduce the size of the resulting classifier:

qiime feature-classifier fit-classifier naive bayes
--i-reference-reads /home/qiime2/Desktop/16SClassifierTraining_Ben/ref-seqs_Truncated_081221.qza
--i-reference-taxonomy /home/qiime2/Desktop/16SClassifierTraining_Ben/Silva132_16S_99_RefTaxonomy1.qza
--p-classify--chunk-size 1000
--verbose
--o-classifier /home/qiime2/Desktop/16SClassifierTraining_Ben/classifier1.qza

While the code runs, and I get my classifier, but I do get the following user warning:

"The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.23.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable."

Has anyone seen this error before? Is this a version issue, either with q2, silva, and/or virtual box? And most importantly, how big of a problem is this and if it's big, how do I correct for it?

Any help would be greatly appreciated!!

Ben

@bkramer,

It is not really an issue as long as you heed the warning! Because of changes in scikit-learn's algorithms between versions, you will not get reliable results using a classifier with a different version to run the classification, in fact it will not let you run it. For an example of the error you will encounter should you try to use a classifier trained by a different version of scikit-learn, see this recent post.

As long as you are using 2021.2 you should have no issues. If you change to a newer release of QIIME 2, you will have to train a new classifier, as 2021.4 uses scikit-learn 0.24.1. You can also run into issues if you install something in your environment that upgrades scikit-learn. This generally is not an issue if you don't try to manually install anything, but it is something to keep in mind.

2 Likes

@Keegan-Evans thank you for clarifying that and for the quick response!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.