Hello,
I am using qiime 2 version 2021.2 and the Silva_132_release 16S database to build my own classifier in VirtualBox. From the Silva folder I used the qiime tools import functions to retrieve the Silva_132_99_16S.fna ('FeatureData[Sequence]') and the majority_taxonomy_7_levels_.txt files and converted them into .qza files.
I then extracted the reference reads using the following code:
qiime feature-classifier extract-reads
--i-sequences /home/qiime2/Desktop/16SClassifierTraining_Ben/Silva132_99_16S.qza
--p-f-primer GTGYCAGCMGCCGCGGTAA
--p-r-primer GGACTACNVGGGTWTCTAAT
--p-trunc-len 120
--o-reads /home/qiime2/Desktop/16SClassifierTraining_Ben/ref-seqs_Truncated_081221.qza
I then trained the classifier using the following code, and included the --p-classify--chunk-size optional command to reduce the size of the resulting classifier:
qiime feature-classifier fit-classifier naive bayes
--i-reference-reads /home/qiime2/Desktop/16SClassifierTraining_Ben/ref-seqs_Truncated_081221.qza
--i-reference-taxonomy /home/qiime2/Desktop/16SClassifierTraining_Ben/Silva132_16S_99_RefTaxonomy1.qza
--p-classify--chunk-size 1000
--verbose
--o-classifier /home/qiime2/Desktop/16SClassifierTraining_Ben/classifier1.qza
While the code runs, and I get my classifier, but I do get the following user warning:
"The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.23.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable."
Has anyone seen this error before? Is this a version issue, either with q2, silva, and/or virtual box? And most importantly, how big of a problem is this and if it's big, how do I correct for it?
Any help would be greatly appreciated!!
Ben