Taxonomic Classifier Plugin Error

Hello,

I am trying to train a taxonomic classifier using the Greengenes 13_8 reference data using the following commands, but keep getting a plugin error message:

qiime feature-classifier extract-reads \

--i-sequences 97_otus.qza
--p-f-primer CCTACGGGAGGCAGCAG
--p-r-primer ATTACCGCGGCTGCTGG
--o-reads ref-seqs.qza

(which runs fine)

qiime feature-classifier fit-classifier-naive-bayes \

--i-reference-reads ref-seqs.qza
--i-reference-taxonomy ref-taxonomy.qza
--o-classifier classifier.qza

but then i get this message:

Plugin error from feature-classifier:

Debug info has been saved to /tmp/qiime2-q2cli-err-19qp8cml.log

I ran the command again with --verbose and got this message:

/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/classifier.py:98: UserWarning: The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.19.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable.)
warnings.warn(warning, UserWarning)
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2cli/commands.py", line 218, in call
results = action(**arguments)
File "", line 2, in fit_classifier_naive_bayes
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 220, in bound_callable
output_types, provenance)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 355, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/classifier.py", line 276, in generic_fitter
pipeline)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/_skl.py", line 32, in fit_pipeline
pipeline.fit(X, y)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/pipeline.py", line 250, in fit
self._final_estimator.fit(Xt, y, **fit_params)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/custom.py", line 41, in fit
classes=classes)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/naive_bayes.py", line 555, in partial_fit
self._update_feature_log_prob(alpha)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/naive_bayes.py", line 718, in _update_feature_log_prob
np.log(smoothed_cc.reshape(-1, 1)))
MemoryError

but not sure how to interpret it...

I also had two more questions:

i) what is the need to select a truncation parameter when extracting the reference reads? My sequences are of different lengths as they have been done with Ion PGM; wouldn't truncating the fragments mean that I would leave out bases that could be used for taxonomic classification? Since I wasn't sure what value to choose I left it at default value.

ii) what is the difference between using the Naive Bayes classifier and the scikit classifier? There seems to be a warning message regarding the classifier I chose as well. I only chose the Naive Bayes classifier as this was presented in the tutorials.

Many thanks :grinning:

Hi @Alex_14262, I cannot answer your additional questions but I can tell you that I had this same problem at one point. The last line in your error message is telling you it is a memory error. You are most likely running out of RAM in the process. If you are using a VM, then you can go into the setting of the machine and increase the amount of RAM available to the machine. If this is a native install you may have to use a machine with higher RAM. You also may want to check your hard drive space just to be safe.

3 Likes

Hi @Alex_14262,
@Zach_Burcham gave a great explanation of the memory error (Thanks @Zach_Burcham!) , and I can answer your other questions.

This is really only useful when reads are all of the same length. You did the right thing — extract using your primers but do not set a trim length (or set the highest expected length).

fit-classifier-naive-bayes is essentially a pre-configured version of fit-classifier-sklearn that is specific to the multinomial Naive Bayes method. fit-classifier-sklearn can be used to train classifiers that use other learning methods and is a bit complicated to use. I'd recommend sticking with fit-classifier-naive-bayes as other learning methods are untested.

I hope that helps!

1 Like

Hi,

Thanks for the help @Zach_Burcham and @Nicholas_Bokulich! So should I ignore the warning about the scikit-learn version saying the results may be unreliable?

I ran the command on a server and it was fine, so the memory issue was solved!

Thanks

You did not tell us this error message above. Could you please share the full error message? Please also let me know what version of QIIME2 and scikit-learn you have installed. Are you downloading the latest version of the pre-trained classifiers from the QIIME2 website?

The easiest way to bypass this warning is to train your own classifier or install the correct version of scikit-learn.

The warning is the first part of the error message I posted above, which I got when I was trying to train the naive-bayes classifier:

I am using Qiime 2 2017.11 on a virtual box. I trained the naive-bayes classifier using the 13_8 Greengenes reference dataset. I also ran: conda install --override-channels -c defaults scikit-learn=0.19.1 as suggested on the data resources page in the warning message as I thought this would solve the issue, but I still got the warning message when I ran the command in verbose mode.

Many thanks

Use conda list to confirm that you have the correct version of scikit-learn installed.

If you do, just ignore the warning message.

If you do not, and attempting to up/downgrade to the correct version is not working, you could just re-train your classifier using the same commands that you used previously.

Good luck!

2 Likes

Hi

Thanks for the suggestion. I checked it and turns out I have scikit-learn version 0.19.1, which is the right one!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.