Taxonomic Classifier Plugin Error

Alex_14262 · February 14, 2018, 3:19pm

Hello,

I am trying to train a taxonomic classifier using the Greengenes 13_8 reference data using the following commands, but keep getting a plugin error message:

qiime feature-classifier extract-reads \

--i-sequences 97_otus.qza
--p-f-primer CCTACGGGAGGCAGCAG
--p-r-primer ATTACCGCGGCTGCTGG
--o-reads ref-seqs.qza

(which runs fine)

qiime feature-classifier fit-classifier-naive-bayes \

--i-reference-reads ref-seqs.qza
--i-reference-taxonomy ref-taxonomy.qza
--o-classifier classifier.qza

but then i get this message:

Plugin error from feature-classifier:

Debug info has been saved to /tmp/qiime2-q2cli-err-19qp8cml.log

I ran the command again with --verbose and got this message:

/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/classifier.py:98: UserWarning: The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.19.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable.)
warnings.warn(warning, UserWarning)
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2cli/commands.py", line 218, in call
results = action(**arguments)
File "", line 2, in fit_classifier_naive_bayes
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 220, in bound_callable
output_types, provenance)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 355, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/classifier.py", line 276, in generic_fitter
pipeline)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/_skl.py", line 32, in fit_pipeline
pipeline.fit(X, y)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/pipeline.py", line 250, in fit
self._final_estimator.fit(Xt, y, **fit_params)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/q2_feature_classifier/custom.py", line 41, in fit
classes=classes)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/naive_bayes.py", line 555, in partial_fit
self._update_feature_log_prob(alpha)
File "/home/qiime2/miniconda/envs/qiime2-2017.11/lib/python3.5/site-packages/sklearn/naive_bayes.py", line 718, in _update_feature_log_prob
np.log(smoothed_cc.reshape(-1, 1)))
MemoryError

but not sure how to interpret it...

I also had two more questions:

i) what is the need to select a truncation parameter when extracting the reference reads? My sequences are of different lengths as they have been done with Ion PGM; wouldn't truncating the fragments mean that I would leave out bases that could be used for taxonomic classification? Since I wasn't sure what value to choose I left it at default value.

ii) what is the difference between using the Naive Bayes classifier and the scikit classifier? There seems to be a warning message regarding the classifier I chose as well. I only chose the Naive Bayes classifier as this was presented in the tutorials.

Many thanks

Zach_Burcham · February 15, 2018, 3:40pm

Hi @Alex_14262, I cannot answer your additional questions but I can tell you that I had this same problem at one point. The last line in your error message is telling you it is a memory error. You are most likely running out of RAM in the process. If you are using a VM, then you can go into the setting of the machine and increase the amount of RAM available to the machine. If this is a native install you may have to use a machine with higher RAM. You also may want to check your hard drive space just to be safe.

Nicholas_Bokulich · February 16, 2018, 3:16pm

Hi @Alex_14262,
@Zach_Burcham gave a great explanation of the memory error (Thanks @Zach_Burcham!) , and I can answer your other questions.

This is really only useful when reads are all of the same length. You did the right thing — extract using your primers but do not set a trim length (or set the highest expected length).

fit-classifier-naive-bayes is essentially a pre-configured version of fit-classifier-sklearn that is specific to the multinomial Naive Bayes method. fit-classifier-sklearn can be used to train classifiers that use other learning methods and is a bit complicated to use. I'd recommend sticking with fit-classifier-naive-bayes as other learning methods are untested.

I hope that helps!

Alex_14262 · February 17, 2018, 5:24pm

Hi,

Thanks for the help @Zach_Burcham and @Nicholas_Bokulich! So should I ignore the warning about the scikit-learn version saying the results may be unreliable?

I ran the command on a server and it was fine, so the memory issue was solved!

Thanks

Nicholas_Bokulich · February 17, 2018, 6:40pm

You did not tell us this error message above. Could you please share the full error message? Please also let me know what version of QIIME2 and scikit-learn you have installed. Are you downloading the latest version of the pre-trained classifiers from the QIIME2 website?

The easiest way to bypass this warning is to train your own classifier or install the correct version of scikit-learn.

Alex_14262 · February 17, 2018, 7:32pm

The warning is the first part of the error message I posted above, which I got when I was trying to train the naive-bayes classifier:

I am using Qiime 2 2017.11 on a virtual box. I trained the naive-bayes classifier using the 13_8 Greengenes reference dataset. I also ran: conda install --override-channels -c defaults scikit-learn=0.19.1 as suggested on the data resources page in the warning message as I thought this would solve the issue, but I still got the warning message when I ran the command in verbose mode.

Many thanks

Nicholas_Bokulich · February 17, 2018, 8:55pm

Use conda list to confirm that you have the correct version of scikit-learn installed.

If you do, just ignore the warning message.

If you do not, and attempting to up/downgrade to the correct version is not working, you could just re-train your classifier using the same commands that you used previously.

Good luck!

Alex_14262 · February 20, 2018, 2:05pm

Hi

Thanks for the suggestion. I checked it and turns out I have scikit-learn version 0.19.1, which is the right one!

system · March 23, 2018, 8:05pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.