Error on feature-classifier classify-sklearn using "--p-n-jobs" option

Hi,

I have a trouble that classify-sklearn is killed by joblib or pickle when do it with “–p-n-jobs” option.
Doing classify-sklearn without the option was successfully competed without any errors.
The machine have a 8-I checked the memory and disk to save the data but they have enough free space. And I upgraded the environment to phython 3.7 but I couldn’t solve the trouble.
Could you tell any advise and suggestions to solve it?

The error message is the below:

`joblib.externals.loky.process_executor._RemoteTraceback:
“”"
Traceback (most recent call last):
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/externals/loky/backend/queues.py”, line 150, in feed
obj
= dumps(obj, reducers=reducers)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/externals/loky/backend/reduction.py”, line 247, in dumps
dump(obj, buf, reducers=reducers, protocol=protocol)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/externals/loky/backend/reduction.py”, line 240, in dump
_LokyPickler(file, reducers=reducers, protocol=protocol).dump(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/externals/cloudpickle/cloudpickle.py”, line 477, in dump
return Pickler.dump(self, obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 409, in dump
self.save(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 521, in save
self.save_reduce(obj=obj, *rv)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 634, in save_reduce
save(state)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 821, in save_dict
self._batch_setitems(obj.items())
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 847, in _batch_setitems
save(v)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 521, in save
self.save_reduce(obj=obj, *rv)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 634, in save_reduce
save(state)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 821, in save_dict
self._batch_setitems(obj.items())
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 852, in _batch_setitems
save(v)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 521, in save
self.save_reduce(obj=obj, *rv)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 610, in save_reduce
save(args)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 751, in save_tuple
save(element)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 781, in save_list
self._batch_appends(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 808, in _batch_appends
save(tmp[0])
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 736, in save_tuple
save(element)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 751, in save_tuple
save(element)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 521, in save
self.save_reduce(obj=obj, *rv)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 634, in save_reduce
save(state)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 821, in save_dict
self._batch_setitems(obj.items())
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 847, in _batch_setitems
save(v)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 781, in save_list
self._batch_appends(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 805, in _batch_appends
save(x)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 781, in save_list
self._batch_appends(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 805, in _batch_appends
save(x)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 521, in save
self.save_reduce(obj=obj, *rv)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 634, in save_reduce
save(state)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 476, in save
f(self, obj) # Call unbound method with explicit self
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 821, in save_dict
self._batch_setitems(obj.items())
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 847, in _batch_setitems
save(v)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 482, in save
rv = reduce(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/_memmapping_reducer.py”, line 442, in call
for dumped_filename in dump(a, filename):
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 480, in dump
NumpyPickler(f, protocol=protocol).dump(value)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 409, in dump
self.save(obj)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 279, in save
wrapper.write_array(obj, self)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 103, in write_array
pickler.file_handle.write(chunk.tostring(‘C’))
OSError: [Errno 28] No space left on device
“”"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/q2cli/commands.py”, line 328, in call
results = action(**arguments)
File “”, line 2, in classify_sklearn
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 245, in bound_callable
output_types, provenance)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 390, in callable_executor
output_views = self._callable(**view_args)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/q2_feature_classifier/classifier.py”, line 220, in classify_sklearn
seq_ids, taxonomy, confidence = list(zip(*predictions))
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/q2_feature_classifier/_skl.py”, line 46, in predict
for calculated in workers(jobs):
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/parallel.py”, line 1042, in call
self.retrieve()
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/parallel.py”, line 921, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/_parallel_backends.py”, line 540, in wrap_future_result
return future.result(timeout=timeout)
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/concurrent/futures/_base.py”, line 432, in result
return self.__get_result()
File “/datadrive/anaconda3/envs/qiime2-2020.2/lib/python3.6/concurrent/futures/_base.py”, line 384, in __get_result
raise self._exception
_pickle.PicklingError: Could not pickle the task to send it to the workers.`

Hi!
Looks like you have space on your harddrive just enough to run it without “jobs” option with default values. The command “classify-sklearn” is RAM and ROM consuming as it is, and providing “jobs” option will make it even more greedy, since it is spliting your dataset on parts to run it in parallel.

Thank you for replying my question.
What kind of harddrive do you recommend for running “classify-sklearn” in parallel? I’m using AWS with a 2T external harddrive having 1T empty space and when I run “classify-sklearn”, I set $TMPDIR to the harddrive. But the above error came up.

1 Tb should be enough, I guess. Or I was wrong, or try to reduce number of jobs

OK. I’ll try it. Thanks.

I succeeded in running “classify-sklearn” using “–p-n-jobs" with 3. Setting the option with 4 emerged a trouble like the above. Thank you, timanix.

1 Like