classify-sklearn subprocess.CalledProcessError returned non-zero exit status 1.

@wasade So I was able to download the 18S sequences I needed from QIITA, but I get this error when I try to classify the .qza file for the ASVs. I only get this error for 18S (and I've tried multiple 18S contexts), but not 16S (which works fine) so I don't know what's going on. Below are the commands I'm using and the error.

redbiom fetch samples
--from /qiime2/freshwater_ph_sample_ids
--context Deblur_2021.09-Illumina-18S-V9-90nt-844b41
--output /qiime2/$gene"_freshwater_ph_samples.biom"

qiime tools import
--type FeatureTable[Frequency]
--input-path /qiime2/$gene"_freshwater_ph_samples.biom"
--output-path /qiime2/$gene"_freshwater_ph_samples.qza"

qiime clawback sequence-variants-from-samples
--i-samples /qiime2/$gene"_freshwater_ph_samples.qza"
--o-sequences /qiime2/$gene"_freshwater_ph_ASVs.qza"

qiime feature-classifier classify-sklearn
--i-classifier /qiime2/$gene/$gene"_NB_SILVA_classifier.qza"
--i-reads /qiime2/$gene"_freshwater_ph_ASVs.qza"
--o-classification /qiime2/$gene/$gene"_freshwater_ph_ASVs_classification.qza"

Plugin error from feature-classifier:

Command '['grep', '-c', '^>', '/tmp/qiime2-archive-uzyptvrx/f6df7d7d-4b9d-45bf-bc9c-31c5646513c2/data/dna-sequences.fasta']' returned non-zero exit status 1.

Debug info has been saved to /tmp/qiime2-q2cli-err-78giiuxc.log

less /tmp/qiime2-q2cli-err-78giiuxc.log

Traceback (most recent call last):
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2cli/", line 329, in call
results = action(**arguments)
File "", line 2, in classify_sklearn
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/", line 245, in bound_callable
output_types, provenance)
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/", line 390, in callable_executor
output_views = self._callable(**view_args)
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_feature_classifier/", line 209, in classify_sklearn
reads_per_batch = _autotune_reads_per_batch(reads, n_jobs)
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/site-packages/q2_feature_classifier/", line 192, in _autotune_reads_per_batch
File "/opt/conda/envs/qiime2-2020.8/lib/python3.6/", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['grep', '-c', '^>', '/tmp/qiime2-archive-uzyptvrx/f6df7d7d-4b9d-45bf-bc9c-31c5646513c2/data/dna-sequences.fasta']' returned non-zero exit status 1.

Hi @chahoos ,

I have moved your question to a new topic because it is a new question unrelated to the first.

The error here is a strange one; it is unrelated to the taxonomic weights, and is occurring when classify-sklearn is attempting to count the number of input sequences (to optimize parallelization). This might imply that there is something wrong with the input sequences... we could try debugging if you send us your inputs. Could you first run qiime feature-table tabulate-seqs on the query sequence artifact (/qiime2/$gene"_freshwater_ph_ASVs.qza") and share the output?

The good news is that there is a simple workaround to this. You can either set n-jobs=1 or use the --p-reads-per-batch parameter to manually specify the number of reads per job (based on the output of the tabulate-seqs command above to manually check the number of reads). This will bypass the step that is failing.

1 Like

Ah oops, I figured it out. Turns out there was nothing inside the .qza file. I guess there are no 18S datasets in QIITA from freshwater and pH < 8. Everything is working fine now after I removed the pH filter.


Thanks for checking and confirming!

I will open a feature request for classify-sklearn so that we can add a more intuitive error message when this issue is encountered.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.