Feature-classifier error in qiime2-2018.6

Hi All,
I am using qiime2-2018.6, when come to the step feature-classifier, I used following command:
qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads otus.qza --o-classification taxonomy.qza
and an erro came out like this:
"Plugin error from feature-classifier:

‘utf-8’ codec can’t decode byte 0xe5 in position 1048: invalid continuation byte

Debug info has been saved to /var/folders/cx/905c_xbs5dz3_hz31ry1vfq80000gn/T/qiime2-q2cli-err-wmj2qi0f.log"

what should I do?

Thanks
Juanli

Hey there @juanli! Sorry to hear things aren’t going well.

Can you send along the contents of this file:

/var/folders/cx/905c_xbs5dz3_hz31ry1vfq80000gn/T/qiime2-q2cli-err-wmj2qi0f.log

If that file doesn’t exist anymore can you rerun your command above, but add the --verbose flag to it? Then, copy-and-paste the full error message here.

Thanks! :qiime2: :t_rex:

Hi @thermokarst,

Thanks for your reply, here is the error details:
zoige_tax4fun yun$ qiime feature-classifier classify-sklearn --i-classifier silva_123_classifier.qza --i-reads zoige_otus.qza --o-classification 27ktaxonomy.qza
Plugin error from feature-classifier:

‘utf-8’ codec can’t decode byte 0xe5 in position 1048: invalid continuation byte

Debug info has been saved to /var/folders/cx/905c_xbs5dz3_hz31ry1vfq80000gn/T/qiime2-q2cli-err-gvhyg5io.log
(qiime2-2018.6) yundeMacBook-Pro:zoige_tax4fun yun$ qiime feature-classifier classify-sklearn --i-classifier silva_123_classifier.qza --i-reads zoige_otus.qza --o-classification 27ktaxonomy.qza --verbose
Traceback (most recent call last):
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in classify_sklearn
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 232, in bound_callable
output_types, provenance)
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 367, in callable_executor
output_views = self._callable(**view_args)
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_feature_classifier/classifier.py”, line 212, in classify_sklearn
reads, classifier, read_orientation=read_orientation)
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_feature_classifier/classifier.py”, line 168, in _autodetect_orientation
first_n_reads = list(islice(reads, n))
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_types/feature_data/_transformer.py”, line 228, in iter
yield from self.generator
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/registry.py”, line 506, in
return (x for x in itertools.chain([next(gen)], gen))
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/registry.py”, line 531, in _read_gen
yield from reader(file, **kwargs)
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/registry.py”, line 1008, in wrapped_reader
yield from reader_function(fhs[-1], **kwargs)
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/format/fasta.py”, line 675, in _fasta_to_generator
FASTAFormatError):
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/format/fasta.py”, line 853, in _parse_fasta_raw
for line in _line_generator(fh, skip_blanks=False):
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/skbio/io/format/_base.py”, line 192, in _line_generator
for line in fh:
File “/Users/yun/miniconda2/envs/qiime2-2018.6/lib/python3.5/codecs.py”, line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xe5 in position 1048: invalid continuation byte

Plugin error from feature-classifier:

‘utf-8’ codec can’t decode byte 0xe5 in position 1048: invalid continuation byte

See above for debug info.

Best wishes
Juanli

Hey there @juanli!

Thanks for the error log - this is a fun one!

Would you be able to share a download link with me (can be in a direct message) so that I can take a look at these data up close?

Thanks! :qiime2: :t_rex:

Hi Matthew,

I’m sorry, could you please explain the download link of what do you need?

Thanks
Juanli

1 Like

Sorry, I wasn’t very clear! I would like to rerun the failing command, on my computer, with your data. I can’t think of any other remote debugging questions I can ask, so time to take a hands-on approach! Can you send the data necessary for me to rerun your command above? Thanks!

Continuing the discussion from Feature-classifier error in qiime2-2018.6:

zoige_otus.qza (305.5 KB)

Hi Matthew,
I upload my zoige_otus.qza file, the classifier is too large, I also used the silva-132-99-515-806-nb-classifier.qza from qiime2 website, resulted the same error, would you please try to use it?

Many thanks
Juanli

Hi @juanli!

This error indicates that you have illegal characters in your query sequences (this is not an error in QIIME2 or with the reference sequences).

Yep, that's consistent with this being an error in the query

Thanks, this is all that's needed — this is where the problem is occurring.

Where did zoige_otus.fa come from? Looks like it is corrupted or for some other reason contains illegal characters (and QIIME2 should be catching these errors on import).

all looks clean until we get to Otu73 through Otu75:

>Otu73
CCTACGGGTGGCTGCAGTCGGGAATTTTGGGCAATGGGCGAAAGCCTGACCCAGCAACGCCGCGTGAAGGATGAAATCCC
TCGGGATGTAAACTTCGCAAGATTGFG<E5>@<C7><D5>CT\@G\AgCGGTTAATACACKBTAT'a<D4><F6>QCGOt
a<C7>RT<D2>T\FUA<C5>GCTC
CGGCVAIBTCCGVGKBAGCAGCCGCEG\@aT<C1>WGWDG<C3>GAGCAAGKGtTG<C4>VC^EGATTTACTGGGCGTAA
AGGGCGCGTA
GGCGGTCAGCAcA<C1>WTGAWVTGTGAAATCTCCGFG<C7>TTAACTCGGAAAGGTCAACTGATACTGTGCGACTAGAF
T<C3>^CAO`*EgG<C7>WCA^AST&OAATTCTCGGTGTAGCGGTGAAATGC<C7>TA^EAPADAGAGAGGAAC@C<C7>
TGCGGKGaEGGCGGGTTGCTGG
GCTGACACTGABG<C7>TGAGGCGCGAAAGCCAGGGGAGCGAACGGGATTAGATCCKBTAGT<C1>GT^A
~OtT74
TAGTGCGAGCCTACGGGTGGCTGCAGTCGGGAATTTTGCTCAATGGGGGAEASACTGAAGCAFC<C5>ACGCCGCGTGCGGGA
TGAAGGCCTTCGGGTTGTAAASC^GKTTTTACCAGG^GACfATAATGACGGTACCTGATGAAT<C1>AG^^CaGGG^CTA`CTACGD
^GKCQG^GIWACGCGGTAATACGTAGGTGACTAGCGTVG\BCGGATTTACTGGGCGTAAAGAGCGC<C7>CE^EWAG
<C7>TC^ETTCAA
GTCGAGTGTGAAAGCCCCCGGCTCAACTGGGGA^GGGuCATTBG<C5>TACTGATCGACTCGAAGGSA^GOAGAG^GGA`GAGO@A
TTCCCGGTGTAGTGGTG^AAAuGCGTAGAT<C1>TC^EGGAGGAACABG<C5>WVGGCGQA^GOCGGCTTCCTGCCSVGTTCTTGACGC^NTWCG^GCGbGACAOBTAGGGGaG<C3>QAACGGGATTAGATACCCTAGTAGTC<D4>AG^VGG

Yikes! where did all that stuff come from? Seems like these sequences are riddled with invalid characters. Either they have been preprocessed in some way that they should not be, or else the file became corrupted somehow. You should get an uncorrupted version of the file, unless if you can figure out what went wrong and correct it (I can't help you there!).

Good luck!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.