Plugin error: feature-classifier extract-reads

Hi, there,
I was following the example at https://docs.qiime2.org/2017.12/tutorials/feature-classifier/ to extract reference reads, however i got these errors. Any idea? thanks,
Gary

qiime feature-classifier extract-reads --i-sequences 85_otus.qza —-p-r-primer GTGCCAGCMGCCGCGGTAA —-p-f-primer GGACTACHVGGGTWTCTAAT --p-trunc-len 200 --o-reads ref-seqs_200.qza

Error: Detected invalid character in: —-p-r-primer, —-p-f-primer
Verify the correct quotes or dashes (ASCII) are being used.

Hello Gary,

Looks like something messed with your command. I see it has em-dashes in it (—) instead of normal dashes (-). Qiime doesn’t like em-dashes.

I wonder how these em-dashes got there… Sometimes programs like MS Word automatically insert em-dashes, which is annoying because it confuses programs like Qiime.

Let me know if this helps,
Colin

1 Like

I still got this error. I checked the dash that is correct now. Any idea?
thanks,
Gary

$ qiime feature-classifier extract-reads --i-sequences 99_otus.qza --p-r-primer TCCTCCGCTTATTGATATGC --p-f-primer GCATCGATGAAGAACGCAGC --p-trunc-len 500 --o-reads 99_ref-seqs_500.qza

Plugin error from feature-classifier:

Invalid characters in sequence: [‘a’, ‘c’].
Valid characters: [‘M’, ‘C’, ‘G’, ‘R’, ‘.’, ‘V’, ‘A’, ‘Y’, ‘D’, ‘H’, ‘T’, ‘B’, ‘S’, ‘W’, ‘K’, ‘N’, ‘-’]
Note: Use lowercase if your sequence contains lowercase characters not in the sequence’s alphabet.

Debug info has been saved to /tmp/qiime2-q2cli-err-t7_oxus7.log

$ more /tmp/qiime2-q2cli-err-t7_oxus7.log
Traceback (most recent call last):
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/commands.py”, line 224, in call
results = action(**arguments)
File “”, line 2, in extract_reads
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 228, in bound_callable
output_types, provenance)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 391, in callable_executor
spec.qiime_type, output_view, spec.view_type, prov)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/result.py”, line 239, in _from_view
result = transformation(view)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/core/transform.py”, line 59, in transformation
new_view = transformer(view)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/core/transform.py”, line 207, in wrapped
file_view = transformer(view)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_types/feature_data/_transformer.py”, line 271, in _10
skbio.io.write(iter(data), format=‘fasta’, into=str(ff))
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 1166, in write
return io_registry.write(obj, format, into, **kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 619, in write
writer(obj, into, **kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 1082, in wrapped_writer
writer_function(obj, fhs[-1], **kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/format/fasta.py”, line 774, in _generator_to_fasta
for header, seq_str, qual_scores in formatted_records:
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/format/_base.py”, line 146, in _format_fasta_like_records
for idx, seq in enumerate(generator):
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_types/feature_data/_transformer.py”, line 228, in iter
yield from self.generator
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_feature_classifier/_cutter.py”, line 104, in _gen_reads
for seq in sequences:
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_types/feature_data/_transformer.py”, line 228, in iter
yield from self.generator
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 506, in
return (x for x in itertools.chain([next(gen)], gen))
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 531, in _read_gen
yield from reader(file, **kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/registry.py”, line 1008, in wrapped_reader
yield from reader_function(fhs[-1], **kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/io/format/fasta.py”, line 677, in _fasta_to_generator
**kwargs)
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/sequence/_grammared_sequence.py”, line 338, in init
self._validate()
File “/opt/apps/Miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/skbio/sequence/_grammared_sequence.py”, line 362, in _validate
list(self.alphabet)))
ValueError: Invalid characters in sequence: [‘a’, ‘c’].
Valid characters: [‘M’, ‘C’, ‘G’, ‘R’, ‘.’, ‘V’, ‘A’, ‘Y’, ‘D’, ‘H’, ‘T’, ‘B’, ‘S’, ‘W’, ‘K’, ‘N’, ‘-’]
Note: Use lowercase if your sequence contains lowercase characters not in the sequence’s alphabet.

Hi Gary,
That is a different error. This time around, the issue is that you have lowercase characters in your 99_otus.qza sequences. Lowercase characters are not supported here. Looks like you are probably using UNITE database — I have run into similar problems with UNITE previously.

Just convert all these characters to uppercase and re-import.

I hope that helps!

Thanks, Nicholas, You rock!!!
Yes, after convert the fasta sequences downloaded from UNITE to upper case, the problem goes away.
Gary

1 Like

It looks like the ITS sequences i downloaded from UNITE had lots of problem. There are more invalid characters found in the input fasta file. I wondering whether Qiime2 has a clean-up version ITS reference sequences for download. thanks,Gary

Plugin error from feature-classifier:

Invalid character in sequence: b’\x0b’.
Valid characters: [‘W’, ‘K’, ‘V’, ‘A’, ‘Y’, ‘S’, ‘N’, ‘.’, ‘H’, ‘D’, ‘M’, ‘C’, ‘-’, ‘R’, ‘G’, ‘B’, ‘T’]
Note: Use lowercase if your sequence contains lowercase characters not in the sequence’s alphabet.

Yowza. I have never seen that one before. As I have successfully used UNITE previously (after converting lowercase -> uppercase), I suspect the problem is being introduced on your end in one of two ways:

  1. If you are converting the files by opening in a text editor and using find/replace, the text editor might be inserting special characters (e.g., special characters for line breaks or tab characters). I recommend downloading the sequences again and converting with this command:
tr 'agct' 'AGCT' < 99_otus.fasta > 99_otus.qza
  1. copying/pasting commands from HTML is another good way to insert special characters or in other ways mangle the input due to the way text is formatted on the screen. This could be incorporating unintended special characters into the primer sequences. Less likely but just a thought.

Alas, no. We do not post our own versions of any reference databases on the QIIME website.

I hope one of those solutions help! Good luck!

Thanks Nicholas.
Actually the reason is the 12-11 (alpha release) I downloaded from UNITE, which has a direct link from Qiime1 site: http://qiime.org/home_static/dataFiles.html

This alpha release has lots of invalid characters. After i download the older version of ITS training set, this problem goes away.

Gary

:+1: understood — thanks for clarifying. The alpha release is quite old and I think I remember suffering through similar issues to these circa 2012!!! :older_man:

Glad you found the right file in the end. Just for future reference, we do have a data resources page with updated links, so you can refer to this list from now on and leave qiime1 behind…