Hi,
I want to train a classifier from phytoref database, I have got my ref-seq.qza file: phytoref_ref-seqs.qza (946.4 KB)
and the taxonomy.qza file: phytoref_taxonomy.qza (176.5 KB)
However, when I run the command below:
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads phytoref_ref-seqs.qza --i-reference-taxonomy phytoref_taxonomy.qza --o-classifier phytoref-classifier.qza
I got this error:
Plugin error from feature-classifier:
Invalid character in sequence: b'X'.
Valid characters: ['G', 'W', 'R', 'M', 'K', 'B', 'V', 'A', '-', '.', 'D', 'C', 'T', 'H', 'S', 'N', 'Y']
Note: Use lowercase
if your sequence contains lowercase characters not in the sequence's alphabet.
Debug info has been saved to /tmp/qiime2-q2cli-err-y4eqdzmc.log
Here is the log file:
Traceback (most recent call last):
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2cli/commands.py", line 311, in call
results = action(**arguments)
File "</home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/decorator.py:decorator-gen-349>", line 2, in fit_classifier_naive_bayes
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_feature_classifier/classifier.py", line 318, in generic_fitter
pipeline)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_feature_classifier/_skl.py", line 29, in fit_pipeline
seq_ids, X = _extract_reads(reads)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_feature_classifier/_skl.py", line 37, in _extract_reads
return zip([(r.metadata['id'], r._string) for r in reads])
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_feature_classifier/_skl.py", line 37, in
return zip([(r.metadata['id'], r._string) for r in reads])
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_types/feature_data/_transformer.py", line 228, in iter
yield from self.generator
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/io/registry.py", line 506, in
return (x for x in itertools.chain([next(gen)], gen))
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/io/registry.py", line 531, in _read_gen
yield from reader(file, **kwargs)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/io/registry.py", line 1008, in wrapped_reader
yield from reader_function(fhs[-1], **kwargs)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/io/format/fasta.py", line 677, in _fasta_to_generator
**kwargs)
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/sequence/_grammared_sequence.py", line 326, in init
self._validate()
File "/home/ugg/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/skbio/sequence/_grammared_sequence.py", line 350, in _validate
list(self.alphabet)))
ValueError: Invalid character in sequence: b'X'.
Valid characters: ['G', 'W', 'R', 'M', 'K', 'B', 'V', 'A', '-', '.', 'D', 'C', 'T', 'H', 'S', 'N', 'Y']
Note: Use lowercase
if your sequence contains lowercase characters not in the sequence's alphabet.
I have encountered the same problem with this issue: Plugin Error: feature-classifier classify-sklearn
but the solution suggested there did not solve my problem, I am still getting the same error above.
Could you please help me understand what mistake I have been doing if there is one.
Thanks..