I tried using feature-classifier classify-sklearn for my tree. I believe the issue is with my fasta file since the example file provided in the tutorial page works.
qiime tools import
--type 'FeatureData[Sequence]'
--input-path ~/MS_set6/reference-hit.seqs.fa
--output-path ~/MS_set6/refseqs.qza
qiime feature-classifier classify-sklearn
--i-classifier ~/MS_set6/gg-13-8-99-515-806-nb-classifier.qza
--i-reads ~/MS_set6/refseqs.qza
--o-classification ~/MS_set6/Taxonomy/MStaxonomy.qza
The Error (I have reloaded the .fa file generated from qiita and restarted my systems several time and get the same error message) reads....
Plugin error from feature-classifier:
Invalid character in sequence: b'g'. Valid characters: ['K', 'R',
'H', 'N', 'S', 'Y', 'V', 'G', 'B', 'A', '-', '.', 'C', 'D', 'T', 'M',
'W'] Note: Use lowercase
if your sequence contains lowercase
characters not in the sequence's alphabet.
The log file reads....
/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/sklearn/feature_extraction/hashing.py:94: DeprecationWarning: the option non_negative=True has been deprecated in 0.19 and will be removed in version 0.21.
" in version 0.21.", DeprecationWarning)
/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/sklearn/feature_extraction/hashing.py:94: DeprecationWarning: the option non_negative=True has been deprecated in 0.19 and will be removed in version 0.21.
" in version 0.21.", DeprecationWarning)
Traceback (most recent call last):
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2cli/commands.py", line 218, in call
results = action(**arguments)
File "", line 2, in classify_sklearn
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/qiime2/sdk/action.py", line 201, in callable_wrapper
output_types, provenance)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/qiime2/sdk/action.py", line 334, in callable_executor
output_views = callable(**view_args)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2_feature_classifier/classifier.py", line 184, in classify_sklearn
confidence=confidence)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2_feature_classifier/_skl.py", line 45, in predict
for chunk in _chunks(reads, chunk_size)) for m in c)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 779, in call
while self.dispatch_one_batch(iterator):
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 620, in dispatch_one_batch
tasks = BatchedCalls(itertools.islice(iterator, batch_size))
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 127, in init
self.items = list(iterator_slice)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2_feature_classifier/_skl.py", line 44, in
(delayed(_predict_chunk)(pipeline, separator, confidence, chunk)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2_feature_classifier/_skl.py", line 97, in _chunks
chunk = list(islice(reads, chunk_size))
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/q2_types/feature_data/_transformer.py", line 228, in iter
yield from self.generator
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/io/registry.py", line 506, in
return (x for x in itertools.chain([next(gen)], gen))
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/io/registry.py", line 531, in _read_gen
yield from reader(file, **kwargs)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/io/registry.py", line 1008, in wrapped_reader
yield from reader_function(fhs[-1], **kwargs)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/io/format/fasta.py", line 677, in _fasta_to_generator
**kwargs)
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/sequence/_grammared_sequence.py", line 338, in init
self._validate()
File "/anaconda3/envs/qiime2-2017.9/lib/python3.5/site-packages/skbio/sequence/_grammared_sequence.py", line 362, in _validate
list(self.alphabet)))
ValueError: Invalid character in sequence: b'g'.
Valid characters: ['K', 'R', 'H', 'N', 'S', 'Y', 'V', 'G', 'B', 'A', '-', '.', 'C', 'D', 'T', 'M', 'W']
Note: Use lowercase
if your sequence contains lowercase characters not in the sequence's alphabet.
I have looked at the file with text edit and do not see a lowercase g on scanning, but the find function is not helpful since it shows me capital G in the sequence as well.
TIA