Hello!
I’m new to the forum, so I hope to be following the rules. I was about to post a new topic about a similar error, but since this was recently posted, I figured it’s easy to ask here.
I get the same error as @Asha1 when trying to use classify-consensus-vsearch with MaarjAM as a reference taxonomy (see full command below), suggesting the ids in my reference reads and reference taxonomy don’t match (see error detail below). However, after checking both files for mismatches I found none. Moreover, my error message says ‘identifier 517’, even though my sequences are all named with the full GenBank ID (e.g., AB749517). I also checked for spaces that could be interfering with the code, but again, nothing.
Someone reported a similar problem here, but that user did find a mismatch. I also tried everything else suggested in that topic, but I still get the error.
One last detail: To prepare my MaarjAM files for qiime, I followed the instructions of a generous github soul. Interestingly, when I tested vsearch with his files (unfortunately out-of-date now), it worked perfectly. And since I followed his R code to prepare and check both my fasta and id-to-taxo, I’m suspecting the error comes from one of the new sequences.
I’m wondering what should I check next if there are no evident mismatches. Could something else be causing this error?
Thanks in advance for your time,
mica
##########
This is the command I ran (QIIME 2 v. 2019.1):
classifier classify-consensus-vsearch
–i-query rep-seqs.qza
–i-reference-reads maarjam_fasta.qza
–i-reference-taxonomy maarjam_id-to-taxo.qza
–p-perc-identity 0.99
–p-threads 15
–verbose
–o-classification taxonomy.qza
This is the output and error message:
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: vsearch --usearch_global /tmp/qiime2-archive-lkyw234k/e38c5264-5fc6-45f7-b4e3-e97eaa0eb252/data/dna-sequences.fasta --id 0.99 --query_cov 0.8 --strand both --maxaccepts 10 --maxrejects 0 --output_no_hits --db /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta --threads 15 --blast6out /tmp/tmpv2i488v1
vsearch v2.7.0_linux_x86_64, 127.6GB RAM, 24 cores
https://github.com/torognes/vsearch
Reading file /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta 100%
19463763 nt in 37596 seqs, min 134, max 1823, avg 518
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 1349 of 3020 (44.67%)
Traceback (most recent call last):
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3124, in get_value*
- return libindex.get_value_box(s, key)*
- File “pandas/_libs/index.pyx”, line 55, in pandas._libs.index.get_value_box*
- File “pandas/_libs/index.pyx”, line 63, in pandas._libs.index.get_value_box*
TypeError: ‘str’ object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 104, in _import_blast_format_assignments*
- t = ref_taxa[id_]*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/series.py”, line 767, in getitem*
- result = self.index.get_value(self, key)*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3132, in get_value*
- raise e1*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3118, in get_value*
- tz=getattr(series.dtype, ‘tz’, None))*
- File “pandas/_libs/index.pyx”, line 106, in pandas._libs.index.IndexEngine.get_value*
- File “pandas/_libs/index.pyx”, line 114, in pandas._libs.index.IndexEngine.get_value*
- File “pandas/_libs/index.pyx”, line 162, in pandas._libs.index.IndexEngine.get_loc*
- File “pandas/_libs/hashtable_class_helper.pxi”, line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item*
- File “pandas/_libs/hashtable_class_helper.pxi”, line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item*
KeyError: ‘517’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call*
- results = action(*arguments)
- File “</home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-352>”, line 2, in classify_consensus_vsearch*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable*
- output_types, provenance)*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor*
- output_views = self._callable(*view_args)
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_vsearch.py”, line 37, in classify_consensus_vsearch*
- unassignable_label=unassignable_label)*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 29, in _consensus_assignments*
- output.name, ref_taxa, unassignable_label=unassignable_label)*
- File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 109, in _import_blast_format_assignments*
- ‘taxonomy.’).format(str(id_)))*
KeyError: 'Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.'
Plugin error from feature-classifier:
*** ‘Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.’***
See above for debug info.