qiime feature-classifier classify-consensus-vsearch error

mica.tosi · November 4, 2019, 11:35pm

Hello!
I'm new to the forum, so I hope to be following the rules. I was about to post a new topic about a similar error, but since this was recently posted, I figured it's easy to ask here.

I get the same error as @Asha1 when trying to use classify-consensus-vsearch with MaarjAM as a reference taxonomy (see full command below), suggesting the ids in my reference reads and reference taxonomy don't match (see error detail below). However, after checking both files for mismatches I found none. Moreover, my error message says 'identifier 517', even though my sequences are all named with the full GenBank ID (e.g., AB749517). I also checked for spaces that could be interfering with the code, but again, nothing.

Someone reported a similar problem here, but that user did find a mismatch. I also tried everything else suggested in that topic, but I still get the error.

One last detail: To prepare my MaarjAM files for qiime, I followed the instructions of a generous github soul. Interestingly, when I tested vsearch with his files (unfortunately out-of-date now), it worked perfectly. And since I followed his R code to prepare and check both my fasta and id-to-taxo, I'm suspecting the error comes from one of the new sequences.

I'm wondering what should I check next if there are no evident mismatches. Could something else be causing this error?

Thanks in advance for your time,

mica

##########
This is the command I ran (QIIME 2 v. 2019.1):

classifier classify-consensus-vsearch
--i-query rep-seqs.qza
--i-reference-reads maarjam_fasta.qza
--i-reference-taxonomy maarjam_id-to-taxo.qza
--p-perc-identity 0.99
--p-threads 15
--verbose
--o-classification taxonomy.qza

This is the output and error message:

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: vsearch --usearch_global /tmp/qiime2-archive-lkyw234k/e38c5264-5fc6-45f7-b4e3-e97eaa0eb252/data/dna-sequences.fasta --id 0.99 --query_cov 0.8 --strand both --maxaccepts 10 --maxrejects 0 --output_no_hits --db /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta --threads 15 --blast6out /tmp/tmpv2i488v1

vsearch v2.7.0_linux_x86_64, 127.6GB RAM, 24 cores
GitHub - torognes/vsearch: Versatile open-source tool for microbiome analysis

Reading file /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta 100%
19463763 nt in 37596 seqs, min 134, max 1823, avg 518
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 1349 of 3020 (44.67%)
Traceback (most recent call last):

File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3124, in get_value*
return libindex.get_value_box(s, key)*
File "pandas/_libs/index.pyx", line 55, in pandas._libs.index.get_value_box*
File "pandas/_libs/index.pyx", line 63, in pandas._libs.index.get_value_box*
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py", line 104, in _import_blast_format_assignments*
t = ref_taxa[id_]*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/series.py", line 767, in getitem*
result = self.index.get_value(self, key)*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3132, in get_value*
raise e1*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3118, in get_value*
tz=getattr(series.dtype, 'tz', None))*
File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value*
File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value*
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc*
File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item*
File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item*
KeyError: '517'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in call*
results = action(*arguments)
File "</home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-352>", line 2, in classify_consensus_vsearch*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable*
output_types, provenance)*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor*
output_views = self._callable(*view_args)
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_vsearch.py", line 37, in classify_consensus_vsearch*
unassignable_label=unassignable_label)*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py", line 29, in _consensus_assignments*
output.name, ref_taxa, unassignable_label=unassignable_label)*
File "/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py", line 109, in _import_blast_format_assignments*
'taxonomy.').format(str(id_)))*
KeyError: 'Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.'

Plugin error from feature-classifier:

*** 'Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.'***

See above for debug info.