qiime feature-classifier classify-consensus-vsearch error

Hello

I have problem with qiime feature-classifier classify-consensus-vsearch taxanomy analysis. There are so many fasta sequence id starts with 465. I don’t know for which Id it has no taxonomy reference data.
Could anyone help me how to sort out this problem?

This is the error I got

(qiime2-2019.1) ruthra@bioinformatics-PowerEdge-T630:~/miniconda2/envs/qiime2-2019.1/v3-v4_region_result/sp/mo/moraxellaca.taxaplot$ qiime feature-classifier classify-consensus-vsearch --i-query rep-seqs-dada2.qza --i-reference-reads si_cs.qza --i-reference-taxonomy si_cs_txt.qza --p-maxaccepts 1 --p-perc-identity 0.7 --p-query-cov 1.0 --p-strand ‘both’ --p-min-consensus 1.0 --p-unassignable-label ‘Unassigned’ --o-classification cc_5_bacterialsequenceclassifyfile --VERBOSE
Usage: qiime feature-classifier classify-consensus-vsearch
[OPTIONS]
Try “qiime feature-classifier classify-consensus-vsearch --help” for help.

Error: no such option: --VERBOSE
(qiime2-2019.1) ruthra@bioinformatics-PowerEdge-T630:~/miniconda2/envs/qiime2-2019.1/v3-v4_region_result/sp/mo/moraxellaca.taxaplot$ qiime feature-classifier classify-consensus-vsearch --i-query rep-seqs-dada2.qza --i-reference-reads si_cs.qza --i-reference-taxonomy si_cs_txt.qza --p-maxaccepts 1 --p-perc-identity 0.7 --p-query-cov 1.0 --p-strand ‘both’ --p-min-consensus 1.0 --p-unassignable-label ‘Unassigned’ --o-classification cc_5_bacterialsequenceclassifyfile --verbose
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: vsearch --usearch_global /tmp/qiime2-archive-6vhz93uq/ce425a22-beb2-4b17-979d-cd8b2b8487f0/data/dna-sequences.fasta --id 0.7 --query_cov 1.0 --strand both --maxaccepts 1 --maxrejects 0 --output_no_hits --db /tmp/qiime2-archive-owedsc7u/d5ce9526-e2a9-4b53-8b72-a37a2c11da39/data/dna-sequences.fasta --threads 1 --blast6out /tmp/tmp6bfegv_k

vsearch v2.7.0_linux_x86_64, 283.3GB RAM, 32 cores

Reading file /tmp/qiime2-archive-owedsc7u/d5ce9526-e2a9-4b53-8b72-a37a2c11da39/data/dna-sequences.fasta 100%
291387001 nt in 203523 seqs, min 286, max 4258, avg 1432
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 533 of 559 (95.35%)
Traceback (most recent call last):
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3124, in get_value
return libindex.get_value_box(s, key)
File “pandas/_libs/index.pyx”, line 55, in pandas._libs.index.get_value_box
File “pandas/_libs/index.pyx”, line 63, in pandas._libs.index.get_value_box
TypeError: ‘str’ object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 104, in import_blast_format_assignments
t = ref_taxa[id
]
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/series.py”, line 767, in getitem
result = self.index.get_value(self, key)
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3132, in get_value
raise e1
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3118, in get_value
tz=getattr(series.dtype, ‘tz’, None))
File “pandas/_libs/index.pyx”, line 106, in pandas._libs.index.IndexEngine.get_value
File “pandas/_libs/index.pyx”, line 114, in pandas._libs.index.IndexEngine.get_value
File “pandas/_libs/index.pyx”, line 162, in pandas._libs.index.IndexEngine.get_loc
File “pandas/_libs/hashtable_class_helper.pxi”, line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File “pandas/_libs/hashtable_class_helper.pxi”, line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: ‘465’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-346>”, line 2, in classify_consensus_vsearch
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_vsearch.py”, line 37, in classify_consensus_vsearch
unassignable_label=unassignable_label)
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 29, in _consensus_assignments
output.name, ref_taxa, unassignable_label=unassignable_label)
File “/home/ruthra/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 109, in import_blast_format_assignments
‘taxonomy.’).format(str(id
)))
KeyError: ‘Identifier 465 was reported in taxonomic search results, but was not present in the reference taxonomy.’

Plugin error from feature-classifier:

‘Identifier 465 was reported in taxonomic search results, but was not present in the reference taxonomy.’

See above for debug info.

Herewith, I have attached my file link for your kind perusal. Thanking you in advance.


1 Like

Hello!
I’m new to the forum, so I hope to be following the rules. I was about to post a new topic about a similar error, but since this was recently posted, I figured it’s easy to ask here.

I get the same error as @Asha1 when trying to use classify-consensus-vsearch with MaarjAM as a reference taxonomy (see full command below), suggesting the ids in my reference reads and reference taxonomy don’t match (see error detail below). However, after checking both files for mismatches I found none. Moreover, my error message says ‘identifier 517’, even though my sequences are all named with the full GenBank ID (e.g., AB749517). I also checked for spaces that could be interfering with the code, but again, nothing.

Someone reported a similar problem here, but that user did find a mismatch. I also tried everything else suggested in that topic, but I still get the error.

One last detail: To prepare my MaarjAM files for qiime, I followed the instructions of a generous github soul. Interestingly, when I tested vsearch with his files (unfortunately out-of-date now), it worked perfectly. And since I followed his R code to prepare and check both my fasta and id-to-taxo, I’m suspecting the error comes from one of the new sequences.

I’m wondering what should I check next if there are no evident mismatches. Could something else be causing this error?

Thanks in advance for your time,

mica

##########
This is the command I ran (QIIME 2 v. 2019.1):

classifier classify-consensus-vsearch
–i-query rep-seqs.qza
–i-reference-reads maarjam_fasta.qza
–i-reference-taxonomy maarjam_id-to-taxo.qza
–p-perc-identity 0.99
–p-threads 15
–verbose
–o-classification taxonomy.qza

This is the output and error message:

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: vsearch --usearch_global /tmp/qiime2-archive-lkyw234k/e38c5264-5fc6-45f7-b4e3-e97eaa0eb252/data/dna-sequences.fasta --id 0.99 --query_cov 0.8 --strand both --maxaccepts 10 --maxrejects 0 --output_no_hits --db /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta --threads 15 --blast6out /tmp/tmpv2i488v1

vsearch v2.7.0_linux_x86_64, 127.6GB RAM, 24 cores
https://github.com/torognes/vsearch

Reading file /tmp/qiime2-archive-3_zrsm3r/cf5cb7a2-b54e-41be-aa02-612ec3103edf/data/dna-sequences.fasta 100%
19463763 nt in 37596 seqs, min 134, max 1823, avg 518
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 1349 of 3020 (44.67%)
Traceback (most recent call last):

  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3124, in get_value*
  • return libindex.get_value_box(s, key)*
  • File “pandas/_libs/index.pyx”, line 55, in pandas._libs.index.get_value_box*
  • File “pandas/_libs/index.pyx”, line 63, in pandas._libs.index.get_value_box*
    TypeError: ‘str’ object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 104, in _import_blast_format_assignments*
  • t = ref_taxa[id_]*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/series.py”, line 767, in getitem*
  • result = self.index.get_value(self, key)*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3132, in get_value*
  • raise e1*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/pandas/core/indexes/base.py”, line 3118, in get_value*
  • tz=getattr(series.dtype, ‘tz’, None))*
  • File “pandas/_libs/index.pyx”, line 106, in pandas._libs.index.IndexEngine.get_value*
  • File “pandas/_libs/index.pyx”, line 114, in pandas._libs.index.IndexEngine.get_value*
  • File “pandas/_libs/index.pyx”, line 162, in pandas._libs.index.IndexEngine.get_loc*
  • File “pandas/_libs/hashtable_class_helper.pxi”, line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item*
  • File “pandas/_libs/hashtable_class_helper.pxi”, line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item*
    KeyError: ‘517’

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call*
  • results = action(*arguments)
  • File “</home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-352>”, line 2, in classify_consensus_vsearch*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable*
  • output_types, provenance)*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor*
  • output_views = self._callable(*view_args)
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_vsearch.py”, line 37, in classify_consensus_vsearch*
  • unassignable_label=unassignable_label)*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 29, in _consensus_assignments*
  • output.name, ref_taxa, unassignable_label=unassignable_label)*
  • File “/home/karidunfield/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_consensus_assignment.py”, line 109, in _import_blast_format_assignments*
  • ‘taxonomy.’).format(str(id_)))*
    KeyError: 'Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.'

Plugin error from feature-classifier:

*** ‘Identifier 517 was reported in taxonomic search results, but was not present in the reference taxonomy.’***

See above for debug info.

1 Like

Hello again @Asha1 and welcome to the forum @mica.tosi !

I apologize both for the delay, I was out of the office.

Yes thanks indeed for asking in this topic! It seems you have identical issues so asking here will save us all time. And thanks for troubleshooting this a little bit and providing details on your process.

This issue is almost always caused by formatting issues, occasionally very cryptic issues. So we usually only see this crop up when someone is generating their own reference database since the kinks have mostly been worked out for commonly used databases — or when the reference sequences and taxonomy really do not match (honest mistakes happen!)

So with that in mind please see this:

My hunch is that you both could have this same issue with invisible line breaks causing formatting issues. Want to take a look and report back?

1 Like

It worked! I converted the fasta with dos2unix before importing and it ran smoothly.
Couldn’t be happier.
Thanks a lot :smile:

2 Likes