I'm am using qiime2-2022.2 and have denoised my samples using DADA2. I did not truncate my seqs and wanted to use the latest SILVA database (13.8.1) with full length seqs (not NR99). The format of the database has completely changes since 13.2 so I am at a loss on which files to choose and how to import the database correctly. I tried to import it both ways (Headerless and with Header) but when I run the qiime feature-classifier classify-consensus-vsearch it is giving me the following error:
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: vsearch --usearch_global /tmp/qiime2-archive-rptvji15/71a58092-ac9f-47b0-bbaa-69992417d645/data/dna-sequences.fasta --id 0.8 --query_cov 0.8 --strand both --maxaccepts 0 --maxrejects 0 --db /tmp/qiime2-archive-89ol21_6/7e3805db-e931-4885-aeb5-e9d8cde1bb4b/data/dna-sequences.fasta --threads 78 --output_no_hits --blast6out /tmp/tmph173fw6e
vsearch v2.7.0_linux_x86_64, 377.4GB RAM, 88 cores
Reading file /tmp/qiime2-archive-89ol21_6/7e3805db-e931-4885-aeb5-e9d8cde1bb4b/data/dna-sequences.fasta 100%
3183581141 nt in 2224740 seqs, min 900, max 4000, avg 1431
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 3303 of 3303 (100.00%)
Traceback (most recent call last):
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3081, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'HL282720.7.1469'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_feature_classifier/_consensus_assignment.py", line 105, in import_blast_format_assignments
t = ref_taxa[id]
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/pandas/core/series.py", line 853, in getitem
return self._get_value(key)
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/pandas/core/series.py", line 961, in _get_value
loc = self.index.get_loc(label)
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3083, in get_loc
raise KeyError(key) from err
KeyError: 'HL282720.7.1469'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2cli/commands.py", line 339, in call
results = action(**arguments)
File "", line 2, in classify_consensus_vsearch
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in callable_executor
output_views = self._callable(**view_args)
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_feature_classifier/_vsearch.py", line 62, in classify_consensus_vsearch
consensus = _consensus_assignments(
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_feature_classifier/_consensus_assignment.py", line 29, in _consensus_assignments
obs_taxa = _import_blast_format_assignments(
File "/home/AnalysisTools/miniconda3/envs/qiime2-2022.2/lib/python3.8/site-packages/q2_feature_classifier/_consensus_assignment.py", line 107, in _import_blast_format_assignments
raise KeyError((
KeyError: 'Identifier HL282720.7.1469 was reported in taxonomic search results, but was not present in the reference taxonomy.'
Plugin error from feature-classifier:
'Identifier HL282720.7.1469 was reported in taxonomic search results, but was not present in the reference taxonomy.'
See above for debug info.
I saw a similar topic on the forum back in 2018 but that was with a custom database. This is not a custom database but clearly I'm doing something wrong when importing the database as the format has completely changed.
Please help.. I'm totally lost .. I've tried everything including clustering @97% and also filtering out my representative seqs from the DADA2 denoising and running the feature-classifier-vsearch with filtered seqs but still the same error.