Dear QIIME2 Developers,
Happy New Year!! Thank you for the wonderful tools!
QIIME2 Version & Installation Method:
QIIME2 2021.11 version installed with conda with rescript added in per the installation directions.
I am having an issue while running the qiime rescript evaluate-fit-classifier script after running the qiime rescript dereplicate script. The scripts and error are below, but the problem seems to be that the --p-mode 'super' from the latter script removes the last taxonomic identification (species or genus if run without species) from the taxonomy of some of the sequences. The script flow works perfectly running with the --p-mode 'uniq' instead.
Related forum post:
The following resolved forum post and error seems to be somewhat related: Rescript merge-taxa non-urgent bug - blank taxon values created when using 'super'
Thank you very much for your time and help.
Sincerely,
David Bradshaw
qzas used below found at (I think I made this available correctly:
https://drive.google.com/drive/folders/1XmCK_e4HzF1ZsyhFCn-zRJQeRshhZbQM?usp=sharing
Scripts Ran:
Good workflow:
qiime rescript dereplicate
--i-sequences silva-138.1-ssu-nr99-18SEuk1319f-18SEukBr-seqs.qza
--i-taxa silva-138.1-ssu-nr99-tax-derep-uniq.qza
--p-rank-handles 'silva'
--p-mode 'uniq'
--o-dereplicated-sequences silva-138.1-ssu-nr99-seqs-18SEuk1319f-18SEukBr-derep-uniq.qza
--o-dereplicated-taxa silva-138.1-ssu-nr99-tax-18SEuk1319f-18SEukBr-derep-uniq.qza
qiime rescript evaluate-fit-classifier
--i-sequences silva-138.1-ssu-nr99-seqs-18SEuk1319f-18SEukBr-derep-uniq.qza
--i-taxonomy silva-138.1-ssu-nr99-tax-18SEuk1319f-18SEukBr-derep-uniq.qza
--o-classifier silva-138.1-99-18SEuk1319f-18SEukBr-2021.8-classifier.qza
--o-observed-taxonomy silva-138-99-18SEuk1319f-18SEukBr--derep-uniq-taxonomy-predicted-taxonomy.qza
--o-evaluation silva-138-99-18SEuk1319f-18SEukBr--derep-uniq-taxonomy-fit-classifier-evaluation.qzv
--p-reads-per-batch 10000
Error workflow:
qiime rescript dereplicate
--i-sequences silva-138.1-ssu-nr99-18SEuk1319f-18SEukBr-seqs.qza
--i-taxa silva-138.1-ssu-nr99-tax-derep-uniq.qza
--p-rank-handles 'silva'
--p-mode 'super'
--o-dereplicated-sequences silva-138.1-ssu-nr99-seqs-18SEuk1319f-18SEukBr-derep-super.qza
--o-dereplicated-taxa silva-138.1-ssu-nr99-tax-18SEuk1319f-18SEukBr-derep-super.qza
qiime rescript evaluate-fit-classifier
--i-sequences silva-138.1-ssu-nr99-seqs-18SEuk1319f-18SEukBr-derep-super.qza
--i-taxonomy silva-138.1-ssu-nr99-tax-18SEuk1319f-18SEukBr-derep-super.qza
--o-classifier silva-138.1-99-341f-805r-2021.8-classifier.qza
--o-observed-taxonomy silva-138-99-341f-805r--derep-super-taxonomy-predicted-taxonomy.qza
--o-evaluation silva-138-99-341f-805r--derep-super-taxonomy-fit-classifier-evaluation.qzv
--p-reads-per-batch 10000
--verbose
Error message:
Traceback (most recent call last):
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/commands.py", line 339, in call
results = action(**arguments)
File "", line 2, in evaluate_fit_classifier
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 485, in callable_executor
outputs = self._callable(scope.ctx, **view_args)
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/rescript/cross_validate.py", line 35, in evaluate_fit_classifier
taxa, seq_ids = _validate_cross_validate_inputs(taxonomy, sequences)
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/rescript/cross_validate.py", line 205, in _validate_cross_validate_inputs
_validate_even_rank_taxonomy(taxa)
File "/home/microbiology/miniconda3/envs/qiime2-2021.11/lib/python3.8/site-packages/rescript/cross_validate.py", line 382, in _validate_even_rank_taxonomy
raise ValueError('Taxonomic label depth is uneven. All taxonomies '
ValueError: Taxonomic label depth is uneven. All taxonomies must have the same number of semicolon-delimited ranks. The following features are too short: AAAA02038450.2584.4394, AB002062.1.1771, AB002076.1.1798, AB002079.1.1770, AB003944.1.2196, etc...