Hi to all!
I am running QIIME2 2019.01 via VirtualBox to analyze a study downloaded from NCBI. When using qiime vsearch cluster-features-closed-reference, I faced this problem.
with verbose:
vsearch v2.7.0_linux_x86_64, 4.8GB RAM, 6 cores
Reading file /tmp/qiime2-archive-9w63scf6/4aa7bd0a-0b9b-44e7-b342-007854eb5248/data/dna-sequences.fasta 100%
142290491 nt in 99322 seqs, min 1254, max 2353, avg 1433
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching 100%
Matching query sequences: 0 of 148682 (0.00%)
vsearch v2.7.0_linux_x86_64, 4.8GB RAM, 6 cores
Reading file /tmp/tmp8pr_na5i 100%
38310917 nt in 148682 seqs, min 54, max 896, avg 258
Getting sizes 100%
Sorting 100%
Median abundance: 1
Writing output 100%
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: vsearch --usearch_global /tmp/tmp9dh7g16y --id 0.97 --db /tmp/qiime2-archive-9w63scf6/4aa7bd0a-0b9b-44e7-b342-007854eb5248/data/dna-sequences.fasta --uc /tmp/tmphpc6wmek --strand plus --qmask none --notmatched /tmp/tmp8pr_na5i --threads 1
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.
Command: vsearch --sortbysize /tmp/tmp8pr_na5i --xsize --output /tmp/q2-DNAFASTAFormat-gk1csu50
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_vsearch/_cluster_features.py", line 275, in cluster_features_closed_reference
collapse_f = _collapse_f_from_sqlite(conn)
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_vsearch/_cluster_features.py", line 97, in _collapse_f_from_sqlite
raise ValueError("No sequence matches were identified by vsearch.")
ValueError: No sequence matches were identified by vsearch.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in call
results = action(**arguments)
File "</home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-122>", line 2, in cluster_features_closed_reference
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_vsearch/_cluster_features.py", line 278, in cluster_features_closed_reference
raise VSearchError('No matches were identified to '
q2_vsearch._cluster_features.VSearchError: No matches were identified to reference_sequences. This can happen if sequences are not homologous to reference_sequences, or if sequences are not in the same orientation as reference_sequences (i.e., if sequences are reverse complemented with respect to reference sequences). Sequence orientation can be adjusted with the strand parameter.
I searched for the problem on the forum but did not find a way to solve it.
By the way, I had another two questions:
- The data were sequencing with 454-FLX Titanium chemistry (Roche) so I suppose DADA2 or Deblur could not be applied?
- I didn't do quality control and chimera filtering myself because the data I downloaded are clean according to the original paper. Is it okay?
Thank you!