Hi, all,
I’m currently using DADA2 to denoise my sequence reads and I’ve encountered an error I can’t seem to overcome. My code compiles successfully when I specify a truncation length close to the end of my sequence bases (around 420 base pairs), but results in an error when I lower this amount or specify 0 to avoid truncation altogether. Ideally, I do not want to truncate my sequences because my quality scores are good across the board. Here is my code that works, followed by the code that results in an error outlined with --verbose (I’m using docker and running Windows command prompt):
C:\QIIME2\Indiana\Run2>docker run --rm -t -i -v %cd%:/data qiime2/core:2017.9 qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trunc-len 420 --o-representative-sequences rep-seqs-dada2.qza --o-table table-dada2.qza
Saved FeatureTable[Frequency] to: table-dada2.qza
Saved FeatureData[Sequence] to: rep-seqs-dada2.qza
C:\QIIME2\Indiana\Run2>docker run --rm -t -i -v %cd%:/data qiime2/core:2017.9 qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trunc-len 0 --o-representative-sequences rep-seqs-dada2.qza --o-table table-dada2.qza --verbose
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: run_dada_single.R /tmp/qiime2-archive-mtq1goc1/f4f9b470-d02f-4ac6-9246-a02575c4d9e2/data /tmp/tmpclae2epg/output.tsv.biom /tmp/tmpclae2epg 0 0 2.0 2 consensus 1.0 1 1000000
R version 3.3.2 (2016-10-31)
Loading required package: Rcpp
There were 50 or more warnings (use warnings() to see the first 50)
DADA2 R package version: 1.4.0
1) Filtering .............Traceback (most recent call last):
File "/opt/conda/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 126, in denoise_single
run_commands([cmd])
File "/opt/conda/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 35, in run_commands
subprocess.run(cmd, check=True)
File "/opt/conda/lib/python3.5/subprocess.py", line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run_dada_single.R', '/tmp/qiime2-archive-mtq1goc1/f4f9b470-d02f-4ac6-9246-a02575c4d9e2/data', '/tmp/tmpclae2epg/output.tsv.biom', '/tmp/tmpclae2epg', '0', '0', '2.0', '2', 'consensus', '1.0', '1', '1000000']' returned non-zero exit status -9
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.5/site-packages/q2cli/commands.py", line 218, in __call__
results = action(**arguments)
File "<decorator-gen-318>", line 2, in denoise_single
File "/opt/conda/lib/python3.5/site-packages/qiime2/sdk/action.py", line 201, in callable_wrapper
output_types, provenance)
File "/opt/conda/lib/python3.5/site-packages/qiime2/sdk/action.py", line 334, in _callable_executor_
output_views = callable(**view_args)
File "/opt/conda/lib/python3.5/site-packages/q2_dada2/_denoise.py", line 137, in denoise_single
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.
Plugin error from dada2:
An error was encountered while running DADA2 in R (return code -9),
please inspect stdout and stderr to learn more.
See above for debug info.
To add to my frustration, a smaller data set with less sequence reads that has gone through the exact same pipeline didn’t result in this error and I was able to denoise it with DADA2, specifying a truncation length of 0. I suspect my issue is similar to here: Exit Code -9 From DADA2
On that recommendation, I’ve allocated as much memory and CPU to Docker as I possibly can and I’m still receiving the error. I suspect my files are too large for my computer to process and I know I have a lot of junk reads that are small and need to be removed. I’m thinking if I can remove these before I denoise my sequences I’ll be able to run DADA2 with no problem, but I don’t know how to do that. I know --p-trunc-len NUMBER truncates the sequence at the 3’ end of the number of base pairs specified and discards anything shorter than the length given, so I can’t use that since my actual reads vary in their length and are of a good quality. Is there anything I can do?