DADA2 error (return code -9) How to trim before denoising?

Hi, all,

I’m currently using DADA2 to denoise my sequence reads and I’ve encountered an error I can’t seem to overcome. My code compiles successfully when I specify a truncation length close to the end of my sequence bases (around 420 base pairs), but results in an error when I lower this amount or specify 0 to avoid truncation altogether. Ideally, I do not want to truncate my sequences because my quality scores are good across the board. Here is my code that works, followed by the code that results in an error outlined with --verbose (I’m using docker and running Windows command prompt):

 C:\QIIME2\Indiana\Run2>docker run --rm -t -i -v %cd%:/data qiime2/core:2017.9 qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trunc-len 420 --o-representative-sequences rep-seqs-dada2.qza --o-table table-dada2.qza
 Saved FeatureTable[Frequency] to: table-dada2.qza
 Saved FeatureData[Sequence] to: rep-seqs-dada2.qza

 C:\QIIME2\Indiana\Run2>docker run --rm -t -i -v %cd%:/data qiime2/core:2017.9 qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trunc-len 0 --o-representative-sequences rep-seqs-dada2.qza --o-table table-dada2.qza --verbose
 Running external command line application(s). This may print messages to stdout and/or stderr.
 The command(s) being run are below. These commands cannot be manually re-run as they will        depend on temporary files that no longer exist.

 Command: run_dada_single.R /tmp/qiime2-archive-mtq1goc1/f4f9b470-d02f-4ac6-9246-a02575c4d9e2/data /tmp/tmpclae2epg/output.tsv.biom /tmp/tmpclae2epg 0 0 2.0 2 consensus 1.0 1 1000000

 R version 3.3.2 (2016-10-31)
 Loading required package: Rcpp
 There were 50 or more warnings (use warnings() to see the first 50)
 DADA2 R package version: 1.4.0
 1) Filtering .............Traceback (most recent call last):
   File "/opt/conda/lib/python3.5/site-packages/q2_dada2/", line 126, in denoise_single
   File "/opt/conda/lib/python3.5/site-packages/q2_dada2/", line 35, in run_commands, check=True)
   File "/opt/conda/lib/python3.5/", line 398, in run
output=stdout, stderr=stderr)
 subprocess.CalledProcessError: Command '['run_dada_single.R', '/tmp/qiime2-archive-mtq1goc1/f4f9b470-d02f-4ac6-9246-a02575c4d9e2/data', '/tmp/tmpclae2epg/output.tsv.biom', '/tmp/tmpclae2epg', '0', '0', '2.0', '2', 'consensus', '1.0', '1', '1000000']' returned non-zero exit status -9

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/opt/conda/lib/python3.5/site-packages/q2cli/", line 218, in __call__
results = action(**arguments)
   File "<decorator-gen-318>", line 2, in denoise_single
   File "/opt/conda/lib/python3.5/site-packages/qiime2/sdk/", line 201, in callable_wrapper
output_types, provenance)
  File "/opt/conda/lib/python3.5/site-packages/qiime2/sdk/", line 334, in _callable_executor_
output_views = callable(**view_args)
   File "/opt/conda/lib/python3.5/site-packages/q2_dada2/", line 137, in denoise_single
" and stderr to learn more." % e.returncode)
 Exception: An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

 Plugin error from dada2:

   An error was encountered while running DADA2 in R (return code -9),
   please inspect stdout and stderr to learn more.

 See above for debug info.

To add to my frustration, a smaller data set with less sequence reads that has gone through the exact same pipeline didn’t result in this error and I was able to denoise it with DADA2, specifying a truncation length of 0. I suspect my issue is similar to here: Exit Code -9 From DADA2

On that recommendation, I’ve allocated as much memory and CPU to Docker as I possibly can and I’m still receiving the error. I suspect my files are too large for my computer to process and I know I have a lot of junk reads that are small and need to be removed. I’m thinking if I can remove these before I denoise my sequences I’ll be able to run DADA2 with no problem, but I don’t know how to do that. I know --p-trunc-len NUMBER truncates the sequence at the 3’ end of the number of base pairs specified and discards anything shorter than the length given, so I can’t use that since my actual reads vary in their length and are of a good quality. Is there anything I can do?

Hi @Brightbeard, sorry to hear things aren’t going well in the DADA2 department. I suspect your hunch is correct, and that you’re running out of memory while processing these samples through q2-dada2. The only option that comes to mind is to use q2-quality-filter to remove these “junk reads”, otherwise, you could look at a QIIME 2 AWS VM to run this denoising step on, allocating a high-memory EC2 instance. Sorry I don’t really have a better answer for you at the moment. :t_rex:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.