(Heather E) #1

Hi. I’m having a similar problem running dada2 in qiime2 and wondered if you have heard back from the dada2 developers? I ran one sample for marker development. My demux file shows 24 million reads, so I’m guessing that the number of unique reads is what’s causing the memory issue? I’m running on an iMacPro, 64GB RAM, 8 cores. I’ve tried decreasing the --p-n-reads-learn to 50,000 and limiting to 2 threads, but I’m still getting the same error message. Any suggestions?

(qiime2-2019.1) Genomicss-iMac-Pro:Evans genomicslab$ qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trim-left-f 5 --p-trim-left-r 5 --p-trunc-len-f 234 --p-trunc-len-r 172 --o-table table.qza --o-representative-sequences rep-seqs.qza --o-denoising-stats denoising-stats.qza --verbose --p-n-reads-learn 50000 --p-n-threads 2

Running external command line application(s). This may print messages to stdout and/or stderr.

The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: run_dada_paired.R /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/forward /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/reverse /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/output.tsv.biom /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/track.tsv /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/filt_f /var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/filt_r 234 172 5 5 2.0 2 consensus 1.0 2 50000

R version 3.4.1 (2017-06-30)

Loading required package: Rcpp

DADA2 R package version: 1.6.0

  1. Filtering .

  2. Learning Error Rates

2a) Forward Reads

Initializing error rates to maximum possible estimate.

Traceback (most recent call last):

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/”, line 231, in denoise_paired


File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/”, line 36, in run_commands, check=True)

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/”, line 418, in run

output=stdout, stderr=stderr)

subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/forward’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/reverse’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/output.tsv.biom’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/track.tsv’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/filt_f’, ‘/var/folders/y2/pxn5pc8j0qncxx0z92q48r7r0000gn/T/tmpv3ph_e0t/filt_r’, ‘234’, ‘172’, ‘5’, ‘5’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘2’, ‘50000’]’ died with <Signals.SIGKILL: 9>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/”, line 274, in call

results = action(**arguments)

File “</Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/>”, line 2, in denoise_paired

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/”, line 231, in bound_callable

output_types, provenance)

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/”, line 365, in callable_executor

output_views = self._callable(**view_args)

File “/Users/genomicslab/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/”, line 246, in denoise_paired

" and stderr to learn more." % e.returncode)

Exception: An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

See above for debug info.


DADA2 error : Cannot allocate memory error in denoising step
(Matthew Ryan Dillon) #2

Hi @Heather_E!

This error message is a little bit different than the “usual” memory-related error messages we see (which is why I split this into a new topic):

You can read up a bit about SIGKILL here — but, something sent that signal to this process. Any chance the computer was just shutting down for system updates? If not, then I suspect you are just seeing a case where you are running out of memory. Can you clarify, are there 24 million reads in one sample, or for your whole run?


(Heather E) #3

Well, the whole run was just one sample, so yes and yes. It’s 24 million reads in one sample. I ran the command about 7 different times over the course of two days, decreasing --p-n-reads learn incrementally. Plus I checked that the computer settings would not allow the computer to sleep so I’m sure that the system wasn’t shutting down. I can run on the local university’s supercomputer if I need to. Any idea how much memory I will need to get this to run? They have a few nodes with options of 128GB, 256GB, or 512GB.


(Matthew Ryan Dillon) #4

Please see the DADA2 docs on this matter:

In particular:

One scaling issue to be aware of: Because the running time of the core sample inference method scales quadratically with the depth of individual samples, but linearly in the number of samples, running times will be longer when fewer samples are multiplexed. Very roughly, if your 150M Hiseq reads are split across 150 samples instead of 750, the running time will be about 5x higher.

Sounds like one gargantuan samples is going to be painfully slow here…


(Heather E) #5

Hmmm. Should I try a different program? Would deblur or something else be any faster?


(Matthew Ryan Dillon) #6

That is up to you. q2-deblur could certainly be a good option.