An error was encountered while running DADA2 in R (return code 1), denoised-single

Hi,

I’m running dada2 denoised-single on my simulated reads and received this error.
This is what I ran:
qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trunc-len 0 --output-dir qiimeoutput --verbose

Here is the error message:

Command: run_dada_single.R /tmp/qiime2-archive-45am8816/aa5c4cd2-2cd0-4fff-ba1e-eceaa3bcf57c/data /tmp/tmpavmzk2yk/output.tsv.biom /tmp/tmpavmzk2yk/track.tsv /tmp/tmpavmzk2yk 0 0 2.0 2 Inf independent consensus 1.0 1 1000000 NULL 16

R version 4.0.2 (2020-06-22)
Loading required package: Rcpp
DADA2: 1.18.0 / Rcpp: 1.0.5 / RcppParallel: 5.0.2

  1. Filtering …
  2. Learning Error Rates
    12554918 total bases in 63847 reads from 50 samples will be used for learning the error rates.
    Error rates could not be estimated (this is usually because of very few reads).
    Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
    Execution halted
    Traceback (most recent call last):
    File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 181, in _denoise_single
    run_commands([cmd])
    File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
    subprocess.run(cmd, check=True)
    File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/subprocess.py”, line 438, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command ‘[‘run_dada_single.R’, ‘/tmp/qiime2-archive-45am8816/aa5c4cd2-2cd0-4fff-ba1e-eceaa3bcf57c/data’, ‘/tmp/tmpavmzk2yk/output.tsv.biom’, ‘/tmp/tmpavmzk2yk/track.tsv’, ‘/tmp/tmpavmzk2yk’, ‘0’, ‘0’, ‘2.0’, ‘2’, ‘Inf’, ‘independent’, ‘consensus’, ‘1.0’, ‘1’, ‘1000000’, ‘NULL’, ‘16’]’ returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/q2cli/commands.py”, line 329, in call
results = action(**arguments)
File “”, line 2, in denoise_single
File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 245, in bound_callable
output_types, provenance)
File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 390, in callable_executor
output_views = self._callable(**view_args)
File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 218, in denoise_single
band_size=‘16’)
File “/data/wjw5274/anaconda3/envs/qiime2/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 192, in _denoise_single
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

I looked at posts with similar issues but could not find a suitable solution. Most of the posts I found have issues with merging forward and reverse, which is not the case here. Also tried adjusting --p-trunc-len and --p-trim-left, to no avail.
My question is, is “Error rates could not be estimated (this is usually because of very few reads).” the key issue here? Should I perhaps rerun my read simulation to have longer read length (if so, what’s a suitable read length)? Or is it something I can fix by adjusting dada2 options?
Thank you so much! Appreciate your help.

Hi @wei_wei,

Thanks for reaching out! Happy to provide some guidance here.

You’re exactly correct - that is the primary error you’re running into. This is because you are attempting to run dada2 denoise-single on simulated reads. DADA2 is designed to correct for Illumina-sequenced amplicon errors (which shouldn’t be present in your simulated reads).

If you haven’t read the DADA2 paper by Callahan et al, I’d highly recommend it for some additional background/context on how to best utilize this plugin.

Hope this helps! Please feel free to reach back out with any further questions.

Cheers,
Liz

1 Like

Hi,

Thank you for your reply! I will look into the paper. However, I simulated reads with some errors (using grinder) and I remember that I was able to use dada2 on simulated reads before. Are simulated errors inadequate?

Wei Wei

We haven’t forgotten about you, @wei_wei. Thanks for your patience!

1 Like

Hi @wei_wei, thanks for your patience here! Looping in Ben Callahan (the author of DADA2) on this.

@benjjneb, can you confirm whether dada2 can be used on simulated reads if they contain simulated errors - or is this no longer supported?

2 Likes

Simulated reads/errors can be used with DADA2… if the simulated errors are reasonably close to what might come off a real sequencing instrument. In particular, there should be a range of quality scores and not just one quality score for correct bases, and another for incorrect bases. This post from the DADA2 issues tracker might help:

There’s no issue with the number of bacterial strains, however there may be an issue with the noise you are adding. The default error estimation function uses a loess fit, and basically assumes that errors are “normal” in the sense that they occur over a range of quality scores and in sufficient numbers for the loess smoothing to work.

I don’t know how you are adding errors by hand, but if its pathological, let’s say all the errors are of type T->C, or all the errors are quality score 22, the default error estimator will perform poorly.

That can be fixed by giving the algorithm the correct error model. Perhaps easier, and more to your goals, it might make more sense to just use a method that adds errors that look like those the sequencing machines make. See for example the ART program: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3278762/

2 Likes

Hi team,

Thank you! I really appreciate your help. I also discovered on my side that when I reduced the read length from 200 to 100, I was able to get dada2 to work. The exact limiting factor I wasn’t sure though. Maybe it is due to reduced error occurrences?
Unfortunately as far as I know grinder only allows two quality scores, one good one and one bad one. But I will look into ART. Thank you so much!

Wei Wei

1 Like