dada2 failed while running nextseq data

Hi! I usually work with MiSeq data in QIIME2 with no issues. I am running my first batch of NextSeq data and encountered an error with dada2 that I have not seen before. I have searched the forum for discussions about running NextSeq data through QIIME2 and overall consensus seems to be that it should work, though I couldn't find any answers about how to handle the binned quality scores. Details below:

Version of QIIME2: qiime2-amplicon-2024.10, installed via conda

Command I ran:

qiime dada2 denoise-paired --i-demultiplexed-seqs /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run1/feature_table.qza --o-denoising-stats /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run1/denoise_stats.qza --o-representative-sequences /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run1/rep_seqs.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 300 --p-trunc-len-r 300 --p-n-threads 0 --output-dir /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2 –verbose

Output and error message:

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13.1 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
595317300 total bases in 1984391 reads from 1 samples will be used for learning the error rates.
595317300 total bases in 1984391 reads from 1 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction, 
       selfConsist = TRUE, multithread = multithread, verbose = verbose, 
       MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
    run_commands([cmd])
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/forward', '--input_directory_reverse', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/reverse', '--output_path', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/output.tsv.biom', '--output_track', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/track.tsv', '--filtered_directory', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/filt_f', '--filtered_directory_reverse', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmplr3uhdak/filt_r', '--truncation_length', '300', '--truncation_length_reverse', '300', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '2.0', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '0', '--learn_min_reads', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-49>", line 2, in denoise_paired
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

If it's helpful, here are the quality plots:
kw1_v6v8bac_paired_end_demux.qzv (309.3 KB)

Thanks for your time!

Hello Rachel,

Welcome to the forums! :qiime2:

Thank you for including ALL the details in your post. This is very helpful!

I think these two things are related: binned quality scores :point_down:


and

Error rates could not be estimated

Take a look at this thread from the DADA2 developer: NovaSeq and Dada2 incompatibility. - #12 by benjjneb

Also check out the alternative methods outlined here: NovaSeq data in QIIME2

Keep us posted on your progress and what you try next!

3 Likes

Hi @colinbrislawn,

Whoa, thanks for the speedy reply!

Thanks for posting the links, but I had already come across those. However, one of those topics had a link to a GitHub thread with tons of info (dropping here for anyone else who encounters this issue): Consequences of using dada2 on NovaSeq data · Issue #791 · benjjneb/dada2 · GitHub

I have some ideas about how to move forward, but first I'm wondering if there are any ways you know of to run this data using dada2 within QIIME2 (as opposed to individual steps in R)? From what I've read, using the dada2 wrapper in QIIME2 should work with nextseq data, but mine keeps bombing and I'm not sure what else to change.

Maybe something to implement in the next QIIME2 release? :slightly_smiling_face:

1 Like

Hello!

I work with NovaSeq data very often and binned quality scores were never an issue...

Looks like the error you are getting related to the fact that only few reads survive truncation settings you provided. Please consider lowering truncation from 300 to 280 and try again.

Best,

2 Likes

@timanix thanks, we are on the same page!

Before you replied, I amended the truncation to 250 for F and R reads, just to see what happens. It's still running now, I'll follow up with results!

2 Likes

Well, that did not work :confused:

I am doing test runs of dada2 on 2 different machines, to maximize my troubleshooting capacity, and so far dada2 has failed on both, even with reduced truncation values. Here are the specs if helpful:

Mac Pro 2019:
12 cores
96 gb memory
MacOS 14.6.1

MacBook Air 2023 (M2):
8 cores
24 gb memory
MacOS 14.5

On the MacBook Air, I suspect I ran into a memory issue and/or maxed out the cores/CPU, details below.

Command I ran:

qiime dada2 denoise-paired --i-demultiplexed-seqs /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_feature_table.qza --o-denoising-stats /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_denoise_stats.qza --o-representative-sequences /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_rep_seqs.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 275 --p-trunc-len-r 275 --p-n-threads 0 --output-dir /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2 –verbose

Output and error messages:

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
549751400 total bases in 1999096 reads from 1 samples will be used for learning the error rates.
R(75970,0x30a87b000) malloc: Heap corruption detected, free list is damaged at 0x60000242c090
*** Incorrect guard value: 128073017101516800
R(75970,0x30a87b000) malloc: *** set a breakpoint in malloc_error_break to debug
R(75970,0x30946c000) malloc: Heap corruption detected, free list is damaged at 0x600002431910
*** Incorrect guard value: 77
R(75970,0x30946c000) malloc: *** set a breakpoint in malloc_error_break to debug
R(75970,0x2025a6240) malloc: Heap corruption detected, free list is damaged at 0x60000242bfd0
*** Incorrect guard value: 50947743890145452
R(75970,0x2025a6240) malloc: *** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
    run_commands([cmd])
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/forward', '--input_directory_reverse', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/reverse', '--output_path', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/output.tsv.biom', '--output_track', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/track.tsv', '--filtered_directory', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/filt_f', '--filtered_directory_reverse', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpmzucmxc4/filt_r', '--truncation_length', '275', '--truncation_length_reverse', '275', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '2.0', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '0', '--learn_min_reads', '1000000']' died with <Signals.SIGABRT: 6>.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-49>", line 2, in denoise_paired
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code -6), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code -6), please inspect stdout and stderr to learn more.

See above for debug info.

On the Mac Pro, it flagged "too few reads" for estimating error rates, even though I can see in the temp files that plenty of the file sizes are passing filtering:

Command:

qiime dada2 denoise-paired --i-demultiplexed-seqs /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/feature_table.qza --o-denoising-stats /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/denoise_stats.qza --o-representative-sequences /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/rep_seqs.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 250 --p-trunc-len-r 250 --p-n-threads 0 --output-dir /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2 –verbose

Output and error message:

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13.1 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
502958000 total bases in 2011832 reads from 1 samples will be used for learning the error rates.
502958000 total bases in 2011832 reads from 1 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction, 
       selfConsist = TRUE, multithread = multithread, verbose = verbose, 
       MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
    run_commands([cmd])
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/forward', '--input_directory_reverse', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/reverse', '--output_path', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/output.tsv.biom', '--output_track', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/track.tsv', '--filtered_directory', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/filt_f', '--filtered_directory_reverse', '/var/folders/cf/1vff8nyj3kb6pv21v7dxdlqsh9kcrx/T/tmpmpof81yq/filt_r', '--truncation_length', '250', '--truncation_length_reverse', '250', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '2.0', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '0', '--learn_min_reads', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-49>", line 2, in denoise_paired
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

Currently, I am re-running on both machines with decreased threads (8 on MacBook and 12 on Mac Pro). For the Mac Pro run, I also changed truncation to 200 for F and R reads, and set --p-trunc-q to 0.

I still think the issue is related to the binned quality scores/error model- I'm wondering if I should bump up the --p-n-reads-learn value?

Thanks!

Hello!
Thank you for updates! I was curious about your dataset.
I took another look at your demux.qzv file and realized that I don't like reverse reads at all.
Based on the file's name, you are targeting the V6-V8 region. This region is big but still should overlap considering that you sequenced it with 300X2.

Could you also try the following options?

  • Increase max-ee for reverse reads
  • Run it with dada2 for only forward reads (no need to reimport, just run dada2 plugin for single reads).
3 Likes

@timanix sure thing- will try these options and report back.

I am back with partial success- tldr: increasing max-ee for reverse reads did not work with paired end reads. Running only forward reads worked (finally!), but a large portion of reads dropped at the chimera removal step :thinking:

For increasing the max-ee for reverse reads, here's what I ran:

qiime dada2 denoise-paired --i-demultiplexed-seqs /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_feature_table.qza --o-denoising-stats /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_denoise_stats.qza --o-representative-sequences /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/KW1_rep_seqs.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 250 --p-trunc-len-r 250 --p-max-ee-r 5 --p-n-threads 8 --output-dir /Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2 –verbose

And here's the output and error (same as before- "very few reads"):

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
513567500 total bases in 2054270 reads from 1 samples will be used for learning the error rates.
513567500 total bases in 2054270 reads from 1 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction, 
       selfConsist = TRUE, multithread = multithread, verbose = verbose, 
       MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
    run_commands([cmd])
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/forward', '--input_directory_reverse', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/reverse', '--output_path', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/output.tsv.biom', '--output_track', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/track.tsv', '--filtered_directory', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/filt_f', '--filtered_directory_reverse', '/Users/rmugge/Documents/QIIME2/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpm2nyd8ho/filt_r', '--truncation_length', '250', '--truncation_length_reverse', '250', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '5', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '8', '--learn_min_reads', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-49>", line 2, in denoise_paired
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/rmugge/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

For forward reads only, I did 2 runs using different parameters. Here is "run 2":

Input:

qiime dada2 denoise-single --i-demultiplexed-seqs /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/feature_table.qza --o-denoising-stats /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/denoise_stats.qza --o-representative-sequences /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/rep_seqs.qza --p-trim-left 0 --p-trunc-len 280 --p-n-threads 12 --output-dir /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2 –verbose

Output:

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13.1 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
697275600 total bases in 2490270 reads from 1 samples will be used for learning the error rates.
4) Denoise samples 
......................................
5) Remove chimeras (method = consensus)
6) Report read numbers through the pipeline
7) Write output
Saved FeatureTable[Frequency] to: /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/feature_table.qza
Saved FeatureData[Sequence] to: /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/rep_seqs.qza
Saved SampleData[DADA2Stats] to: /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run2/denoise_stats.qza

Here's the denoising stats:
denoise_stats_run2.qzv (1.2 MB)

As you can see, a good portion of reads pass the filtering and denoising steps, and then quite a bit are lost at the chimera removal step. Based on this output, I tried to improve the reads retained by changing --p-chimera-method to pooled instead of consensus ("run 3"). This somehow resulted in worse output:
denoise_stats_run3.qzv (1.2 MB)

Going back to paired end reads, I also tried increasing the --p-n-reads-learn to 2M, and that also bombed, with the same error:

Input:

qiime dada2 denoise-paired --i-demultiplexed-seqs /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/04_qiime_import/kw1_v6v8bac_paired_end_demux.qza --o-table /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run4/feature_table.qza --o-denoising-stats /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run4/denoise_stats.qza --o-representative-sequences /Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/05_dada2_run4/rep_seqs.qza --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 250 --p-trunc-len-r 200 --p-n-reads-learn 2000000 --p-max-ee-f 4 --p-max-ee-r 5 --p-n-threads 20 --verbose

Output and error:

R version 4.3.3 (2024-02-29) 
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13.1 / RcppParallel: 5.1.9 
2) Filtering ......................................
3) Learning Error Rates
547219750 total bases in 2188879 reads from 1 samples will be used for learning the error rates.
437775800 total bases in 2188879 reads from 1 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction, 
       selfConsist = TRUE, multithread = multithread, verbose = verbose, 
       MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
    run_commands([cmd])
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
    subprocess.run(cmd, check=True)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/forward', '--input_directory_reverse', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/reverse', '--output_path', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/output.tsv.biom', '--output_track', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/track.tsv', '--filtered_directory', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/filt_f', '--filtered_directory_reverse', '/Volumes/Pegasus32_R8/QIIME2_KW1_022025/pipeline_V6V8_bac/tmpi6rsjwr7/filt_r', '--truncation_length', '250', '--truncation_length_reverse', '200', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '4', '--max_expected_errors_reverse', '5', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '20', '--learn_min_reads', '2000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in __call__
    results = self._execute_action(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-49>", line 2, in denoise_paired
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
    outputs = self._callable_executor_(
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/Users/rmugge/.conda/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
    raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

  An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

I have asked our sequencing facility to run DADA2 so we can compare output- I am hoping nothing is wrong with the sequences, but I'm not sure. At this point, it looks like I either need to learn to run DADA2 in R so I can enforce monotonicity, as others have suggested, or try running Deblur instead. I am open to other suggestions if you have any!

1 Like

Hello!
At least, there is some progress! I am sorry that it takes a long time to get it working.

Personally, I didn't find big differences by doing it with NextSeq data. So I am not sure if it will fix the issue, but maybe it will work in your case.

I would try the following:

  • Run Dada2 with F reads and increased "--p-min-fold-parent-over-abundance". Check post here.
  • Merge reads with vsearch and run Deblur
  • Use only F reads with Deblur

Best,

1 Like

@timanix thank you! It did not cross my mind to merge the reads and then run through Deblur...I will try each of these options and post the results!

1 Like