Plugin error from dada2: An error was encountered while running DADA2 in R (return code 1).

Hello everyone,
I'm trying to run dada2 denoise-paired on already demultiplexed Novaseq 6000 150 PE 12s sequences (amplicon size 100 bp ca) of fishes. I have already seen some topics about this issue I think. From what I have understands there is something about the new binned quality scores of Novaseq sequencer.
Does someone already tried to tackle this issue in a rather straightforward way?

This is the command I run
qiime dada2 denoise-paired
--i-demultiplexed-seqs 12s_demu.qza
--p-trunc-len-f 45
--p-trunc-len-r 45
--o-representative-sequences 12s-rep-seqs-dada2.qza
--o-table 12s-table-dada2.qza
--o-denoising-stats 12s-stats-dada2.qza
--verbose

The error code I get:
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada.R --input_directory /tmp/tmpjp6xn60c/forward --input_directory_reverse /tmp/tmpjp6xn60c/reverse --output_path /tmp/tmpjp6xn60c/output.tsv.biom --output_track /tmp/tmpjp6xn60c/track.tsv --filtered_directory /tmp/tmpjp6xn60c/filt_f --filtered_directory_reverse /tmp/tmpjp6xn60c/filt_r --truncation_length 45 --truncation_length_reverse 45 --trim_left 0 --trim_left_reverse 0 --max_expected_errors 2.0 --max_expected_errors_reverse 2.0 --truncation_quality_score 2 --min_overlap 12 --pooling_method independent --chimera_method consensus --min_parental_fold 1.0 --allow_one_off False --num_threads 1 --learn_min_reads 1000000

R version 4.2.3 (2023-03-15)
Loading required package: Rcpp
DADA2: 1.26.0 / Rcpp: 1.0.10 / RcppParallel: 5.1.6
2) Filtering .................
3) Learning Error Rates
47639430 total bases in 1058654 reads from 6 samples will be used for learning the error rates.
47639430 total bases in 1058654 reads from 6 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction,
selfConsist = TRUE, multithread = multithread, verbose = verbose,
MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w,
classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 326, in denoise_paired
run_commands([cmd])
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/tmp/tmpjp6xn60c/forward', '--input_directory_reverse', '/tmp/tmpjp6xn60c/reverse', '--output_path', '/tmp/tmpjp6xn60c/output.tsv.biom', '--output_track', '/tmp/tmpjp6xn60c/track.tsv', '--filtered_directory', '/tmp/tmpjp6xn60c/filt_f', '--filtered_directory_reverse', '/tmp/tmpjp6xn60c/filt_r', '--truncation_length', '45', '--truncation_length_reverse', '45', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '2.0', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '1', '--learn_min_reads', '1000000']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2cli/commands.py", line 468, in call
results = action(**arguments)
File "", line 2, in denoise_paired
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
outputs = self.callable_executor(
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 509, in callable_executor
output_views = self._callable(**view_args)
File "/home/PERSONALE/alex.cussigh2/.conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 339, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

Thank you so much for the attention,
Alex

Hi @alexcussigh0,

The key lies in this line of the traceback:

Error rates could not be estimated (this is usually because of very few reads).

It looks like too many of your reads are being filtered out, which is not allowing for an error matrix to be created within DADA2. I would recommend easing up on your trunc-len param, because this is what's filtering out all of your reads - you will essentially never see amplicons with a bp length that short. If you're unsure of where to trim or truncate your reads, I'd recommend looking at the interactive quality plot from demux summarize to make an informed decision.

Cheers :lizard:

Before this, I have already run the command with trunc-len 0 and I have got the same error :confused:

@alexcussigh0 it sounds like an issue with your data then vs. with your filtering. Can you share your demux.qza file (either in your response or in a direct message, whichever you prefer)?

@lizgehret thank you so much!
Here the link with the demux file

Hello @alexcussigh0,

Unfortunately it looks like @lizgehret was right. If you look at your demux visualizer
12s_demux.qzv (311.7 KB), you can see that there has been some serious read trimming. Your median read length is ~37 nt for both forward and reverse, far too short to merge. From looking at the quality plots, there is a steep drop off in read quality around this position. One guess as to what happened is that this was essentially a failed run but the sequencing center quality trimmed most of the reads and gave you what was left over.

If you are determined to move forward with these data, some more quality control would be necessary, but you're looking at probably only being able to use 10% or so of the sequences.

Thanks @colinvwood,
somehow I managed to carry on the analysis settin p-trim-left-f/r 45 so I removed all the Ns present at the beginning of the reads!
The results are still underwhelming ahah

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.