Exception: An error was encountered while running DADA2 in R (return code 1)

Hi, I’m running this lines on qiime2:

qiime dada2 denoise-single
--i-demultiplexed-seqs demux.qza
--p-trim-left 0 --p-trunc-len 0
--o-representative-sequences rep-seqs.qza
--o-table table.qza
--o-denoising-stats denoising-stats.qza
--p-n-threads 16

Warning message:
package ‘optparse’ was built under R version 4.2.3
R version 4.2.2 (2022-10-31)
Loading required package: Rcpp
DADA2: 1.26.0 / Rcpp: 1.0.12 / RcppParallel: 5.1.6
2) Filtering ..........................
3) Learning Error Rates
155833747 total bases in 1032144 reads from 24 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction,
selfConsist = TRUE, multithread = multithread, verbose = verbose,
MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filts, nreads = nreads.learn, multithread = multithread,
HOMOPOLYMER_GAP_PENALTY = HOMOPOLYMER_GAP_PENALTY, BAND_SIZE = BAND_SIZE)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w,
classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filts, nreads = nreads.learn, multithread = multithread,
HOMOPOLYMER_GAP_PENALTY = HOMOPOLYMER_GAP_PENALTY, BAND_SIZE = BAND_SIZE))
Traceback (most recent call last):
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 240, in _denoise_single
run_commands([cmd])
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 37, in run_commands
subprocess.run(cmd, check=True)
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/tmp/qiime2/ubuntu/data/271d51ca-00d4-4639-83f9-eb2f41f62cd6/data', '--output_path', '/tmp/tmptq9f3lol/output.tsv.biom', '--output_track', '/tmp/tmptq9f3lol/track.tsv', '--filtered_directory', '/tmp/tmptq9f3lol', '--truncation_length', '0', '--trim_left', '0', '--max_expected_errors', '2.0', '--truncation_quality_score', '2', '--max_length', 'Inf', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '1', '--learn_min_reads', '1000000', '--homopolymer_gap_penalty', 'NULL', '--band_size', '16']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2cli/commands.py", line 520, in call
results = self._execute_action(
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2cli/commands.py", line 581, in _execute_action
results = action(**arguments)
File "", line 2, in denoise_single
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
outputs = self.callable_executor(
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 566, in callable_executor
output_views = self._callable(**view_args)
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 266, in denoise_single
return _denoise_single(
File "/data/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 249, in _denoise_single
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Hi @Achraf_Zbaida

This is the part of the error that I think is really informative:

It seems that there are not enough reads to estimate error rates. I think that you have enough reads (1032144 reads from 24 samples), but they are possibly not making it through filtering.

I notice that you didn't trim your sequences.

Do you have good sequence quality throughout your entire sequence length? Would you mind attaching a picture of your interactive quality plot from your demux.qza, so I can take a look?

This error could be a couple of things, but my first thought is that not enough sequences are making it passed filtering and truncating more conservatively might enable more sequences to make it through.

Thanks!
:turtle:

Thank you for your response.
Here is the interactive quality plot:

Hi @Achraf_Zbaida,
So its not quality, you have a very high quality sequences.

I am a little surprised by your quality plot. Usually these have some variation in the quality of sequences but your seems to be completely stable at 30. Is this surprising to you?

I am not exactly sure what is happening with your error rates.

Another idea: Maybe its because you have such a high multi threading parameter and it is causing the data to be partitioned too small? (I am not convinced that this is how dada2 calculates error rates).

Could you try reducing your --p-threads and how that goes?