DADA2 - Error rates could not be estimated

Hello,

I'm trying to analyze my 16s dataset of 384 samples sequenced on the NextSeq2000. This isn't the first time we've sequenced on the NextSeq and we've had success analyzing our data in the past. However, when running dada2 denoise-paired, I am getting the following error when running the following command:

Command:
qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-paired-end.qza
--o-representative-sequences rep-seqs-dada2-280-180.qza
--o-table table-dada2-280-180.qza
--p-trim-left-f 10
--p-trim-left-r 10
--p-trunc-len-f 280
--p-trunc-len-r 180
--p-n-threads 60
--o-denoising-stats stats-dada2-280-180.qza
--verbose

Error:
Warning message:
package ‘optparse’ was built under R version 4.2.3
Loading required package: Rcpp
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction,
selfConsist = TRUE, multithread = multithread, verbose = verbose,
MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w,
classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 350, in denoise_paired
run_commands([cmd])
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 37, in run_commands
subprocess.run(cmd, check=True)
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/tmp/tmpqalnn36m/forward', '--input_directory_reverse', '/tmp/tmpqalnn36m/reverse', '--output_path', '$

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/q2cli/commands.py", line 520, in call
results = self._execute_action(
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/q2cli/commands.py", line 581, in _execute_action
results = action(**arguments)
File "", line 2, in denoise_paired
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
outputs = self.callable_executor(
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/qiime2/sdk/action.py", line 566, in callable_executor
output_views = self._callable(**view_args)
File "/project/ime228_uksr/irci222/my_conda/envs/qiime2-amplicon-2024.2_B/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 363, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

See above for debug info.

I'm guessing that the issue is "Error rates could not be estimated (this is usually because of very few reads)." This is interesting because based on the demux.qzv file, there should be an adequate number of reads for dada2 to work with. I've found other discussions in the forum on this topic, but they are a couple years old, and because I was able to successfully run this command on the NextSeq data of our previous run, I'm convinced the issue is with this particular run rather than the pipeline. I even tried re-generating the FastQ files but am still running into the issue.

I have attached images of my demux.qzv file for reference.


Thank you for your help!

Hi @icinco,
I am not exactly sure what is happening!

Maybe its because you have such a high multi threading parameter and it is causing the data to be partitioned too small? (I am not convinced that this is how dada2 calculates error rates).

Could you try reducing your --p-threads and how that goes?

1 Like