Hello!
I recently completed analysis on a large sampleset with QIIME2. I had received guidance on this run in a separate forum entry: DADA2 Filtering Error - Failed to Write Record - #21 by megan.justice
Unfortunately, I realized I utilized the dada2 denoise-SINGLE instead of the dada2 denoise-PAIRED that I should have utilized.
The initial code was:
#qiime dada2 denoise-single\
# --p-trim-left 0\
# --p-trunc-len 268\
# --i-demultiplexed-seqs paired-end-demux.qza\
# --o-representative-sequences rep-seqs-1.qza\
# --o-table table-1.qza\
# --o-denoising-stats stats-1.qza\
To remedy this, I copied the paired-end-demux.qza object from the successful 'single' run to a new directory and edited my dada2 denoise command to the following:
## Denoise
qiime dada2 denoise-paired\
--p-trunc-len-f 268\
--p-trunc-len-r 268\
--i-demultiplexed-seqs paired-end-demux.qza\
--o-representative-sequences rep-seqs-1.qza\
--o-table table-1.qza\
--verbose\
--o-denoising-stats stats-1.qza\
When I try this command on the same data as the single run, I get the following message:
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.Command: run_dada.R --input_directory /tmp/tmp39j03iev/forward --input_directory_reverse /tmp/tmp39j03iev/reverse --output_path /tmp/tmp39j03iev/output.tsv.biom --output_track /tmp/tmp39j03iev/track.tsv --filtered_directory /tmp/tmp39j03iev/filt_f --filtered_directory_reverse /tmp/tmp39j03iev/filt_r --truncation_length 268 --truncation_length_reverse 268 --trim_left 0 --trim_left_reverse 0 --max_expected_errors 2.0 --max_expected_errors_reverse 2.0 --truncation_quality_score 2 --min_overlap 12 --pooling_method independent --chimera_method consensus --min_parental_fold 1.0 --allow_one_off False --num_threads 10 --learn_min_reads 1000000
R version 4.3.3 (2024-02-29)
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.13.1 / RcppParallel: 5.1.9
2) Filtering ........................................................................................................................................................................................
3) Learning Error Rates
385065348 total bases in 1436811 reads from 4 samples will be used for learning the error rates.
385065348 total bases in 1436811 reads from 4 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
6: stop("Error matrix is NULL.")
5: getErrors(err, enforce = TRUE)
4: dada(drps, err = NULL, errorEstimationFunction = errorEstimationFunction,
selfConsist = TRUE, multithread = multithread, verbose = verbose,
MAX_CONSIST = MAX_CONSIST, OMEGA_C = OMEGA_C, ...)
3: learnErrors(filtsR, nreads = nreads.learn, multithread = multithread)
2: withCallingHandlers(expr, warning = function(w) if (inherits(w,
classes)) tryInvokeRestart("muffleWarning"))
1: suppressWarnings(learnErrors(filtsR, nreads = nreads.learn, multithread = multithread))
Traceback (most recent call last):
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 353, in denoise_paired
run_commands([cmd])
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 38, in run_commands
subprocess.run(cmd, check=True)
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada.R', '--input_directory', '/tmp/tmp39j03iev/forward', '--input_directory_reverse', '/tmp/tmp39j03iev/reverse', '--output_path', '/tmp/tmp39j03iev/output.tsv.biom', '--output_track', '/tmp/tmp39j03iev/track.tsv', '--filtered_directory', '/tmp/tmp39j03iev/filt_f', '--filtered_directory_reverse', '/tmp/tmp39j03iev/filt_r', '--truncation_length', '268', '--truncation_length_reverse', '268', '--trim_left', '0', '--trim_left_reverse', '0', '--max_expected_errors', '2.0', '--max_expected_errors_reverse', '2.0', '--truncation_quality_score', '2', '--min_overlap', '12', '--pooling_method', 'independent', '--chimera_method', 'consensus', '--min_parental_fold', '1.0', '--allow_one_off', 'False', '--num_threads', '10', '--learn_min_reads', '1000000']' returned non-zero exit status 1.During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in call
results = self._execute_action(
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
results = action(**arguments)
File "", line 2, in denoise_paired
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
outputs = self.callable_executor(
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in callable_executor
output_views = self._callable(**view_args)
File "/home/ec2-user/miniconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_dada2/_denoise.py", line 366, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.Plugin error from dada2:
An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
See above for debug info.
No output files were generated, so this is the only guidance I have regarding what went wrong.
I am working in an AWS EC2 instance with 60 GB RAM and 2 TB disk space.
I have since tried to multithread the command into 10 threads, but it still failed.
What could be happening here?
Does the denosie-paired require significantly more RAM than denoise-single?
Any suggestions on a remedy?
Thanks!