I'm running dada2 on paired end data and getting the same error over and over again. My data are from an 150 PE run on an Illumina iSeq machine, sequencing a 16s fragment for vertebrates. The samples were demultiplexed by Casava based on the i5 and i7 adapters. Some samples returned very few reads, so I repeated these steps with only those samples that returned more than 19k reads. Still, I got the same result. Here's the code that I ran for the full dataset:
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path pe-demux-results/manifest-file.txt --input-format PairedEndFastqManifestPhred33V2 --output-path pe-demux-results/demux16s.qza
qiime dada2 denoise-paired --i-demultiplexed-seqs demux-16s.qza --p-trim-left-f 24 --p-trim-left-r 21 --p-trunc-len-f 148 --p-trunc-len-r 142 --o-table table-pe.qza --o-representative-sequences rep-seqs.qza --o-denoising-stats denoising-stats.qza --verbose
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.
Command: run_dada_paired.R /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/forward /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/reverse /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/output.tsv.biom /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/track.tsv /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/filt_f /var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/filt_r 148 142 24 21 2.0 2.0 2 12 independent consensus 1.0 1 1000000
R version 4.0.5 (2021-03-31)
Loading required package: Rcpp
DADA2: 1.18.0 / Rcpp: 1.0.7 / RcppParallel: 5.1.4
1) Filtering .............
2) Learning Error Rates
133563748 total bases in 1077127 reads from 4 samples will be used for learning the error rates.
130332367 total bases in 1077127 reads from 4 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.
Execution halted
Traceback (most recent call last):
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 266, in denoise_paired
run_commands([cmd])
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 36, in run_commands
subprocess.run(cmd, check=True)
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_dada_paired.R', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/forward', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/reverse', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/output.tsv.biom', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/track.tsv', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/filt_f', '/var/folders/kk/scwljzqs5gb9zsd8m4rxwsbr0000gn/T/tmpb7ed8hkc/filt_r', '148', '142', '24', '21', '2.0', '2.0', '2', '12', 'independent', 'consensus', '1.0', '1', '1000000']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/q2cli/commands.py", line 329, in __call__
results = action(**arguments)
File "<decorator-gen-572>", line 2, in denoise_paired
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self._callable_executor_(scope, callable_args,
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in _callable_executor_
output_views = self._callable(**view_args)
File "/Users/AirAlex/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/q2_dada2/_denoise.py", line 279, in denoise_paired
raise Exception("An error was encountered while running DADA2"
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
Plugin error from dada2:
An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.
See above for debug info.
As you can see from the demux-16s-viz.qzv, the median quality score remains high, even though it is variable towards the end.
The target amplicon is only 250 bp long, so there should be just enough overlap in the reads to merge R1 and R2.
FWIW, the dada
step works fine with if analyzed as SE reads (R1 and R2 analyzed separately), even though there are just as many samples with very few reads.
Any ideas?