Denoising paired ends with DADA2

Hi Tech support,
I am running MISEQ sequences with paired ends. I am currently running DEBLUR but I wanted to compare with DADA2 before proceeding with my downstream analysis.
I was just wondering if you could help out with finding what I did wrong in running these commands:

qiime dada2 denoise-paired
–i-demultiplexed-seqs invertspairend-demux.qza
–p-trunc-len-f 200
-–p-trunc-len-r 180
-–p-trim-left-f 10
-–p-trim-left-r 10
–p-trunc-q 2
–o-representative-sequences invertsDADA-rep-seqs.qza
–o-table inverts-table-dada2.qza
–o-denoising-stats inverts-stats-dada2.qza
–p-n-threads 10

and I got this error message:

Usage: qiime dada2 denoise-paired [OPTIONS]
Try “qiime dada2 denoise-paired --help” for help.

Error: no such option: -–
./3B.DADADENOISE.sh: 19: ./3B.DADADENOISE.sh: --o-representative-sequences: not found
./3B.DADADENOISE.sh: 22: ./3B.DADADENOISE.sh: --p-n-threads: not found

Many thanks,
Imee

Hi @lmee19,

Im seeing a couple of problems in your command. First, I just want to make sure that you’ve got your flags right (the -i-demultiplexed-seqs should be preceeded by two - symbols ( --i-demultiplexed-seqs). Check and see if that helps. If you’re doing wrapped line commands, you also want to use a backslash (\) between lines.

So, your reformatted command would look like this:

qiime dada2 denoise-paired \
 --i-demultiplexed-seqs invertspairend-demux.qza \
 --p-trunc-len-f 200 \
 --p-trunc-len-r 180 \
 --p-trim-left-f 10 \
  --p-trim-left-r 10 \
 --p-trunc-q 2 \
 --o-representative-sequences invertsDADA-rep-seqs.qza \
 --o-table inverts-table-dada2.qza \
 --o-denoising-stats inverts-stats-dada2.qza \
 --p-n-threads 10

Good luck!
Justine

1 Like

Hi Justine,
Thank you so much!!

1 Like

Hi Everyone,
I have been running DADA following the script above for more than a week and i got this an error message.

Can you help me figure this out?

Loading required package: Rcpp
DADA2 R package version: 1.6.0

  1. Filtering …
  2. Learning Error Rates
    2a) Forward Reads
    Initializing error rates to maximum possible estimate.
    Sample 1 - 2210368 reads in 2188405 unique sequences.
    selfConsist step 2
    selfConsist step 3
    selfConsist step 4
    selfConsist step 5
    Convergence after 5 rounds.
    2b) Reverse Reads
    Initializing error rates to maximum possible estimate.
    Error rates could not be estimated.
    Error in err[c(1, 6, 11, 16), ] <- 1 :
    incorrect number of subscripts on matrix
    Calls: dada
    Execution halted
    Running external command line application(s). This may print messages to stdout and/or stderr.
    The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpxpo322w2/forward /tmp/tmpxpo322w2/reverse /tmp/tmpxpo322w2/output.tsv.biom /tmp/tmpxpo322w2/track.tsv /tmp/tmpxpo322w2/filt_f /tmp/tmpxpo322w2/filt_r 200 180 10 10 2.0 2 consensus 1.0 10 1000000

Traceback (most recent call last):
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 231, in denoise_paired
run_commands([cmd])
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
subprocess.run(cmd, check=True)
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/subprocess.py”, line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/tmpxpo322w2/forward’, ‘/tmp/tmpxpo322w2/reverse’, ‘/tmp/tmpxpo322w2/output.tsv.biom’, ‘/tmp/tmpxpo322w2/track.tsv’, ‘/tmp/tmpxpo322w2/filt_f’, ‘/tmp/tmpxpo322w2/filt_r’, ‘200’, ‘180’, ‘10’, ‘10’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘10’, ‘1000000’]’ returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in denoise_paired
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 362, in callable_executor
output_views = self._callable(**view_args)
File “/home/imelda/anaconda2/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 246, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Please see this thread for some ideas: Error running DADA2 in 454 data

Hi Matt,
I tried running it again with this script:

qiime dada2 denoise-paired
–i-demultiplexed-seqs invertspairend-demux.qza
–p-trunc-len-f 200
–p-trunc-len-r 180
–p-trim-left-f 10
–p-trim-left-r 10
–p-max-ee 2.0
–p-chimera-method consensus
–o-representative-sequences inverts-rep-seqs-dada2.qza
–o-table inverts-table-dada2.qza
–o-denoising-stats inverts-stats-dada2.qza
–p-n-threads 40

And what I got from the logfile is this:

./3B.DADADENOISE.sh: 1: ./3B.DADADENOISE.sh: QIIME: not found

I am actually having trouble running this DADA step. I am running it in parallel with DEBLUR.
I wanted to compare first before deciding which one to proceed with. Can you help me figure out this part?
Thanks a lot

Hi @Imee19!

This part of the error is most likely indicating that your script isn’t able to find your conda environment with QIIME 2. Did you remember to activate it?

1 Like

Hi Matt,
I am still getting this error from DADA2 Denoising step, please help me figure out how to address this one.
cat /tmp/qiime2-q2cli-err-q4l8egu9.log
R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0

  1. Filtering …
  2. Learning Error Rates
    2a) Forward Reads
    Initializing error rates to maximum possible estimate.
    Sample 1 - 2210368 reads in 2188405 unique sequences.
    selfConsist step 2
    selfConsist step 3
    selfConsist step 4
    selfConsist step 5
    Convergence after 5 rounds.
    2b) Reverse Reads
    Initializing error rates to maximum possible estimate.
    Error rates could not be estimated.
    Error in err[c(1, 6, 11, 16), ] <- 1 :
    incorrect number of subscripts on matrix
    Calls: dada
    Execution halted
    Running external command line application(s). This may print messages to stdout and/or stderr.
    The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpodsqctkg/forward /tmp/tmpodsqctkg/reverse /tmp/tmpodsqctkg/output.tsv.biom /tmp/tmpodsqctkg/track.tsv /tmp/tmpodsqctkg/filt_f /tmp/tmpodsqctkg/filt_r 200 180 10 10 2.0 2 consensus 1.0 40 1000000

Traceback (most recent call last):
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 231, in denoise_paired
run_commands([cmd])
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
subprocess.run(cmd, check=True)
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/subprocess.py”, line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/tmpodsqctkg/forward’, ‘/tmp/tmpodsqctkg/reverse’, ‘/tmp/tmpodsqctkg/output.tsv.biom’, ‘/tmp/tmpodsqctkg/track.tsv’, ‘/tmp/tmpodsqctkg/filt_f’, ‘/tmp/tmpodsqctkg/filt_r’, ‘200’, ‘180’, ‘10’, ‘10’, ‘2.0’, ‘2’, ‘consensus’, ‘1.0’, ‘40’, ‘1000000’]’ returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-442>”, line 2, in denoise_paired
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/home/imelda/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_dada2/_denoise.py”, line 246, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Thanks,
Imee

Hi @Imee19! You have gone full circle now — this is the same error you posted above — please see my previously supplied suggestion.

Hi Matt,
Yes,I am still stuck with the same issue when I first posted here. Is it possible to not find my conda environment while I am able to run Deblur and other qiime 2 commands except DADA?
Thanks,
Imee

Did you get a chance to read those references I linked to? Is it possible that your data just doesn’t have any repeated sequences in it (that is the gist of the linked issue).

I’m sorry, I don’t understand this question - can you provide an example or elaborate a bit more? Thanks for bearing with me.

Hi Matt,
Do you mean same seqeunce shared by more than one read? I tried to understand the link you included in this thread but I think I didn’t figure out the problem.
Thanks,
Imee

Hi @Imee19 - here is the direct link to the DADA2 issue: https://github.com/benjjneb/dada2/issues/614

Have you taken a look at this already?

Hi Matt,
I did and what I have done so far is look into the reads and aligned between two reads (forward and reverse). I found out that the forward reads are behind the reverse reads. What do I need to do with this issue?
Thanks,
Imee

Hey there @Imee19 --- let's start over. The error message you reported above appears to be the exact same error message discussed on the DADA2 issue tracker. The issue on the DADA2 issue tracker mentions that one way that this error crops up is when all the reads in the input are unique --- the error model has nothing to work with, since there are no repeated reads. It is possible that you have run into this situation. Reading more of the issue tracker, one possible solution is to proceed with only your forward reads.

Outside of DADA2 - other options to try are OTU clustering or deblur. Hope that helps! :qiime2:

2 Likes