Dada2 in R (return code -7)

Hi,

I am creating this new topic because I tried to search for similar discussion to troubleshoot my error message, but it seems like there is none.

I have total 66samples, they are 16S paired end amplicon from Illumina. I removed primers from my sequences using cutadapt in qiime2. The qiime version that I am using is qiime2-2018.6 and I am running it on cluster.

Here is the command line I ran:

qiime dada2 denoise-paired --i-demultiplexed-seqs trimmed2.0-paired-end.qza --p-trunc-len-f 277 --p-trunc-len-r 251 --p-max-ee 8 --p-n-threads 24 --output-dir dada2_output_277f251ree8.qza

Here is the output of the log:

Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpradrfbin/forward > /tmp/tmpradrfbin/reverse /tmp/tmpradrfbin/output.tsv.biom /tmp/tmpradrfbin/track.tsv /tmp/tmpradrfbin/filt_f /tmp/tmpradrfbin/filt_r 277 251 0 0 8.0 2 consensus 1.0 24 1000000

R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0

  1. Filtering …

  2. Learning Error Rates
    2a) Forward Reads
    Initializing error rates to maximum possible estimate.
    Sample 1 - 110016 reads in 55472 unique sequences.
    Sample 2 - 61976 reads in 35550 unique sequences.
    Sample 3 - 99850 reads in 34645 unique sequences.
    Sample 4 - 93057 reads in 45524 unique sequences.
    Sample 5 - 75640 reads in 41019 unique sequences.
    Sample 6 - 93285 reads in 47301 unique sequences.
    Sample 7 - 62141 reads in 31174 unique sequences.
    Sample 8 - 54129 reads in 30589 unique sequences.
    Sample 9 - 787109 reads in 728473 unique sequences.
    selfConsist step 2
    selfConsist step 3
    selfConsist step 4
    selfConsist step 5
    selfConsist step 6
    selfConsist step 7
    selfConsist step 8
    selfConsist step 9
    selfConsist step 10
    Self-consistency loop terminated before convergence.
    2b) Reverse Reads
    Initializing error rates to maximum possible estimate.
    Sample 1 - 110016 reads in 90109 unique sequences.
    Sample 2 - 61976 reads in 54617 unique sequences.
    Sample 3 - 99850 reads in 67183 unique sequences.
    Sample 4 - 93057 reads in 73409 unique sequences.
    Sample 5 - 75640 reads in 64919 unique sequences.
    Sample 6 - 93285 reads in 78672 unique sequences.
    Sample 7 - 62141 reads in 49237 unique sequences.
    Sample 8 - 54129 reads in 45100 unique sequences.
    Sample 9 - 787109 reads in 714150 unique sequences.
    selfConsist step 2
    selfConsist step 3
    selfConsist step 4
    selfConsist step 5
    selfConsist step 6
    selfConsist step 7
    Convergence after 7 rounds.

  3. Denoise remaining samples …
    The sequences being tabled vary in length.

  4. Remove chimeras (method = consensus)

  5. Write output

*** caught bus error ***
address 0x2ab17051ffb8, cause ‘non-existent physical address’
An irrecoverable exception occurred. R is aborting now …
Traceback (most recent call last):
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 229, in denoise_paired
run_commands([cmd])
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
subprocess.run(cmd, check=True)
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/subprocess.py”, line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/tmpradrfbin/forward’, ‘/tmp/tmpradrfbin/reverse’, ‘/tmp/tmpradrfbin/output.tsv.biom’, ‘/tmp/tmpradrfbin/track.tsv’, ‘/tmp/tmpradrfbin/filt_f’, ‘/tmp/tmpradrfbin/filt_r’, ‘277’, ‘251’, ‘0’, ‘0’, ‘8.0’, ‘2’, ‘consensus’, ‘1.0’, ‘24’, ‘1000000’]’ returned non-zero exit status -7

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in denoise_paired
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 232, in bound_callable
output_types, provenance)
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 367, in callable_executor
output_views = self._callable(**view_args)
File “/home/wyap004/miniconda2/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 244, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code -7), please inspect stdout and stderr to learn more.

I was worried if this is another plugin issue so I tried to run a tutorial with Atacama soil data (10% subsample data). It worked well, so I guess is not a plugin issue. I know my output directory look weird (I accidentally ended it with .qzat) but I doubt that is the causes.

Appreciate any insight or help to solve this problem.

Cheers.
Wenshu

Hi @wenshu,

Holy cow, that’s as new one. It kind of looks like a hardware failure of your RAM, since that isn’t a segfault (which is the more common memory issue).

Yeah if it is hardware, then you would have a hard time replicating the issue. Are you able to reproduce the first issue at all?

You might try doing a hardware diagnostic (e.g. memory test) to confirm that there’s nothing wrong.

Also .qzat won’t matter, QIIME 2 reads the file to determine type, file extensions are just a people thing :wink:

2 Likes

Hi @ebolyen

Thanks for the reply.

I am curious why this is a hardware problem when I managed to run a smaller dataset (10% subsample) with the Atacama soil tutorial?

I am not sure how to do that, I am sorry that I am not very proficient in this. I am running this on a server, each node contain 125GB memory. Will this information be helpful to troubleshot the problem?

I see. noted :slight_smile:

Hey @wenshu,

If it is a hardware problem, you will have a nearly impossible time reproducing it as which particular memory cell is used for allocation will depend on the current state of the machine, your data, and the alignment of the stars :stars:

No worries, it sounds like a job for your sysadmin in this case.

But first, let’s confirm it isn’t a logical flaw with the program or installation.

If you run this command (from above) again.

qiime dada2 denoise-paired --i-demultiplexed-seqs trimmed2.0-paired-end.qza --p-trunc-len-f 277 --p-trunc-len-r 251 --p-max-ee 8 --p-n-threads 24 --output-dir dada2_output_277f251ree8.qza

does it fail in the same way?

Hi @ebolyen

I see.

I am running the same command again since yesterday and it is still running. The first time it took me almost 48hrs to hit the wall, I guess we will know it by tomorrow or tonight. This time I am running the exact same command on two different server. If the command worked on the second server (which has almost ten time smaller memory size then the first server that I used, I suppose the memory size should be still sufficient to run this job), then I guess probably is hardware problem then.

I will update here once I got an answer.

1 Like

Hi,

Sorry for replying after so long. My run took almost 4 days to complete (:tada:finally).

Somehow for some reason, my command worked in the same server that I am using initially. The only difference that I did this time is to run it with PBS script instead of running it on screen. My suspect probably because running it on screen is causing some access issue with the plugin?

My command failed on second server, not very sure if is due to memory problem. I am attaching the log here for whoever that might have similar problem.

Blockquote
[[email protected] phrathong]$ vi nohup.out

[1]+ Stopped vim nohup.out
[[email protected] phrathong]$ vi /tmp/qiime2-q2cli-err-51gx3jua.log

Sample 8 - 54129 reads in 30589 unique sequences.
Sample 9 - 787109 reads in 728473 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
selfConsist step 7
selfConsist step 8
selfConsist step 9
selfConsist step 10
Self-consistency loop terminated before convergence.
2b) Reverse Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 110016 reads in 90109 unique sequences.
Sample 2 - 61976 reads in 54617 unique sequences.
Sample 3 - 99850 reads in 67183 unique sequences.
Sample 4 - 93057 reads in 73409 unique sequences.
Sample 5 - 75640 reads in 64919 unique sequences.
Sample 6 - 93285 reads in 78672 unique sequences.
Sample 7 - 62141 reads in 49237 unique sequences.
Sample 8 - 54129 reads in 45100 unique sequences.
Sample 9 - 787109 reads in 714150 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
selfConsist step 7
Convergence after 7 rounds.

  1. Denoise remaining samples …Running external command line application(s). This may print messages to stdout and/or stderr.
    The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_dada_paired.R /tmp/tmpk0d5bv_5/forward /tmp/tmpk0d5bv_5/reverse /tmp/tmpk0d5bv_5/output.tsv.biom /tmp/tmpk0d5bv_5/track.tsv /tmp/tmpk0d5bv_5/filt_f /tmp/tmpk0d5bv_5/filt_r 277 251 0 0 8.0 2 consensus 1.0 8 1000000

Traceback (most recent call last):
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 229, in denoise_paired
run_commands([cmd])
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 36, in run_commands
subprocess.run(cmd, check=True)
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/subprocess.py”, line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run_dada_paired.R’, ‘/tmp/tmpk0d5bv_5/forward’, ‘/tmp/tmpk0d5bv_5/reverse’, ‘/tmp/tmpk0d5bv_5/output.tsv.biom’, ‘/tmp/tmpk0d5bv_5/track.tsv’, ‘/tmp/tmpk0d5bv_5/filt_f’, ‘/tmp/tmpk0d5bv_5/filt_r’, ‘277’, ‘251’, ‘0’, ‘0’, ‘8.0’, ‘2’, ‘consensus’, ‘1.0’, ‘8’, ‘1000000’]’ returned non-zero exit status -9

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “”, line 2, in denoise_paired
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 362, in callable_executor
output_views = self._callable(**view_args)
File “/home/wenshu/anaconda2/envs/qiime2-2018.8/lib/python3.5/site-packages/q2_dada2/_denoise.py”, line 244, in denoise_paired
" and stderr to learn more." % e.returncode)
Exception: An error was encountered while running DADA2 in R (return code -9), please inspect stdout and stderr to learn more.

                                                                                                                                                                                      73,0-1        Bot

Cheers.
Wenshu

Hey @wenshu,

Glad to hear you were able to get your original run to work!

Screen itself doesn’t cause any issues as it inherits the same terminal environment as if you had not used screen (I’ve also run lots of jobs with screen in an ad-hoc way :smile:), but it is certainly possible that something (or someone) accidentally killed/closed the screen process and so the job within it failed.

But that doesn’t quite line up with the memory access failure, so I am still suspecting something stranger at play.

This one is a SIGKILL which can happen for any number of reasons, (killing a screen session especially) so I don’t have a good explanation for that, but fortunately it’s a much more mundane problem. You may have run out of memory or hit your walltime limit on your scheduler.

How did you submit that particular job?

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.