Error with multithreaded dada2 [Plugin error from dada2. An error was encountered while running DADA2 in R (return code 1)]

Peter_Kos · October 14, 2019, 1:54pm

Hi, I am running qiime2-2019.7. when I ran my dada2, the following error message came: "An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more."
I started to research around the problem, so I ran the same on another server, and seemingly my files were OK (e.g. equal number of R1 and R2, etc).
Then I came back to my server and saw that I can run the thing on 1 thread, and on 2 and 12, but it dies on 24 thread and up.

Here below is the successfull command and timestamp and success message and the 24-thread command and error message, and I am attaching the error log files.

(qiime2-2019.7) kp@duna:~/old.duna$ date && time qiime dada2 denoise-single --i-demultiplexed-seqs all_seqs.qza --p-trunc-len 249 --p-n-threads 12 --o-table alltable_12.qza --o-representative-sequences all_representatives_12.qza --o-denoising-stats all_denoising_12.qza
2019. okt. 14., hétfő, 12:31:57 CEST

Saved FeatureTable[Frequency] to: alltable_12.qza
Saved FeatureData[Sequence] to: all_representatives_12.qza
Saved SampleData[DADA2Stats] to: all_denoising_12.qza

real 95m12,943s
user 819m54,123s
sys 4m12,487s
(qiime2-2019.7) kp@duna:~/old.duna$
(qiime2-2019.7) kp@duna:~/old.duna$ date && time qiime dada2 denoise-single --i-demultiplexed-seqs all_seqs.qza --p-trunc-len 249 --p-n-threads 24 --o-table alltable_24.qza --o-representative-sequences all_representatives_24.qza --o-denoising-stats all_denoising_24.qza &
[2] 5731
(qiime2-2019.7) kp@duna:~/old.duna$ 2019. okt. 14., hétfő, 14:11:51 CEST
Plugin error from dada2:

An error was encountered while running DADA2 in R (return code 1), please inspect stdout and stderr to learn more.

Debug info has been saved to /tmp/qiime2-q2cli-err-99rnsr78.log

real 1m0,362s
user 13m58,391s
sys 0m55,533s

[2]+ Exit 1 date && time qiime dada2 denoise-single --i-demultiplexed-seqs all_seqs.qza --p-trunc-len 249 --p-n-threads 24 --o-table alltable_24.qza --o-representative-sequences all_representatives_24.qza --o-denoising-stats all_denoising_24.qza

[And meanwhile the 2-thread succeeded too ]
(qiime2-2019.7) kp@duna:~/old.duna$ Saved FeatureTable[Frequency] to: alltable_2.qza
Saved FeatureData[Sequence] to: all_representatives_2.qza
Saved SampleData[DADA2Stats] to: all_denoising_2.qzaqiime2-q2cli-err-h64zw2l5.log.txt (2.6 KB) qiime2-q2cli-err-99rnsr78.log.txt (2.6 KB)

colinbrislawn · October 14, 2019, 4:40pm

Good afternoon Peter,

I'm glad this is working for you, even if it's only on 12 threads.

Did you open up your log files to look for errors? When I opened them up, I found this error:

'names' attribute [96] must be the same length as the vector [92]

That error has been popping up in the forums when people use too many threads, but I'm not sure of it's root cause. Let us know if you find any other clues! Like, does 16 or 18 threads work? How much ram is being used throughout this full process?

Colin

Nicholas_Bokulich · October 14, 2019, 4:45pm

See a description of this error here:

Peter_Kos · October 14, 2019, 10:29pm

Hi Colin,
thanks for looking at it. I Am also happy that the program runs, so the test data is valid, the program and the server are working.
Yes ofc I saw that error message, however it did not tell me anything. I do not know what "names attribute" that is and which vector that is. So that is completely meaningless for me.
Nevertheless, I was worried that I made a mistake with uneven files (directions) or similar. Since the program run with less threads, I concluded that my data is OK.
And therefore I ment to report this error to those ppl who may know what 'names' attribute is in the certain segment of the software and which vector that is, and how on Earth they are not the same length on 48 threads if they are same length on 12 threads.
I saw that qiime uses /tmp/ directory extensively. That could have been a problem on another server where the /tmp/ is on a small partition. However, on this server the threads, the memory and storage are all quite unlimited for this dataset. Fasttree used all threads 100% , so again this is perhaps not a H/W issue.
But I have absolutely no knowledge about forking and multithreading. And I am not good in R either.
So, if there is not any solution for this issue, I will 'titrate' how many threads dada2 can handle. But it is then a very sad situation and it may fundamentally slow down the calculation of my BigData when it comes out of the sequencer.
Tomorrow I'll try and find the thread limit.
What do U mean "How much ram is being used throughout this full process?" I do not know what dada2 is exactly doing when the error occurs. Can I measure in any way the peak memory usage or the average? What kind of data can be useful? Or you mean the available memory?
Thx for your help and effort
Peter

Peter_Kos · October 14, 2019, 10:29pm

Hi, Nicholas,
thanks for looking after this.
I have tested (with error) --p-n-threads 0 (which should be automatically in accordance with the computer, and 48 and 24. I had success with 12 and 2 and 1
The computer has 96 threads and one TB memory.

Peter_Kos · October 15, 2019, 1:23pm

It must be some hidden central resource, like max number of open file handles, shared memory or whatnot.
My full test-dataset contains 96 paired-end samples 8 486 433 sequences. On 18 threads can run, on 24 threads can not (still not have 19...23 data, not as if it matters much at the current state).
If I take randomly 4 samples, with 349814 sequences in them, it can run on 96 threads seamlessly.
So the problem is not the number of threads. (Perhaps the threads x reads? ) The total overall memory is 1 TB, it is usually mostly empty during the run.
It seems to me that somewhere between dada2 and the H/W, somewhere perhaps between R and the HW, either in R or in Debian there may be some soft setting/default/whatever, that creates a limit. It would be good to find out what that is to relax that limit, as the H/W surely has enough resources.
Otherwise I will be in great trouble with the real data that is somewhere between one and two thousand samples, and I may need to go down to single thread, and run it for months instead of a day.

colinbrislawn · October 15, 2019, 1:43pm

Hello Peter,

I'm right there with you on this one. That's a cryptic error for a mysterious bug, and I'm just going to wait until a more experienced R dev can troubleshoot this part of dada2.

Until then, I like your idea of 'titrating' the dada2 threads. Because we don't know the root cause of the error, a careful 'parameter sweep' (comp sci jargon for titration) could lend some useful clues! Or maybe the R wizards already know the solution and we will get a patch soon.

The dada2 qiime plug-in runs a full pipeline, and different steps in the pipeline do different things and may need different amounts of memory. I was suggesting that you run dada2 with top open and keep an eye on available memory throughout the process.

Colin

Nicholas_Bokulich · October 15, 2019, 2:35pm

The issue is not that you are requesting more threads than are available. It is that each thread you request will increase the cumulative memory load until it goes over the top. This is described on the dada2 issue tracker.

dada2 often takes a good chunk of memory to run on a single thread — it is very difficult to predict how much memory a single run would use but memory issues with a single thread are common enough (just look around this forum for examples). If you set threads=0 (to use all available threads), then you multiply this by a factor of N. To the extent that yes you even exceed 1 TB RAM.

So unfortunately I do not think there is an option in dada2 to "run on N threads where N + 1 = the amount that will cause my system to explode", and looking at the dada2 issue tracker it sounds like the recommended workaround is to run on a single thread.

So if the issue for you is that you want to run this routinely without needing to "titrate" the number of threads, I think you should just choose a conservatively small number of threads (say 10) that would be unlikely to eat up too much cumulative memory.

Titrating the reads on any single run is not worth it... you will end up spending more time waiting for successively smaller runs to crash than it would take a single low-threaded run to finish.