dada2 for single-end sequences gets stucked at step learning error rates

Hey everyone,
I am new to this forum and hope that I have chosen the right category.

I am running dada2 denoising single-end sequences of 16S data, but since multiple hours the following step is not proceeding, it also does not show me an error, it`s just not finishing.

  1. Learning Error Rates
    219883600 total bases in 1099418 reads from 13 samples will be used for learning the error rates.

My command is:
qiime dada2 denoise-single
--i-demultiplexed-seqs bacteria_demux.qza
--p-trunc-len 200
--o-representative-sequences bacteria_rep-seqs-dada2.qza
--o-table bacteria_table-dada2.qza
--o-denoising-stats bacteria_stats-dada2.qza --verbose

I have used dada2 for other samples the last days and it usually did not take longer as 30 to 60min. I have read in other posts that it is sometimes normal to take longer, but the strange thing is:
I have run it before for exactly the same samples with paired-end sequences and it took not longer than one hour (the demux file with paired-end sequences is 4.12 GB), as my reverse read did not show a good quality plot, I tried it using only the forward reads (demux file 1.61 GB) and now it is not finishing anymore. I have tried to use truncate --p-trunc-len 200 and 235 but it`s the same problem...

In case it helps, here is the quality plot for the forward reads:

I have run it several times since two days now, because I first thought I have to clean my memory but now I have plenty, and don`t understand why it is working with the paired-end but not with the single-end sequences. I would be very pleased if someone could help me with that.

Thanks in advance!

Hi @diggingdirt,

Welcome to the :qiime2: forum! Apologies for the delay in response here.

I suspect you are running into some sort of memory allocation issue here - can you provide me with the version of QIIME 2 you are using, along with how you have it installed (linux, OS, WSL, virtual machine, etc)?

Thanks! :lizard:

Hey @lizgehret

I am using qiime2 2022.2, installed by WSL. At the end the dada2 denoising for my single-end sequences managed to finish. But it took almost 7 hours for the smaller?!? demux file. Do you have any suggestions how to speed it up?

Thank you very much for your help!

Hi @diggingdirt,

There are a couple of options you can try moving forward to get a better idea of why these are such long running jobs.

After running qiime dada2 denoise-single (or any other command that seems to be taking a while to complete) you can open up another terminal window and type 'top', and then press enter. This will show you all of the commands that are currently being run. Look for ‘R’ in the ‘Command’ column and you’ll be able to see the %CPU usage. You can also look down the “State” column to see which commands are “running” if you are having a difficult time finding the specific command. This may help you to determine if a specific command is utilizing a high %CPU.

Another option you have to simply speed up the processing time (if this is only occurring for this command specifically) is to utilize the --p-n-threads parameter. Including an integer above 1 in this parameter will specify number of threads to use for multithreaded processing. If 0 is provided, all available cores on your machine will be used.

Hope this helps! Cheers :lizard:

Thank you very much for the help!!!
I have tried it with different --p-n-threads before but I didn`t know how much I could go up, so now
I used --p-n-threads 0 and it completed after 15 minutes :slight_smile:

@diggingdirt,

Glad it worked for you! As a future caution, for processes that use a lot of RAM for each thread(such as running a taxonomic classifier!), it can actually badly slow things down or even prevent completion if you increase the number of threads to the maximum.

2 Likes

Unfortunately I have noticed that training a classifier with --p-n-threads 0 after I read @lizgehret `s advice. I suupose I was a bit overmotivated to use this command then.
Thank you for the explanation!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.