Dada 2 denoise running for 3days

Shah1 · October 7, 2024, 5:28am

Hi,

I am running dada2 denoise and i have total 84 paired end sample and each sample contains average of 1million reads. How much time it would take to finish .

The command i have used is
qiime dada2 denoise-paired
--i-demultiplexed-seqs sample.qza
--p-trim-left-f 0
--p-trim-left-r 0
--p-trunc-len-f 0
--p-trunc-len-r 0
--o-representative-sequences rep-seqs.qza
--o-table table.qza
--p-n-threads 20
--o-denoising-stats dada2-stats.qza

timanix · October 7, 2024, 8:18am

Hello and welcome to the forum!

I don't know how long it will take for dada2 to finish. 1 million reads per sample is a lot, and I would subsample samples before running dada2. Something around 10% should be enough...

Shah1 · October 7, 2024, 9:11am

Hi,
This is the verbose output is everything look fine?

R version 4.3.3 (2024-02-29)
Loading required package: Rcpp
DADA2: 1.30.0 / Rcpp: 1.0.12 / RcppParallel: 5.1.6
2) Filtering ..........................................
3) Learning Error Rates
318498129 total bases in 1073064 reads from 1 samples will be used for learning the error rates.
303974019 total bases in 1073064 reads from 1 samples will be used for learning the error rates.
3) Denoise samples .........................................

timanix · October 7, 2024, 9:16am

It looks normal, but:

I don't know how long it will be running
It may happen that it will crash after a week.

So I would either wait or abort the run, subsample samples to fraction 0.1 (or another, depending on the samples overview), and rerun Dada2. I prefer subsampling since 1 million reads per sample is a lot and will slow down other analyses. Moreover, it may happen that you will not be happy with Dada2's output based on the parameters you used and decide to rerun it with other settings. Then you will be waiting again...

Shah1 · October 9, 2024, 5:48am

So finally it finished.
I want to know Subsampling samples to a fraction of 0.1 in the context of DADA2 means that you're instructing the tool to randomly select 10% of the total reads from each sample for processing, rather than using the full dataset. Am I right?"

timanix · October 9, 2024, 6:56am

That is correct, but in that case one should subsample samples before Dada2, so dada2 is not instructed to subsample, but work with already subsampled dataset.

Shah1 · October 9, 2024, 7:13am

Thanks timanix helping me out. It will be great if u can guide me how to subsample the datasets before Dada2.

timanix · October 9, 2024, 7:26am

You can use the link from my comment to see the plugin options. Just decide to which fraction to subsample based on your samples depth.

system · November 9, 2024, 1:26pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.