After using qiime1 for the past year, I have recently installed qiime2 on my personal computer in order perform my new analyses.
I am working with bacterial 16S sequences (V5-V6 region, amplified with the 784F/1061R primers and sequenced on the MiSeq Illumina platform). I have 275 samples, which are in the paired-end demultiplexed fastq format.
I am using the qiime2-2017.10 version and was able to import my data with the “qiime tools import” function and visualize them thanks to “qiime demux summarize”.
I am now trying to quality-filter my sequences with “qiime dada2 denoise-paired”. The function has been running for more than 36h and I am wondering whether the process is stuck? There is no error output, and I don’t know how to check if there is actually something happening. Is it expected that denoise-paired would take so much time?
Thank you very much for your help!
Unfortunately, DADA2 is known for taking quite some time in QIIME 2, but there are few tricks that can at least help speed things up a bit:
- Firstly, if you are using a Virtual Box, I would recommend making sure you have assigned more than the default amount of RAM (the more you can afford to use, the faster it will run. For example, I often allot 8 - 10 GB of RAM on my personal computer, out of 16 GB).
- Furthermore, using the
--p-n-threads INTEGER paramater can also be adjusted to allot more cores to the run (although it does seem that RAM is the limiting factor for DADA2).
- Using a server (either AWS, or potentially a university server that you have access to) can be extremely beneficial in speeding up this step as well.
All of that being said, there is some good news! There haven’t been any errors yet, and once you do get past this step of the protocol, you will have gotten through the longest step by far!
Hope that helps!
Thank you very much for having taken the time to answer so quickly, I really appreciate!
As I am running QIIME2 on my Macbook Pro, the program should already be allocated as much RAM as possible… However, this is useful to know; if the process keeps running for much longer, I will abort it and re-try with the --p-n-threads specification.
In parallel, I will now try to run the function on the HPC of my work, maybe it will improve the speed! It is encouraging that seeing no error so far is a good sign.
Thank you very much again,
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.