Hi @fvnieuwe: I suspect that your command isn't hanging, so much as it is just taking a really really long time to complete. Part of why it is taking so long is QIIME 2 currently runs dada2
in single-threaded mode. We are working on some upstream changes to support multi-threaded dada2
runs within QIIME 2, and we hope to have that ready by the next release (currently scheduled for Q1 2017).
If you want to confirm that dada2 denoise
isn't actually hanging, you might try running it on a smaller dataset to confirm that it completes successfully. For example, you could import a couple of your samples into a new .qza file and try denoising that.
@gregcaporaso posted some options for working around the (currently very slow) dada2 denoise
step. In addition to his suggestions, you could try running the underlying dada2
R tool directly, and then import your denoised data to continue analysis with QIIME 2.
Note that if your dataset was generated from multiple MiSeq runs, you'll want to use @gregcaporaso's third suggestion in that linked post. dada2 works best by denoising each MiSeq run separately, so you'll get better results in less time because you can denoise each MiSeq run separately in parallel and then merge the results.
I also merged my paired-end data using PEAR. I renamed and imported as suggested. I am able to do the denoise step on this. Is this a valid approach?
This approach is not recommended because dada2 needs the unjoined reads in order to produce the best results. Unfortunately we don’t have support for that hooked up yet; we expect paired-end dada2 support in the next release in addition to multithreading support.
For now, you’ll need to pick either R1 or R2 reads, import them into a .qza file, and denoise that. Another option is to join your reads (e.g. with PEAR, QIIME 1's join_paired_ends.py
, or some other read-joining approach) and cluster/denoise the joined reads with a different tool that supports sequence data that has already been joined. For example, you could do all these steps in QIIME 1 (e.g. join_paired_ends.py
, pick_open_reference_otus.py
) and then import the resulting .biom
file into QIIME 2.
Apologies that this process is a pain to work around right now -- the next QIIME 2 release should make this much easier. Let us know how it goes!