Why not parallelize/multi-thread (nearly) every step?

jpetteng · December 27, 2017, 8:31pm

I am dealing with a dataset that after importing using the fastq manifest option results in a 17GB .qza file. To properly process it I am using qiime vsearch join-pairs and qiime quality-filter q-score-joined both of which can't take advantage of multiple threads but are ideally suited to do so.

Even on a high-compute node to perform those two steps is taking hours.

I am a little surprised by the lack of emphasis in QIIME2 on efficient processing (e.g., parallelization or multi-threading) given the ever increasing size of datasets researchers are producing and would like to process with something like QIIME2.

Sorry for the rant.

-jamie