I am dealing with a dataset that after importing using the fastq manifest option results in a 17GB .qza file. To properly process it I am using qiime vsearch join-pairs and qiime quality-filter q-score-joined both of which can't take advantage of multiple threads but are ideally suited to do so.
Even on a high-compute node to perform those two steps is taking hours.
I am a little surprised by the lack of emphasis in QIIME2 on efficient processing (e.g., parallelization or multi-threading) given the ever increasing size of datasets researchers are producing and would like to process with something like QIIME2.
Sorry for the rant.
-jamie