Dear QIIME2 developers,
I am trying out q2-moshpit using qiime2-shotgun-2024.2 Linux deployment on a HPC.
My capacity is capped at16 cpu/128 GB memory per node.
Most functions work well and relatively fast, whenever multithreading is possible. Using qiime moshpit eggnog-diamond-search on contigs assembled from a cleaned 50 million reads metagenomic dataset of an activated sludge sample works at an acceptable pace.
Particularly, the annotation function qiime moshpit eggnog-annotate is fairly slow (running about 15 minutes per 500 queries). I haven't checked the number of queries to go, but I am at 11000 and counting. The memory usage of this task seems low, conveniently output to stdout for each block of 500 queries (~1.5% used, 98.5% available). I was wondering if you could enable a --p-n-threads parameter for this function for multithreading?
Just an idea that may be worth considering.
As I am now dealing with 12 of such metagenomic samples sequenced, and 35 slightly larger sets of accompanying paired (ribo-depleted) metatranscriptomic data sets, I am hoping for optimized runtime.
Thanks guys (and thanks for the continuous development work in general over the years)
Cheers,
Pieter