process killed qiime 2024.5 on Ubuntu 24.04.4 LTS when running on multicore

scilexenko · March 11, 2026, 8:27am

Hello,

My classify-sklearn run is being killed due to OOM (Out of Memory) despite having 192GB of RAM. The input rep-seqs file is small (8,331 sequences), but the system monitor shows 100% RAM and Swap usage before the process is terminated.

Command:

qiime feature-classifier classify-sklearn \
  --i-classifier unite_ver10_97_s_all_04.04.2024-Q2-2024.5.qza \
  --i-reads rep-seqs-dada2.qza \
  --o-classification taxonomy-ITS.qza \
  --p-n-jobs 12 \
  --verbose

Rep-Seq Stats:

Count: 8,331
Mean Length: 278.7 bp
Range: 200–388 bp

System Info:

CPU: Intel Core Ultra 9 275HX (24 cores)
RAM: 192GB (100% utilized during crash)
Swap: 8.6GB (100% utilized during crash)

I then tried to run on a single core with a reduced batch size:

qiime feature-classifier classify-sklearn \
  --i-classifier unite_ver10_97_s_all_04.04.2024-Q2-2024.5.qza \
  --i-reads rep-seqs-dada2.qza \
  --o-classification taxonomy-ITS.qza \
  --p-n-jobs 1 \
  --p-reads-per-batch 1000 \
  --verbose

And it worked. The issue was resolved by reducing --p-n-jobs from 12 to 1. Are there specific scikit-learn or environment configurations (e.g., JOBLIB_TEMP_FOLDER) that can mitigate this memory multiplication in QIIME 2? Why did parallelization terminate the process? How does QIIME manage caching? I previously ran a similar analysis on an M2 Mac without issues, even though it only had 16 GB of RAM.

Thank you,

Ivan

colinbrislawn · March 11, 2026, 6:16pm

Hello Ivan,

Is there any chance that this is a database I released (unite-train)?

If so, I trained it to include all sh_ suffixes for species hypotheses (SH).
Having all level 7 labels be unique increases memory usage a lot!

In my new version, I remove the species hypotheses suffix and it needs less RAM to run. If you are using my old database, try my new database:

This particular Qiime2 plugin is could be improved, see Utilize one database instance for multiple classification jobs · Issue #190 · qiime2/q2-feature-classifier · GitHub

If you have the python skill, take a look!