DADA2 fails silently on SLURM cluster (QIIME2 2024.10) — No .qza outputs, no error messages

Context

Hi QIIME2 team,

I'm running QIIME2 2024.10 via a SLURM-managed cluster (HPC from PUCV Chile), using Miniconda with a validated environment.

I’m trying to run qiime dada2 denoise-paired on a validated .qza file (demuxed, paired-end 16S reads, 256 bp), but the command never produces any output. The .out, .err, and custom debug .log files do not show any progress beyond "Running DADA2...". No errors appear either.

What I tried

  • Input validated via qiime tools validate
  • Checked that .qza files exist and are readable
  • Executed DADA2 via SLURM using 8 and 12 hour limits
  • Added custom debug messages to SLURM script (via echo) → confirms that QIIME is activated and DADA2 is called
  • Used --verbose flag, but no output beyond "[5] Executing DADA2..."
  • Monitored with top and ps aux → No qiime/dada2 process active during job runtime
  • Tried running manually in interactive node → same silent behavior

SLURM script (summary)

#!/bin/bash
#SBATCH --job-name=dada2_si
#SBATCH --partition=CPU
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --time=12:00:00
#SBATCH --output=dada2_sin_litoralis_debug_%j.out
#SBATCH --error=dada2_sin_litoralis_debug_%j.err

# Activate conda
source ~/miniconda3/etc/profile.d/conda.sh
conda activate qiime2-amplicon-2024.10

cd /work/katherine.munoz/metabarcoding_sin_litoralis

# Validate
qiime tools validate demux-sin-litoralis.qza

# Run DADA2
qiime dada2 denoise-paired \
  --i-demultiplexed-seqs demux-sin-litoralis.qza \
  --p-trunc-len-f 240 \
  --p-trunc-len-r 240 \
  --o-representative-sequences rep-seqs-sin-litoralis.qza \
  --o-table table-sin-litoralis.qza \
  --o-denoising-stats stats-sin-litoralis.qza \
  --verbose

Environment
QIIME2 version: 2024.10

Python: 3.10

Platform: SLURM-managed HPC (CentOS/RHEL, 191 GB RAM)

Reads: Paired-end, 256 bp, demuxed

Input validated with qiime tools validate

Error log: No errors, just timeout after 8 or 12 hours

What I suspect
Could this be related to:

DADA2 being unusually slow on clusters?

A conflict between SLURM resource allocation and the way DADA2 spawns R processes?

Something broken with multithreading?

Any diagnostic suggestions or ideas for debugging would be deeply appreciated.

Thanks in advance!

— Katherine

1 Like

Hey @Katherine-KMC,

Thanks for the nice writeup of your issue!

I am going to give a quick guess:

Your data may be quite large, and qiime tools validate can be quite silent while it works.
That could easily eat a few hours of your walltime budget.

The next likely problem is you haven't set any parallelism for DADA2 which speeds it up quite a bit.

I see -c 8 in your sbatch param, so the equivalent bit for DADA2 will be setting:

--p-n-threads 8

That will make it so that you get to use those cores that were provisioned.


Now all that said, I would expect to see a difference in output between 8 hrs and 12 hrs, since even single-threaded, you'll still see a bit of output. But perhaps you only saw DADA2 output at the end of the 12hr job? That would be consistent with validate eating all of your walltime while it thouroughly inspects the format of each record in your fastq files.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.