SEPP Plugin on WSL - Returning non-zero exit and enter a zombie state

Hello!

I’m experiencing issues while running the fragment-insertion sepp plugin on Qiime2 within a Windows Subsystem for Linux (WSL 2) environment. The SEPP process repeatedly generates the error 'non-zero exit status 1' and when I try run with deblur, the zombie processes (Z state), appear to cause the process to stall indefinitely. I’ve tried several troubleshooting steps but haven’t resolved the issue, so any insights would be greatly appreciated!

System Details

  • OS: Windows Subsystem for Linux (WSL 2), Ubuntu 20.04
  • Qiime2 Version: 2024.10
  • Memory & Swap: 16GB memory, 12GB swap
  • Processor: 20 cores allocated in WSL

Command

qiime2-amplicon-2024.10) lisou@DESKTOP-OO5NL0A:~/Nested$ qiime fragment-insertion sepp \
--i-representati> --i-representative-sequences Nested-rep-seqs.qza \
> --i-reference-database sepp-refs-silva-128.qza \
> --p-threads 4 \
> --o-tree Nested-merged-sepp-tree.qza \
> --o-placements Nested-merged-sepp-placements.qza \
Plugin error from fragment-insertion:

  Command '['run-sepp.sh', '/tmp/qiime2/lisou/data/35e66cf8-05f4-462a-b349-4325ba34a5ef/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '8', '-A', '1000', '-P', '5000', '-a', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/aligned-dna-sequences.fasta', '-t', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/tree.nwk', '-r', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/raxml-info.txt']' returned non-zero exit status 1.

Debug info has been saved to /tmp/qiime2-q2cli-err-2swu9_nc.log

Log info:

Removing /tmp/tmp.0gkiL3LLUi/sepp-tmp-RZvEqxZmcq
Traceback (most recent call last):
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 530, in call
results = self._execute_action(
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2cli/commands.py", line 602, in _execute_action
results = action(**arguments)
File "", line 2, in sepp
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 299, in bound_callable
outputs = self.callable_executor(
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/qiime2/sdk/action.py", line 570, in callable_executor
output_views = self._callable(**view_args)
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_fragment_insertion/_insertion.py", line 75, in sepp
_run(str(representative_sequences.file.view(DNAFASTAFormat)),
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/site-packages/q2_fragment_insertion/_insertion.py", line 54, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File "/home/lisou/anaconda3/envs/qiime2-amplicon-2024.10/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run-sepp.sh', '/tmp/qiime2/lisou/data/35e66cf8-05f4-462a-b349-4325ba34a5ef/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '4', '-A', '1000', '-P', '5000', '-a', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/aligned-dna-sequences.fasta', '-t', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/tree.nwk', '-r', '/tmp/qiime2/lisou/data/e44b5e78-31e5-4a0f-9041-494bc3ca2df2/data/raxml-info.txt']' returned non-zero exit status 1.

I've tried reducing the thread set --p-threads to 2 (from 8 originally), thinking this might reduce memory pressure or process conflicts. This change did not resolve the issue.
I've also checked and increased the memory and CPU: There’s plenty of available memory and swap, so I don’t believe the issue is due to resource exhaustion.
I am also unable to run debug, as QIIME stalls.

Command with debug

qiime fragment-insertion sepp \
  --i-representative-sequences Nested-rep-seqs.qza \
  --i-reference-database sepp-refs-silva-128.qza \
  --p-threads 2 \
  --o-tree Nested-merged-sepp-tree.qza \
  --o-placements Nested-merged-sepp-placements.qza \
  --verbose \
  --p-debug

When I run top shows that some of the run_sepp.py processes enter a zombie state after a few minutes. The command then stalls indefinitely, and CPU usage drops to zero, indicating that SEPP is no longer progressing. Here’s what I observed in top:

Tasks:  13 total,   1 running,  10 sleeping,   0 stopped,   2 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  26060.9 total,  16683.2 free,   6164.4 used,   3213.3 buff/cache
MiB Swap:  12288.0 total,  12288.0 free,      0.0 used.  19536.0 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    1 root      20   0    2296   1324   1108 S   0.0   0.0   0:00.18 init
    8 root      20   0    2164    360      0 S   0.0   0.0   0:00.00 init
    9 root      20   0    2172    360      0 S   0.0   0.0   0:00.00 init
   10 lisou     20   0   10172   5108   3276 S   0.0   0.0   0:00.04 bash
  496 root      20   0    2644    564      0 S   0.0   0.0   0:00.00 init
  497 root      20   0    2644    564      0 S   0.0   0.0   0:00.05 init
  498 lisou     20   0   10172   5176   3404 S   0.0   0.0   0:00.01 bash
  517 lisou     20   0   10872   3696   3144 R   0.0   0.0   0:00.32 top                      
 635 lisou     20   0 5456316   2.4g 133432 S   0.0   9.5   0:30.78 qiime
  664 lisou     20   0    8624   3224   2948 S   0.0   0.0   0:00.00 run-sepp.sh
  668 lisou     20   0 3945268   3.6g  11572 S   0.0  14.0   2:11.24 run_sepp.py
  670 lisou     20   0       0      0      0 Z   0.0   0.0   0:01.33 run_sepp.py
  671 lisou     20   0       0      0      0 Z   0.0   0.0   0:01.39 run_sepp.py

Any help would be appreciated. I have looked at the other posts on the non-zero exit status 1 for WSL environments and the only solution I've seen is switching to OS. And I'm stumped being unable to run deblur!