Fragment-insertion sepp running failed

Hi there,

I'm running the fragment-insertion command and my run stops after few minutes of running.
I'm working on 64GB RAM and using WSL, Ubuntu. The allocated VM memory is 32 GB.
I tried with 16 cores, 32 cores and using #NCORES=1.
The problem is same.

  • qiime2-amplicon-2024.2
    -Conda
  • qiime fragment-insertion sepp --i-representative-sequences dada2_output/rep_seqs_final.qza --i-reference-database /home/sathya/sepp-refs-gg-13-8.qza --o-tree asvs-tree.qza --o-placements insertion-placements.qza --p-threads 32 --verbose

No error message. Run stops after few minutes of running. See the htop results below.


Thanks!

Hello @Sathya-Am, you're saying you run the command on the command line with the --verbose flag and a few minutes later it just stops running without saying anything? That's very unusual.

Can you please run the following command and post the entire command and the entire output here? Thank you.

/usr/bin/time -v qiime fragment-insertion sepp --i-representative-sequences dada2_output/rep_seqs_final.qza --i-reference-database /home/sathya/sepp-refs-gg-13-8.qza --o-tree asvs-tree.qza --o-placements insertion-placements.qza --p-threads 32 --verbose

Hi @Oddant1 , I didn't get an output even without --verbose flag.

This is how it looks like in my command prompt. After sometime it fails or it will keep frozen forever.

This is what showing when I run htop.

Hi @Oddant1 , Here's the output.

Removing /tmp/tmp.J7S1bxwBSP/sepp-tmp-4wBVzwdupd
Traceback (most recent call last):
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2cli/commands.py", line 520, in call
results = self._execute_action(
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2cli/commands.py", line 581, in _execute_action
results = action(**arguments)
File "", line 2, in sepp
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
outputs = self.callable_executor(
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 566, in callable_executor
output_views = self._callable(**view_args)
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_fragment_insertion/_insertion.py", line 75, in sepp
_run(str(representative_sequences.file.view(DNAFASTAFormat)),
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/site-packages/q2_fragment_insertion/_insertion.py", line 54, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File "/home/sathya/miniconda3/envs/qiime2-amplicon-2024.2/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run-sepp.sh', '/tmp/qiime2/sathya/data/3fc16008-83f8-479f-9eec-f71736834c43/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '32', '-A', '1000', '-P', '5000', '-a', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/aligned-dna-sequences.fasta', '-t', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/tree.nwk', '-r', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/raxml-info.txt']' returned non-zero exit status 1.

Plugin error from fragment-insertion:

Command '['run-sepp.sh', '/tmp/qiime2/sathya/data/3fc16008-83f8-479f-9eec-f71736834c43/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '32', '-A', '1000', '-P', '5000', '-a', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/aligned-dna-sequences.fasta', '-t', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/tree.nwk', '-r', '/tmp/qiime2/sathya/data/a14c6180-506b-4ecb-bacb-9cb30bc3044b/data/raxml-info.txt']' returned non-zero exit status 1.

See above for debug info.
Command exited with non-zero status 1
Command being timed: "qiime fragment-insertion sepp --i-representative-sequences dada2_output/rep_seqs_final.qza --i-reference-database /home/sathya/sepp-refs-gg-13-8.qza --o-tree asvs-tree.qza --o-placements insertion-placements.qza --p-threads 32 --verbose"
User time (seconds): 32904.50
System time (seconds): 252.10
Percent of CPU this job got: 2967%
Elapsed (wall clock) time (h:mm:ss or m:ss): 18:37.22
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 3833244
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 207
Minor (reclaiming a frame) page faults: 12644812
Voluntary context switches: 593378
Involuntary context switches: 247138
Swaps: 0
File system inputs: 2944
File system outputs: 5543536
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

This is very interesting.

Means your memory usage is peaking at less than 4gigs which is obviously nowhere near enough to cause this failure, that being said, it is possible that this is still related to OOM and the process was killed simply for requesting more memory than was available.

Can you please run it with --p-threads 1 and post the results again?

/usr/bin/time -v qiime fragment-insertion sepp --i-representative-sequences dada2_output/rep_seqs_final.qza --i-reference-database /home/sathya/sepp-refs-gg-13-8.qza --o-tree asvs-tree.qza --o-placements insertion-placements.qza --p-threads 1 --verbose

You said it still fails, so I am expecting to see a failure, but it might fail differently and show us something else.

EDIT: I looked at run-sepp.sh and am seeing this chunk here

# from http://stackoverflow.com/a/2130323
function cleanup {
  exitcode=`echo $?`
  if [ $exitcode != 0 ] && [ ! -z "$printDebug" ];
  then
    echo "========= Execution of SEPP failed with exit code $exitcode =================";
    echo "temporary working directories are NOT deleted for further inspection:";
    echo "  \$tmp = $tmp";
    echo "  \$tmpssd = $tmpssd";
    echo "--------- Content of STDOUT -----------------------------------------";
    cat sepp-$name-out.log
    echo "--------- Content of STDERR -----------------------------------------";
    cat sepp-$name-err.log
    echo "=====================================================================";
  else
    echo "Removing $tmp";
    rm -r $tmp
    rm -r $tmpssd
  fi
  unset TMPDIR
}
trap cleanup EXIT

We are seeing that "Removing tmp" in your output.

I am wondering if it maybe doesn't have the permission to remove that directory. Is that directory still there after the command finishes running?

Yes, tmp directory is in my root directory.
I'm working on a university computer and I don't have admin rights.
Dose it have something to do with this?

And my command with #cores=1 is still running.

Can you run stat /tmp/tmp.J7S1bxwBSP/sepp-tmp-4wBVzwdupd and post the output here?

(qiime2-amplicon-2024.2) sathya@SBCH-D107:~$ stat /tmp/tmp.J7S1bxwBSP/sepp-tmp-4wBVzwdupd
stat: cannot statx '/tmp/tmp.J7S1bxwBSP/sepp-tmp-4wBVzwdupd': No such file or directory

Ok that tracks with it having successfully removed the directory, so it shouldn't be that.

Can you please rerun the command with the --p-debug flag added? --verbose is giving verbose debug output from QIIME 2, in this command --p-debug should give the full debug output from sepp itself.

I don't see an output yet. It keeps frozen. And I'm still using --p-threads 1

When/if it fails with --p-threads 1 give it 32 threads again and the --p-debug. Maybe we'll get lucky, and it will succeed with one thread this time.

Okay! Got it. Will post the output once done. Thanks a lot!

Hi @Oddant1 , I ran with -p-threads 1 and it didn't give any output or error. Just kept frozen over 24 hours. Then, I ran the command again today with 32 cores and same thing happened. Initially it showed my cores are running when I use htop. But it get stopped at the middle. My last run is frozen now.

RE the 1 thread being frozen for 24 hours, that's not necessarily unexpected. It probably isn't frozen it's just taking a long time to run. These things can take a long time especially with only 1 thread. As long as your terminal is still sitting there and saying there isn't an error something is running somewhere.

RE the run with 32 threads, did you give it the --p-debug flag?

Yes, Here is the command.
/usr/bin/time -v qiime fragment-insertion sepp --i-representative-sequences dada2_output/rep_seqs_final.qza --i-reference-database /home/sathya/sepp-refs-gg-13-8.qza --o-tree asvs-tree.qza --o-placements insertion-placements.qza --p-threads 32 --p-debug

What does the terminal you ran that command in look like right now?

Please see the image below.

And here is the htop


That is fascinating. How long has it been sitting there like that for?

With 32 cores, cores were running for maybe 1/2 h. Since then it's like now (maybe 5-6 h).