Error when running insert-fragment sequences using SEPP

Hi all,

I've attempted to run fragment-insertion on both the command line and in q2studio to see if I can identify why it keeps failing.

Running it in the q2studio with debug finally gave me this error report:

Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/scheduler.py", line 298, in call_back
join._tick(job)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/scheduler.py", line 232, in _tick
self.perform()
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/exhaustive.py", line 83, in perform
self.figureout_fragment_subset()
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/exhaustive.py", line 49, in figureout_fragment_subset
search_res = fragment_chunk_problem.get_job_result_by_name("hmmsearch")
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/problem.py", line 99, in get_job_result_by_name
with open(job.result, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_0/A_0_0/FC_0_0_0/hmmsearch.results.y42_fepn'
Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/jobs.py", line 131, in run
self.read_stderr() if self.read_stderr() else 'No error messages available']))
sepp.scheduler.JobError: The following execution failed:
/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/.sepp/bundled-v4.3.5/hmmsearch --noali --cpu 1 -o /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/FC_12_2_19/hmmsearch.results.5x7xf768 -E 99999999 --max /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/hmmbuild.model.iu6u_gab /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/fragment_chunks/fragment_chunk_19uh5rymx3.fasta

Error: File existence/permissions problem in trying to open HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/hmmbuild.model.iu6u_gab.
HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A

Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/jobs.py", line 131, in run
self.read_stderr() if self.read_stderr() else 'No error messages available']))
sepp.scheduler.JobError: The following execution failed:
/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/.sepp/bundled-v4.3.5/hmmsearch --noali --cpu 1 -o /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/FC_12_2_18/hmmsearch.results.w3lt_z80 -E 99999999 --max /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/hmmbuild.model.iu6u_gab /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/fragment_chunks/fragment_chunk_18x8grlaol.fasta

Error: File existence/permissions problem in trying to open HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A_12_2/hmmbuild.model.iu6u_gab.
HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_12/A

Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion//sepp/run_sepp.py", line 25, in
ExhaustiveAlgorithm().run()
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/algorithm.py", line 169, in run
if (not JobPool().wait_for_all_jobs()):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/share/fragment-insertion/sepp/sepp/scheduler.py", line 342, in wait_for_all_jobs
raise Exception(job.errors[0])
Exception: [Errno 2] No such file or directory: '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.Mnzb9sur/q2-fragment-insertion.2gn3nj0g/root/P_0/A_0_0/FC_0_0_0/hmmsearch.results.y42_fepn'

followed by the STDERR::

concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 34, in _subprocess_apply
results = action(*args, **kwargs)
File "</Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-464>", line 2, in sepp
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 179, in sepp
reference_alignment, reference_phylogeny, debug)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 137, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-5zwbdq8o/d9988cbe-1ac2-4959-87a4-578f89207565/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '13', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-fa_lfted/6920b923-8e48-4bbf-9dd3-ea42fbec4e02/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-vhe40z07/51ec41c3-b6de-4f01-a8f7-7547e76aa1fb/data/tree.nwk', '-b', '1']' returned non-zero exit status 1.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/Desktop/Qiime2/q2studio-2019.1.0/q2studio/api/jobs.py", line 156, in callback
results = future.result()
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/concurrent/futures/_base.py", line 425, in result
return self.__get_result()
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
subprocess.CalledProcessError: Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-5zwbdq8o/d9988cbe-1ac2-4959-87a4-578f89207565/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '13', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-fa_lfted/6920b923-8e48-4bbf-9dd3-ea42fbec4e02/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-vhe40z07/51ec41c3-b6de-4f01-a8f7-7547e76aa1fb/data/tree.nwk', '-b', '1']' returned non-zero exit status 1.

Thanks for any and all help and opinions!

Hi @TKOneal,
I am cc:ing the fragment-insertion developer @Stefan to look into this. Thanks for your patience!

Hi @TKOneal,

With the caveat that your milage may vary, my experience is that SEPP is one of the few programs in QIIME 2 that assumes you're working in a specific linguistic environment. Which means that I get a similar fragment insertion failure on my Swedish machine, but not my American one.

Could you check your location settings by running

locale

in your terminal?

If you don't have a LC_ALL listed, I suggest running

export LANG='en_US.utf8'

If that doesn't work, IDK, but its at least something?

Hi @TKOneal,

reading your error report let's me first think about concurrency issues. What kind of computational environment are you using? Is it a single laptop, or are you running SEPP in a grid environment. It could be that one thread is creating a temporary directory in /var/tmp which another thread overwrites / deletes and thus the first cannot find expected files. If that is the case, you should change the default tmp dir to something more stable across nodes in your grid.

@jwdebelius I have not yet encountered issues with different localizations, but I also only tested German and American envs. I'd be eager to see a concrete error message should you once again have this issue. Would be good to fix that in SEPP itself instead of having users to set this environment variable.

Best,
Stefan

Hi jwdebelius!
Thanks for responding. Running locale shows the following:;
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

Hi Stefan,
I am running Qiime2 locally. Here are some basic system details::

MacOS Mojave v10.14.4
MacPro (late 2013)
processors 2.7GHz 12-core intel xeon E5
memory 64GB 1866 MHz DDR3

I ran the fragment-insertion plugin with 20 threads of the machines 24.
The following is directly from my run records:
qiime fragment-insertion sepp
--i-representative-sequences non-aligned-rep-seqs.qza
--p-threads 20
--p-alignment-subset-size 1000
--p-placement-subset-size 5000
--o-tree tree_frag-insertion_SEPP.qza
--o-placements insertion-placements_SEPP.qza
--verbose
I use default settings. I am concerned this may be some sort of memory issues. On another thread here on the forum I've been speaking with Nicholas_Bokulich about the size of my data set. Which is comprised of 32 samples with 389,340 features and a total frequency of 2,877,374.

Hi @TKOneal,

Okay, so probably not the same problem Im getting! That's at least one thing

pinging @Stefan
:qiime2:

Hi Stefan,
Just an update, I updated my qiime 2 version to qiime2-2019.4 and now I can't seem to get the fragment-insertion SEPP to run at all. I'm not sure if what sort of error it is but I wanted to post the report encase anyone else has the same issue.
input:

    qiime fragment-insertion sepp  \
    --i-representative-sequences rep-seqs.qza \
    --i-reference-alignment  gg_99_otus_aligned.qza \
    --i-reference-phylogeny  gg_99_otus_annotated_tre.qza \
    --p-threads 0 \
    --o-tree insertion-tree.qza \
    --o-placements insertion-placements.qza \
    --output-dir fragment-insertion_SEPP \
    --verbose

Removing /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tmp-XXXXX.t2128idL
Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2cli/commands.py", line 311, in call
results = action(**arguments)
File "</Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/decorator.py:decorator-gen-299>", line 2, in sepp
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 179, in sepp
reference_alignment, reference_phylogeny, debug)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 137, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.4/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-wvx96o9l/4d78965a-a2ca-47c4-9589-4b5d9bce4c9b/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '0', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-jgb225ve/5d1ee9a7-2723-4502-bc3f-41fac3177d4f/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-huvom3i5/c99314bc-a852-49be-9663-308d299bfe60/data/tree.nwk']' returned non-zero exit status 1.

Plugin error from fragment-insertion:

Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-wvx96o9l/4d78965a-a2ca-47c4-9589-4b5d9bce4c9b/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '0', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-jgb225ve/5d1ee9a7-2723-4502-bc3f-41fac3177d4f/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-huvom3i5/c99314bc-a852-49be-9663-308d299bfe60/data/tree.nwk']' returned non-zero exit status 1.

and the same error report comes up if I remove both reference files, changed threads or changed output names, locations.

I also ran the same input using qiime2-2019.1

Removing /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tmp-XXXXX.aoKUCkEN
Traceback (most recent call last):
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in call
results = action(**arguments)
File "</Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-290>", line 2, in sepp
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 365, in callable_executor
output_views = self._callable(**view_args)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 179, in sepp
reference_alignment, reference_phylogeny, debug)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 137, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File "/Users/environmentalmicrobiologyteam/miniconda3/envs/qiime2-2019.1/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-borz_lb2/4d78965a-a2ca-47c4-9589-4b5d9bce4c9b/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '0', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-6cnfpdp6/5d1ee9a7-2723-4502-bc3f-41fac3177d4f/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-m0kt6fuh/c99314bc-a852-49be-9663-308d299bfe60/data/tree.nwk']' returned non-zero exit status 1.

Plugin error from fragment-insertion:

Command '['run-sepp.sh', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-borz_lb2/4d78965a-a2ca-47c4-9589-4b5d9bce4c9b/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '0', '-A', '1000', '-P', '5000', '-a', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-6cnfpdp6/5d1ee9a7-2723-4502-bc3f-41fac3177d4f/data/aligned-dna-sequences.fasta', '-t', '/var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/qiime2-archive-m0kt6fuh/c99314bc-a852-49be-9663-308d299bfe60/data/tree.nwk']' returned non-zero exit status 1.

Hope this helps!

Hi @TKOneal,
I think you should at least give SEPP one thread, thus please use --p-threads 1. Inserting ~400k fragments into the 200k tips big reference tree is a heavy job. I doubt your 64 GB RAM will suffice for that.
Could you please also add --p-debug to your command to get more speaking error information.

Hi Stefan,
Thanks for your help. I think I may have fixed part of the issue. I have been filtering my rep-seqs.qza incorrectly. It is currently running with the debug flag. I will let you know if it throws any errors. I'm also running it with 10 threads currently.

Hi Stefan,
I've attached the debug report from my most recent run.

sepp_error_STERR.txt.zip (385.4 KB)

to me, it still reads like the same error:
Error: File existence/permissions problem in trying to open HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.IEzkn47F/q2-fragment-insertion.bbxsw1we/root/P_11/A_11_5/hmmbuild.model.juw7elj0. HMM file /var/folders/8h/80xg_29d0lb0pdtsmt406gr80000gn/T/sepp-tempssd-XXXX.IEzkn47F/q2-fragment-insertion.bbxsw1we/root/P_11/A

You might want to take a look into the sepp bash script at $CONDA_PREFIX/bin/run-sepp.sh
`
In lines 11, 20, 25 or 29, you can tweak the directory used for temporary files.

I would also play with the number of cores used simultaneously and maybe start with a file containing only the first say 100 sequences for quicker debugging.