Problem with Fragment Insertion - SEPP

Hello, everyone,

I’m trying to run qiime fragment-insertion sepp with a SILVA rooted tree and aligned sequences (99%), but I keep getting an error. First I tried with my own computer, but I noticed there wasn’t enough RAM in it. So I went after a remote server with enough memory, but I still get an error.

My command line is as follows:

[felipe.rocha@bioinfo Analises] docker run -t -i -v (pwd):/data qiime2/core:2019.7
qiime fragment-insertion sepp
–i-representative-sequences Felipe-Deblur-RepSeqs.qza
–i-reference-alignment silva99_alignedseqs.qza
–i-reference-phylogeny silva99_rootedtree.qza
–p-threads 2
–o-tree FelipeSilva-FragmInsert-RootedTree.qza
–o-placements FelipeSilva-FragmInsert-Placements.qza
–verbose

And the error I get:

Removing /tmp/tmp.4taskGv3iQ/sepp-tmp-i9N6X
Traceback (most recent call last):
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2cli/commands.py”, line 327, in call
results = action(**arguments)
File “</opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/decorator.py:decorator-gen-299>”, line 2, in sepp
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 240, in bound_callable
output_types, provenance)
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 383, in callable_executor
output_views = self._callable(**view_args)
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py”, line 179, in sepp
reference_alignment, reference_phylogeny, debug)
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py”, line 137, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File “/opt/conda/envs/qiime2-2019.7/lib/python3.6/subprocess.py”, line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run-sepp.sh’, ‘/tmp/qiime2-archive-bfgihcj8/b712abec-2b25-418a-b569-b8f8ad4be511/data/dna-sequences.fasta’, ‘q2-fragment-insertion’, ‘-x’, ‘2’, ‘-A’, ‘1000’, ‘-P’, ‘5000’, ‘-a’, ‘/tmp/qiime2-archive-fizomboa/b35759dc-0bdc-4c8a-a122-3496ab0a0222/data/aligned-dna-sequences.fasta’, ‘-t’, ‘/tmp/qiime2-archive-5e4vsrma/50638235-87e6-44f0-a9ce-302bf1d7294d/data/tree.nwk’]’ returned non-zero exit status 1.

Plugin error from fragment-insertion:

Command ‘[‘run-sepp.sh’, ‘/tmp/qiime2-archive-bfgihcj8/b712abec-2b25-418a-b569-b8f8ad4be511/data/dna-sequences.fasta’, ‘q2-fragment-insertion’, ‘-x’, ‘2’, ‘-A’, ‘1000’, ‘-P’, ‘5000’, ‘-a’, ‘/tmp/qiime2-archive-fizomboa/b35759dc-0bdc-4c8a-a122-3496ab0a0222/data/aligned-dna-sequences.fasta’, ‘-t’, ‘/tmp/qiime2-archive-5e4vsrma/50638235-87e6-44f0-a9ce-302bf1d7294d/data/tree.nwk’]’ returned non-zero exit status 1.

See above for debug info.

Everything seems fine in my command, and I have sufficient memory, so please help me out on this one. Thank you all, Felipe.

Hi @Felipe_Rocha,

How did you get the Silva trees? There's a fun issues with sepp in that you can't use the trees that come in the QIIME release, you need to get the SEPP-specific trees. There's a thread about that here:

If switching trees doesn't solve your problem, you might consider the --debug flag next time? It wont fix anything, but it might give more answers?

Best,
Justine

1 Like

Hi @jwdebelius,

I got my Silva 99% tree from the zip file they have on their website, it is something like silva_99.tre, which must be the QIIME release you are talking about. I’ll try to use the --debug flag again today, let’s see what happens.

I cannot find this reference info that Ryan on that other thread is talking about, all I see as input is:
Usage: qiime fragment-insertion sepp [OPTIONS]

Perform fragment insertion of 16S sequences using the SEPP algorithm
against the Greengenes 13_8 99% tree.

Inputs:
–i-representative-sequences ARTIFACT FeatureData[Sequence]
The sequences to insert [required]
–i-reference-alignment ARTIFACT FeatureData[AlignedSequence]
The reference multiple nucleotide alignment used to
construct the reference phylogeny. [optional]
–i-reference-phylogeny ARTIFACT
Phylogeny[Rooted] The rooted reference phylogeny. Must be in sync with
reference-alignment, i.e. each tip name must have
exactly one corresponding record. [optional]

If things go wrong again I’ll try to read the README.md, what sort of program should I use to open .md files? I wasn’t able to do it.

Thanks for helping,
Felipe.

Hi @Felipe_Rocha,

THere should be .qza files already imported in the share/qiime2-fragment-insertion directory (or something similar); the exact path is given on the github link, and you should be able to use that and the alignment file instead of the tree.

Im currently running those files for hte first time with my own fragment insertion, and the --debug flag has been my saving grace. ("I'll do insertion against Silva," she said. "It will make my new boss happy," she said. "How hard can it be?" ...Two weeks of expensive debugging later, and I'll let you know hopefully by sometime Saturday whether or not these trees worked for me.) But, in the meantime, seriously, reading the output from the debug flag on the terminal helped me so much.

As far as the .md files go,

A .md file is a markdown file, it's like HMTL light. You can read it using a text editor, but some of the annotation will be different. (Actually, a lot like text formatting here). Or, my favorite mac program for markdown is MacDown because it's a relatively nice side-by-side rendering.

Best,
Justine

1 Like

Hi @Felipe_Rocha and @jwdebelius,
I am sorry that using Silva is currently such a pain in the a**. I am working on measures to make it way easier in future qiime2 releases: https://github.com/qiime2/q2-fragment-insertion/pull/63

For now, changing the reference is impossible via the q2 wrapper since does not expose the parameter to set the file path for the info file. You need three files for your reference: 1. tree, 2. alignment and 3. info.

The funny thing is, that those three files are already shipped with qiiime2-2019.7: unzip https://anaconda.org/qiime2/q2-fragment-insertion/2019.7.0/download/linux-64/q2-fragment-insertion-2019.7.0-py36_0.tar.bz2 and navigate to /share/fragment-insertion/ref/ you will find three files silva12.8_99otus_aligned_masked1977.tre silva12.8_99otus_aligned_masked1977.fasta silva12.8_99otus_aligned_masked1977.info

You will have to run SEPP manually (i.e. not using the qiime plugin) via command line: run-sepp.sh yourfragments.fasta q2-fragment-insertion -x 2 -A 1000 -P 5000 -a silva12.8_99otus_aligned_masked1977.fasta -t files silva12.8_99otus_aligned_masked1977.tre -r silva12.8_99otus_aligned_masked1977.info -b 1

Best,
Stefan

3 Likes

Thanks @Stefan, I'll take a look at the manual command.

Perhaps a stupid question, but if thats the case, why is it exposed in the documentation? It might be good to just update the docs to make that clear.

Best,
Justine

Hi @Stefan,

Continuing with live debugging, I’m getting an error with I try the suggested command:

========= Execution of SEPP failed with exit code 1 =================                                                                                                                 
temporary working directories are NOT deleted for further inspection:                                                                                                                 
  $tmp = /tmp/tmp.o0ZesBHaZU/sepp-tmp-QoU2w                                                                                                                                   
  $tmpssd = /tmp/tmp.o0ZesBHaZU/sepp-tempssd-K86g
--------- Content of STDOUT ----------------------------------------- 
cat: sepp-./data/trees/silva128_insertion-out.log: No such file or directory 

I’m trying to run it outside the share directory; Id like to be able to instead reference the file I need.

Best,
Justine

Hi @jwdebelius,
can you please post the full command that led to the error, and perhaps also the result of pwd and ls -la to give us a feeling of your path “environment” and presence / absence of necessary reference files?

I just recall that there was another little bug in SEPP itself with providing the info file reference. We fixed that in the original sources and also compiled a new conda package for SEPP: https://anaconda.org/bioconda/sepp but I am not sure if the latest qiime version uses this fixed package :-/ Thus, you might want to install (and thus overwrite) conda install -c bioconda sepp>=4.3.10 prior to further debugging.

My command was

run-sepp.sh \
 ./data/trees/deblur_rep_seq.fasta \
 ./data/trees/silva128_insertion \
 -x 2 \
 -a /home/justine.debelius/trees/sepp-ref-tree/silva128/share/fragment-insertion/ref/silva12.8_99otus_aligned_masked1977.fasta \
 -t /home/justine.debelius/trees/sepp-ref-tree/silva128/share/fragment-insertion/ref/silva12.8_99otus_aligned_masked1977.tre \
 -r /home/justine.debelius/trees/sepp-ref-tree/silva128/share/fragment-insertion/ref/silva12.8_99otus_aligned_masked1977.info \
 -b 1

And, I realize I didnt print the error (or I got a new one…):

/home/justine.debelius/miniconda3/envs/qiime2-2019.7-mini/bin/run-sepp.sh: line 137: sepp-./data/trees/silva128_insertion-out.log: No such file or directory

The sub directories and referenced files exist, and Im working out of the right location, so I dont think its a path issue?

I tried the conda update, my current version is sepp=4.3.10=py36_0.

Best,
Justine

The second argument to run-sepp.sh is expected to be a name, not a real prefix as in a path :-/

Which is hard to see from the script https://github.com/smirarab/sepp/blob/bd26318e7857a98c5917a1b0c7b97aa4a9096e2c/sepp-package/run-sepp.sh#L4

The script now tries to first combine the word sepp- with your prefix ./data/trees/silva128_insertion and than checks if the file sepp-./data/trees/silva128_insertion can be written - which is not the case due to the included . and /. I suggest you change the second argument to just: silva128_insertion

Hi @jwdebelius and @Stefan,

well, it didn't work out, but now I have a huge debug file which I don't understand. :sweat_smile:
I think there is more to it above but I couldn't rescue in my terminal.

debug-sepp.txt (124.1 KB)

I hope you can find your way through fragment-insertion!

Thanks for the explanation, I'll download some markdown program to read that file.
Thank you both for the discussion here, I'll do my best to understand and try each of the steps you both proposed here, unfortunately, I'm still not well versed in python, conda, and so on.

I see you are suggesting to use Silva 12.8 files, I was using the latest release (13.2). Should I try with the previous release then?

Cheers,
Felipe.

1 Like

Won't that then move my path to sepp-data/trees/silva128_insertion, creating a new folder and data structure? Does it mean I have to go to my target directory (data/trees/ to run the command and can't run it from a parent directory?

It looks like the run-sepp.sh script won’t write if you try to run it from a directory other than the target directory with a relative path. ¯\_(ツ)_/¯

@Felipe_Rocha from scrolling through your debug report, I figure your computer has too few memory: OSError: [Errno 12] Cannot allocate memory

2 Likes

OK, so I should try to run manually the script with my target directory open in the terminal?

Wow, I've been using a remote server with ~380GB of RAM, isn't that enough? Or is there something going on while I use this remote server which prevents me of using its memory?

1 Like

Yes, navigate to your target directory (wherever you want the tree to come out) and then run the come. If you use path (like I was), you'll get the same "can't make" error.

1 Like