Fragment-insertion sepp returned non-zero exit status

(Nick Scales) #1

Hi there,

When I run:

qiime fragment-insertion sepp \
  --i-representative-sequences rep-seqs.qza \
  --i-reference-alignment alignment.qza \
  --i-reference-phylogeny rooted_tree.qza \
  --o-tree insertion-tree.qza \
  --o-placements insertion-placements.qza

I get the following error message:

Traceback (most recent call last):
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/site-packages/q2cli/commands.py", line 274, in __call__
    results = action(**arguments)
  File "<decorator-gen-290>", line 2, in sepp
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/site-packages/qiime2/sdk/action.py", line 362, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_fragment_insertion/_insertion.py", line 179, in sepp
    reference_alignment, reference_phylogeny, debug)
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/site-packages/q2_fragment_insertion/_insertion.py", line 137, in _run
    subprocess.run(cmd, check=True, cwd=cwd)
  File "/data/apps/anaconda/3.6-4.3.1/envs/qiime2-2018.11/lib/python3.5/subprocess.py", line 398, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['run-sepp.sh', '/scratch/6225856.1.mic/qiime2-archive-crqnvwyn/e4ba8d00-30b4-46c9-8e40-d0b8f3820351/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '1', '-A', '1000', '-P', '5000', '-a', '/scratch/6225856.1.mic/qiime2-archive-5tonh1yi/714f006f-6a6d-4c62-a393-b9fc0ae24df5/data/aligned-dna-sequences.fasta', '-t', '/scratch/6225856.1.mic/qiime2-archive-qc27x6og/12b1e47c-b2f4-479e-bea3-c3875b1864f0/data/tree.nwk']' returned non-zero exit status 1

Plugin error from fragment-insertion:

  Command '['run-sepp.sh', '/scratch/6225856.1.mic/qiime2-archive-crqnvwyn/e4ba8d00-30b4-46c9-8e40-d0b8f3820351/data/dna-sequences.fasta', 'q2-fragment-insertion', '-x', '1', '-A', '1000', '-P', '5000', '-a', '/scratch/6225856.1.mic/qiime2-archive-5tonh1yi/714f006f-6a6d-4c62-a393-b9fc0ae24df5/data/aligned-dna-sequences.fasta', '-t', '/scratch/6225856.1.mic/qiime2-archive-qc27x6og/12b1e47c-b2f4-479e-bea3-c3875b1864f0/data/tree.nwk']' returned non-zero exit status 1

See above for debug info.

I have looked through the similar topics but the previous answers seem either related to space or compute power which I don’t think applies as I am running on a high performance cluster.

The alignment and phylogeny were generated using the command:

qiime phylogeny align-to-tree-mafft-fasttree \
  --i-sequences rep-seqs.qza \
  --o-alignment alignment.qza \
  --o-masked-alignment masked_alignment.qza \
  --o-tree tree.qza \
  --o-rooted-tree rooted_tree.qza

My rep seqs file was generated by dada2 denoise, I am happy to send it over if that is helpful, it’s 50MB so I wasn’t sure if it would be wise/possible to attach here! I am using qiime2 v 2018.11 but I also tried with version 2019.1 and got the same error. In addition, to confirm it wasn’t space related I tried running this on a different drive on the cluster which has many TB of available space but keep getting the same issue!

Thanks in advance for your help!

Nick

(Nicholas Bokulich) #2

Welcome @nickscales!

I am cc:ing the developer of this plugin, @Stefan, to take a look at this error.

Would you be willing to share your input data files in case @Stefan needs these to debug?

Thanks!

(Nick Scales) #3

Hi @Nicholas_Bokulich thank you! To try and narrow down the error I ran the following command using the default greengenes database:

qiime fragment-insertion sepp \
  --i-representative-sequences rep-seqs.qza \
  --o-tree insertion-tree.qza \
  --o-placements insertion-placements.qza

And it seems to be working so I think the issue is my reference alignment and phylogeny

rooted_tree.qza (367.0 KB)
alignment.qza (2.5 MB)

Thanks for your help!

2 Likes
(Stefan Janssen) #4

Hi @nickscales,

you are using your own reference, i.e. a phylogeny and an alignment. Unfortunately, a proper reference set also needs to contain a raxml.info file (this is not yet documented, thus you could not know). Here are example commands how to properly create phylogeny and info file from alignment: https://github.com/smirarab/sepp-refs/tree/master/silva
However, the current plugin version cannot accept info-file paths as inputs :-/ I am working on an PR to change this in the future: https://github.com/qiime2/q2-fragment-insertion/pull/32 and like to make this the right way by also creating a bioconda package for the underlying SEPP program: https://github.com/bioconda/bioconda-recipes/pull/14233 which at the moment is stalled due to some osx dendropy dependency issues.

Thus, to make it work for you, you need to “hack” a little. First you need to come up with this info file. Second, you can either replace the default info file (should be found as $CONDA_PREFIX/share/fragment-insertion/ref/RAxML_info-reference-gg-raxml-bl.info) or pull my PR https://github.com/qiime2/q2-fragment-insertion/pull/32 and provide the file path as further input argument.

Hope this helps until we merge this addition into master and make it generally available.

Stefan

3 Likes
(Nick Scales) #5

Hi @Stefan thank you SO much for your reply! I have put my RAxML info in /share/fragment-insertion/ref and I named it the same as the previous file, but I keep getting the same error:

Blockquote

Plugin error from fragment-insertion:

Reference alignment and phylogeny do not match up. Please ensure that all sequences in the alignment correspond to exactly one tip name in the phylogeny.

Blockquote

I have checked a few times and I’m fairly sure they are matching up, I even tried shortening the names but to no avail - do you know what might be happening?

RAxML_rooted.nwk (4.9 KB)
RAxML_info.txt (34.7 KB)
seqs.fastq (107.1 KB)

(Stefan Janssen) #6

Hi @nickscales,

I don’t fully understand why, but when I import your tree into an Qiime2 artifact and load this with my function that checks if alignment and tree matches up, I see that the names loose their underscores, e.g. the identifier Curtobacterium_sp._MCBA15_001 is read as Curtobacterium sp. MCBA15 001 - although it has the underscores in the Newick file!!

@Nicholas_Bokulich is that change of identifiers a bug or a feature?
(Update: looks like skbio.TreeNode.read is responsible for ID transformation)

My suggestion for a workaround is that you remove all _ from the identifiers in the Newick and the Alignment file and try again.

Best
Stefan

(Nick Scales) #9

Hi @Stefan,

Many thanks again for your help. Removing the underscores in the Newick file has indeed stopped the names not matching error. However, frustratingly, we now have the exact same error I was seeing originally:

Blockquote
Traceback (most recent call last):
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py”, line 274, in call
results = action(**arguments)
File “</data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-294>”, line 2, in sepp
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 231, in bound_callable
output_types, provenance)
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 365, in callable_executor
output_views = self._callable(**view_args)
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py”, line 179, in sepp
reference_alignment, reference_phylogeny, debug)
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py”, line 137, in _run
subprocess.run(cmd, check=True, cwd=cwd)
File “/data/users/nscales/miniconda3/envs/qiime2-2019.1/lib/python3.6/subprocess.py”, line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘run-sepp.sh’, ‘/scratch/6520860.1.mic/qiime2-archive-knar6piz/e4ba8d00-30b4-46c9-8e40-d0b8f3820351/data/dna-sequences.fasta’, ‘q2-fragment-insertion’, ‘-x’, ‘1’, ‘-A’, ‘1000’, ‘-P’, ‘5000’, ‘-a’, ‘/scratch/6520860.1.mic/qiime2-archive-7ozvg0r7/b199d679-1701-4100-a352-21e0e74790ec/data/aligned-dna-sequences.fasta’, ‘-t’, ‘/scratch/6520860.1.mic/qiime2-archive-xgivba8f/333eb21f-35bf-41c3-aaa8-f70d5645c0a0/data/tree.nwk’]’ returned non-zero exit status 1.

Blockquote
Plugin error from fragment-insertion:

Blockquote
Command ‘[‘run-sepp.sh’, ‘/scratch/6520860.1.mic/qiime2-archive-knar6piz/e4ba8d00-30b4-46c9-8e40-d0b8f3820351/data/dna-sequences.fasta’, ‘q2-fragment-insertion’, ‘-x’, ‘1’, ‘-A’, ‘1000’, ‘-P’, ‘5000’, ‘-a’, ‘/scratch/6520860.1.mic/qiime2-archive-7ozvg0r7/b199d679-1701-4100-a352-21e0e74790ec/data/aligned-dna-sequences.fasta’, ‘-t’, ‘/scratch/6520860.1.mic/qiime2-archive-xgivba8f/333eb21f-35bf-41c3-aaa8-f70d5645c0a0/data/tree.nwk’]’ returned non-zero exit status 1.

Does that mean the error is to do with my info file perhaps?

Kind regards,
Nick

(Stefan Janssen) #10

Could you please re-run with additional command line parameters --p-debug --verbose hoping to get more verbose error messages. If you don’t mind sending me at least a part of your input sequences, I could start debugging on my end.

(Nick Scales) #12

Looking through the error message, it looks like the key part is

Blockquote
Warning: using a statistics file directly is now deprecated. We suggest using a reference package. If you already are, then please use the latest version of taxtastic.
WARNING: your stats file is from RAxML 8.2.12; RAxML has been tested with the following versions: 7.0.4; 7.2.3; 7.2.5; 7.2.6; 7.2.7
I’m going to try parsing as if this was version 7.2.3Problem parsing info or stats file/data/users/nscales/miniconda3/envs/qiime2-2019.1/share/fragment-insertion//ref/RAxML_info-reference-gg-raxml-bl.info

Blockquote
error:Uncaught exception: Parse_stats.Stats_parsing_error(“too many partitions. Only one is allowed.”)
Fatal error: exception Parse_stats.Stats_parsing_error(“too many partitions. Only one is allowed.”)

Which seems as though it is the info file I added. I actually do have a reference package as I was trying to do this using pplacer before I saw it could be done in qiime - is it possible to incorporate that? What does the “too many partitions” error mean, and how can I fix it?

Thank you!

fragment-insertion.o6534112.txt (26.9 KB)
RAxML_info.txt (34.7 KB)

(Stefan Janssen) #13

Good morning. We are now leaving my comfort zone and I’d like to refer to SEPP’s developer :-/ As a first hint, you might want to see what we have done to create a reference (tree/alignment/info) for Silva 12.8: https://github.com/smirarab/sepp-refs/tree/master/silva as said earlier.
Another idea would be to take our default info file and try to adapt necessary value to your - but I don’t know which they are, neither am I familiar with the info file format.

1 Like
(Nick Scales) #14

I have found a solution! At https://github.com/smirarab/sepp/blame/master/tutorial/sepp-tutorial.md#L451 they mention that you need to remove a line in my info file that read: ‘Partition: 0 with name: No Name Provided’ and this has let the script finish once I put that back in share/fragment-insertion/ref/

Thanks so much for your help - have a great day!

3 Likes
(Stefan Janssen) #15

That is amazing and a very good catch - reading the documentation sometimes really helps :wink: I am sure this answer will also help others struggling with similar situations. Thanks a lot!

2 Likes