Q2-picrust2 plugin error running hsp.py script

mmelendrez · November 12, 2018, 5:10pm

Hello - I am attempting to use the q2-picrust2 plug in for my dataset. Download and activation of the plugin seemed to go fine, no errors thrown.

Per the tutorial the plugin needs a tree created by q2-fragment-insertion I did this with the following commands per the q2-fragment-insertion documentation :

(qiime2-2018.6) wsb255bioimac27:AllPrimerAnalysis_071118 mel_local$ qiime fragment-insertion sepp --i-representative-sequences allPrimers-rep-seqs.qza --o-tree insertion-tree.qza --o-placements insertion-placements.qza --p-threads 4

Filtered my feature table to ensure only those features found in the tree were in the table:

(qiime2-2018.6) wsb255bioimac27:AllPrimerAnalysis_071118 mel_local$ qiime fragment-insertion filter-features --i-table allPrimers-dada2-table.qza --i-tree insertion-tree.qza --o-filtered-table allPrimers_filtered-insertion-table.qza --o-removed-table allPrimers_removed-insertion-table.qza --verbose

I further filtered my feature-table as I was only interested in the output pertaining to two of the three primer sets I was working with:

qiime feature-table filter-samples --i-table allPrimers_filtered-insertion-table.qza --m-metadata-file Pro_V4_keep.tsv --o-filtered-table table_V4-Pro.qza

Another filter step to only keep those samples pertaining to the 2 fermenters (rather than also including initial manure and food samples)

(qiime2-2018.8) wsb255bioimac27:Issue_0037_Bacteria mel_local$ qiime feature-table filter-samples --i-table table_V4-Pro.qza --m-metadata-file sample_metadata_110718.tsv --o-filtered-table table_V4-Pro-F12.qza --p-where "Subject IN ('Fermenter1', 'Fermenter2')" Saved FeatureTable[Frequency] to: table_V4-Pro-F12.qza

So now I have:

FeatureTable[Frequency] type table_V4-Por-F12.qza
SEPP tree - insertion-tree.qza

I ran the q2-picrust2 plugin per the tutorial with the suggested parameters just to see if I could get it working:

(qiime2-2018.8) wsb255bioimac27:Issue_0037_Bacteria mel_local$ qiime picrust2 custom-tree-pipeline --i-table table_V4-Pro-F12.qza --i-tree insertion-tree.qza --output-dir q2-picrust2-output --p-threads 1 --p-hsp-method pic --p-max-nsti 2

I got the following error:

Error running this command: hsp.py -i 16S -t /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp0356q166/placed_seqs.tre -p 1 -n -o /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp0356q166/picrust2_out/16S_predicted -m pic

I was unable to open the directory path to the tmp files to figure out what might have happened? The error is not informative...at least not intuitive in terms of what might have went wrong?

I would appreciate any suggestions... files attached.

table_V4-Pro-F12.qza (809.9 KB)
insertion-tree.qza (3.1 MB)
sample_metadata_110718.tsv (17.7 KB)

WAS1 · November 20, 2018, 1:39pm

seen pretty similar errors recently

Nicholas_Bokulich · November 20, 2018, 1:42pm

Hi @mmelendrez,
Sorry for the delay in responding — I am pinging @gmdouglas to look at this problem.
Thanks!

Ravenclaw · November 20, 2018, 2:00pm

I got the same error too!

gmdouglas · November 20, 2018, 2:04pm

Hey there,

The problem is that you're placing your sequences into the Greengenes reference tree instead of the PICRUSt2 tree. You'll need to specify the custom tree with q2-fragment-insertion as specified in the tutorial here: q2 picrust2 Tutorial · picrust/picrust2 Wiki · GitHub

It should work if you place your sequences into that tree. Please let me know if you run into other problems though!

Best,

Gavin

WAS1 · November 20, 2018, 2:21pm

Thanks for the good work and clarifications. the wordings of insertion as in the tutorial may suggest the default files with sepp script.

Are u referring to this tree to be downloaded ? ```
http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.tre.qza

Maybe I havent looked properly, but the directories already installed do not appear to contain a  ref tree

many Thanks

gmdouglas · November 20, 2018, 2:43pm

Hi there,

Yes, I'm referring to these commands in the tutorial:

wget http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.fna.qza

wget http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.tre.qza

qiime fragment-insertion sepp --i-representative-sequences mammal_seqs.qza \
                              --p-threads 1 --i-reference-alignment reference.fna.qza \
                              --i-reference-phylogeny reference.tre.qza \
                              --output-dir tutorial_placed_out

I'll add a clarification that these files need to be used.

Gavin

mmelendrez · November 20, 2018, 3:14pm

@gmdouglas - ok - so it's running now per this command:

(qiime2-2018.8) wsb255bioimac27:q2-picrust2 mel_local$ qiime fragment-insertion sepp --i-representative-sequences allPrimers-rep-seqs.qza --p-threads 1 --i-reference-alignment reference.fna.qza --i-reference-phylogeny reference.tre.qza --output-dir sepp_out

Cheers!

Questions

the PICRUSt2 reference.fna.qza and reference.tre.qza files - are those files updated or will there be a tutorial or link or explanation of how we might create updated alignments/tree files for use in the PICRUSt2 plugin in the future?
Is the database default in PICRUSt2 just greengenes 13_8 97% grouped OTUs? Or is it 'more' (ie. inclusive of RDP, Silva...) You mentioned in the docs that it has been expanded 10X so was just wondering if that's just per an updated greengenes database or if other databases were included?

Thanks again!

WAS1 · November 20, 2018, 6:42pm

Thanks.
I assume --p-max-nsti is derived from the normal NSTI score from earlier versions.

So I am wondering why this cannot be set to accept float say 0.4 (instead of the default value 2) if one wants to increase stringency for inclusion. it seems to only accept integers 1 or 2.
Thanks

gmdouglas · November 20, 2018, 9:23pm

Hi there,

I will be updating these files periodically, but note that they're based on the IMG database since gene family abundances derived from genomes are needed. You can certainly use custom files with PICRUSt2, but I don't have a tutorial of exact commands replicating what I did. I will make the scripts I used available when PICRUSt2 is published though.
One confusing thing is that PICRUSt1 was also based on an earlier version of IMG (with 10X fewer genomes), but the predictions were made for all Greengenes OTUs in advance. The Greengenes predictions were what most PICRUSt1 users interacted with - it wasn't actually doing predictions for each individual dataset. In contrast, PICRUSt2 does perform predictions for each input dataset.

Best,

Gavin

gmdouglas · November 20, 2018, 9:23pm

Thanks for pointing out this bug! That parameter is meant to accept floats and is wrongly limited to integers currently in the QIIME2 plugin. I'll change this and update the github repo + tutorial.

Thanks,

Gavin

gmdouglas · November 21, 2018, 6:15pm

Hi again @WAS1,

The NSTI cut-off option is now implemented correctly in the most recent q2-picrust2 version (v0.0.2). I have updated the installation instructions to point to this new version.

Thanks,

Gavin

system · December 23, 2018, 12:15am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.