Q2-picrust2 plugin error running hsp.py script

picrust

(Mel Melendrez) #1

Hello - I am attempting to use the q2-picrust2 plug in for my dataset. Download and activation of the plugin seemed to go fine, no errors thrown.

Per the tutorial the plugin needs a tree created by q2-fragment-insertion I did this with the following commands per the q2-fragment-insertion documentation :

(qiime2-2018.6) wsb255bioimac27:AllPrimerAnalysis_071118 mel_local$ qiime fragment-insertion sepp --i-representative-sequences allPrimers-rep-seqs.qza --o-tree insertion-tree.qza --o-placements insertion-placements.qza --p-threads 4

Filtered my feature table to ensure only those features found in the tree were in the table:

(qiime2-2018.6) wsb255bioimac27:AllPrimerAnalysis_071118 mel_local$ qiime fragment-insertion filter-features --i-table allPrimers-dada2-table.qza --i-tree insertion-tree.qza --o-filtered-table allPrimers_filtered-insertion-table.qza --o-removed-table allPrimers_removed-insertion-table.qza --verbose

I further filtered my feature-table as I was only interested in the output pertaining to two of the three primer sets I was working with:

qiime feature-table filter-samples --i-table allPrimers_filtered-insertion-table.qza --m-metadata-file Pro_V4_keep.tsv --o-filtered-table table_V4-Pro.qza

Another filter step to only keep those samples pertaining to the 2 fermenters (rather than also including initial manure and food samples)

(qiime2-2018.8) wsb255bioimac27:Issue_0037_Bacteria mel_local$ qiime feature-table filter-samples --i-table table_V4-Pro.qza --m-metadata-file sample_metadata_110718.tsv --o-filtered-table table_V4-Pro-F12.qza --p-where "Subject IN ('Fermenter1', 'Fermenter2')" Saved FeatureTable[Frequency] to: table_V4-Pro-F12.qza

So now I have:

  • FeatureTable[Frequency] type table_V4-Por-F12.qza
  • SEPP tree - insertion-tree.qza

I ran the q2-picrust2 plugin per the tutorial with the suggested parameters just to see if I could get it working:

(qiime2-2018.8) wsb255bioimac27:Issue_0037_Bacteria mel_local$ qiime picrust2 custom-tree-pipeline --i-table table_V4-Pro-F12.qza --i-tree insertion-tree.qza --output-dir q2-picrust2-output --p-threads 1 --p-hsp-method pic --p-max-nsti 2

I got the following error:

Error running this command: hsp.py -i 16S -t /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp0356q166/placed_seqs.tre -p 1 -n -o /var/folders/12/2j8hq03s52lbnstw5wh008k80000gq/T/tmp0356q166/picrust2_out/16S_predicted -m pic

I was unable to open the directory path to the tmp files to figure out what might have happened? The error is not informative…at least not intuitive in terms of what might have went wrong?

I would appreciate any suggestions… files attached.

table_V4-Pro-F12.qza (809.9 KB)
insertion-tree.qza (3.1 MB)
sample_metadata_110718.tsv (17.7 KB)


(Matthew Ryan Dillon) #5

(Matthew Ryan Dillon) #6

#7

seen pretty similar errors recently


(Nicholas Bokulich) #8

Hi @mmelendrez,
Sorry for the delay in responding — I am pinging @gmdouglas to look at this problem.
Thanks!


(David Esteban) #9

I got the same error too!


(Gavin Douglas) #10

Hey there,

The problem is that you’re placing your sequences into the Greengenes reference tree instead of the PICRUSt2 tree. You’ll need to specify the custom tree with q2-fragment-insertion as specified in the tutorial here: https://github.com/picrust/picrust2/wiki/q2-picrust2-Tutorial

It should work if you place your sequences into that tree. Please let me know if you run into other problems though!

Best,

Gavin


#11

Thanks for the good work and clarifications. the wordings of insertion as in the tutorial may suggest the default files with sepp script.

Are u referring to this tree to be downloaded ? ```
http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.tre.qza

Maybe I havent looked properly, but the directories already installed do not appear to contain a  ref tree

many Thanks

(Gavin Douglas) #12

Hi there,

Yes, I’m referring to these commands in the tutorial:

wget http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.fna.qza

wget http://kronos.pharmacology.dal.ca/public_files/tutorial_datasets/picrust2_tutorial_files/reference.tre.qza

qiime fragment-insertion sepp --i-representative-sequences mammal_seqs.qza \
                              --p-threads 1 --i-reference-alignment reference.fna.qza \
                              --i-reference-phylogeny reference.tre.qza \
                              --output-dir tutorial_placed_out

I’ll add a clarification that these files need to be used.

Gavin


(Mel Melendrez) #13

@gmdouglas - ok - so it’s running now per this command:

(qiime2-2018.8) wsb255bioimac27:q2-picrust2 mel_local$ qiime fragment-insertion sepp --i-representative-sequences allPrimers-rep-seqs.qza --p-threads 1 --i-reference-alignment reference.fna.qza --i-reference-phylogeny reference.tre.qza --output-dir sepp_out

Cheers!

Questions

  • the PICRUSt2 reference.fna.qza and reference.tre.qza files - are those files updated or will there be a tutorial or link or explanation of how we might create updated alignments/tree files for use in the PICRUSt2 plugin in the future?
  • Is the database default in PICRUSt2 just greengenes 13_8 97% grouped OTUs? Or is it ‘more’ (ie. inclusive of RDP, Silva…) You mentioned in the docs that it has been expanded 10X so was just wondering if that’s just per an updated greengenes database or if other databases were included?

Thanks again!


#14

Thanks.
I assume --p-max-nsti is derived from the normal NSTI score from earlier versions.

So I am wondering why this cannot be set to accept float say 0.4 (instead of the default value 2) if one wants to increase stringency for inclusion. it seems to only accept integers 1 or 2.
Thanks


(Gavin Douglas) #15

Hi there,

  • I will be updating these files periodically, but note that they’re based on the IMG database since gene family abundances derived from genomes are needed. You can certainly use custom files with PICRUSt2, but I don’t have a tutorial of exact commands replicating what I did. I will make the scripts I used available when PICRUSt2 is published though.

  • One confusing thing is that PICRUSt1 was also based on an earlier version of IMG (with 10X fewer genomes), but the predictions were made for all Greengenes OTUs in advance. The Greengenes predictions were what most PICRUSt1 users interacted with - it wasn’t actually doing predictions for each individual dataset. In contrast, PICRUSt2 does perform predictions for each input dataset.

Best,

Gavin


(Gavin Douglas) #16

Thanks for pointing out this bug! That parameter is meant to accept floats and is wrongly limited to integers currently in the QIIME2 plugin. I’ll change this and update the github repo + tutorial.

Thanks,

Gavin


(Gavin Douglas) #17

Hi again @WAS1,

The NSTI cut-off option is now implemented correctly in the most recent q2-picrust2 version (v0.0.2). I have updated the installation instructions to point to this new version.

Thanks,

Gavin