Analysis of qiime2-picrust2 plugin output

SetaPark · January 30, 2020, 9:27am

Hello All
I did
module load singularity/3
export TZ='Pacific/Auckland'

/workspace/hraegb/singularity/qiime-2019.7-picrust2.sif picrust2 full-pipeline
--i-table $FILTER/table-dada2.qza
--i-seq $FILTER/rep-seqs-dada2.qza
--output-dir $WORKING/$PICR2
--o-ko-metagenome $PICR2/KO
--o-ec-metagenome $PICR2/EC
--o-pathway-abundance $PICR2/PA
--p-threads 1
--p-hsp-method mp
--p-max-nsti 2
--verbose

to obtain the KO.qza, PA.qza, EC.qza files. Then I was able to visualise them, as well as extract the biom data to tsv. Core diversity etc worked. I have two queries

The KO had IDs from KOOOO1 to K19789. There is a script to add descriptions and levels of the pathways here, but that does not work in the plugin.So then how do we annotate them and do the level wise analysis?
What is the best way to analyse the data? I have attached here the tsv for a look. extracted-KO-feature-table-biom.tsv (759.9 KB)
Also is it possible to look associate OTUs with the functional pathways at all within this plugin or do we have to use the full picrust?

I would really appreciate your help.

Thanks very much

SetaPark

Nicholas_Bokulich · January 30, 2020, 11:27pm

Hi @SetaPark,
I am pinging @gmdouglas to see if he can help. Thanks in advance!

gmdouglas · January 31, 2020, 12:15am

Hi @SetaPark,

You will need to export the files from QIIME 2 if you would like to use the standalone scripts. Unfortunately the plugin just has limited functionality. You will need to export the file to BIOM format, convert that to TSV format (see: Converting between file formats — biom-format.org), and then make sure the header matches the files in the tutorial.
This is a really important question, but honestly it's extremely difficult to answer. It's actually unclear how to best analyze even standard metagenomics data of gene family or pathway abundances. Predicted metagenomes are a similar datatype, but with added noise and more biases. As part of my own research I'm trying to figure out a pipeline for identifying robust biomarkers in either predicted or actual metagenomes, but this is still ongoing. However, many people would analyze the data by converting to relative abundance and running analyses in R or running standard QIIME 2 data analysis commands. If you're interested specifically in testing for differential abundance I can only say that different methods give very different results in my experience (for both actual and predicted metagenomes) and that it's difficult to give a recommendation without a gold-standard to compare to.
You need to use the standalone version to get a breakdown of how OTUs contribute to pathways. The steps to get this datatype are described in the main tutorial: PICRUSt2 Tutorial (v2.3.0 beta) · picrust/picrust2 Wiki · GitHub

Best,

Gavin

SetaPark · January 31, 2020, 2:31am

Thanks very much @Nicholas_Bokulich

SetaPark · January 31, 2020, 2:45am

Thanks very much @gmdouglas and @Nicholas_Bokulich for your prompt responses . I will have a go at the full version now. If I have any problems ( I am sure I will have some issues !), I will get back to you.

Regards
SetaPark

SetaPark · January 31, 2020, 7:19am

Hello again @Nicholas_Bokulich @gmdouglas
We have now installed picrust2. However I am stuck at the very first step of the pipeline

My question is
How do we generate the .fna for this work? The NGS analysis output were fastq.gz files which were then imported via PairedEndFastqManifestPhred33 format to get paired_end_demux.qza, that was then DADA2 processed to get repseqs.qza and table.qza; with the latter being converted via biom method to biom.

So I have the biom but not the .fna files for this script.I see that the study sequences are representative OTUs and/or ASVs under the typical workflow but I do not have any .fna outputs.

picrust2_pipeline.py -s study_seqs.fna -i study_seqs.biom -o picrust2_out_pipeline -p 1

Thanks
SetaPark

gmdouglas · January 31, 2020, 3:03pm

Hi again,

The "repseqs.qza" file corresponds to the actual ASV DNA sequences, which you can export to FASTA format.

Gavin

Edit: Make sure that you read through the tool tutorial, which describes these input files: PICRUSt2 Tutorial (v2.3.0 beta) · picrust/picrust2 Wiki · GitHub

SetaPark · February 2, 2020, 2:53am

Hello again @gmdouglas

Thanks so much. It worked. For benefit of others, who may be stuck here, this is what I did..
qiime tools export
--input-path $FILTER/rep-seqs-dada2.qza
--output-path $FILTER/export

Then
module avail picrust2

Then
module load picrust2/2.3.0

Then
picrust2_pipeline.py
-s $FILTER/export/rep-seqs-dada2.fasta
-i $FILTER/exported-feature-table_biom/feature-table.biom
-o $PIC2/picrust2_out_pipeline
-p 1

Then
add_descriptions.py -i $PIC2/picrust2_out_pipeline/EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC
-o $PIC2/picrust2_out_pipeline/EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz

add_descriptions.py -i $PIC2/picrust2_out_pipeline/KO_metagenome_out/pred_metagenome_unstrat.tsv.gz -m KO
-o $PIC2/picrust2_out_pipeline/KO_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz

add_descriptions.py -i $PIC2/picrust2_out_pipeline/pathways_out/path_abun_unstrat.tsv.gz -m METACYC
-o $PIC2/picrust2_out_pipeline/pathways_out/path_abun_unstrat_descrip.tsv.gz

Thanks very much once again,
Best wishes

YuZhang · March 25, 2021, 12:37pm

hello sir，I want know how did you do the downstream analysis？

acmaguila · July 3, 2021, 12:24pm

Hi, I am interested on how did you move forward for the downstream analysis of your dataset? Especially on the enrichments? Thank you, hoping you could share more on this!