to obtain the KO.qza, PA.qza, EC.qza files. Then I was able to visualise them, as well as extract the biom data to tsv. Core diversity etc worked. I have two queries
The KO had IDs from KOOOO1 to K19789. There is a script to add descriptions and levels of the pathways here, but that does not work in the plugin.So then how do we annotate them and do the level wise analysis?
You will need to export the files from QIIME 2 if you would like to use the standalone scripts. Unfortunately the plugin just has limited functionality. You will need to export the file to BIOM format, convert that to TSV format (see: http://biom-format.org/documentation/biom_conversion.html), and then make sure the header matches the files in the tutorial.
This is a really important question, but honestly it’s extremely difficult to answer. It’s actually unclear how to best analyze even standard metagenomics data of gene family or pathway abundances. Predicted metagenomes are a similar datatype, but with added noise and more biases. As part of my own research I’m trying to figure out a pipeline for identifying robust biomarkers in either predicted or actual metagenomes, but this is still ongoing. However, many people would analyze the data by converting to relative abundance and running analyses in R or running standard QIIME 2 data analysis commands. If you’re interested specifically in testing for differential abundance I can only say that different methods give very different results in my experience (for both actual and predicted metagenomes) and that it’s difficult to give a recommendation without a gold-standard to compare to.
Thanks very much @gmdouglas and @Nicholas_Bokulich for your prompt responses . I will have a go at the full version now. If I have any problems ( I am sure I will have some issues !), I will get back to you.
My question is
How do we generate the .fna for this work? The NGS analysis output were fastq.gz files which were then imported via PairedEndFastqManifestPhred33 format to get paired_end_demux.qza, that was then DADA2 processed to get repseqs.qza and table.qza; with the latter being converted via biom method to biom.
So I have the biom but not the .fna files for this script.I see that the study sequences are representative OTUs and/or ASVs under the typical workflow but I do not have any .fna outputs.
Thanks so much. It worked. For benefit of others, who may be stuck here, this is what I did…
qiime tools export
–input-path $FILTER/rep-seqs-dada2.qza
–output-path $FILTER/export
Then
module avail picrust2
Then
module load picrust2/2.3.0
Then
picrust2_pipeline.py
-s $FILTER/export/rep-seqs-dada2.fasta
-i $FILTER/exported-feature-table_biom/feature-table.biom
-o $PIC2/picrust2_out_pipeline
-p 1
Then
add_descriptions.py -i $PIC2/picrust2_out_pipeline/EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC
-o $PIC2/picrust2_out_pipeline/EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz
add_descriptions.py -i $PIC2/picrust2_out_pipeline/KO_metagenome_out/pred_metagenome_unstrat.tsv.gz -m KO
-o $PIC2/picrust2_out_pipeline/KO_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz
Hi, I am interested on how did you move forward for the downstream analysis of your dataset? Especially on the enrichments? Thank you, hoping you could share more on this!