Q2 picrust2 - output pathway tables


#1

Hello I have ran qiime picrust2 custom-tree-pipeline on my data using --p-hsp-method mp

My question is related to the pathway abundance and coverage tables. The coverage and abundance of each MetaCyc pathway is calculated how? It reports abundance and coverage over the whole community (predicted metegenome for each sample)? is it corrected by each predicted genome? if so, how? how does it takes into account the number of reactions in each pathway?

I got this from the FAQ section in Picrust2 wiki, however I find it a bit confusing. I don get what “the group being assessed” stands for.

“The coverage is based on the harmonic mean of confidence scores for each reaction in a pathway. The scores will differ depending on what the median reaction abundance is for the group being assessed (i.e. different reaction scores will be inferred if the reaction is within one predicted genome compared to across an entire sample). Since the median reaction abundance will be higher across an entire sample it’s harder to be confident in rare reactions within a sample based on this approach.”

Thanks very much in advance for any help.

best!


(Matthew Ryan Dillon) #2

ccing @gmdouglas :qiime2:


(Gavin Douglas) #3

Hi @mbcarbonetto,

Pathway abundances and coverages are calculated using the same approach used by HUMAnN2. Predicted EC numbers are first regrouped to be MetaCyc reactions, which can be linked to MetaCyc pathways.

The pathways that are present are identified by first running MinPath to identify the minimum pathways present to explain the reactions. The abundances of these pathways are essentially calculated by taking the harmonic mean of the MetaCyc reactions within the pathway.

I say essentially because it’s a little more complicated (see explanation copied from HUMAnN2 wiki):

…the abundance for each pathway is a recursive computation of abundances of sub-pathways with paths resolved to abundances based on the relationships and abundances of the reactions contained in each. Each path, the smallest portion of a pathway or sub-pathway which can’t be broken down into sub-pathways, has an abundance that is the max or harmonic mean of the reaction abundances depending on the relationships of these reactions. Optional reactions are only added to the overall abundance if their abundance is greater than the harmonic mean of the required reactions.

Pathway coverages are calculated using the same approach except that the reaction abundances are transformed into reaction confidence scores.

In the QIIME2 version of PICRUSt2 the pathway abundances and coverages are based on the whole metagenome per sample. There is no correction for individual contributing sequences. Note that you can get the breakdown by sequence with the standalone version of PICRUSt2 as well, but not with the QIIME2 version currently.


#4

Hi Gavin,

Thank you very much for your quick reply.

And thanks for developing the plugin.

Good to know that there is choice for correction with the standalone version.

thanks again!

Belen