I’m using q2-picrust2 to predict metagenomes from ASVs following the q2-picrust2 tutorial:
- Using picrust2 default reference files with qiime2-2018.11 release.
- Inserting ASV representative sequences into picrust2 default reference tree with
qiime fragment-insertion sepp
. - Predicting metagenomes with
qiime picrust2 custom-tree-pipeline
.
I’m planning to run this pipeline on multiple ASV feature tables and representative sequences generated from different samples, and then merge the resulting picrust2 feature tables (e.g. KO tables). Is this a valid approach, or is it better to merge the ASV feature tables and representative sequences before running picrust2?
I tested out both approaches by:
- Running picrust2 separately on two single-sample ASV feature tables and merging the picrust2 feature tables with
qiime feature-table merge
. - Merging the two single-sample ASV feature tables and representative sequences with
qiime feature-table merge
andqiime feature-table merge-seqs
, respectively, and running picrust2 on the merged table and sequences.
I compared the resulting picrust2 feature tables with qiime feature-table summarize
and noticed some minor differences. For example, the number of samples and features (e.g. KOs) are the same across the tables, but the total number of sequences is lower in one table (95,409,620 seqs for Approach #1) than the other (95,412,070 seqs for Approach #2).
Are these differences to be expected (e.g. due to stochasticity in picrust2 or some other reason)? Is one approach recommended over the other?
I’m happy to provide exact commands and outputs if that’s helpful – just wanted to put the conceptual question about merging out there first. Thanks for your guidance!