q2-phylogenize: a pipeline for associating microbial genes with environments, accounting for phylogeny


Hi all, we’ve released a (beta) plugin for QIIME2 that allows users to use phylogenetic regression to link microbial genes to either prevalence in an environment, or specificity for that environment relative to others, based on denoised 16S amplicon sequencing (using DADA2 or Deblur) and a database of sequenced genomes. Phylogenetic regression takes into account the fact that species that share ancestry will often share phenotypic traits, like colonizing similar environments. This confounder can otherwise lead to high rates of spurious associations. Researchers who are doing functional analysis of 16S data will hopefully find this tool particularly useful.

The underlying method is described in Bradley, Nayfach, and Pollard 2018 and in a follow-up preprint, soon to be updated with information about the plugin and new analyses. Figures 1-2 in the first paper linked, in particular, contain a good illustration of the benefits of phylogenetic regression over a standard linear model for identifying microbial genes associated with particular environments.

phylogenize generates an interactive HTML report that allows you to visualize prevalence/specificity in an environment over trees, along with tables of significantly-associated genes and significantly-enriched SEED subsystems (e.g. “nitrogen fixation”).


You will need to install the phylogenize pipeline (released as an R package) within the QIIME2 environment, then install the q2-phylogenize plugin. Full instructions can be found at the phylogenize and q2-phylogenize Bitbucket repositories.

In principle, phylogenize should work on any UNIX-like environment where the dependencies can be installed, but we have only tested it on Ubuntu 16.04 and up so far. We recommend using phylogenize on a system with at least 6 GB of RAM.

For general tips using phylogenize, there is a more high-level tutorial at the phylogenize website.


Comments and feedback are very welcome! Feel free to leave replies here or to open an issue on the Bitbucket pages for phylogenize or q2-phylogenize. Also, thanks to the QIIME2 developers and also to @cduvallet for releasing helpful resources on QIIME2 plugin development.


This looks great @pbradz! If you get a chance please register this plugin at library.qiime2.org


Done, thanks so much!


Just as a note, we’ve uploaded a major revision to the preprint on bioRxiv. Among other changes, the new preprint reflects changes to the phylogenize codebase, including the introduction of this QIIME2 plugin interface, and also adds a new analysis of Earth Microbiome Project data in which we use phylogenize to find microbial genes associated with life in the plant rhizosphere. Again, feedback welcomed and thanks for reading!