This is a Community Tutorial for
q2-breakaway within the qiime2-2018.8 release.
breakaway is the premier package for statistical analysis of microbial diversity.
breakaway implements the latest and greatest estimates of richness, as well as the most commonly used estimates. The
breakaway philosophy is to estimate diversity, to put error bars on diversity estimates, and to perform hypothesis tests for diversity that use those error bars.
breakaway implements a number of different richness estimates. Please cite the following if you use them:
breakaway(): Willis and Bunge (2015). Estimating diversity via frequency ratios. Biometrics.
breakaway is based in
R and requires installation of dependencies
praise into your
conda environment before installing
breakaway. Please refer to the following instructions on how to install
breakaway and its dependencies.
- Here we activate our example version of QIIME,
qiime2-2018.8. If you're not sure what your current version of QIIME is you can run
conda env listin the command line to see a list of installed QIIME environments. Please note that q2-breakaway will not work with earlier versions of QIIME but should be functional with qiime2-2018.8 and on.
source activate qiime2-2018.8
(Expected installation time ~3-5 minutes)
conda install -c bioconda -c conda-forge bioconductor-phyloseq r-devtools r-tibble r-magrittr r-dplyr r-withr r-testthat r-praise unzip
- Note: When installing select
yto proceed with installation when prompted.
pip install git+https://github.com/statdivlab/q2-breakaway.git qiime dev refresh-cache
qiime breakaway --help
- For this tutorial we will be using data from the "Moving Pictures" data.
q2-breakawayrequires input of a FeatureTable of frequency counts. We recommend using a FeatureTable that has been generated from
Rwith pool = TRUE to make sure that singletons have not been completely filtered out (this pooling option for
q2-dada2in QIIME 2 is under active development).
breakawayuses low abundance taxa to predict missing diversity. Many recent quality control methods can filter out singletons and create stringently error controlled datasets that contain samples with few rare taxa. This will result in small error bars as there is low uncertainty in estimating missing taxa if samples have few rare taxa. We are actively exploring ways to improve our methods, but in the short-term suggest use of
vsearch, or dada2 in
Rwith pool = TRUE to generate the FeatureTable needed for input into
qiime breakaway alpha \ --i-table table-deblur.qza \ --o-alpha-diversity richness-better.qza
You can export the results out of QIIME2 to see the richness estimates, confidence intervals, and model used by
qiime tools export \ richness-better.qza \ --output-dir richness
We see that Kemp, Poisson, and Negative Binomial models were used to generate our confidence intervals! Let's visualize our estimates and their error bars.
qiime breakaway plot \ --i-alpha-diversity richness-better.qza \ --o-visualization richness-better-plot.qzv
qiime tools view richness-better-plot.qzv
And now there are error bars around our estimates! Note that some error bars are smaller than others. This is because those samples had few rare taxa, and so low uncertainty in estimating the number of missing taxa.
kemp(): Willis and Bunge (2015). Estimating diversity via frequency ratios. Biometrics.
betta(): Willis, A., Bunge, J., & Whitman, T. (2017). Improved detection of changes in species richness in high diversity microbial communities. JRSS-C.
breakaway_nof1(): Willis, A. (2016+). Species richness estimation with high diversity but spurious singletons. arXiv.
objective_bayes_*(): Barger, K. & Bunge, J. (2010). Objective
Bayesian estimation for the number of species. Bayesian Analysis.