Q2-breakaway: Community Tutorial

QIIME2 Tutorial: q2-breakaway

This is a Community Tutorial for q2-breakaway within the qiime2-2018.8 release.

breakaway is the premier package for statistical analysis of microbial diversity. breakaway implements the latest and greatest estimates of richness, as well as the most commonly used estimates. The breakaway philosophy is to estimate diversity, to put error bars on diversity estimates, and to perform hypothesis tests for diversity that use those error bars.

Citing breakaway

The R package breakaway implements a number of different richness estimates. Please cite the following if you use them:

  • breakaway(): Willis and Bunge (2015). Estimating diversity via frequency ratios. Biometrics.

Let's get started!

breakaway is based in R and requires installation of dependencies phyloseq, devtools, ggplot2, magrittr, tibble, dplyr, withr, testthat, and praise into your conda environment before installing breakaway. Please refer to the following instructions on how to install breakaway and its dependencies.

Activate your QIIME Environment

  • Here we activate our example version of QIIME, qiime2-2018.8. If you're not sure what your current version of QIIME is you can run conda env list in the command line to see a list of installed QIIME environments. Please note that q2-breakaway will not work with earlier versions of QIIME but should be functional with qiime2-2018.8 and on.
source activate qiime2-2018.8

Install breakaway dependencies

(Expected installation time ~3-5 minutes)

conda install -c bioconda -c conda-forge bioconductor-phyloseq r-devtools r-tibble r-magrittr r-dplyr r-withr r-testthat r-praise unzip
  • Note: When installing select y to proceed with installation when prompted.

Install breakaway and q2-breakaway

pip install git+https://github.com/statdivlab/q2-breakaway.git

qiime dev refresh-cache

Check that breakaway is installed

qiime breakaway --help

How to use q2-breakaway

  • For this tutorial we will be using data from the "Moving Pictures" data. q2-breakaway requires input of a FeatureTable of frequency counts. We recommend using a FeatureTable that has been generated from deblur/vsearch or dada2 in R with pool = TRUE to make sure that singletons have not been completely filtered out (this pooling option for q2-dada2 in QIIME 2 is under active development).

    Note: breakaway uses low abundance taxa to predict missing diversity. Many recent quality control methods can filter out singletons and create stringently error controlled datasets that contain samples with few rare taxa. This will result in small error bars as there is low uncertainty in estimating missing taxa if samples have few rare taxa. We are actively exploring ways to improve our methods, but in the short-term suggest use of deblur, vsearch, or dada2 in R with pool = TRUE to generate the FeatureTable needed for input into breakaway.

table-deblur.qza

qiime breakaway alpha \
--i-table table-deblur.qza \
--o-alpha-diversity richness-better.qza

You can export the results out of QIIME2 to see the richness estimates, confidence intervals, and model used by breakaway.

qiime tools export \
richness-better.qza \
--output-dir richness

alpha-diversity.tsv

We see that Kemp, Poisson, and Negative Binomial models were used to generate our confidence intervals! Let's visualize our estimates and their error bars.

qiime breakaway plot \
--i-alpha-diversity richness-better.qza \
--o-visualization richness-better-plot.qzv

To view...

qiime tools view richness-better-plot.qzv

And now there are error bars around our estimates! Note that some error bars are smaller than others. This is because those samples had few rare taxa, and so low uncertainty in estimating the number of missing taxa.

Future Functionality (things to look forward to!)

  • kemp(): Willis and Bunge (2015). Estimating diversity via frequency ratios. Biometrics.
  • betta(): Willis, A., Bunge, J., & Whitman, T. (2017). Improved detection of changes in species richness in high diversity microbial communities. JRSS-C.
  • breakaway_nof1(): Willis, A. (2016+). Species richness estimation with high diversity but spurious singletons. arXiv.
  • objective_bayes_*(): Barger, K. & Bunge, J. (2010). Objective
    Bayesian estimation for the number of species. Bayesian Analysis.
8 Likes

Hi @Pauline_Trinh, Thanks so much for this contribution! It's very exciting to see breakaway accessible through QIIME 2! I just have a couple of comments on this document.

I think you should update this to refer to 2018.8 since it depends on changes that are currently in the development version of QIIME 2, so q2-breakaway doesn't actually work with 2018.6. I got hung up on this for a few minutes until @ebolyen helped me out. Since the 2018.8 release is only a couple of days away, this won't be a big deal (but it would make sense to note that this is pending the 2018.8 release, which is scheduled for later this week).

Is this functionality that we should make accessible in q2-dada2? If so, we can create an issue for this on the q2-dada2 issue tracker.

@ebolyen filled me in on the general problem of breakaway needing information on singletons while many of the recent quality controls methods filter them out. It might be worth noting in this document that that's something that is being actively explored.

It looks like you might not have the right citation for q2-breakaway - I'm guessing this is a placeholder, though it's not a bad paper title. :slight_smile:

qiime breakaway --citations
% use `qiime tools citations` on a QIIME 2 result for complete list

@article{key0,
 author = {Willis, Amy},
 title = {q2-breakaway: alpha diversity, so much better}
}

Thanks again for this contribution!

3 Likes

Thanks @gregcaporaso for the helpful feedback!

I’ve edited the document and fixed the citations in breakaway as suggested. I forgot about the updated q2-types, so my apologies for the mind slip but thankful for @ebolyen!

For the pooling option in q2-dada2, I believe there is already an open issue #87 that was posted by @benjjneb.

I’m looking forward to expanding breakaway's functionality to QIIME users over the next month and thanks for your support!

5 Likes

Hi @Pauline_Trinh, I was wondering if there were any updates to q2-breakaway coming soon to include the other functions that allow for hypothesis testing? Thanks!

2 Likes

An off-topic reply has been merged into an existing topic: q2-breakaway and q2-dada2 pseudo-pooling

Please keep replies on-topic in the future.