I want to take this opportunity to respond to Dr. Edgar’s publication. Before beginning, it is only fair to you as the reader to have full disclosure on both sides. Dr. Edgar has an important conflict of interest: he is profiting from users he pulls away from QIIME to his commercial software (see his Competing Interests note). I also have a conflict of interest in that grants to develop and improve the open-source QIIME 2 project fund a significant amount of the work conducted in my lab. However, I think it is important to respond to Dr. Edgar’s publication with a point of view from the QIIME development team, because many of his comparisons and claims are inaccurate or outdated.
Most importantly: His work is an analysis of QIIME 1, not QIIME 2. We are no longer recommending that you use QIIME 1 (as of July 2017) but rather that new users start with QIIME 2 and existing users begin transitioning to QIIME 2 now. We will stop supporting QIIME 1 on 1 January 2018.
You should not need to re-analyze or discard results that you generated with QIIME 1. Better methods are now available, but the old results are still valid and meaningful. If you want to convince yourself of this, I encourage you to use a more recent pipeline for your microbiome analysis and compare the conclusions that you come to with those that you derived from your QIIME 1 analysis. I expect that you will come to the same conclusions in most cases, but let us know on the forum if you don’t and we can help you interpret those differences. The situation is similar to when we transitioned from using
uclustfor read clustering, or from Sanger to 454 to Illumina for sequencing: the methods improve, but old results are typically not invalidated by the advances.
QIIME 2 wraps improved quality control methods that include chimera removal (DADA2 and Deblur). This appears to largely address the well-known “OTU inflation issue” that plagued the previous generation of microbiome bioinformatics tools. Dr. Edgar notes:
many, probably most, of the spurious OTUs obtained with noisy reads are caused by inadequate error filtering
Spurious OTUs are also caused by chimeras, which are known to be ubiquitous in 16S rRNA amplicon sequences but are not filtered…
As expected, improving quality control and including chimera checking results in much more realistic measures of community richness in QIIME 2 than we achieved with QIIME 1. Further, inflated and uninflated (e.g., post-denoising of 454 or Illumina data) estimates of alpha diversity have been observed to be strongly correlated, so both are likely to be useful for most ecological purposes as long as consistent methods are used.
We advise against OTU clustering altogether in QIIME 2 in favor of working with amplicon sequence variants (ASVs). The vsearch-based OTU clustering methods that we added in QIIME 2 2017.9 (released in Sept 2017) are there by popular demand for our users who are not ready to abandon OTU clustering yet (though we implemented them in a way that encourages comparison of OTU tables against amplicon sequence variant tables). Previous versions of QIIME 2 only supported ASV-based analyses.
OTU clustering and taxonomy assignment in QIIME 1 use Dr. Edgar’s software, uclustq v1.2.22 (released in 2009). QIIME 1 is simply wrapping Dr. Edgar’s software. This was the best software at the time, but science and technology progress, and it should not be surprising that we have better options now.
The issue of closed-reference OTU picking failing to map reads of different variable regions of a sequence to the same reference sequence is interesting and highlights a potential area for improvement if this is still relevant with more recent tools. That said, the approach has been practically useful in many meta-analyses of studies that use different primer pairs, frequently recapitulating results that have been obtained from studies that use a single primer pair. It’s worth noting, as Dr. Edgar acknowledges, that we should expect phylogenetic methods such as UniFrac (and Faith’s Phylogenetic Diversity) to minimize the impact of this issue:
Weighted UniFrac was found to report small distances between identical (or very similar) mock samples despite high rates of spurious OTUs and substantial divergences in which spurious OTUs were present.
QIIME (1 and 2) has always encouraged the use of these metrics.
While using data from and repeatedly citing the Bokulich 2012 paper in which QIIME 1 quality filtering methods were benchmarked, Dr. Edgar failed to acknowledge and implement the main conclusions of that paper: first, that more stringent filtering parameters should be set in
split_libraries_fastq.pythan the default; and second, that those methods require OTU abundance filtering to remove low-abundance OTUs emerging most frequently from contamination, chimera, and sequencing/PCR errors. The default in QIIME 1 was purposely set to be liberal, to optimize for keeping more sequences in the initial quality control phase at the expense of inflated estimates of community richness. The updated quality control approaches used in QIIME 2 thus far seem to allow us to achieve very stringent quality control while also retaining large numbers of sequences per sample.
Finally, join us in developing QIIME 2! We’re building an open and supportive community of scientists and software engineers to facilitate advances in the exciting field of microbiome science. QIIME 2 is free and open source software, which is important for its accessibility to researchers and reproducibility/validation of the software itself. We need your help integrating the latest microbiome analyses, visualizations, and statistics into the QIIME 2 ecosystem. We need unbiased and reproducible benchmarks to understand and compare new and improved methods to techniques that were previously state-of-the-art.
Let’s make microbiome bioinformatics better together! If you’re interested in getting involved with the QIIME 2 project, get in touch on the QIIME 2 forum.
Thanks to Nick Bokulich, Emily Cope, Matthew Dillon, Rob Knight, Talima Pearson, and Jai Rideout for valuable input while preparing this post.