A response to “Accuracy of microbial community diversity ...” by R Edgar for QIIME users

gregcaporaso · October 10, 2017, 3:07pm

I want to take this opportunity to respond to Dr. Edgar’s publication. Before beginning, it is only fair to you as the reader to have full disclosure on both sides. Dr. Edgar has an important conflict of interest: he is profiting from users he pulls away from QIIME to his commercial software (see his Competing Interests note). I also have a conflict of interest in that grants to develop and improve the open-source QIIME 2 project fund a significant amount of the work conducted in my lab. However, I think it is important to respond to Dr. Edgar’s publication with a point of view from the QIIME development team, because many of his comparisons and claims are inaccurate or outdated.

Most importantly: His work is an analysis of QIIME 1, not QIIME 2. We are no longer recommending that you use QIIME 1 (as of July 2017) but rather that new users start with QIIME 2 and existing users begin transitioning to QIIME 2 now. We will stop supporting QIIME 1 on 1 January 2018.
You should not need to re-analyze or discard results that you generated with QIIME 1. Better methods are now available, but the old results are still valid and meaningful. If you want to convince yourself of this, I encourage you to use a more recent pipeline for your microbiome analysis and compare the conclusions that you come to with those that you derived from your QIIME 1 analysis. I expect that you will come to the same conclusions in most cases, but let us know on the forum if you don’t and we can help you interpret those differences. The situation is similar to when we transitioned from using cd-hit to uclust for read clustering, or from Sanger to 454 to Illumina for sequencing: the methods improve, but old results are typically not invalidated by the advances.
QIIME 2 wraps improved quality control methods that include chimera removal (DADA2 and Deblur). This appears to largely address the well-known “OTU inflation issue” that plagued the previous generation of microbiome bioinformatics tools. Dr. Edgar notes:

many, probably most, of the spurious OTUs obtained with noisy reads are caused by inadequate error filtering

and

Spurious OTUs are also caused by chimeras, which are known to be ubiquitous in 16S rRNA amplicon sequences but are not filtered...

As expected, improving quality control and including chimera checking results in much more realistic measures of community richness in QIIME 2 than we achieved with QIIME 1. Further, inflated and uninflated (e.g., post-denoising of 454 or Illumina data) estimates of alpha diversity have been observed to be strongly correlated, so both are likely to be useful for most ecological purposes as long as consistent methods are used.
We advise against OTU clustering altogether in QIIME 2 in favor of working with amplicon sequence variants (ASVs). The vsearch-based OTU clustering methods that we added in QIIME 2 2017.9 (released in Sept 2017) are there by popular demand for our users who are not ready to abandon OTU clustering yet (though we implemented them in a way that encourages comparison of OTU tables against amplicon sequence variant tables). Previous versions of QIIME 2 only supported ASV-based analyses.
OTU clustering and taxonomy assignment in QIIME 1 use Dr. Edgar’s software, uclustq v1.2.22 (released in 2009). QIIME 1 is simply wrapping Dr. Edgar’s software. This was the best software at the time, but science and technology progress, and it should not be surprising that we have better options now.
The issue of closed-reference OTU picking failing to map reads of different variable regions of a sequence to the same reference sequence is interesting and highlights a potential area for improvement if this is still relevant with more recent tools. That said, the approach has been practically useful in many meta-analyses of studies that use different primer pairs, frequently recapitulating results that have been obtained from studies that use a single primer pair. It’s worth noting, as Dr. Edgar acknowledges, that we should expect phylogenetic methods such as UniFrac (and Faith’s Phylogenetic Diversity) to minimize the impact of this issue:

Weighted UniFrac was found to report small distances between identical (or very similar) mock samples despite high rates of spurious OTUs and substantial divergences in which spurious OTUs were present.

QIIME (1 and 2) has always encouraged the use of these metrics.
While using data from and repeatedly citing the Bokulich 2012 paper in which QIIME 1 quality filtering methods were benchmarked, Dr. Edgar failed to acknowledge and implement the main conclusions of that paper: first, that more stringent filtering parameters should be set in split_libraries_fastq.py than the default; and second, that those methods require OTU abundance filtering to remove low-abundance OTUs emerging most frequently from contamination, chimera, and sequencing/PCR errors. The default in QIIME 1 was purposely set to be liberal, to optimize for keeping more sequences in the initial quality control phase at the expense of inflated estimates of community richness. The updated quality control approaches used in QIIME 2 thus far seem to allow us to achieve very stringent quality control while also retaining large numbers of sequences per sample.

Finally, join us in developing QIIME 2! We’re building an open and supportive community of scientists and software engineers to facilitate advances in the exciting field of microbiome science. QIIME 2 is free and open source software, which is important for its accessibility to researchers and reproducibility/validation of the software itself. We need your help integrating the latest microbiome analyses, visualizations, and statistics into the QIIME 2 ecosystem. We need unbiased and reproducible benchmarks to understand and compare new and improved methods to techniques that were previously state-of-the-art.

Let’s make microbiome bioinformatics better together! If you’re interested in getting involved with the QIIME 2 project, get in touch on the QIIME 2 forum.

Thanks to Nick Bokulich, Emily Cope, Matthew Dillon, Rob Knight, Talima Pearson, and Jai Rideout for valuable input while preparing this post.

Robert_Edgar · October 10, 2017, 8:43pm

I am glad to see that Dr. Caparaso does not dispute that the results and conclusions reported in the paper are correct.

Comment #2: Dr. Caporaso states that "You should not need to re-analyze or discard results that you generated with QIIME 1.". I disagree. The large majority of OTUs are spurious and the majority of predicted taxonomies are probably wrong if you followed the recommended procedures for QIIME 1.

Comment #7. Dr. Caporaso suggests that I did not use improvements recommended in Bokulich et al. 2013. In fact, I used the procedures recommended by the online QIIME documentation as of mid-2017, so if there are any better recommendations in Bokulich et al. 2013 the QIIME developers themselves did not adopt them.

Giorgio_Casaburi · October 11, 2017, 4:18pm

We have been using the QIIME 1 pipeline with recommendation in Bokulich et al. in several occasions. Independent laboratories and with different technologies (e.g., from pure molecular biology technique to MALDI-TOF detection) have confirmed QIIME taxonomic profiles. We acknowledge that computational biology is not an exact science per se and there are several bias. However, claiming or even suggesting that more than 50% of genera profiled are false positive, does seem to be a little magnified. If true, that would mean that more than five thousands scientific papers published in the last 5 years are not accurate. I don't believe there is a perfect tool out there, but there are a few quite accurate, like QIIME, with a great community of people involved from different background. More importantly, tool like QIIME are free, which in turn make science available to anyone and any institution especially those in less resource rich countries that can not always afford private solutions. We will continue using free tools like QIIME, in the name of free science and as free digital citizens. Thank you for all your hard work.

yeojuny · October 12, 2017, 4:37am

Well said!
as one of QIIME user.
And I also thank you for upgrading and modifying QIIME all the time by keeping up with the recent issues.

Robert_Edgar · October 12, 2017, 12:53pm

I agree with Dr. Casaburi that if my results are correct then the conclusions of many published papers could be called into question. My prediction that at a majority of genus names are like to be false positives on typical data is based on (1) mock community tests in the PeerJ paper, which are unrealistically easy for taxonomy prediction so may underestimate the error rate, and (2) cross-validation on sequences with known taxonomy (SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences | bioRxiv and https://drive5.com/usearch/manual/tax_bench.html). I would be interested to learn more about his methods for validating taxonomic profiles and invite him -- or anyone else with alternative approaches -- to contact me directly to discuss. If there is good evidence that contradicts my results, I will post a comment or correction as appropriate.

colinbrislawn · October 16, 2017, 4:56pm

I deeply appreciate dissent that is reasonable, respectful, and passionate.

Thank you for posting on the qiime forums!

Luke · October 23, 2017, 4:45pm

I always worry that the "you get what you pay for" adage might apply when people are ardent supporters of either the status quo or their "own" methods and their egos inhibit acknowledging weaknesses, but I do enjoy the open debate. As far as the possibility that we have and will continue to create novel nonexistent OTUs, I see nothing that suggests that this will change in the near-future, do you?

MaryKable · November 2, 2017, 7:34pm

Thank you for starting this great discussion. My lab recently completed a direct comparison of the open reference OTU picking method from QIIME 1 and the DADA2 method using a mock community, (available here: https://www.biorxiv.org/content/early/2017/11/01/212134) which I thought might add to this discussion since it confirms many of the points made in the original post.

Regarding point number 2 above, we confirm that taxa present at greater than 1% relative abundance were not incorrectly identified using the QIIME 1 open reference OTU method, they were simply not as precisely identified as they can be with the newer DADA2 method. Regarding points 3 and 4, we did observe a dramatic reduction in the number of sequence variants detected using DADA2 relative to QIIME 1.

telatin · February 2, 2018, 1:12pm

I agree with your post but not with the last part:

First, research is very expensive, and just adding a tiny open source bit on a very expensive process is quite inaccurate. The FASTQ file you analyze did cost a lot (sequencing cost being the tip of an iceberg, after the whole sample collection and purification etc), and they deserve a good analysis, that could even include proprietary software.

Second, in a scientific debate metrics should be shared. Either a software is performing well (in terms of desired outcome) under certain assumptions or not. In this specific case it could be that the bigger impact is from the user, that can be naively following a tutorial (Qiime is prone to this, in my experience) or critically evaluating each step tuning it to his/her experimental setup, and not the software.

Giorgio_Casaburi · February 5, 2018, 3:35pm

Yes indeed research is expensive - so help me celebrate the good fortune of having at least some saving in using amazing free tools like QIIME.

First: I always tend to reject paper that I review which do not share their methods and just have a couple of sentences such as "we used this XXX proprietary software for the analysis"- like we all accept the "logic equation" = since it is an expensive software than it is definitely going to be the best option: wrong!!!

Second: QIIME is mainly a wrapper and thus it offers the possibility of using several different tools and/or statistical options. There are plenty of papers that have investigated the different parameters and options that one can use when performing a 16S amplicon analysis in QIIME or similar tools. The tutorials in QIIME are just baseline/examples and thus it is responsibility of the users - not the developers - to make sure that the parameters they are using are suitable and robust. I think this should be pretty basic.

Here, the tip of the discussion was based on the "open-reference OTU" topic, which has been used in more than 5K papers more or less. Anyway, I have personally confirmed QIIME predictions with other methods - as I already mentioned. Most recently, I performed a shotgun sequencing on the same samples were the 16S was performed; comparing QIIME 16S prediction and shotgun analysis there was an almost 100% comparable taxonomic assignment in relative abundances.

The 16S paper is published already (http://msphere.asm.org/content/2/6/e00501-17) - in case you want to see metrics. The second is under review but I will definitely share the link once it is available.