Method comparison - differences in observed richness

samu · March 15, 2019, 2:15pm

Hi there,

We have some questions about microbial community analysis using DADA2 and other methods. I thought it might be interesting to discuss our observations in this forum. We are looking at the 16S rRNA of the bacterial community in soil samples. We compared different methods:

a classical 97% identity OTU pipeline using usearch
a zOTU pipeline as proposed by Edgar
DADA2_single with samples not pooled, as it is implemented in the DADA2 plugin of qiime2
DADA2_pooled with pooled samples pool=TRUE. The sample pool=TRUE method seems not to be available in qiime2 yet. I have seen that this was already pointed out in this blog post.

and we found the following patterns:

The observed ASV richness found with DADA2_single is much smaller compared to the classical 97% OTU clustering.
The observed ASV richness from the zOTU pipeline and from the DADA2_pooled pipeline is very similar and higher than the richness of the classical 97% OTU clustering pipeline.
Apart from this difference in observed richness, the overall Alpha and Beta diversity patterns and taxonomic abundances are quite similar. The DADA2_pooled and the zOTU results are the most similar.

We wonder why this large difference in observed richness occurs and how much it impacts any further analysis, especially if we are interested in rare taxa? And why the observed richness differs so much between DADA2_pooled and DADA2_single?

Overall, we found it really interesting that the zOTU and the DADA2_pooled results were so similar, even though the analysis pipeline is very different.

Best regards,

Samuel

Mehrbod_Estaki · March 15, 2019, 10:08pm

Hi @samu,
Thank you for sharing this awesome post! While I don't have the exact answers to your questions I'll discuss a couple of points which may be useful in understanding this.
First, the classic OTU clustering method is far from ideal and is very prone to spurious calls and in my (and I'm sure many others) should just not be used unless under very specific conditions. There is enough readings (ex1, ex2, ex3) on this to steal a solid weekend away from you. So in moving forward I won't spend too much time discussing the OTU method, only that what you see is typical of this method and there are many potential reasons behind this, but the important takeaway is that it is clearly not as accurate as denoising methods so why even bother to go there.
The second point I wanted to make is that using species richness as a measure of accuracy may not be the best way to compare these pipelines. At least not without some error estimation. There are some really good work by Dr. Amy Willis' group on dealing with this issue, described in a couple of pre-prints here and here and in fact they have developed both R packages and Qiime2 plugins to estimate alpha diversity with error estimates. Look for their Breakaway and DivNet packages for more info. These methods rely on the presence of singletons and rare taxa to model and predict unobserved species which gives you a more accurate estimation (and error bars!) of the true richness. As far as I know no one has benchmarked these different denoising pipelines using estimated richnesses, so that might be a natural next step here! Speaking of benchmarks, similar to what you have found, there are 2 other studies (Nearing et al. and Caruso et al.) that I know that have compared these pipelines and discuss their differences in more detail. Certainly worth a read for the intrigued.

Those being out of the way, let's get specific:

Aside from the fact that OTU clustering is known for spurious calls, the current DADA2 default (Pool=F) makes an active effort to discard singletons as they are too hard to call correctly (discussion on this here). In fact when runnning single end reads there will be no singletons in your table, you can see how compared to other methods that allow singletons you will see a drastically reduced number of ASVs. The DADA2-paired method however can have singletons since they can again appear after merging but I would still think the number of ASVs would be lower with DADA2 default due to its conservative nature of dealing with singletons. In this regard I can't speak to uNOISE3 and its zOTUs, as I'm just not familiar enough with its methods.

The pooled option in DADA2 is important here as this allows the inclusion of singletons at the cost of significantly higher computation/running time. Whereas with pool=F, DADA2 analyses the data on a sample-to-sample basis and discards singletons with each sample, with the pooled option, data is shared between samples which means singletons are compared across the whole dataset and not per each sample. This leads to retaining higher ASVs and is the preferred method when trying to estimate alpha diversity with the breakaway package.

That is good to hear! Otherwise we might have to reanalyze 2 decades worth of sequencing data and this is in line with what I've found on my work and what others report. Overall it seems to be that in the grand scheme of things the order in which we can rely on micrbiome data results is: overall patterns > relative abundances > alpha diversities.

Thanks again for sharing this and I look forward to others' take on this.

samu · April 1, 2019, 12:10pm

Hi @Mehrbod_Estaki,

Thank you very much for your fast and detailed reply, we really appreciate the support in this forum! It definitely helps to answer our questions and the many readings you suggested are definitely worth reading.

Best regards,

Samuel

Mehrbod_Estaki · April 1, 2019, 12:59pm

Hi @samu,
I'm glad you found them useful. I just wanted to take a step back from my earlier comment:

Which basically bordered on saying OTU methods were not useful anymore.
As fate would have it, I very recently stumbled upon a few twitter feeds and blogs and posts etc. discussing the very same topic which has since made me reevaluate my stance on the issue. At least until I find some more convincing and conclusive readings. But based on a few conversations and comments from experts in the field, it seems as though there is grounds for using both methods based on the data.

And just to clarify, these discuss OTUs vs ASVs and not the denoising aspect of these pipelines. Everyone agrees that the denoising is still essential and should be used, but what you do after with the ASVs is the idea being questioned. Cluster or not? If yes, at what %, and what method. Here are some additional readings from non-published sources:
"tweetstorm" by Pat Schloss
Twitter discussion by Thomas Sharpton
Blog post by By Noah Fierer, Tess Brewer, & Mallory Choudoir
Pat Schloss' review on a Robert Edgar pre-print and his rebuttal

There's more of these out there as it turns out so I just wanted to make sure I pass these on and not do any more preaching on the topic without more conclusive evidence.
I also would love to hear some of the other community's experts' opinion on the matter.