Pielou’s evenness, Faith's PD and Shannon’s diversity indices for ASVs instead of "species"

timanix · May 30, 2020, 9:52am

Dear all!
Could you recommend me an article to cite that it is "legal" or appropriate to estimate alpha diversity metrics based on ASV tables instead of species or OTUs.
In my case, I repeated alpha diversity commands for clustered to 97% OTUs and the results were extremely similar to such with ASVs. So I would like to keep only ASV based metrics, but also I would like to cite a proper paper for it. I think I read about it somewhere but I can't find this paper

Mehrbod_Estaki · May 30, 2020, 9:05pm

Hi @timanix,
I'm not sure if such a paper exists, the same way you wouldn't be able to find a paper that can say estimating diversity at 97% similarity OTUs is appropriate. The idea that 3% difference in short 16S regions can distinguish species is quite misleading. But 100% similarity in certain regions also doesn't coincide with identical species. However, if you are looking for a paper to support the use of ASVs, I would recommend this piece by the developers of DADA2:
Exact Sequence Variants Should Replace Operational Taxonomic Units in Marker-Gene Data Analysis

jwdebelius · May 30, 2020, 9:41pm

Hi @timanix,

I would just use them based on ASVs and make it clear that you've done calculations based on ASVs. (Swap out "observed ASVs" for "observed X") and call it a day.

Best,
Justine

Nicholas_Bokulich · May 30, 2020, 10:23pm

I can't think of any papers explicitly showing that alpha diversity estimates are more accurate with ASVs than OTUs, but here are two ideas:

point to the original dada2 paper, which shows that OTU clustering yields inaccurate sequence variants, whereas dada2 yields far fewer (or no?) false positives. This is essentially indicating better alpha diversity estimates. The paper that @Mehrbod_Estaki linked to is a good perspective on why ASVs should replace OTUs in general, so would be a good reference to use to support your case.
if you still need more evidence, there are various papers out there (including my own) showing that OTU clustering methods overestimate alpha diversity. In your analysis unique OTUs ~= unique ASVs, but this is still consistent: basically, by clustering sequences at 97% you are losing the species and sub-species resolution that dada2 captures (reducing observed diversity) but still failing to detect and remove some false positives (increasing observed diversity). This balances out so that observed OTUs ~= observed ASVs in your case... results vary due to myriad factors (various users on this forum report higher or lower alpha diversity with OTUs vs. ASVs, or similar ballpark estimates as you see).

timanix · May 31, 2020, 7:11am

Thank you, I think that I will cite this paper

Basically, it is what I did, but one of the reviewers pointed out that I need or explain better why I used ASVs instead of species, or redo it with OTUs.

It is exactly what my supervisor told me. I just thought, that maybe I can cite a paper to avoid writing too much of explanations. This two papers I was advised are what I was looking for, thank you!

Alpha diversity was a little bit lower with clustered tables, than with ASVs, but I mostly interested in comparison of alpha diversity indicies between niches and pattern was the same

Thanks to all of you for your answers! That's really helpfull

SoilRotifer · May 31, 2020, 3:51pm

Hi @timanix, et al. To extend some of what @Nicholas_Bokulich highlighted, I'd refer you to the following:

Blog posts from Noah Fierer

This paper from Glassman & Martiny 2018
This paper from Rob Edgar 2018 (he is the developer of usearch) .

I might be missing some, but these should help alleviate any concerns of OTUs vs (A/E)SVs.

-Mike

timanix · June 1, 2020, 5:22am

@SoilRotifer Thank you for useful articles and links!