Rarefaction or not before calculate "niche breadth", "rare species" and "abundant species“

YuZhang · May 25, 2022, 2:48pm

Dear all，
I want to know if it is necessary to rarefact the data before caculate the "niche breadth", "rare species" and "abundant species“? Because, I know the data for alpha and beta diversity need to be rarefacted, while the data for relative abundance do not in qiime2 pipeline.
Thanks.

jwdebelius · May 25, 2022, 3:16pm

Hi @YuZhang,

I think it depends on how you're measuring/partitioning these. I'm not sure how you're defining each term. I would expect rarefaction to have a bigger effect in your "rare features" measurement, since they're more likely to be subsampled. (Rarefaction tends to add sparsity.)

I do think you'll have to consider depth in some way; I just don't know that rarefaction is your best approach.

Best,
Justine

YuZhang · May 26, 2022, 1:46am

Thanks.
First, the calculation of "rare species" and "abundant species“ are based on the following article.

Liang, Y.; Xiao, X.; Nuccio, E. E.; Yuan, M.; Zhang, N.; Xue, K.; Cohan, F. M.; Zhou, J.; Sun, B., Differentiation strategies of soil rare and abundant microbial taxa in response to changing climatic regimes. Environ Microbiol 2020, 22, (4), 1327-1340.

Then the "niche breadth" is based on the following expression

Zhang, J.; Zhang, B.; Liu, Y.; Guo, Y.; Shi, P.; Wei, G., Distinct large-scale biogeographic patterns of fungal communities in bulk soil and soybean rhizosphere in China. Sci Total Environ 2018, 644, 791-800.

Secondly, some article calculated the niche breadth based on shannon-wiener （alpha diversity？）

Zhang, Lan, Guowen Huang, Yongtao Li, and Shitai Bao. 2021. "Quantitative Research Methods of Linguistic Niche and Cultural Sustainability" Sustainability 13, no. 17: 9586. Quantitative Research Methods of Linguistic Niche and Cultural Sustainability

Thus, is it correct to use rarefacted data to calculate alpha and bete diversity and use raw data (without rarfaction) to conduct other analysis, for example "niche breadth", "rare species" and "abundant species“ , differential abundant species (deseq2 or ancom) ?

jwdebelius · May 26, 2022, 2:46pm

Hi @YuZhang,

First, these are general recommendations, because the answer is always It depends^TM, as a function of your data and your analyses.

My general recommendation would be to rarefy before alpha diversity, particularly richness and before most beta metrics. (Aitchison is rarefaction-less, DEICODE is rarefaction-less and there's some debate around Bray Curtis, ymmv). You may find this article on rarefaction interesting:

https://academic.oup.com/bioinformatics/article-abstract/38/9/2389/6536959

It looks like the abundance methods you're looking at are based on relative abundance. I would not rarefy before calculating this, but you again may want to look at depth.

For differential abundance, I think the consensus is not to rarefy. But the whole question of differential abundance is its own can of worms. I'll refer you to a recent paper by Nearing et al for more discussion.

https://www.nature.com/articles/s41467-022-28034-z

Best,
Justine