Rarefaction or not before calculate "niche breadth", "rare species" and "abundant species“

Dear all,
I want to know if it is necessary to rarefact the data before caculate the "niche breadth", "rare species" and "abundant species“? Because, I know the data for alpha and beta diversity need to be rarefacted, while the data for relative abundance do not in qiime2 pipeline.

Hi @YuZhang,

I think it depends on how you're measuring/partitioning these. I'm not sure how you're defining each term. I would expect rarefaction to have a bigger effect in your "rare features" measurement, since they're more likely to be subsampled. (Rarefaction tends to add sparsity.)

I do think you'll have to consider depth in some way; I just don't know that rarefaction is your best approach.


First, the calculation of "rare species" and "abundant species“ are based on the following article.

Liang, Y.; Xiao, X.; Nuccio, E. E.; Yuan, M.; Zhang, N.; Xue, K.; Cohan, F. M.; Zhou, J.; Sun, B., Differentiation strategies of soil rare and abundant microbial taxa in response to changing climatic regimes. Environ Microbiol 2020, 22, (4), 1327-1340.

Then the "niche breadth" is based on the following expression
Zhang, J.; Zhang, B.; Liu, Y.; Guo, Y.; Shi, P.; Wei, G., Distinct large-scale biogeographic patterns of fungal communities in bulk soil and soybean rhizosphere in China. Sci Total Environ 2018, 644, 791-800.

Secondly, some article calculated the niche breadth based on shannon-wiener (alpha diversity?)

Zhang, Lan, Guowen Huang, Yongtao Li, and Shitai Bao. 2021. "Quantitative Research Methods of Linguistic Niche and Cultural Sustainability" Sustainability 13, no. 17: 9586. Sustainability | Free Full-Text | Quantitative Research Methods of Linguistic Niche and Cultural Sustainability

Thus, is it correct to use rarefacted data to calculate alpha and bete diversity and use raw data (without rarfaction) to conduct other analysis, for example "niche breadth", "rare species" and "abundant species“ , differential abundant species (deseq2 or ancom) ?

Hi @YuZhang,

First, these are general recommendations, because the answer is always It dependsTM, as a function of your data and your analyses.

My general recommendation would be to rarefy before alpha diversity, particularly richness and before most beta metrics. (Aitchison is rarefaction-less, DEICODE is rarefaction-less and there's some debate around Bray Curtis, ymmv). You may find this article on rarefaction interesting:

It looks like the abundance methods you're looking at are based on relative abundance. I would not rarefy before calculating this, but you again may want to look at depth.

For differential abundance, I think the consensus is not to rarefy. But the whole question of differential abundance is its own can of worms. I'll refer you to a recent paper by Nearing et al for more discussion.