I want to know if it is necessary to rarefact the data before caculate the "niche breadth", "rare species" and "abundant species“? Because, I know the data for alpha and beta diversity need to be rarefacted, while the data for relative abundance do not in qiime2 pipeline.

I think it depends on how you're measuring/partitioning these. I'm not sure how you're defining each term. I would expect rarefaction to have a bigger effect in your "rare features" measurement, since they're more likely to be subsampled. (Rarefaction tends to add sparsity.)

I do think you'll have to consider depth in some way; I just don't know that rarefaction is your best approach.


First, the calculation of "rare species" and "abundant species“ are based on the following article.

Then the "niche breadth" is based on the following expression
Secondly, some article calculated the niche breadth based on shannon-wiener (alpha diversity?)

Thus, is it correct to use rarefacted data to calculate alpha and bete diversity and use raw data (without rarfaction) to conduct other analysis, for example "niche breadth", "rare species" and "abundant species“ , differential abundant species (deseq2 or ancom) ?

First, these are general recommendations, because the answer is always It dependsTM, as a function of your data and your analyses.

My general recommendation would be to rarefy before alpha diversity, particularly richness and before most beta metrics. (Aitchison is rarefaction-less, DEICODE is rarefaction-less and there's some debate around Bray Curtis, ymmv). You may find this article on rarefaction interesting:

It looks like the abundance methods you're looking at are based on relative abundance. I would not rarefy before calculating this, but you again may want to look at depth.

For differential abundance, I think the consensus is not to rarefy. But the whole question of differential abundance is its own can of worms. I'll refer you to a recent paper by Nearing et al for more discussion.