DADA2 and rarefaction

Hi everyone, i heard that it is not always necessary to rarefy after using DADA2 especially if the rarefaction curves reach a plateau very early on.

Is this correct?

Hi @Mantella86,

It's a complicated issue and history. I think the general answer is "it depends".

It depends on what you're analyzing, your metric/test, and your assumptions. There was the "never rarefy" school, which has carried forward and is correct for differential abundance testing. (Normalization gets handled there and in some rarefaction-less metrics.)

Weiss et al responded by showing that depth could still be a. strong driver of community in ordination space, and suggested rarefying before diversity, which I think has been the prevailing recommendation.

A recent paper gave a more nuanced approach, suggesting it depends on your question and system. You might find more guidance in that.

My experience is that it depends on your metric, hypothesis, and and model.

4 Likes

Hi, I have some microbe data that I did not rarefy after using DADA2 because the samples all reach a plateau. It is my understanding that for alpha diversity this does not matter but for beta diversity analysis its better to at least convert the data to relative abundance?

Is this correct?

Best wishes

Hi @Mantella86,

I merged your two topics, because they seem to be asking the same question. Did you take a look at the new paper I linked?

Best,
Justine

Hi Justine, sorry I did not see your reply here. I did but i am also quite new to Microbial data analysis. I have read a few papers. Some of the recent ones at least suggest transforming the data for beta diversity but i am not sure?

My problem with rarefying is I would lose a lot of data.

Hi @Mantella86,

Rarefaction does tend to be assorted with data loss; but again, it depends on your enviroment and your metric. For example, I wouldn't advise rarefying before you apply a rarefactionless metric, like Aitchison or DEICODE. But, you also wouldn't pass in normalized data since the normalization is built in.

The Weiss paper suggests that sampling depth is a major driver of unweighted metrics (an observation my independent work agrees with). I don't know enough about your environmental biomass or system, what metric is plateauing, or how you're selecting depth. The reality of microbiome analysis is that you're likely to lose data somewhere whether it's filtering low abundance/prevelance reads, filtering samples with very few reads (which you should abslotuely do) or rarifying.

The other major consideration here for what its worth is that QIIME 2 won't take a relative abundance table for metrics. You have to pass a count table. You could pass an unrarefied count table, though.

Best,
Justine

1 Like