Rarefaction vs Rarefying and when to use them in your analysis pipeline

Dear qiime2 community,

I entered the microbiome analysis world this year and just came to the understanding that Rarefaction and Rarefying are NOT the same thing, so I have been rarefying my data when I thought that I was performing rarefaction (insert scream of horror :scream:). With this realization in hand I have attempted to understand what the differences are and when to apply them... I would greatly appreciate input if you have a moment!

To summarize my understanding so far:

Rarefaction = repeated subsampling to a specific sequencing depth that computes a single mean value for each sample.
Rarefying = a single subsampling to a specific sequencing depth that computes a value for each ASV for each sample.

My understanding is that there has been a lot of confusion in the field surrounding both these terms (conflating the two.. #relatable :smiling_face_with_tear:), but the current consensus seems to be that Rarefaction is a valid way (and perhaps even the best way) to correct for differences in sampling depth when calculating diversity metrics, as the initial paper that suggested against Rarefaction actually used Rarefying instead (though there were other issues too..).

My questions are as follows:

  1. Although Rarefaction is a validated way of correcting for sampling depth, Rarefying is not. Is this fair to say?

  2. Rarefaction can be used for the calculation of beta-diversity and alpha-diversity metrics, but it isn't possible for calculations of differential abundance. Is this not then a problem because we are not correcting for sampling depth in differential abundance analyses? Would this not then give support for performing Rarefying as one could do so, thus correct for sampling depth, and then use it for both diversity metrics and differential abundance calculations?

  3. Because Rarefaction controls for sampling depth, and CLR transformation controls for the compositionality of data, would it not be fair to apply Rarefaction followed by CLR transformation?

Thank you so much in advance for helping me understand these concepts!

kindest regards,
Zoë

A few papers that I have really appreciated on the topic:
https://journals.asm.org/doi/10.1128/msphere.00355-23
https://journals.asm.org/doi/10.1128/msphere.00354-23

4 Likes

This is a great post!

My first thought is that the words are too dang similar!

Instead, we could call these 'single subsampling' and 'multiple subsampling', which makes the commonality (computational subsampling) and difference (one time, many times) clear.

'Multiple subsampling' is also called 'bootstrapping', and there is a new Qiime2 plugin that does this!

3 Likes

Hi @zippyzo, Thanks for your post!

Although Rarefaction is a validated way of correcting for sampling depth, Rarefying is not. Is this fair to say?

Rarefaction is a more reliable approach to performing alpha and beta diversity analysis. Rarefying has been widely used, and tends to produce very similar results to rarefaction*, but rarefaction is preferable.

@colinbrislawn mentioned q2-boots (see the paper here), which is what I would recommend using. This is illustrated in the gut-to-soil (g2s) tutorial here - in this case using the kmer-diversity action, though core-metrics is also a good one to use (docs on core-metrics, with examples, are here).

it isn't possible for calculations of differential abundance

The methods for normalization tend to be specific to differential abundance testing method. Those statistics are complex (I'm not an expert), so I go with what the statisticans who develop them recommend.

Because Rarefaction controls for sampling depth, and CLR transformation controls for the compositionality of data, would it not be fair to apply Rarefaction followed by CLR transformation?

This seems reasonable to me, but I'm interested in what others think as well (@jwdebelius - any input on this?).

Right now, it's possible to create n rarefied features tables with the resample action in boots. We don't have a corresponding action to average these together to create a single table, but if there is a need for that it would be easy enough to add.

I hope this helps!

Notes:
*One analysis that hasn't been done yet, but which I think would be interesting, would be to assess how often a given rarefied diversity metric represents an outlier from the distribution of all the diversity metrics that are computed during rarefaction. I think that would help us assess how problematic rarefying tends to be - but regardless, I still believe that rarefaction is preferable. This analysis would be straight-forward to pull off with q2-boots - happy to advise if anyone wants to try it!

1 Like

Hi @zippyzo and @gregcaporaso

I wouldn't recommend rarefaction + CLR. The issue I end up having with rarefaction here is that it introduces artifcial zeros. (We have true 0s from absences, 0s from things that are below the limit of detection, systematic zeros, and now rarefaction zeros.) Since a lot of bias in CLR comes from those zeros, introducing more zeros would not be my preferred appraoch,even if its done through repeated subsampling.

I tend to prefer to apply a filter to my taxa to select things which meet certain abundance and prevalence criteria so that I can be confident that they're detected consistently when I test them. My person critieria are typical 1/rarefaction depth in at least 10% of the samples. This is based on the idea that I in theory want to be able to detect at least 1 read (if present) in any sample, so if my shallowest sample is slightly deeper than that depth, I should - theoretically - be able to detect it. I think you could also argue that you need to go half the rarefaction depth (at least 2 reads detected in the shallowest sample, although for the life of me, I cant recall the name of this principle.) I use a 10% threshhold because I will sometimes use the same critieria for other transforms, like a presence/absence model, and in those cases, my RR models dont work well for super rare taxon.

The filtering function is implemented in filter-features-conditionally

You can do this with q2-feature-table merge using the average as your overlap method. Im not sure how float values are handled there, but as a first order solution, you can already do it?

Best,
Justine

2 Likes