Sampling depth - Is my sampling depth ideal

ChrisKeefe · May 14, 2020, 4:51pm

There are many different perspectives on this. What you choose is largely a matter of personal preference and study data/needs.

You're interpreting the alpha rarefaction curves correctly - assuming they level off for any combination of metadata category and alpha diversity metric that matters to you, that leveling point can be used as a rough "minimum". As you suggest, there's no need to decrease depth to that minimum.

I often use the approach you've suggested, choosing the highest possible number of sequences I can without losing a specific sample. Better scientists than I have suggested that this might introduce a little bias (I suspect because your low-count sample will not be subject to random subsampling, while all of your other samples will be).

These folks may select an arbitrary reasonable rarefaction depth - say, 10k reads - and apply that without splitting hairs over a few reads. Even in this case, though, you want to consider the affect that sampling depth will have on your data, in terms of utility and bias, and select a threshold that won't damage the meta-study if your next data set isn't this robust.

You probably have enough samples to safely lose a few, but you may not have to lose any. Your decision comes down to balancing "how deep is deep enough" for my study, against preserving as many samples as possible. Only you can make that call, but here are a couple opinions that might help. (1, 2)