Choosing sampling depth


My samples have very different read counts with a minimum of 2 and maximum of 95,118. This makes it difficult for me to choose an appropriate sampling depth. Any suggestions?

Hi @Negin

The goal of rarefaction is to optimize for feature and sample based loss. You need to determine what’s most important for your analysis and your community. This depends on the type of community you’re dealing with and your overall sample size. It’s a lot of judgement and there isn’t a single right answer.

That said, In my work, I dont rarify below 1000 sequences/sample. My personal preference is around 5000 seqs/sample, but that’s somewhat subjective and based on the type of sample I typically work with, my sequencing and OTU picking/denoising protocols. But, past the 1000 seq/sample threshold, my goal is to retain as many samples as possible.

So, I would absolutely discard anything below 1000 sequences/sample. Based on your sample size, you will lose a lot of samples that you might need in your analysis if you go much deeper. But, if you wanted to go deeper, you may also choose to do so. I often try to find things that look like natural breaks in my samples. So, if you wanted to go deeper than 1000 sequences/sample, you might look at the more than 1500 sequence/sample jump in depth between BRH1477519 and NC3.

Your Q2 artefact should help you characterise the number of samples and sequences that you’re losing in the filtering.



