I want to know how low I can go. I’d love to know why I shouldn’t go any lower.
Specifically, I’m trying to understand how the inclusion (or exclusion) of a sample with some small number of reads influences the denoising steps (either with Deblur, DADA2, or unoise). One one hand, it seems like the inclusion of the entire dataset provides the true representation of the error present in the dataset. On the other, I wonder if samples with low numbers of reads have a greater propensity for chimeras or greater error?
If time and computational resources are no issue, is it best to include everything going into denoising?
There are lots of great posts in this forum about sampling depth considerations (two, for example here and here) but I’m pretty sure these relate to the data going into diversity tests (so presumably after denoising).
Thanks for your consideration and comments!