I'm working with skin samples and I'm trying to choose the sampling depth.
When analyzing the alpha rarefaction curve, I noticed that a sampling depth of approximately 500 reads could be a good choice because is where the curve is reaching the plateau.
However, when analyzing the taxa-bar-plot I noticed that the many samples between 500 and 1654 reads are very “weird” because they present only one or two fungi that do not represent the real skin microbiota. And I'm concerned to include these samples and compromise the statistics and my results.
Is there any other approach for choosing sampling depth? In addition to the alpha rarefaction curve...
My personal rule of thumb is that I don't typically work below 1000 sequences/sample, no matter where the curves flatten because the data below that threshhold tends to be less stable. You could try a PCoA at 500 sequences/sample, possibly passing your denosing stats as a column, to see if the samples between 500 and 2000 separate in a strange way.
Yes, rarefaction works to some degree to make predictions. So, that's an okay approach. I also tend to like to look at the table - as you've done - to see how many samples you lose at a given filtering depth. So, like, if you pick a depth of 500 sequences/sample, you lose 4 samples and if you pick a depth of 1000, you lose 9 samples; I'd compare that against the percentage of sequences you retain.
By this I mean that as you repeat rarefaction, a lower depth will mean that there's more error in the measurement. Again, emperically, I tend to discard anything with fewer than 1000 sequences/sample as too low quality to include automatically. Failures below that threshhold are common and simply a relative of most amplicon sequencing studies of a reasonable size. So, I wouldn't worry there.
Best,
Justine
P.S. Once you pick a depth, you might also check out the SRS plug in. I haven't played with it yet, but its supposed to be more robust than rarefaction and might be something to try!