Choosing Optimal Sampling Depth for Core Diversity and Alpha Rarefaction Analysis

Hello
I have conducted alpha rarefaction analysis with two different sampling depths: 1049 and 11000. For the first, I chose a low sampling depth (1049), and for the second, I selected 11000, thinking it might be near the median depth. Both choices produced different alpha diversity boxplots after running qiime diversity core-metrics-phylogenetic and visualization. I noticed that the observed feature table changes based on the sampling depth selected.

My confusion lies in choosing the best sampling depth for both:

  1. The alpha rarefaction curve: I understand the importance of finding a plateau where the diversity metrics stabilize, but how do I balance this with ensuring the inclusivity of all samples?
  2. Core diversity analysis: Should the sampling depth chosen for core diversity analysis be the same as the plateau value observed in the alpha rarefaction plot?

I am looking for guidance on choosing the exact sampling depth and

  • How to determine the best sampling depth for both alpha rarefaction and core diversity metrics.
  • Whether it’s common to see different observed feature tables when changing the sampling depth, and if so, how to interpret these differences in the context of diversity analysis.

Any insights or suggestions would be greatly appreciated!

Thank you!

observed-otus-group-significance 1049.qzv (472.5 KB)
observed-otus-group-significance_11000.qzv (471.9 KB)
alpha-rarefaction1049.qzv (725.1 KB)
alpha-rarefaction11000.qzv (642.8 KB)
filtered-phylum-table-summary.qzv (1.1 MB)

1 Like

Hi @Namraj_Jaishi,

This is definitely a difficult balance to find.

When selecting a sampling depth, we are trying investigate all the samples as deeply as possible (trying to maximize features), while trying to not to throw out to many shallow (samples that dont reach the sampling depth threshold) samples.

I tend to fall on the trying to maximize features in samples side of this spectrum so that I can investigate samples deeply as opposed to trying to retain samples.

However, this is definitely an "it depends" senario. If the study is longitudinal (and alpha rarefaction shows an obvious plateau), I would lean towards saving as many samples as possible so that subjects arent missing timepoints. However, if I have group replicates (like 60 samples from my skin microbiome and 60 from my gut microbiome) in my study, I would probably drop samples, in order, to investigate the remainder of the samples as deeply as possible

At the end of the day, this is really your decisions and you will need to think about whats best for your study.

You should choose a sampling depth that is in the range of sampling depths were alpha diversity plateaus. This means that if you add more features to each sample, it doesn't change the alpha diversity value for the sample.

Alpha rarefaction asks for a max sampling depth. This value should be on the higher end so that you can see where the plateau starts. If I was debating between 1,049 or 11,000 as my sampling depth I would probably hand in 15000 as my max depth for alpha rarefaction so I can get a good sense of how the alpha diveristy changes at sampling depths between 1,049 and 15,000

The alpha rarefaction should then help you decide what value to give core-metrics.

It would be expected to see different observed feature values per sample, if 1,049 is below the threshold were alpha diversity stablizes, which seems like it might be (I would maybe increase the number of steps in your alpha rarefaction plot so you could see this more fine tuned.)

Additionally, we have alot of resources regarding this on the qiime2 youtube channel and here on the forum that maybe helpful!

Hope this helps!

1 Like

@cherman2
Thank you for your insights.

1 Like