Bacterial alpha diversity so low?

Mantella86 · August 21, 2021, 3:23pm

Hi everyone, i have finished processing bacterial data through QIIME2. I do not lose to many reads after filtering but i am finding my observed alpha diversity numbers seem low. Rather than a few thousand species i only have a few hundred per treatment. Even though my OTU table says i have over 5000 OTUs.

Any thoughts why this is?

Best wishes

jwdebelius · August 22, 2021, 10:33am

Hi @Mantella86,

First, do you have 5000 OTUs total? Over 5000 counts? Over 5000 counts per sample? (Did you rarefy to 5000 sequences/sample, maybe?)
If it's the first, remember that many samples don't share all their features. It obviously depends on the environment, so the fact that you have over 5000 independent features doesn't necessarily correspond to 5000 features in each sample.

Second, does the number of features make senes for your environment? Certain environments are low diversity environments (for instance, the human vagina) and will saturate pretty quickly. So, if this is a well known environment or has been reported in the literature before, I'd take a look at where their diversity is.

However, I'll also note a caveat: your observed features is contingent on sequencing depth, which you can see with your rarefaction curve and your denoising/clustering method (see Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches).

Best,
Justine

Mantella86 · August 22, 2021, 5:53pm

Hi Justine, thank you for getting back. I have exactly 6399 and did not rarefy. I have been advised not to for now. The environment is a peatland so very acidic anaerobic soil. Could it be that when i do my alpha diversity they are clustering together?

Best wishes

jwdebelius · August 22, 2021, 8:42pm

Hi @Mantella86,

By observed OTUs, do you mean te count in the table summary, or from the diversity command?

If it's the former, what do you see in other papers with similar sample size, sequencing depths, and environments? What is the point of reference? Again, keep in mind that your observed features are going to be a function of sequencing depth, bioinformatics, sample size, and environment. I'd look at the paper on denoising I linked above, for example, for one expiation.

If it's the later, you need to rarify because a pure richness metric is super sensitive to sequencing depth and rarefaction is the current best practice (best of possible evils) to deal with the issue. You have a little bit more space with something like shannon, but I think currently, we still recommend rarifying for traditional diversity metrics.

Best,
Justsine

Mantella86 · August 23, 2021, 10:12am

Hi Justine, i mean observed OTU when i run the estimate richness command in phyloseq. There is only a few hundred per treatment even though my OTU table has over 6000 features.

I hope that makes sense.

Best wishes

jwdebelius · August 23, 2021, 1:28pm

Hi @Mantella86,

Thanks for the clarification. I would not expect every feature to be present in every sample. My experience is that there if often a power law relationship between the number of samples in which an organism appears where 80% or more of features are only seen in 10% of samples or fewer (although this is primarily in free living organisms). So, it makes sense to me that not every feature appears in every sample.

Best,
Justine

system · September 23, 2021, 7:28pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.