How is the alpha-diversity calculated in Qiime 2? What is the relationship with the exported alpha rarefaction data?

Hi all. I had thought that the alpha-diversity metrics derived from “qiime diversity core-metrics-phylogenetic” command (e.g., data folders like shannon_vector, faith_pd_vector) should be the last value of each row in the exported alpha rarefaction data derived from “qiime diversity alpha-rarefaction” command. However recently, I just found the these values are different. In fact, it seems that the results from the “core-metrics-phylogenetic” do not equal to any of the values among the last ten iteration values with the maximum sequencing depths in the rarefaction data, although they are close to each other. Is there anyone who knows about this issue? What is the relationship between the two results?

Welcome to the forum @fanwayne!

This result is expected. Any time a random subsample is taken from a collection of sequences for estimating alpha diversity, the selection of sequences may be different, leading to slightly different estimates each time. This is what is occurring when you run alpha rarefaction or core metrics — even if the same rarefaction depth is being chosen, each iteration will yield a slightly different result because different random subsamples are being taken, causing drop-out or drop-on of distinct sequences.

These results should more or less converge as rarefaction depth increases but there still may be some variation. At low rarefaction depth high variation is expected. Look at the rarefaction curve to determine what low and high subsampling depths are for your samples. Choose the highest possible subsampling depth so that the alpha diversity results become more stable and representative of the true diversity of those communities.

Incidentally, this is the goal behind alpha rarefaction: to determine at what subsampling depth alpha diversity results saturate and stabilize, so that accurate estimates may be achieved!

3 Likes

Thank you so much for your explanation!

1 Like