interpretation of rarefaction curve

Hi everyone,

I’m working with Qiime2 since last month and I’m trying to understand the concept of rarefying and how this can affect my downstream analysis. A quick background: I have sequences generated from fungi that were colonizing plant roots. These plant species are known to be poorly colonized by fungi (low fungi biomass in my samples) therefore I did not expect much diversity.

I realized that for one plant species (samples Sa3 and Sa30) the maximum sampling depth was only 4000 whereas for the other species it was 7000 (Sa27) and 11000 (Sa1). When I first seen it, the first thought I had was that I cannot use this dataset like this in the downstream diversity analyses to compare both species (because of the differences in sampling depth). Am I thinking on the right way?

After that, I rarefied to 6000 and I still have enough replicates of each species for comparison reasons.
So, would be this the correct dataset to proceed with the downstream analysis?

Thank you so much for your attention!

I read the manuscript where authors rarefied root fungi dataset to 800 to retain most of the samples. Your sequencing depth looks good enough to proceed with the analysis. Rarefying your samples will allow you to perform diversity comparisons. I would also consider to go even lower to 4000 to keep all the samples.


Thank you for your reply, @timanix .
If you can share this paper that you mentioned, I would be happy.


Unfortunately, I read this paper quite long time ago and the only thing I remember is a rarefaction depth.
Here is another one, but with mice gut microbome. Fungi dataset was rarefied to 889 reads.

1 Like