I am confused about the rarefaction. Why do we need to do the rarefaction? What is the benefit? In addition, if we do subsampling, is there a command line to repeat the subsampling results? Or we just need to check the subsampling results form the rarefied_table.qza? Besides, when I run this qiime diversity core-metrics command, chao1 diversity measurement is not included in the output. If I want to get the chao1 measurement, how can I get it? Is the qiime diversity alpha command based on the subsampled data?
Thank you so much!
Even subsampling is necessary for alpha diversity estimation because we need to control for uneven sampling. For example, imagine that we have 2 different soil samples and want to determine which one contains the most unique species. Sample 1 has 1000 sequence reads, representing a total of 100 unique species; sample 2 has 100 sequence reads representing a total of 50 unique species. Which one has more unique species? Without rarefying, sample 1 has more species, but after evenly rarefying sample 2 would have more.
You should provide alpha with a rarefied feature table. Running feature-table rarefy, then alpha with the rarefied table would replicate the alpha diversity results output from core-metrics.
@pumpkin, a quick follow-up on @Nicholas_Bokulich's answer, as of QIIME 2 2017.9 both core-metrics and core-metrics-phylogenetic now return the rarefied feature table, so if you want to use the same rarefied table that was used to compute the values returned in core-metrics/core-metrics-phylogenetic, you can just use that artifact (instead of recomputing the rarefied table). Either approach is good, and probably warrants comparison as part of your analysis, I just wanted to let you know there are a few ways to accomplish this goal! Thanks for using QIIME 2!