Hi there,
I used qiime2 (qiime2-2019.4) in my research. By chance, I ran “qiime diversity core-metrics-phylogenetic” several times for the same dataset but can not get the same rarefied tables. I just wonder is it true that every time run “qiime diversity core-metrics-phylogenetic” will get slightly different rarefied tables because of the random permutation process in it? If so, is it possible to fix the random permutation and get reproducible results? Thank you!
I'm not sure how to do that within Qiime 2, but it's an interesting idea!
(I know you can do this in Phyloseq, or avoid this issue totally using jackknifed PCoA)
Not yet possible in QIIME 2. We have thrown around the idea of exposing a parameter for setting a random seed to make this reproducible, but I think it is a very low priority because, in general, if you are sampling at a reasonably high rarefaction depth then your results should converge fairly well at that sequencing depth so results should not vary too much. If you rarefy twice and get vastly different results it is a pretty good indication you need to sample at a higher depth!
Not yet possible in QIIME 2. We have thrown around the idea of exposing a parameter for setting a random seed to make this reproducible, but I think it is a very low priority because, in general, if you are sampling at a reasonably high rarefaction depth then your results should converge fairly well at that sequencing depth so results should not vary too much. If you rarefy twice and get vastly different results it is a pretty good indication you need to sample at a higher depth!
@Nicholas_Bokulich Thank you for you reply. I just wonder in some situation, we may need such reproducibility such as for workshop or compare the results from two students. Just a little suggestion for your consideration. Is it possible to add the seed parameter in next version?
No, it is not currently planned for the upcoming 2019.10 release. As @Nicholas_Bokulich said, this is very low priority for us at the moment. Regarding the "workshop" example --- we teach QIIME 2 workshops very often, and don't see this being an issue with the tutorial datasets we use. In fact, the small bit of variability between different runs is a useful talking point for emphasizing what rarefaction is!