A couple of members of my lab and I have been getting very different results when using qiime2 versus phyloseq for calculating weighted unifrac values. It seems that the default parameters used by these tools (qiime diversity core-metrics-phylogenetic versus phyloseq::distance) generate very different weighted unifrac values, even when pre-rarefying the dataset (since qiime2 automatically rarefies). Maybe it's just something with the defaults that's causing the difference, but I can't tell what that is. I've tried phyloseq::Unifrac(weighted=TRUE, normalized=FALSE), but the resulting values are still different from qiime2. Maybe it's how phyloseq deals with the root, but again, I'm not sure why. I've been using rooted phylogenies, so the same root should be used for phyloseq and qiime2. I've attached a compressed Jupyter notebook file showing a reproducible example with the GlobalPatterns dataset from phyloseq. The version of qiime2 is a bit old, but I'm guessing the weighted unifrac algorithm hasn't changed in more recent versions.
I've also attached figure showing histograms of beta-diversity values calculated on GlobalPatterns (rarefied to min of any sample) with qiime diversity core-metrics-phylogenetic versus phyloseq::distance.
I think it would be good to make sure researchers know that they will get very different weighted unifrac values depending if the use the default methods for qiime2 versus phyloseq. Moreover, I'm not sure which method is "correct". From processing my own datasets with both methods, it appears that qiime2 generates much more reasonable values than phyloseq.
Thanks for the notebook, that is wonderful! I haven’t looked too closely yet, but it is it possible that we’re just seeing the result of rarefying twice? Granted I wouldn’t expect weighted unifrac to look that different between different rarefactions, but that’s the most obvious place to look initially.
If we were to take the rarefied table from core-metrics-phylogenetic and use that in phyloseq (skipping phyloseq’s rarefaction), do the distributions line up again?
I’m confused by “rarefying twice”. If the rarefying depth for qiime diversity core-metrics-phylogenetic is set to the same value as the table that’s already rarefied, shouldn’t that essentially be the original counts? Rarefying in qiime2 by default is without replacement, correct?
I apologize, I missed that your notebook wrote out the rarefied table and used that with QIIME 2, you are correct rarefying twice at the same depth doesn't do anything as we rarefy without replacement.
No worries! I’m worried about the differences between phyloseq and qiime2, at least for their defaults. Both methods are used by members of my lab, so they may be getting very different results just because they are using phyloseq versus qiime2
@nick-youngblut, I’m not sure of the source of the difference but thank you for flagging it. Weighted UniFrac is deterministic, and the implementations of UniFrac used by QIIME 2 are validated against the original implementation of UniFrac by Cathy Lozupone from PyCogent (unit tests for the version of UniFrac being used by QIIME 2 by default are here). Has anyone in your team followed up with the phyloseq developers about the difference?