Presence-absence distance method

Nicholas_Bokulich · November 6, 2018, 1:52am

You are not wrong — I don't think there is anything wrong with calculating Bray-Curtis dissimilarity on a presence-absence matrix. It's just that it becomes quite similar to Jaccard, so you may as well use Jaccard.

Oh cool, I did not know that one. Jaccard is close, but Sorensen == binary Bray-Curtis! :qiime2: does have the Sorensen index — listed by its pseudonym, the Dice index.

Yes — that is the point of the semantic types. And we did what we thought made things easiest/simplest: all beta diversity metrics in :qiime2: can only accept frequency tables, but qualitative metrics convert these to presence-absence data. So whether you are computing Jaccard or Bray-Curtis you input the same data and :qiime2: does the rest. So there is no need to convert your data to multiple formats, and no need to know which format pairs with which method to use these methods correctly.

But please let us know how we could make this simpler/easier. We are always open to contributions, and if you think that improved documentation would fix this then let's talk

I have mixed feelings on this. On the one hand, it makes it easier for users to learn about these methods, and resources like that forum post obviously are not too visible. On the other hand, all of this information is easily available online and on wikipedia. The brief summaries are nice, but can only transmit so much information, certainly not enough to "teach [users] about which test is best for their project". Which boils down to one thing (you called it): this is more time we developers need to spend doing busy work, less time we get to work on other features and on documentation that is not googleable. What about just linking to something like that forum post from within the method description?

You are right, that error is counterintuitive in this case. We could let the beta method accept a FeatureTable[PresenceAbsence] artifact — that would fix this error but would then allow users to run metrics that may be inappropriate on binary data (inappropriate may not be the correct word, but you see my point — the sword cuts both ways doesn't it!)

A better error message would be useful, but this message is coming from the :qiime2: framework, which performs type validation, not the plugin. Improving the plugin description to indicate that :qiime2: does the heavy lifting and that converting to binary is not necessary could be a better solution in my mind, and I can raise an issue if you agree — this would be a great first contribution to QIIME 2 if you are interested