The flipside of the A-side: after-hour ramblings on life, the universe and everything

Theory papers almost never make it into top journals and this is why I have blogged about the paper ‘Detecting Novel Associations in Large Data Sets’ in Science by Reshef et al before (here and here). The reception in the statistics community was mixed and while Terry Speed seemed to love it, Rob Tibshirani started to point out weaknesses. And now other people have joined the discussion.

Justin B Kinney, a physicist who has a quantitative biology lab at CSHL, writes:

Mickey Atwal and I recently posted a preprint that challenges the primary claims of this [Reshef’s] article.

In essence we find that Reshef et al.’s claim that MIC has an “equitability” property while mutual information does not is incorrect. We provide mathematical proof of this, and also validate our findings by rerunning Reshef et al.’s simulations.

Thus, there appears to be no motivation for using MIC rather than simply estimating mutual information.

Frankly, as far as we can tell, MIC is just a messed up estimate of mutual information.

If you are interested, Justin Kinney’s preprint and more material by Reshef et al are available on arXiv:

We discuss the notion of equitability as a desirable heuristic property, as underscored by our use of words like “roughly equal” and “similar” instead of “equal” when discussing it. Philosophically, we have been using equitability as an approximate property (..)

In my own work I am so pragmatic, I just use whatever works and does the job. But as soon as it comes to mathematical statements something like “roughly equal”, “desirable heuristic property”, and “approximate property” seem to be a way to weasel out of discussion… I’m not impressed at all by Mitzenmacher’s response.

This article is great! I am working with different information theory measures (entropy, mutual information, Jensen-Rényi Divergence) to drive a dynamic programming adaptive algorithm and I was intrigued when I found the MIC. Thanks to your article I’m convinced it’s just a messed up estimate of MI.