Monthly Archives: September 2008

A few posts ago, when I told you how amazingly simple it turned out to be to sample independent sets with PyMC. Remember when I said that it was working a little differently than I expected, though? I sent an email to the pymc-users mailing list, and, in just a few days, one of the developers, Anand Patil, replied to say that there was a little typo in their code which was making the chain reject with the probability it was supposed to accept with. (I’m realizing that it is hard to make a story about debugging python code sound exciting, so let’s skip the build up and cut to the thrilling conclusion.) Anand fixed the bug, which required changing one word, but also required finding that one word in the right 1200-line file.

Some of the folks I corresponded with from the PyMC list didn’t know what I was talking about with this sampling independent sets stuff, so I thought I’d expand a little bit on it now, as a attempt at gratitude.

If you’ve read some of my previous posts, you might be wondering, what does Health Metrics have to do with sampling independent sets in graphs? (What is Health Metrics? you might also be wondering.)

In my new job, I’m not that interested in sampling independent sets. I’m mostly interested in sampling from a weird distribution that comes out of a Bayesian denoising problem.

Let me set the stage: a huge project that IHME is part of is the Global Burden of Disease study, which will (among other things) rank 200 diseases and injuries according to how severely they impact humanity. How you could possibly, ethically make this list is the topic of many books, and I won’t try to get into it now. IHME director Chris Murray seems to favor measuring impact in “disability adjusted life years” (DALYs), which is a fairly individualistic, fairly egalitarian approach.

As I was saying in my last post, I’ve been getting interested in actually running Markov Chain Monte Carlo algorithms, instead of trying to prove things about their asymptotic performance. It seems like the “stats” way to do this is to use R and WinBUGS, but I’ve always thought that R programming looks messy. Python is easier on my eyes.

So, it’s my good fortune that PyMC exists. This means I don’t need to do any hard work, like coding a Gibbs sampler or learning R. Let me show you.

My new job is in a den of Bayesians! This sort of philosophical trouble is something I avoided for years when I worked on random graphs. In combinatorial probability, I just said “assume the axioms of probability” and got to look for all the interesting facts that follow logically. People want these probability calculations to say something about the “real world”? That’s not my thing; it’s up to them to go from math to science. Well, now it is my problem.

I’m going to join the world of blogging to share all the interesting applications of computer science theory to global health metrics. It turns out that there are plenty; certainly more than I can take on alone. Stay tuned.