Hurrah for Free/Open Software like PyMC

A few posts ago, when I told you how amazingly simple it turned out to be to sample independent sets with PyMC.  Remember when I said that it was working a little differently than I expected, though?  I sent an email to the pymc-users mailing list, and, in just a few days, one of the developers, Anand Patil, replied to say that there was a little typo in their code which was making the chain reject with the probability it was supposed to accept with.  (I’m realizing that it is hard to make a story about debugging python code sound exciting, so let’s skip the build up and cut to the thrilling conclusion.)  Anand fixed the bug, which required changing one word, but also required finding that one word in the right 1200-line file.

Some of the folks I corresponded with from the PyMC list didn’t know what I was talking about with this sampling independent sets stuff, so I thought I’d expand a little bit on it now, as a attempt at gratitude.

There is a code snippet in my previous post which makes a Markov chain on independent sets in a graph. Let me show you a small example of what I mean by that. For K_2, the complete graph on 2 vertices, there are 3 independent sets; \{\}, \{v_0\}, and \{v_1\}. The set of all vertices \{v_0,v_1\} is not an independent set, because there is an edge between v_0 and v_2.

Now, if you look back at how I set up the potential in my IndepSetMC class, you’ll see that every vertex in the set contributes a factor of \beta to the probability that it appears (unless one of its neighbors also appears; then it contributes a factor of 0.)

def potential_logp(v, N_v):
    if v + max(N_v) > 1:
        return -Inf
        return self.beta*v

For the simplest example, look at K_1, the “complete” graph on one vertex, the two independent sets are \emptyset and \{v_0\}, and \Pr[S = \{v_0\}] = \frac{e^\beta}{1+e^\beta}. And, thanks to Anand’s bug fix, this is indeed what comes out.

For another simple example, but one where interesting things are already starting to happen, try K_2, the graph I tried to draw above, where there are 3 independent sets. Here \Pr[S = \{v_0\}] = \frac{e^\beta}{1+2e^\beta}, and this is what comes out when \beta is small. But as \beta grows, the observed probability that, for example, v_0 is part of the independent set diverges further and further from the stationary probability. That doesn’t mean anything is wrong with the code. It is a limitation of MCMC. Chains which sample large independent sets have to mix slowly, or else P=NP.

Comments Off on Hurrah for Free/Open Software like PyMC

Filed under MCMC

Comments are closed.