I was looking for a distraction earlier this week, which led me to the world of stackexchange sites. The stack overflow has been on my radar for a while now, because web-search for coding questions often leads there and the answers are often good. And I knew that math overflow, tcs overflow, and even stats overflow existed, but I’d never really explored these things.
Well, diversion found! I got enamored with an MCMC question on the tcs site, about how to find random bounded-depth spanning trees. Bounded-depth spanning trees are something that I worked on with David Wilson and Riccardo Zechinna in my waning days of my MSR post-doc, and we came up with some nice results, but the theoretical ones are just theory, and the practical ones are based on message passing algorithms that still seem magical to me, even after hours of patient explanation from my collaborators.
So let’s do it in PyMC… this amounts to an exercise in writing custom step methods, something that’s been on my mind for a while anyways. And, as a bonus, I got to make an animation of the chain in action which I find incredibly soothing to watch on repeat:
The first thing to do is wrap a networkx graph in a PyMC stochastic, and put some penalty on vertices being too far from the root:
def BDST(G, root=0, k=5, beta=1.): T = nx.minimum_spanning_tree(G) T.base_graph = G @mc.stoch(dtype=nx.Graph) def bdst(value=T, root=root, k=k, beta=beta): path_len = pl.array(nx.shortest_path_length(value, root).values()) return -beta * pl.sum(path_len > k) return bdst
Easy, right? The parameter
beta is an inverse temperature, according to statistical physicists, and when it gets large the chain might freeze. Speaking of the chain, the important part of this exercise is in the step method, which must be customized to move from spanning tree to spanning tree on the base graph. The stackexchange question poser, Arman, tried a step method that chooses a random edge not in the tree, chooses a random edge on the cycle that adding this edge to the tree would create, and swaps them if they don’t violate the depth-bound constraint. Since I’ve extended this approach to have soft constraints, I’ll propose swapping them and sometimes reject the proposal. This requires only overwriting methods for
reject in the
class BDSTMetropolis(mc.Metropolis): def __init__(self, stochastic): mc.Metropolis.__init__(self, stochastic, scale=1., proposal_sd='custom', proposal_distribution='custom', verbose=None, tally=False) def propose(self): T = self.stochastic.value T.u_new, T.v_new = T.edges() while T.has_edge(T.u_new, T.v_new): T.u_new, T.v_new = random.choice(T.base_graph.edges()) T.path = nx.shortest_path(T, T.u_new, T.v_new) i = random.randrange(len(T.path)-1) T.u_old, T.v_old = T.path[i], T.path[i+1] T.remove_edge(T.u_old, T.v_old) T.add_edge(T.u_new, T.v_new) self.stochastic.value = T def reject(self): T = self.stochastic.value T.add_edge(T.u_old, T.v_old) T.remove_edge(T.u_new, T.v_new) self.stochastic.value = T
That’s it. To use it, make a chain and sample from it, while turning up the inverse heat. Will it work? That’s a research question, not a research distraction. It works fine on small grid graphs, though.
beta = mc.Uninformative('beta', value=1.) G = nx.grid_graph([11, 11]) root = (5,5) bdst = BDST(G, root, 10, beta) mod_mc = mc.MCMC([beta, bdst]) mod_mc.use_step_method(BDSTMetropolis, bdst) mod_mc.use_step_method(mc.NoStepper, beta) for i in range(5): beta.value = i*5 mod_mc.sample(1000) print 'cur depth', max_depth.value
Here’s an exercise for the reader, if you’re interested in designing a StepMethod like this of your own. The message passing algorithm we settled on in Clustering with shallow trees is more analogous to a walk on in-arborsences, i.e. rooted, directed trees with every edge pointing towards the root. The edge swaps are in the opposite order as above: first choose a node and delete its out-edge, which disconnects the arborsence, and then choose a new node for the disconnected node to point to. The benefit of this is that it makes it easier to keep track of the depth of the tree; only the first node and its children change depth. Part two of the exercise, which I think would be too hard to assign to students without an additional example, is to make the steps in out-arborsence chain into a Gibbs sampler. This would definitely be a good idea if you really want to make spanning trees mix on big graphs, though.
Distraction over, back to work. But don’t those spanning trees on grid graphs look like great mazes? I’ve got a distraction for another day.