# Tag Archives: MCMC

## Introducing the H-RAM

I haven’t had time to say much recently, due to some travel for work, but I did have a chance to prototype the Hit-and-Run/Adaptive Metropolis approach to MCMC that I mentioned in my last post.  Was that really three weeks ago?  How time flies.

Anyway, thanks to the tip Aram pointed out in the comments, Hit-and-Run can take steps in the correct direction without explicitly computing an approximation of the covariance matrix, just by taking a randomly weighted sum of random points.  It’s nearly magic, although Aram says that it actually makes plenty of sense. The code for this “H-RAM” looks pretty similar to my original Hit-and-Run step method, and it’s short enough that I’ll just show it to you:

class HRAM(Gibbs):
def __init__(self, stochastic, proposal_sd=None, verbose=None):
Metropolis.__init__(self, stochastic, proposal_sd=proposal_sd,
verbose=verbose, tally=False)
self.proposal_tau = self.proposal_sd**-2.
self.n = 0
self.N = 11
self.value = rnormal(self.stochastic.value, self.proposal_tau, size=tuple([self.N] + list(self.stochastic.value.shape)))

def step(self):
x0 = self.value[self.n]
u = rnormal(zeros(self.N), 1.)
dx = dot(u, self.value)

self.stochastic.value = x0
logp = [self.logp_plus_loglike]
x_prime = [x0]

for direction in [-1, 1]:
for i in xrange(25):
delta = direction*exp(.1*i)*dx
try:
self.stochastic.value = x0 + delta
logp.append(self.logp_plus_loglike)
x_prime.append(x0 + delta)
except ZeroProbability:
self.stochastic.value = x0

i = rcategorical(exp(array(logp) - flib.logsum(logp)))
self.value[self.n] = x_prime[i]
self.stochastic.value = x_prime[i]

if i == 0:
self.rejected += 1
else:
self.accepted += 1

self.n += 1
if self.n == self.N:
self.n = 0


Compare the results to the plain old Hit-and-Run step method:

Filed under MCMC

## MCMC in Python: PyMC Step Methods and their pitfalls

There has been some interesting traffic on the PyMC mailing list lately. It seems that there is a common trouble with the “Adaptive Metropolis” step method, and it’s failure to converge. I’ve been quite impressed with this approach, and I haven’t had the problems that others reported, so I started wondering: Have I been lucky? Have I been not looking carefully?

I decided to do some experiments to make Metropolis and Adaptive Metropolis shine, and since I’m an aspiring math filmmaker these days, I made my experiments into movies.

I consider the Metropolis step method the essence of MCMC. You have a particular point in parameter space, and you tentatively perturb it to a new random point, and then decide to accept or reject this new point with a carefully designed probability, so that the stationary distribution of the Markov chain is something you’re interested in. It’s magic, but it’s good, simple magic.

Here is the simplest example I could come up with, sampling from a uniform distribution on the unit square in the plane using Metropolis steps. Any self-respecting step method can sample from the unit square in two dimensions!

Filed under MCMC, videos

## MCMC in Python: Custom StepMethods and bounded-depth spanning tree distraction

I was looking for a distraction earlier this week, which led me to the world of stackexchange sites. The stack overflow has been on my radar for a while now, because web-search for coding questions often leads there and the answers are often good. And I knew that math overflow, tcs overflow, and even stats overflow existed, but I’d never really explored these things.

Well, diversion found! I got enamored with an MCMC question on the tcs site, about how to find random bounded-depth spanning trees. Bounded-depth spanning trees are something that I worked on with David Wilson and Riccardo Zechinna in my waning days of my MSR post-doc, and we came up with some nice results, but the theoretical ones are just theory, and the practical ones are based on message passing algorithms that still seem magical to me, even after hours of patient explanation from my collaborators.

So let’s do it in PyMC… this amounts to an exercise in writing custom step methods, something that’s been on my mind for a while anyways. And, as a bonus, I got to make an animation of the chain in action which I find incredibly soothing to watch on repeat:

Filed under MCMC, TCS

## World Malaria Report and MCMC

OMG I have got busy. I went to NIPS and the weekend disappeared and now it’s post-doc interview season again, already! So much to say, but I plan to pace myself. For this short post, an exciting announcement that my model of the insecticide treated mosquito net distribution supply chain was used in the WHO 2010 World Malaria Report, which just came out. Since it is a Bayesian statistical model that draws samples from a posterior distribution with MCMC, it’s really nice that the report includes some of the uncertainty intervals around the coverage estimates. Guess what? There is a lot of uncertainty. But nets are getting to households and getting used. Pages 19 and 20 in Chapter 4 have the results of our hard work.

Filed under global health, MCMC

## MCMC in Python: Statistical model stuck on a stochastic system dynamics model in PyMC

My recent tutorial on how to stick a statistical model on a systems dynamics model in PyMC generated a good amount of reader interest, as well as an interesting comment from Anand Patil, who writes:

Something that might interest you is that, in continuous-time stochastic differential equation models, handling the unobserved sample path between observations is really tricky. Roberts and Stramer’s On inference for partially observed nonlinear diffusion models using the Metropolis-Hastings algorithm explains why. This difficulty can apply to discrete-time models with loads of missing data as well. Alexandros Beskos has produced several really cool solutions.

This body of research is quite far from the vocabulary I am familiar with, so I’m not sure how serious a problem this could be for me. It did get me interested in sticking my statistical model to a systems model with stochastic dynamics, though, something which took only a few additional lines… thanks PyMC!

## Stochastic SI model

from pymc import *
from numpy import *

#observed data
T = 10
susceptible_data = array([999,997,996,994,993,992,990,989,986,984])
infected_data = array([1,2,5,6,7,18,19,21,23,25])

# stochastic priors
beta = Uniform('beta', 0., 1., value=.05)
gamma = Uniform('gamma', 0., 1., value=.001)
tau = Normal('tau', mu=.01, tau=100., value=.01)

# stochastic compartmental model
S = {}
I = {}

## uninformative initial conditions
S[0] = Uninformative('S_0', value=999.)
I[0] = Uninformative('I_0', value=1.)

## stochastic difference equations for later times
for i in range(1,T):
@deterministic(name='E[S_%d]'%i)
def E_S_i(S=S[i-1], I=I[i-1], beta=beta):
return S - beta * S * I / (S + I)
S[i] = Normal('S_%d'%i, mu=E_S_i, tau=tau, value=E_S_i.value)

@deterministic(name='E[I_%d]'%i)
def E_I_i(S=S[i-1], I=I[i-1], beta=beta, gamma=gamma):
return I + beta * S * I / (S + I) - gamma * I
I[i] = Normal('I_%d'%i, mu=E_I_i, tau=tau, value=E_I_i.value)

# data likelihood
A = Poisson('A', mu=[S[i] for i in range(T)], value=susceptible_data, observed=True)
B = Poisson('B', mu=[I[i] for i in range(T)], value=infected_data, observed=True)


This ends up taking a total of 6 lines more than the deterministic version, and all the substantial changes are from lines 24-34. So, question one is for Anand, do I have to worry about unobserved sample paths here? If I’ve understood Roberts and Stramer’s introduction, I should be ok. Question two returns to a blog topic from one year ago, that I’ve continued to try to educate myself about: how do I decide if and when this more complicated model should be used?

Comments Off on MCMC in Python: Statistical model stuck on a stochastic system dynamics model in PyMC

Filed under global health, MCMC, statistics

## MCMC in Python: How to stick a statistical model on a system dynamics model in PyMC

A recent question on the PyMC mailing list inspired me.  How can you estimate transition parameters in a compartmental model?  I did a lit search for just this when I started up my generic disease modeling project two years ago.  Much information, I did not find.  I turned up one paper which said basically that using a Bayesian approach was a great idea and someone should try it (and I can’t even find that now!).

Part of the problem was language.  I’ve since learned that micro-simulators call it “calibration” when you estimate parameter values, and there is a whole community of researchers working on “black-box modeling plug-and-play inference” that do something similar as well.  These magic phrases are incantations to the search engines that help find some relevant prior work.

But I started blazing my own path before I learned any of the right words; with PyMC, it is relatively simple.  Consider the classic SIR model from mathematical epidemology.  It’s a great place to start, and it’s what Jason Andrews started with on the PyMC list.  I’ll show you how to formulate it for Bayesian parameter estimation in PyMC, and how to make sure your MCMC has run for long enough. Continue reading

Filed under global health, MCMC, statistics

## MCMC in Python: Global Temperature Reconstruction with PyMC

A short note on the PyMC mailing list alerted me that the Apeescape, the author of mind of a Markov chain blog, was thinking of using PyMC for replicating some controversial climate data analysis, but was having problems with it. Since I’m a sucker for controversial data, I decided to see if I could do the replication exercise in PyMC myself.

I didn’t dig in to what the climate-hockey-stick fuss is about, that’s something I’ll leave for my copious spare time. What I did do is find the data pretty easily available on the original author’s website, and make a translation of the R/bugs model into pymc/python. My work is all in a github repository if you want to try it yourself, here.

Based on Apeescape’s bugs model, I want to have $\textnormal{temp}_t = N(\mu_t, \sigma^2)$ where $\mu_t = \beta_0 + \beta_1\textnormal{temp}_{t-1} + \beta_2\textnormal{temp}_{t-2} + \sum_{i=3}^{12} \beta_i(\textnormal{PC})_{t,i}$, with priors $\vec{\beta} \sim N(\vec{0}, 1000 I)$ and $\sigma \sim \textnormal{Uniform}(0,100)$.

I implemented this in a satisfyingly concise 21 lines of code, that also generate posterior predictive values for model validation:

# load data
data = csv2rec('BUGS_data.txt', delimiter='\t')

# define priors
beta = Normal('beta', mu=zeros(13), tau=.001, value=zeros(13))
sigma = Uniform('sigma', lower=0., upper=100., value=1.)

# define predictions
pc = array([data['pc%d'%(ii+1)] for ii in range(10)]) # copy pc data into an array for speed & convenience
@deterministic
def mu(beta=beta, temp1=data.lagy1, temp2=data.lagy2, pc=pc):
return beta[0] + beta[1]*temp1 + beta[2]*temp2 + dot(beta[3:], pc)

@deterministic
def predicted(mu=mu, sigma=sigma):
return rnormal(mu, sigma**-2.)

# define likelihood
@observed
def y(value=data.y, mu=mu, sigma=sigma):
return normal_like(value, mu, sigma**-2.)


Making an image out of this to match the r version got me stuck for a little bit, because the author snuck in a call to “Friedman’s SuperSmoother” in the plot generation code. That seems unnecessarily sneaky to me, especially after going through all the work of setting up a model with fully bayesian priors. Don’t you want to see the model output before running it through some highly complicated smoothing function? (The super-smoother supsmu is a “running lines smoother which chooses between three spans for the lines”, whatever that is.) In case you do, here it is, together with an alternative smoother I hacked together, since python has no super-smoother that I know of.

Since I have the posterior predictions handy, I plotted the median residuals against the median predicted temperature values. I think this shows that the error model is fitting the data pretty well:

Filed under MCMC, statistics

## Applied Approximate Counting: Malaria

My first first-authored global health paper came out today (I consider it my first “first-authored” paper ever, since the mathematicians I’ve worked with deviantly list authorship in alphabetical order regardless of seniority and contribution). It’s a bit of a mouthful by title: Rapid Scaling Up of Insecticide-Treated Bed Net Coverage in Africa and Its Relationship with Development Assistance for Health: A Systematic Synthesis of Supply, Distribution, and Household Survey Data.

What I find really pleasing about this research paper is the way it continues research I worked on in graduate school, but in a completely different and unexpected direction. Approximate counting is something that my advisor specialized in, and he won a big award for the random polynomial time algorithm for approximating the volume of convex bodies. I followed in his footsteps when I was a student, and I’m still doing approximate counting, it’s just that now, instead of approximating the amount of high-dimensional sand that will fit in an oddly shaped high-dimensional box, I’ve been approximating the number of insecticide-treated bednets that have made it from manufacturers through the distribution supply-chain and into the households of malaria-endemic regions of the world. I’m even using the same technique, Markov-chain Monte Carlo.

I’ve been itching to write about the computational details of this research for a while, and now that the paper’s out, I will have my chance. But for today, I refer you to the PLoS Med paper, and the technical appendix, and the PyMC code on github.

Filed under global health, MCMC

## MCMC in Python: Sudoku is a strange game

I was on a “vacation” for the last week, which included a real vacation to visit my parents, and also a scare-quotes vacation to Detroit that is a totally different subject. But the side effect of this vacation that is to be the topic of this post is the strange game of Sudoku, which is popular with travelers.

Jessi was with me and she seems to like this Sudoku business (although she denies it), so I thought about playing, too. But I always get interested in the wrong parts of these games. Solving Sudoku puzzles seems more like a game for computers than humans, and writing a program to let computers have their fun is the way I would prefer to waste time while en route. There really isn’t much to it:

def solve(T):
# if all cells are filled in, we win
if all(T > 0):
return 'success'

# solve T recursively, by trying all values for the most constrained var
pos = possibilities(T)
i,j = most_constrained(T, pos)

for val in pos[(i,j)]:
T[i,j] = val
if solve(T) == 'success':
return 'success'

# if this point is reached, this branch is unsatisfiable
T[i, j] = -1
return 'failure'


Is this ruining anyone’s homework problems? Just in case it is, everything I said at the beginning of my post on minimum spanning trees still applies.

With plenty of time left in my flight, this was just the opening move, though. The real question is what makes this fun for the humans who find this kind of thing fun? One hypothesis, the kind that comes naturally when you have watched the statistical physicists in action, is that fun is some sort of phase transition, and if you take random Sudoku puzzles with a certain number of entries you will maximize fun. Half-baked, I know, but as interesting as the movies they show on planes, at least. And it raises the extremely important, and possibly unsolved challenge, how do you sample uniformly at random from Sudoku puzzles with n cells filled in?

In my travel-addled state, I thought maybe a good way to do it would be to start with some complete configuration, knockout a large fraction of the cells at random, permute the remaining cells, and then solve the resulting configuration. Repeating this has got to be an ergodic markov chain, right? I’m not sure… and even then, do you think the stationary distribution is uniform?

def rand(n, T=None):
if T == None:
T = -1*ones([9,9])

# solve it to generate an initial solution
solve(T)

# do random shuffles to approximate uniformly random solution
for k in range(5):
select_random_cells(T, 20)
randomly_permute_labels(T)
solve(T)

# remove appropriate amount of labels
select_random_cells(T, n)

return T


Now, when Jessi found out that I was helping computers take her sudoku-solving job away, she thought I was a geek, but when she found out I was generating puzzles with more than one solution, she was outraged. Sudoku puzzles have a unique solution! So maybe what is really fun is a random puzzle with a unique solution, and the right number of cells filled in, where a smaller number of cells means that harder puzzles are right. Fortunately/unfortunately my travel ended before I finished investigating this important issue.

I doubt it is related, but I got pretty sick the next day.