Category Archives: MCMC

A paper I just read

I read random papers once in a while from the AMS Math Reviews program, and I read one recently about an MCMC approach to X-ray imaging. It was a fun, detailed look at a few different ways to do sampling, and use effective sample size to figure out which worked better when.

It did also leave me wondering what the giant X-ray machines buried 1,000 feet underground are for, though.

Comments Off on A paper I just read

Filed under MCMC, Uncategorized

MCMC in Python: Power-posterior calculation with PyMC3

I’ve got a fun little project that has let me check in on the PyMC project after a long time away. Long-time readers of Healthy Algorithms might remember my obsession with PyMC2 from my DisMod days nearly ten years ago, but for those of you joining us more recently… there is a great way to build Bayesian statistical models with Python, and it is the PyMC package. It was completely rewritten for version 3, which is now available, and using it looks like this:

import numpy as np
import matplotlib.pyplot as plt

# Initialize random number generator
np.random.seed(123)

# True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]

# Size of dataset
size = 100

# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2

# Simulate outcome variable
Y = alpha + beta[0]*X1 + beta[1]*X2 + np.random.randn(size)*sigma

from pymc3 import Model, Normal, HalfNormal
basic_model = Model()

with basic_model:

    # Priors for unknown model parameters
    alpha = Normal('alpha', mu=0, sd=10)
    beta = Normal('beta', mu=0, sd=10, shape=2)
    sigma = HalfNormal('sigma', sd=1)

    # Expected value of outcome
    mu = alpha + beta[0]*X1 + beta[1]*X2

    # Likelihood (sampling distribution) of observations
    Y_obs = Normal('Y_obs', mu=mu, sd=sigma, observed=Y)

basic_model.logp({'alpha':1,
                 'beta':[1,1],
                 'sigma_log_':0})

What I need is a way to compute the “power-posterior” of a model, which is to say p(\theta\mid y,t) = \frac{p(y\mid \theta)^t p(\theta)}{\mathcal{Z}_t(y)}  .

I’ve figured out a pretty cool way to compute the prior and the likelihood of the model:

from theano import theano, tensor as tt

def logpriort(self):
    """Theano scalar of log-prior of the model"""
    factors = [var.logpt for var in self.free_RVs]
    return tt.add(*map(tt.sum, factors))
def logprior(self):
    """Compiled log probability density function"""
    return self.model.fn(logpriort(self))
def logliket(self):
    """Theano scalar of log-prior of the model"""
    factors = [var.logpt for var in self.observed_RVs] + self.potentials
    return tt.add(*map(tt.sum, factors))
def loglike(self):
    """Compiled log probability density function"""
    return self.model.fn(logliket(self))

Now I just need to put them together, to get the log of the power-posterior. There must be an easy way to do it, but this is not it:

def logpowerpt(self, t):
    return t*loglike(self) + logprior(self)
def logpowerp(self, t):
    return self.model.fn(logpowerpt(self, t))

vals = {'alpha':1,
        'beta':[1,1],
        'sigma_log_':0}
logpowerp(basic_model, 0)(vals)

Any suggestions?

Comments Off on MCMC in Python: Power-posterior calculation with PyMC3

Filed under MCMC

MCMC Mixing Mysteries

I love doing Math Reviews, for the random stuff I get to read. Here is a paper I probably would not have found:
MR3208124 Rom\’an, Jorge Carlos, Hobert, James P.\ and Presnell, Brett, On reparametrization and the Gibbs sampler, Statist. Probab. Lett. 91 (2014), 110–116.

The Gibbs sampler may be out of fashion with the Bayesian computation crowd these days, but the reparameterizations are still mysterious. I tried the PyMC3 NUTS sampler on the first example and it too has rather different mixing times: https://gist.github.com/aflaxman/be253895efd0e2962472

Comments Off on MCMC Mixing Mysteries

Filed under MCMC, Mysteries

MCMC in Python: Gaussian mixture model in PyMC3

PyMC3 is really coming along. I tried it out on a Gaussian mixture model that was the subject of some discussion on GitHub: https://github.com/pymc-devs/pymc3/issues/443#issuecomment-109813012 http://nbviewer.ipython.org/gist/aflaxman/64f22d07256f67396d3a

1 Comment

Filed under MCMC, software engineering, statistics

Interesting Q/A: autocorrelation for categorical var in MCMC

From http://stats.stackexchange.com/questions/10798/measures-of-autocorrelation-in-categorical-values-of-a-markov-chain, a question I run into from time to time:
> Are there any measures of auto-correlation for a sequence of observations of an (unordered) categorical variable?

An (accepted) answer that got me thinking:
> [L]ook directly at the convergence rate for the Markov chain.

My interpretation, in PyMC2 terms: run chain, calculate empirical transition probabilities for categorical variable, examine spectral gap.

Experimental notebook tk.

Comments Off on Interesting Q/A: autocorrelation for categorical var in MCMC

Filed under MCMC

MCMC in Python: How to make a custom sampler in PyMC

The PyMC documentation is a little slim on the topic of defining a custom sampler, and I had to figure it out for some DisMod work over the years. Here is a minimal example of how I did it, in answer to a CrossValidated question.

Comments Off on MCMC in Python: How to make a custom sampler in PyMC

Filed under MCMC

Sequential Monte Carlo in PyMC?

I’ve been reading about Sequential Monte Carlo recently, and I think it will fit well into the PyMC3 framework. I will give it a try when I have a free minute, but maybe someone else will be inspired to try it first. This paper includes some pseudocode.

smc

Comments Off on Sequential Monte Carlo in PyMC?

Filed under MCMC

Math and Dance

I almost didn’t share these HarleMCMC videos, but how long could I resist, really?

We’ll see how this holds up to repeated viewing…

Here is a math/dance video for the ages:

Comments Off on Math and Dance

Filed under MCMC, Uncategorized

MCMC in Python: (approximate) derivative-constrained Gaussian Processes with PyMC.gp

I’ve always enjoyed the Gaussian Process part of the PyMC package, and a question on the mailing list yesterday reminded me of a project I worked on with it that never came to fruition: how to implement constraints on the derivatives of the GP.

The best answer I could come up with is to use “potential” nodes, and do it approximately. That is to say, instead of constraining the derivative, I satisfy myself to constrain a secant that approximates the derivative. And instead of constraining it at every point in an interval, I satisfy myself to constrain it at a discrete subset of points.

Here is an ipython notebook example: [ipynb] [py]

Comments Off on MCMC in Python: (approximate) derivative-constrained Gaussian Processes with PyMC.gp

Filed under MCMC

MCMC in Python: Bayesian meta-analysis example

In slow progress on my plan to to go through the examples from the OpenBUGS webpage and port them to PyMC, I offer you now Blockers, a random effects meta-analysis of clinical trials.



[py] [pdf]

1 Comment

Filed under MCMC, software engineering