Category Archives: statistics

January 27, 2015 · 12:00 pm

Kish Stuff

A student came by interested in survey statistics and we go to talking about what an amazing person Leslie Kish must have been. We did some googling on it. Here are a few items we found:

http://projecteuclid.org/download/pdf_1/euclid.ss/1032209665
http://www.amstat.org/about/statisticiansinhistory/index.cfm?fuseaction=biosinfo&BioID=9

Click to access Kish_Leslie_1977_edit_(wla_092809).pdf

Comments Off on Kish Stuff

Filed under statistics

Tagged as people

January 20, 2015 · 12:00 pm

Non-parametric regression in Python: Gaussian Processes in sklearn (with a little PyMC)

I’ve got a fun class going this quarter, on “artificial intelligence for health metricians”, and the course content mixed with some of the student interest has got me looking at the options for doing Gaussian process regression in Python. `PyMC2` has some nice stuff, but the `sklearn` version fits with the rest of my course examples more naturally, so I’m using that instead.

But `sklearn` doesn’t have the fanciest of fancy covariance functions implemented, and at IHME we have been down the road of the Matern covariance function for over five years now. It’s in `PyMC`, so I took a crack at mash-up. (Took a mash at a mash-up?) There is some room for improvement, but it is a start. If you need to do non-parametric regression for something that is differentiable more than once, but less than infinity times, you could try starting here: http://nbviewer.ipython.org/gist/aflaxman/af7bdb56987c50f3812b

p.s. Chris Fonnesbeck has some great notes on doing stuff like this and much more here: http://nbviewer.ipython.org/github/fonnesbeck/Bios366/blob/master/notebooks/Section5_1-Gaussian-Processes.ipynb

Comments Off on Non-parametric regression in Python: Gaussian Processes in sklearn (with a little PyMC)

Filed under statistics

Tagged as gaussian processes, gpr

December 18, 2014 · 12:00 pm

Bayesian Correlation in PyMC

Here is a StackOverflow question with a nice figure:

Is there a nice, simple reference for just what exactly these graphical model figures mean? I want more of them.

4 Comments

Filed under statistics

Tagged as pymc

December 17, 2014 · 12:00 pm

Statistics in Python: Calculating R^2

I wanted to include some old-fashioned statistics in a paper recently, and did some websearching on how to calculate R^2 in Python. It’s all very touchy, it seems. Here’s what I found:

http://stats.stackexchange.com/questions/36064/calculating-r-squared-coefficient-of-determination-with-centered-vs-un-center
http://stackoverflow.com/questions/893657/how-do-i-calculate-r-squared-using-python-and-numpy
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.linregress.html
http://forums.udacity.com/questions/100154896/why-is-r-squared-from-formula-different-than-scipy-functions-one

I eventually went with this:

%load_ext rmagic

x = np.array(1/df.J)
y = np.array(df.conc_rand)
%Rpush x y
%R print(summary(lm(y ~ x + 0)))

Comments Off on Statistics in Python: Calculating R^2

Filed under statistics

Tagged as R^2, stats

November 18, 2014 · 12:00 pm

CrossValidated on interesting and well-written papers in applied stats

I should read some of these, and stash a few for the PGF journal club:

http://stats.stackexchange.com/questions/9365/what-are-some-interesting-and-well-written-applied-statistics-papers

http://www.jstor.org/stable/2347679

Comments Off on CrossValidated on interesting and well-written papers in applied stats

Filed under statistics

Tagged as journ

October 29, 2014 · 12:00 pm

MCMC in Python: observed data for a sum of random variables in PyMC

I like answering PyMC questions on Stack Overflow, but sometimes I give an answer and end up the one with the question. Like what would you model as the sum of a Poisson and a Negative Binomial?

Comments Off on MCMC in Python: observed data for a sum of random variables in PyMC

Filed under statistics

Tagged as pymc

July 24, 2014 · 8:00 am

MCMC in Python: sim and fit with same model

Here is a github issue and solution that I saw the other day. I think it’s a nice pattern.

def generate_model(values={'mu': true_param, 'm': None}):

    #prior
    mu = pymc.Uniform("mu", lower=-10, upper=10, value=values['mu'], 
        observed=(values['mu'] is not None))

    # likelihood function
    m = pymc.Normal("m", mu=mu, tau=tau, value=values['m'], 
        observed=(values['m'] is not None))

    return locals()

Comments Off on MCMC in Python: sim and fit with same model

Filed under statistics

Tagged as MCMC, pymc, python

July 23, 2014 · 8:00 am

MCMC in Python: Fit a non-linear function with PyMC

Here is a recent q&a on stack overflow that I did and liked.

Comments Off on MCMC in Python: Fit a non-linear function with PyMC

Filed under statistics

Tagged as MCMC, pymc

July 1, 2014 · 8:00 am

The one before that

Jake Vanderplas’s comparison of Python MCMC modules was preceded by a Bayesian polemic. In general, I find the stats philosophy war old-timey and distracting, but his comparison of confidence intervals and credible intervals is something I need to understand better.

http://jakevdp.github.io/blog/2014/06/12/frequentism-and-bayesianism-3-confidence-credibility/

Comments Off on The one before that

Filed under statistics

Tagged as MCMC

June 30, 2014 · 8:00 am

MCMC in Python: a bake-off

While I’m on a microblogging spree, I’ve been meaning to link to this informative comparison of pymc, emcee, and pystan: http://jakevdp.github.io/blog/2014/06/14/frequentism-and-bayesianism-4-bayesian-in-python/

Comments Off on MCMC in Python: a bake-off

Filed under statistics

Tagged as MCMC, pymc, python

Category Archives: statistics

Kish Stuff

Non-parametric regression in Python: Gaussian Processes in sklearn (with a little PyMC)

Bayesian Correlation in PyMC

Statistics in Python: Calculating R^2

CrossValidated on interesting and well-written papers in applied stats

MCMC in Python: observed data for a sum of random variables in PyMC

MCMC in Python: sim and fit with same model

MCMC in Python: Fit a non-linear function with PyMC

The one before that

MCMC in Python: a bake-off

Posts

Theory Blogs

some rights reserved

Pages

Archives

Meta