Category Archives: statistics

CrossValidated on interesting and well-written papers in applied stats

I should read some of these, and stash a few for the PGF journal club:

http://stats.stackexchange.com/questions/9365/what-are-some-interesting-and-well-written-applied-statistics-papers

http://www.jstor.org/stable/2347679

Leave a comment

Filed under statistics

MCMC in Python: observed data for a sum of random variables in PyMC

I like answering PyMC questions on Stack Overflow, but sometimes I give an answer and end up the one with the question. Like what would you model as the sum of a Poisson and a Negative Binomial?

Comments Off

Filed under statistics

MCMC in Python: sim and fit with same model

Here is a github issue and solution that I saw the other day. I think it’s a nice pattern.

def generate_model(values={'mu': true_param, 'm': None}):

    #prior
    mu = pymc.Uniform("mu", lower=-10, upper=10, value=values['mu'], 
        observed=(values['mu'] is not None))

    # likelihood function
    m = pymc.Normal("m", mu=mu, tau=tau, value=values['m'], 
        observed=(values['m'] is not None))

    return locals()

Comments Off

Filed under statistics

MCMC in Python: Fit a non-linear function with PyMC

Here is a recent q&a on stack overflow that I did and liked.

Comments Off

Filed under statistics

The one before that

Jake Vanderplas’s comparison of Python MCMC modules was preceded by a Bayesian polemic. In general, I find the stats philosophy war old-timey and distracting, but his comparison of confidence intervals and credible intervals is something I need to understand better.

http://jakevdp.github.io/blog/2014/06/12/frequentism-and-bayesianism-3-confidence-credibility/

Comments Off

Filed under statistics

MCMC in Python: a bake-off

While I’m on a microblogging spree, I’ve been meaning to link to this informative comparison of pymc, emcee, and pystan: http://jakevdp.github.io/blog/2014/06/14/frequentism-and-bayesianism-4-bayesian-in-python/

Comments Off

Filed under statistics

MCMC in Python: Another thing that turns out to be hard

Here is an interesting StackOverflow question, about a model for data distributed as the sum of two uniforms with unknown support. I was surprised how hard it was for me.

http://stackoverflow.com/questions/24379868/estimate-the-parameters-of-a-random-variable-which-is-the-sum-of-uniform-random/24397044#24397044

I think the future of probabilistic programming should be to make a model for this easy to code.

Comments Off

Filed under statistics