July 11, 2011 · 7:00 am

My First Contribution to PyMC

I’m excited to report that my first contribution back to the PyMC codebase was accepted. 🙂

It is a slight reworking of the pymc.Matplot.plot function that make it include autocorrelation plots of the trace, as well as histograms and timeseries. I also made the histogram look nicer (in my humble opinion).

Before:

After:

In this example, I can tell that MCMC hasn’t converged from the trace of beta_2 without my changes, but it is dead obvious from the autocorrelation plot of beta_2 in the new version.

The process of making changes to the pymc sourcecode is something that has intimidated me for a while. Here are the steps in my workflow, in case it helps you get started doing this, too.

# first fork a copy of pymc from https://github.com/pymc-devs/pymc.git on github
git clone https://github.com/pymc-devs/pymc.git

# then use virtualenv to install it and make sure the tests work
virtualenv env_pymc_dev
source env_pymc_dev/bin/activate

# then you can install pymc without being root
cd pymc
python setup.py install
# so you can make changes to it without breaking everything else

# to test that it is working
cd ..
python
>>> import pymc
>>> pymc.test()
# then make changes to pymc...
# to test changes, and make sure that all of the tests that use to pass still do repeat the process above
cd pymc
python setup.py install
cd ..
python
>>> import pymc
>>> pymc.test()

# once everything is perfect, push it to a public git repo and send a "pull request" to the pymc developers.

Is there an easier way? Let me know in the comments.

Filed under MCMC, software engineering

Tagged as MCMC, pymc, python

One response to “My First Contribution to PyMC”

Chris Fonnesbeck

July 12, 2011 at 11:06 pm

And a good contribution too!

As for the workflow, I don’t mess around with virtualenv much, but I do take advantage of branches in git. Any time I do something novel, I create a branch and mess with it there until I am happy, before merging it back to my local master and then pushing it to the remote. You can never have too many branches!

Posts
aco ai ai4hm algorithms baby animals Bayesian books conference contest costs dataviz data viz disease modeling dismod diversity diversity club free/open source funding gaussian processes gbd global health health inequality health metrics health records idv IDV4GH ihme infoviz ipython iraq journal club machine learning malaria matching algorithms matchings MCMC media microsimulation mortality mpld3 my research Mysteries networks networkx optimization orms pandas privacy probability public health pymc pymc3 python random effects reading list reproducible research reproductive health research jobs seminar sklearn software carpentry spanning trees sparql statistics stats survey talks TCS teaching Theory Blogs travel tutorial va verbal autopsy vital registration
Theory Blogs
some rights reserved

This material is released under the Creative Commons Noncommercial Attribution Share-Alike 3.0 License
Pages
- About
July 2011

M T W T F S S

1 2 3

4 5 6 7 8 9 10

11 12 13 14 15 16 17

18 19 20 21 22 23 24

25 26 27 28 29 30 31

« Jun Aug »
Archives
Archives
Meta