I read the The Girl With the Dragon Tattoo series recently, which was extremely engrossing. The first book has a bit of a health metrics theme, with each section prefaced with a shocking statistic about violence against women in Sweden. The second book has a bit of a math theme, with each section prefaced by a correct, if inane algebraic equation.
Also in the second book, the tattooed girl spends some time reading a strangely titled math book, Dimensions in Mathematics, and I liked the story enough to google the book, since it was presented with author and publisher. It turned out that this just revealed more mystery.
Besides the marvelous upgrade to ipython, there were some other things I saw at SciPy 2011 that I want to remember to remember.
I think I’ll have a lot more to say about Dexy soon, because I really need something like that. A tool to make documentation sexy. If only the tool itself had more documentation!
Speaking of SciPy 2011 (as I was in my last post), the coolest, most draw-dropping-est demo I saw there was hands-down for the new ipython. The most cutting edge stuff is available on the web. I want it.
I just returned from the SciPy 2011 conference in Austin. Definitely a different experience than a theory conference, and definitely different than the mega-conferences I’ve found myself at lately. I think I like it. My goal was to evangelize for PyMC a little bit, and I think that went successfully. I even got to meet PyMC founder Chris Fonnesbeck in person (about 30 seconds before we presented a 4 hour tutorial together).
For the tutorial, I put together a set of PyMC-by-Example slides and code to dig into that silly relationship between Human Development Index and Total Fertility Rate that foiled my best attempts at Bayesian model selection so long ago.
I’m not sure the slides stand on their own, but together with the code samples they should reproduce my portion of the talk pretty well. I even started writing it up for people who want to read it in paper form, but then I ran out of momentum. Patches welcome.
20 seconds, 20 minutes, or 20 hours. These are all amounts of time that a computational method I’ve been working at some time has taken to complete processing. They each lead to a very different experience for the model developer, and probably in the end for the model, too. Twenty seconds is definitely what I prefer.
Filed under statistics, TCS
Just like last summer, many of the Post-Bachelors Fellows of IHME are away now to learn where global health metrics come from. Spencer James has a great photoblog from his work in Zambia. Are there other PBFs that I can follow from afar?
I’m excited to report that my first contribution back to the PyMC codebase was accepted. 🙂
It is a slight reworking of the pymc.Matplot.plot function that make it include autocorrelation plots of the trace, as well as histograms and timeseries. I also made the histogram look nicer (in my humble opinion).
In this example, I can tell that MCMC hasn’t converged from the trace of
beta_2 without my changes, but it is dead obvious from the autocorrelation plot of
beta_2 in the new version.
The process of making changes to the pymc sourcecode is something that has intimidated me for a while. Here are the steps in my workflow, in case it helps you get started doing this, too.
# first fork a copy of pymc from https://github.com/pymc-devs/pymc.git on github
git clone https://github.com/pymc-devs/pymc.git
# then use virtualenv to install it and make sure the tests work
# then you can install pymc without being root
python setup.py install
# so you can make changes to it without breaking everything else
# to test that it is working
>>> import pymc
# then make changes to pymc...
# to test changes, and make sure that all of the tests that use to pass still do repeat the process above
python setup.py install
>>> import pymc
# once everything is perfect, push it to a public git repo and send a "pull request" to the pymc developers.
Is there an easier way? Let me know in the comments.