Monthly Archives: January 2015

Non-parametric regression in Python: Gaussian Processes in sklearn (with a little PyMC)

I’ve got a fun class going this quarter, on “artificial intelligence for health metricians”, and the course content mixed with some of the student interest has got me looking at the options for doing Gaussian process regression in Python. `PyMC2` has some nice stuff, but the `sklearn` version fits with the rest of my course examples more naturally, so I’m using that instead.

But `sklearn` doesn’t have the fanciest of fancy covariance functions implemented, and at IHME we have been down the road of the Matern covariance function for over five years now. It’s in `PyMC`, so I took a crack at mash-up. (Took a mash at a mash-up?) There is some room for improvement, but it is a start. If you need to do non-parametric regression for something that is differentiable more than once, but less than infinity times, you could try starting here: http://nbviewer.ipython.org/gist/aflaxman/af7bdb56987c50f3812b

p.s. Chris Fonnesbeck has some great notes on doing stuff like this and much more here: http://nbviewer.ipython.org/github/fonnesbeck/Bios366/blob/master/notebooks/Section5_1-Gaussian-Processes.ipynb

Comments Off on Non-parametric regression in Python: Gaussian Processes in sklearn (with a little PyMC)

Filed under statistics

Natl Academy report with comics

…about interdisciplinary research, at that: http://www.nap.edu/catalog/11153/facilitating-interdisciplinary-research

Comments Off on Natl Academy report with comics

Filed under science policy

Pretty bug in mpld3

It’s not quite d3-broke-and-made-art quality, but I like the plot in this bug report: https://github.com/jakevdp/mpld3/issues/274#issuecomment-68576519

Comments Off on Pretty bug in mpld3

Filed under dataviz, software engineering

PyMC2 function evals

PyMC2 has some tricky tricks for reducing function evaluations if possible. A question asked and answered on Stack Overflow investigates: http://stackoverflow.com/q/27714635/1935494 and I made a IPython Notebook with more details, too: http://nbviewer.ipython.org/gist/aflaxman/c07c5261bf22f6847098

Comments Off on PyMC2 function evals

Filed under software engineering

A little PyMC2 trick

Here is a little trick for getting around a pesky initialization issue in PyMC2 models, asked and answers on Stack Overflow when thing were quiet around here: http://stackoverflow.com/a/27724637/1935494

Comments Off on A little PyMC2 trick

Filed under software engineering

Reproducible Computational Research by UW Folks

This interesting thing crossed my inbox during the quiet time between quarters:

Inspired by Dave and Randy’s presentations earlier in the quarter, our lab happened to publish two preprints today, both with supplemental GitHub repositories.

As mentioned several times, the reproducible part is hard. I would appreciate any feedback on our attempts to provide data and code, and how they might be improved. Of course you are welcome to comment on preprints if you wish.

1) Heare JE, Blake B, Davis JP, Vadopalas B, Roberts SB. (2014) Evidence of Ostrea lurida (Carpenter 1894) population structure in Puget Sound, WA. PeerJ PrePrints 2:e704v1 http://dx.doi.org/10.7287/peerj.preprints.704v1

GitHub Repo (Data and R scripts): https://github.com/jheare/OluridaSurvey2014

2) Indication of family-specific DNA methylation patterns in developing oysters
Claire E. Olson, Steven B. Roberts
bioRxivdoi: http://dx.doi.org/10.1101/012831

GitHub Repo (IPython notebook): https://github.com/che625/olson-ms-nb/tree/1.0

Any feedback on how we might improve our Repositories is certainly welcome.

Very daring. I hope it was ok to share on my blog. I find this level of transparency inspiring.

The discussion that ensued indicates that there is still room for better tools to archive the computational environment where these analyses are being performed. I’ve always dreamed of doing my whole project in a virtual machine and then freezing it for posterity when I’m done. It would be the digital version of keeping a laptop on my shelf for each analysis. Easier said than done, however.

The discussion also resulted in a new wiki listing code products that accompany UW research projects: https://github.com/uwescience/reproducible/wiki/Code-Products

2 Comments

Filed under software engineering