Put it in a figure+

An old adage when writing research papers is “put it in a figure”. If there is one thing that I want the reader to know when they put my paper down, then I try to put it in a beautiful figure, with a complete explanation in the caption. I saw the extension of this rule to talks recently, and I’m going to try it out myself: if there is one thing you want your audience to remember when they leave your talk, put it in a movie.

Here is the movie that taught me this lesson:

And here is a blog post by one of the video creators, to tell you more about what you’re seeing.

1 Comment

Filed under disease modeling, videos

Causal Modeling in Python: Bayesian Networks in PyMC

While I was off being really busy, an interesting project to learn PyMC was discussed on their mailing list, beginning thusly:

I am trying to learn PyMC and I decided to start from the very simple discrete Sprinkler model.
I have found plenty of examples for continuous models, but I am not sure how should I proceed with conditional tables, especially when the condition is over more than a variable. How would you model a CPT with PyMC, and in particular the Sprinkler model?

I spend all my time on continuous models these days, and I miss my old topics from back in grad school. Not that this is one of them, exactly. But when I found myself with a long wait while running some validation code, I thought I’d give it a try. The model turned out to be simple, although using MCMC to fit it is probably not the best idea. Continue reading

2 Comments

Filed under MCMC

OWS in Theory

Luca Trevisan sparks a CS Theory discussion about the police repression of students supporting Occupy Wall St on his blog “in theory”.

Comments Off on OWS in Theory

Filed under Mysteries

How I spent my fall vacation

As mentioned, HA took a brief vacation while I worked hard on my disease modeling system for the Global Burden of Disease 2010 study. First, I thought I was writing a book about the methods, but as I wrote I realized more and more things that I would like to do differently implementing the methods. So then I switched to re-writing all of the implementation, which seemed like an ambitious 1 week project. Two months later, I’m very happy with the results, and they’re online just in time for my users to crunch many numbers.

Of course, there are still plenty of issues that are coming up, and I still have to get back to writing the book about this approach. But I miss the variety of the blogging, and I’m starting it back up, even if I have plenty of other writing responsibilities.

Also, I love the wisdom of the internet. Dear readers, here is the description of this integrative disease modeling book I’ve been working on. Does it sound like something you’ve seen before? I’ve found that the mathematics I’m using have been rediscovered many times in many fields, all of which I would like to know more about.

Integrative systems modeling of disease in populations is the first book-length treatment of model-based meta-analytic methods for descriptive epidemiology. It develops, from first principles, the system dynamics model which constitutes the theoretical foundation of Years Lived with Disability (YLD) estimation in burden of disease studies. This compartmental model of the progression of disease through a population has been used for over ten years in global health epidemiology in the popular generic disease modeling system DisMod II, distributed by the World Health Organization. However, until now, the description of the model and the methods behind the software have been scattered through the scientific literature in a loose collection of journal articles and operations manuals.

In addition to collecting the prior work on compartmental modeling of disease together in one place, this book significantly extends the model, by formally connecting the system dynamics model of disease progression to a statistical model of epidemiological rates, the kind that are calculated in descriptive epidemiological research and collected through systematic review. This combination of systems dynamics modeling and statistical model, which the author calls integrative systems modeling allows the model to integrate all available relevant data. Because advanced numerical algorithms are needed to fit these complex models, a section of the book provides the necessary background on Markov chain Monte Carlo (MCMC) computation.

Experience with the results of systematic review indicates that when all available relevant data is collected, it is often very sparse and very noisy. The integrative systems models developed in this book focus particularly on techniques for handling sparse, noisy data. The book explores statistical models for over-dispersed count data, covariate modeling to both explain systematic variation in epidemiological rate data and increase predictive accuracy for estimates for subpopulations where no data is available, and age-pattern modeling to systematically incorporate expert knowledge about how quickly epidemiological rates can vary as a function of age. It also develops a novel theory of age group modeling to address heterogeneity in age groups commonly found during systematic review.

The theoretical foundations of integrative systems modeling of disease in populations are complemented with a series of applications of the model to meta-analysis of more than a dozen different diseases. These practical applications provide a unique opportunity to see how the model performs in a variety of scenarios, and also demonstrate how the model performs when the model assumptions are violated, and how to work around model assumption violations.

The book concludes with a detailed description of the future directions for research in model-based meta-analysis of descriptive epidemiological data and integrative systems modeling for global health.

Comments Off on How I spent my fall vacation

Filed under disease modeling

Resusicitating Healthy Algorithms

Wow, 2.5 months can just fly by! I’m crawling out from under a crushing workload, and ready to rejoin the world of applied theoretical computer science and python hacking. Did I miss anything?

1 Comment

Filed under Uncategorized

Flock of VA papers

I’m afraid that Healthy Algorithms will be pretty quiet in the next month, I’ve got some major other writing commitments to attend to, and I need to ration my keystrokes if I’m going to make the deadline.

But here is something I’m happy to leave at the top of the page while I’m busy: the special issue of Population Health Metrics devoted to the Verbal Autopsy is provisionally available.

This includes the paper on using random forests for computer coding verbal autopsies that I’ve mentioned before, a paper describing the massive efforts that went into collecting a verbal autopsy validation dataset, and a paper on our take on the metrics of prediction quality that we recommend for any approach to verbal autopsy.

Bonus, a commentary that quotes Foucault to put random forests in context.

2 Comments

Filed under global health

Fiction and a Fictional Math Book

I read the The Girl With the Dragon Tattoo series recently, which was extremely engrossing. The first book has a bit of a health metrics theme, with each section prefaced with a shocking statistic about violence against women in Sweden. The second book has a bit of a math theme, with each section prefaced by a correct, if inane algebraic equation.

Also in the second book, the tattooed girl spends some time reading a strangely titled math book, Dimensions in Mathematics, and I liked the story enough to google the book, since it was presented with author and publisher. It turned out that this just revealed more mystery.

Comments Off on Fiction and a Fictional Math Book

Filed under Mysteries

Other Way Cool Demos from SciPy 2011

Besides the marvelous upgrade to ipython, there were some other things I saw at SciPy 2011 that I want to remember to remember.

I think I’ll have a lot more to say about Dexy soon, because I really need something like that. A tool to make documentation sexy. If only the tool itself had more documentation!

2 Comments

Filed under software engineering

Coolest Demo at SciPy 2011

Speaking of SciPy 2011 (as I was in my last post), the coolest, most draw-dropping-est demo I saw there was hands-down for the new ipython. The most cutting edge stuff is available on the web. I want it.

1 Comment

Filed under software engineering, Uncategorized

PyMC at SciPy 2011

I just returned from the SciPy 2011 conference in Austin. Definitely a different experience than a theory conference, and definitely different than the mega-conferences I’ve found myself at lately. I think I like it. My goal was to evangelize for PyMC a little bit, and I think that went successfully. I even got to meet PyMC founder Chris Fonnesbeck in person (about 30 seconds before we presented a 4 hour tutorial together).

For the tutorial, I put together a set of PyMC-by-Example slides and code to dig into that silly relationship between Human Development Index and Total Fertility Rate that foiled my best attempts at Bayesian model selection so long ago.

I’m not sure the slides stand on their own, but together with the code samples they should reproduce my portion of the talk pretty well. I even started writing it up for people who want to read it in paper form, but then I ran out of momentum. Patches welcome.

Comments Off on PyMC at SciPy 2011

Filed under education, global health, MCMC, statistics