MCMC Mixing Mysteries

I love doing Math Reviews, for the random stuff I get to read. Here is a paper I probably would not have found:
MR3208124 Rom\’an, Jorge Carlos, Hobert, James P.\ and Presnell, Brett, On reparametrization and the Gibbs sampler, Statist. Probab. Lett. 91 (2014), 110–116.

The Gibbs sampler may be out of fashion with the Bayesian computation crowd these days, but the reparameterizations are still mysterious. I tried the PyMC3 NUTS sampler on the first example and it too has rather different mixing times:

Occupation codes in the NHIS

This is my new obsession, does anyone know what I should know about self-reported occupation data in NHIS surveys?


Extra Journal Club: Cd exposure and neurodevelopment

I’m sure reading a lot lately. That is good. This week, I’m filling in for the PBF journal club, too, and today we’ll be discussing Ciesielski et al’s paper Cadmium Exposure and Neurodevelopmental Outcomes in U.S. Children, which uses 6 years of NHANES data to weigh the evidence that low levels of cadmium cause learning disabilities in children.

All the data is available on the CDC’s website, so I thought I’d take a look at it. Here is an interesting little plot that popped out: prevalence of parent-reported learning disabilities in 6-15 year olds as a function of income-to-poverty-line ratio.


Would you have expected that?

OWS in Theory

Luca Trevisan sparks a CS Theory discussion about the police repression of students supporting Occupy Wall St on his blog “in theory”.

Fiction and a Fictional Math Book

I read the The Girl With the Dragon Tattoo series recently, which was extremely engrossing. The first book has a bit of a health metrics theme, with each section prefaced with a shocking statistic about violence against women in Sweden. The second book has a bit of a math theme, with each section prefaced by a correct, if inane algebraic equation.

Also in the second book, the tattooed girl spends some time reading a strangely titled math book, Dimensions in Mathematics, and I liked the story enough to google the book, since it was presented with author and publisher. It turned out that this just revealed more mystery.

The surprising (to economists) truth about what motivates

I’ve been watching really fun 10 minute talks lately on youtuble. They are put together by the Royal Society for the encouragement of Arts, Manufacturers, and Commerce (weird name, huh? It seems they prefer “RSA” for short. But I’m still enough of a computer scientist to think that acronym is taken.)

Here is one that crossed my inbox yesterday, a talk by Dan Pink about what motivates us:


Inequality vs Stuff

I went to a talk a few weeks ago by Richard Wilkinson and Kate Pickett, global health researchers who have written a book called The Spirit Level.  They were quick to explain that, while the name makes perfect sense in British English, it has been a source of continuing confusion in American English.  What is a “spirit level”?  It’s a building tool, a type of ruler with little bubbles in it to show when it is parallel to the ground.  Maybe it’s called a carpenter level in the states, or just a level when the context is clear.

I would have called it “Inequality vs Stuff”, or at least that’s my description of the talk:  a vast array of scatterplots showing the relationship between income inequality and different measurements of population health.  Here is one that is typical for their case:

When they told the story, they started with a composite health index scattered against inequality, since that has much less noise, and then use the noisy plots like this one as supporting evidence when they show that the relationship holds for everything.

The slide that stuck with me the most is one that diverged from their story a little:

Not population health this time, but still interesting.  Something to share with your entrepreneur friends.

These plots seem like enough fun that I made my own, based on a question from the question and answer portion of the talk.  I’ve forgotten who, but someone in the audience asked “How is inequality related to total fertility rate?” and the answer from Wilkinson and Pickett was along the lines of “We never thought to check, how do you think it might be related?”

Since I had the data lying around from my attempts to learn about model selection last summer, I made myself the plot.  Turns out there is not much of an association.The only example of a non-association the speakers mentioned was a surprise to them: suicide rates are not correlated with income inequality.


Machine Translation and the Porpoise Corpus

I might have mentioned that I got to do some world traveling for my work recently. Seeing rural Tanzania was an experience that I still don’t really have good words to describe. But this is not a post about that. This is a post about a sticky idea I got stuck on in some science fiction I was reading during my multi-day to and fro travel.

On my around-the-world-in-4.5-days journey, I read the Jewish feminist sci-fi novel He, She, and It by Marge Piercy. It’s got a classic hard AI theme, about a robot that is so, so human… I’d recommend it. But dilemmas of whether a robot can make a minyon in the reform tradition of 2059 has not stuck in my mind the way this one line about whales has: Continue reading


US Health Care Costs, cont.

I wrote two months ago about the mysterious differences in health care costs that I found so intriguing in a talk by Jonathan Skinner. (That was two months ago? Really?) Since then, the surgeon/author Atul Gawande has brought the mystery to the national stage. In a long story for the New Yorker, he gave the non-technical version of Skinner’s talk, and today he addressed some of the feedback that this article has received over the last month.

His short answer to the mystery is this:

Analysis of Medicare data by the Dartmouth Atlas project shows the difference is due to marked differences in the amount of care ordered for patients—patients in McAllen receive vastly more diagnostic tests, hospital admissions, operations, specialist visits, and home nursing care than in El Paso.

But that is not the end of the story. It only takes a sentence to explain the “proximal” cause of these cost differences, but it takes the whole article for Gawande to do justice to his theory on the underlying cause, and his is certainly not the only theory.

Since his theory of the root cause of this inequality is centered on physicians putting profit over patients, it has made some doctors uneasy. Greg Roth, a physician that I work with hadn’t had time to read the article when we last chatted, but he did attend Skinner’s talk with me two months ago. Greg told me about a detail that has emerged as doctors put Gawande’s article under their microscopes: we might be making a mountain out of molehill-sized mystery.

Look at this plot, which shows the complementary cumulative distribution function for the primary quantity in Gawande’s article, Total Medicare reimbursements per enrollee for 2006.

Investigative reporter have to get the story, and raking the muck way out in the tail of this distribution turned out to be a good bet this time. But McAllen is 6 standard deviations above the mean (not to imply that this distribution is normal… should it be?) How much impact would it have, for the whole population, if the outliers were greatly improved?

If through anti-fraud policing, better culture, and general hard work, the top 10% of hospitals reduced their cost per patient to the national average, that would reduce the average cost by 3.6%. Outliers show what is possible, but making a big change involves more than outliers.


Mysterious Question: Differences in Health Care Costs

Health Economist Jonathan Skinner gave a talk at IHME about a week and a half ago. He told us about his work on the Dartmouth Atlas of Healthcare, and showed us some of the numbers he’s crunched on the variation of Medicare costs by region. He has found this mysterious, 2.5x variation between the cost of care between expensive regions (like Miami) and inexpensive regions (like Seattle). It seems like a great mystery, and I’ve been puzzling over it for a week now. Any theories? I’m partial to network effects.

Here’s his paper on the subject.


