Baby faces and Chernoff faces

I’ve been spending a lot of time looking at faces lately. Baby faces, to be specific. Baby faces are just wonderful, especially my baby, if I do say so myself. In fact, that adorable baby, said face, and said staring are making me forget my “blogging” voice. I think I can recover it with a little effort.

There is a face-related visualization technique that I’ve been planning to write about, and this seems like just the moment: recall Chernoff Faces. This 1970s-era method for multidimensional data visualization is so cute that it has been a recurring example in visualization education for 40 years, but it is so hard to use that it is almost never used in practice. Have you seen it before?

There are two major problems with this method: it is almost always a mistake to use it for public communication (Eugene Turner’s map of LA above is a rare exception), and it is almost always a mistake to use it in a dashboard. It has lived on from its original paper because it is so cute, but it made it to publication in its original paper because it was used appropriately: as a human computation aid in a clustering task.


It looks to me like there are three different shaped faces in the above figure. Agree?

This hasn’t always gone so well, though, as an example from a subsequent application by different authors shows below. In the following, do you see anything worth grouping together? I guess they show only a single exemplar from each cluster here, so they’re all supposed to look different.

But I’m still excited about for the same reason as everyone else: faces are so cute! I have come up with a related human computation task where it may also be useful: outlier detection.

I’m not the first to want to do this, and it has even been attempt in health service research before:

(from here)


(from here)

A quick search turns up lots of code for generating faces from multidimensional data, but nothing as cute as Chernoff’s original work. I’ll remedy that in a near future post. Unless you know of something already out there that I missed.

2 Comments

Filed under dataviz

R in ipython

This is going to be great. Now I’ve got to figure out how to make Rpy work, though!

There is a funny little discussion on twitter, where I found out about it.

Comments Off on R in ipython

Filed under Uncategorized

IHME opportunities

Post-Graduate Fellowship Program
Advancing the Science of Health Measurement through Innovation, Education, and Collaboration

The Post-Graduate Fellowship program is for recent PhD and MD researchers and combines academic research, education and training, and professional work with progressive, on-the-job training and mentoring from an illustrious group of professors and researchers. We are now accepting applications for our 2013 cohort. The program description and instructions on how to apply are attached and linked below. Our application deadline is November 1, 2012.

 

For more information on how to apply, please visit our Web page:
http://www.healthmetricsandevaluation.org/education-training/post-graduate-fellowship

PhD in Global Health
A Measurable Difference

The new PhD program in Global Health builds on the expertise of our faculty in the areas of Metrics and Implementation Science. This unique, interdisciplinary program is comprised of a core curriculum in advanced quantitative methods, epidemiology, population health measurement, impact evaluations, and implementation science methods. Students develop skills through a combination of didactic courses, seminars, and research activities including primary data collection and analysis. This program trains global health researchers for careers in academic institutions, international organizations, ministries of health, foundations, and the private sector. Our application deadline is December 1, 2012.

For more information on how to apply, please visit our Web page: http://globalhealth.washington.edu/phd

Comments Off on IHME opportunities

Filed under education

MCMC in Python: (approximate) derivative-constrained Gaussian Processes with PyMC.gp

I’ve always enjoyed the Gaussian Process part of the PyMC package, and a question on the mailing list yesterday reminded me of a project I worked on with it that never came to fruition: how to implement constraints on the derivatives of the GP.

The best answer I could come up with is to use “potential” nodes, and do it approximately. That is to say, instead of constraining the derivative, I satisfy myself to constrain a secant that approximates the derivative. And instead of constraining it at every point in an interval, I satisfy myself to constrain it at a discrete subset of points.

Here is an ipython notebook example: [ipynb] [py]

Comments Off on MCMC in Python: (approximate) derivative-constrained Gaussian Processes with PyMC.gp

Filed under MCMC

Gratitude

I’m on a self-imposed blog fast until I finish sending out thank-you notes for all of the love, support, and baby gifts I’ve received over the last few months. (Plus there is a book manuscript on my desk with hundreds of copy edits to make…)

But I’m breaking my rule to announce that I’ve been selected by Technology Review as a TR35 Young Innovator. Thanks for the kind words and support I’ve received! Also, thanks to my colleagues at IHME and elsewhere for all of the hard work; without this labor, my innovations would languish in obscurity.

13 Comments

Filed under general

Turing Centennial Series from In Theory blog

I have greatly enjoyed the series of posts that Luca Trevisan brought together in honor of Alan Turing’s 100th birthday.  He introduced the series as follows:

Within the Turing festivities, I think it would be interesting to talk about how things have changed (or not) since Turing’s time for people who do academic work in cryptography and in the theory of computing and who are gay or lesbian.

 

So I have invited a number of gay and lesbian colleagues to write guest posts talking about how things have been for them, and so far half a dozen have tentatively accepted. Their posts will appear next month which, besides being Turing’s centennial month, also happens to be the anniversary of the Stonewall riots.

 
The 8 contributed posts (starting from post 0) are collected here, but I recommend that you start from Post 0 and work your way up:

Comments Off on Turing Centennial Series from In Theory blog

Filed under Uncategorized

The road to knowledge is asking many questions

I got a little peeved when reading too many documents on the computer this morning, and I asked the internet “Why does acrobat have hyperlinks but no back button???

And it told me: Just use ALT-back arrow, or click a few times and you can add it.

Thanks internet.

Comments Off on The road to knowledge is asking many questions

Filed under Uncategorized

Healthy algorithms, healthy baby

Healthy algorithms has been quite for the last two months, because I have had a new project to keep me occupied: Sidney.

7 Comments

Filed under general

MCMC in Python: Bayesian meta-analysis example

In slow progress on my plan to to go through the examples from the OpenBUGS webpage and port them to PyMC, I offer you now Blockers, a random effects meta-analysis of clinical trials.



[py] [pdf]

1 Comment

Filed under MCMC, software engineering

Understanding the Elsevier Boycott

Hello Dear Readers,

Can someone help me quickly get up to speed on the Elsevier boycott? I’ve had a read through thecostofknowledge.com and even skimmed through Tim Gower’s statement of purpose. What I’m missing is what are the demands of this boycott? I’m delighted to have an excuse to refuse a request for refereeing, but how can my boycott be genuine about this if Elsevier has no way to make things right?

8 Comments

Filed under general