Daily Archives: November 27, 2012

Random Forests as in the Verbal Autopsy

Here is an interesting method that spans my old life in theory world and my new one in global health metrics: the Random Forest.

This is a technique that grows (get it?) out of research on decision trees, and it is a great example of how combining a few simple ideas can get complicated very quickly.

The task is the following: learn from labeled examples. (Is this yet another baby-related research topic? Not as directly as the last few.) To be specific, I start with a training data set, which to be specifically about the task at hand in global health, may be the results of verbal autopsy interviews, all digitized and encoded as numeric data; together with the true underlying cause of death (as identified by gold-standard clinical diagnostic criteria) as the labels.

To “learn” in this case means to build a predictor that can take new, unlabeled examples and assign a cause of death to them.

The first simple idea needed for the random forest is the decision tree, and I found a nice youtube video that explains it, so I don’t need to write it up myself:

Well, this video is not perfect; if you have not seen this before, you may be left with a few questions.

1 Comment

Filed under machine learning