# Matching Algorithms and Reproductive Health: Part 2, Matching and Virginity Pledges

I might have been a little over-ambitious with this series. I wrote a little bit about the how matching theory emerged from the social sciences two weeks ago. But then I got really busy! And that was the part I actually knew something about ahead of time. The promised connection between matching algorithms and reproductive health (and more generally, how matching is being used in quasi-experiment design) is the part that I have to do some reading on before I can write knowledgeably about.

However, I have a plan: I’d like to “crowd-source” my library research. I’ve read a little bit about a grand experiment that is going on right now, wherein Tim Gowers, Terry Tao, and mathematicians well-known and unknown, are collaborating online to try to develop a combinatorial proof of the density Hales-Jewitt theorem. I’ll try a small-scale version of their amazing project. I don’t expect it to draw in the same heavy-weight-champion mathematicians that Gower’s polymath project has attracted, but it will not require background knowledge of Szemerédi’s regularity lemma or the triangle-removal lemma (which currently have no applications to Global Public Health, but I’m on the lookout…).

The article that made me think I could title a series of posts “Matching Algorithms and Reproductive Health” came out a month or so ago in the journal of the American Academy of Pediatrics: Janet Elise Rosenbaum, Patient Teenagers? A Comparison of the Sexual Behavior of Virginity Pledgers and Matched Nonpledgers, Pediatrics, Vol. 123 No. 1 Jan 2009. (See? It has “match” in the title.) One nice thing about these medical journals is that they give good summaries of the articles, sort of an abstract-of-the-abstract. Patient Teenagers in two sentences:

What’s Known on This Subject
Two studies have found, by using regression, that virginity pledges delay sex, but regression cannot correct for large preexisting differences between pledgers and nonpledgers.
We used a more robust method than regression to compare virginity pledgers with similar nonpledgers and found virtually no difference in sexual behavior or STDs and much less use of condoms.

This is great ammo for the culture wars, and it was headline news for a few days when it came out. In blog time, I’m way late in mentioning it. But my interest is the more robust method that Rosenbaum uses. It is something that statisticians call “matching”, but when they say matching, they don’t mean finding a subgraph with maximum degree one. I have a feeling that what they mean is related, however. Can we figure out together?

It seems like there is an R package that is popular for this sort of analysis, matchit, and there is an award-winning 38 page paper explaining the ideas behind the code. Have a look and report back. I will too, when I have another 90 minutes free.

Filed under combinatorial optimization, global health

### 4 responses to “Matching Algorithms and Reproductive Health: Part 2, Matching and Virginity Pledges”

1. I didn’t realize this while I was writing, but now I’m pretty sure that I went to high school with Janet Rosenbaum! Small world, huh?

2. Aram Harrow

Hi Abie,
I think this kind of data-set matching includes the combinatorial kind, but also a more generalized pruning of the data set, in which control and treatment look nearly the same in all the pre-treatment dimensions, at least in some aggregate sort of way.

From page 212 of the paper:

Although one-to-one exact matching can eliminate model dependence and any bias from incorrect assumptions made during the parametric stage of analysis, it is not the only way to break the link between $T_i$ and $X_i$, since satisfying equation (11) only requires the distributions to be equivalent. Thus, to be clear, matching does not require pairing observations (indeed, there might have been less confusion if the technique had been called ‘‘pruning’’); only the distributions need be matched as closely as possible.

3. I found that the matchit documentation is a quicker way to get to the bottom of this than the matchit paper.

Check out section 3.2.1.4, Optimal Matching. This is one of 7 ways recommended, but, if I’m reading it right, it is exactly a minimum weight perfect matching of the treatment subjects to the control subjects.