I’m preparing for my first global travel for global health, but the net is paying attention to a paper that I think I’ll like, and I want to mention it briefly before I fly.
Computational Complexity and Information Asymmetry in Financial Products is 27 pages of serious TCS, but it is so obviously applicable that people outside of our particular ivory tower, and even outside of academia entirely are blogging and twittering about it, and even reading it!
Freedom to Tinker has a nice summary of this paper, if you want to know what it’s about in a hurry.
Mike Trick makes the salient observation that NP-hard doesn’t mean computers can’t do it. But the assumption that this paper is based on is not about worst-case complexity; it is, as it should be, based on an assumption about the average-case complexity of a particular optimization problem over a particular distribution.
As it turns out, this is an average-case combinatorial optimization problem that I know and love, the densest subgraph problem. My plan is to repeat the problem here, and share some Python code for generating instances of it. Then, you, me, and everyone, can have a handy instance to try optimizing. I think that this problem is pretty hard, on average, but there is a lot more chance of making progress on an algorithm for it than for cracking the P versus NP nut. Continue reading
(Tap… tap… tap… is this thing on? Good.)
July was vacation month, where I went on a glorious bike tour of the Oregon/California coast, and learned definitively that I don’t like biking on the side of a highway all day. Don’t worry, I escaped in Coos Bay and took trains and buses between Eugene, Santa Cruz, Berkeley, and SF for a vacation more my speed.
But now that I’m back, August is turning out to be project month. I have 3 great TCS applications to global health in the pipeline, and I have big plans to tell you about them soon. But one mixed blessing about these applications is that people actually want to see the results, like, yesterday! So first I have to deal with the results, and then I can write papers and blogs about the techniques.
Since Project Month is a little over-booked with projects, I’m going to have to triage one today. You’ve heard of the NetFlix Challenge, right? Well, github.com is running a smaller scale recommendation contest, and I was messing around with personal page rank, which seems like a fine approach for recommending code repositories to hackers. I haven’t got it working very well (best results, 15% of holdout set recovered), but I was having fun with it. Maybe someone else will take it up, let me know if you get it to work; networkx + data = good times.
f = open('download/data.txt')
for l in f:
u_id, r_id = l.strip().split(':')
[get the code]
This is the final item in my series on Matching Algorithms and Reproductive Health, and it brings the story full circle, returning to the algorithms side of the show. Today I’ll demonstrate how to actually find minimum-weight perfect matchings in Python, and toss in a little story about . Continue reading
It’s been three weeks and one IHME retreat since I wrote about matching algorithms and virginity pledges, and I think I now understand what’s going on in Patient Teenagers well enough to describe it. I’ll try to give a stylized example of how the minimum-weight perfect matching algorithm makes itself useful in reproductive health research.
I think it’s helpful to focus on a concrete research question about the virginity pledge and its effects on reproductive health. Here’s one: “does taking the pledge reduce the chances that an individual contracts trichomoniasis?” If the answer is yes, or if the answer is no, people can still argue about the value of the virginity pledge programs, but this seems like relevant information for decision making. Continue reading
I might have been a little over-ambitious with this series. I wrote a little bit about the how matching theory emerged from the social sciences two weeks ago. But then I got really busy! And that was the part I actually knew something about ahead of time. The promised connection between matching algorithms and reproductive health (and more generally, how matching is being used in quasi-experiment design) is the part that I have to do some reading on before I can write knowledgeably about.
However, I have a plan: I’d like to “crowd-source” my library research. Continue reading
Earlier this week, I was inspired by current events to launch a bold, crazy-sounding series about matching theory and its application to reproductive health. This first installment is a quick social history of the development of matching theory, largely influenced by (and fact-checked against) Lex Scrijver’s encyclopediac Combinatorial Optimization: Polyhedra and Efficiency. His paper “On the history of combinatorial optimization (till 1960)” contains similar information in an easy-to-download form.
On to the story: how social science applications drove the development of matching theory. Continue reading
One of the first things on Obama’s agenda after being sworn in as President last week was lifting the “global gag rule”, a Regan-era innovation that tied US aid to strict anti-choice regulations. Meanwhile, the TCS reading group at UW has been studying matching problems and Edmond’s blossom algorithm. Together, this has been the motivation I needed to launch a series of posts about applications of matchings in reproductive health metrics. Part 1 will have more about matchings.
Friday, already! Happy fragile ceasefire in Gaza, happy wedding to Daniel and Anna, happy Martin Luther King Day, and happy presidential inauguration.
I pass by this mural of MLK almost every day, and I find it incredibly expressive. Maybe it has something to do with the Seattle weather. Today I thought he was looking tired, but hopeful.
Is there some research paper out there to distract me from thinking about politics too much? How about a networks paper with applications to politics? An Arxiv paper from the Physics and Society section was updated recently, and it caught my eye with some beautiful pictures. I’ll explain a little, or you can head straight for their figure on the interactions between presidential candidates and corporations. Continue reading
Sometimes, instead of working, I like to see what search terms are bringing readers to my blog. The most common search that healthyalgorithms has been most useless for is “minimum spanning tree python”. Today, I’ll remedy that.
But first, dear searchers, consider this: why are you searching for minimum spanning tree code in python? Is it because you have a programming assignment due soon? High-school CS class is voluntary. All college is optional, and many you are paying to attend. You know what I’m talking about? Perhaps the short motivational comic Time Management for Anarchists is better than some Python code.
Still want to know how to do it? Ok, but I warned you.
I didn’t make any best-of-the-year lists, but I support the idea. I also support new year’s resolutions, but I’m not going to write about mine.
But the internet picks up the slack.
FlowingData has a 5 Best Data Vis of the year list, which I’m fond of. It includes the beautiful Streamgraphs of Byron and Wattenberg. Their technical report has some fun applications of combinatorial optimization to aesthetics.
Lance Fortnow has a nice Complexity Year in Review on the Computational Complexity Blog. Unfortunately, I don’t have a beautiful illustration of Prasad’s result that Unique Games Conjecture implies semidefinite relaxations have optimal approximation ratios.