Value of Privacy in Census Data (Holocaust Remembrance Day edition)

In October of 2019, the Harvard Data Science Review hosted two workshop sessions on Differential Privacy in the 2020 Census. I watched it remotely, and I thought it was interesting, but I don’t remember anything really surprising being presented. Simson Garfinkel from Census Bureau asked a question towards the end that stuck with me, though.  Can people who advocate for privacy protection in census data do some sort of cost-benefit analyses to quantify the value of an imprecise census?

I don’t know of any work which has presented such an analysis explicitly.  But I did read a chapter in a history book that struck me as relevant.  It is a bit heavy, so I have never before thought it was appropriate to push into the conversation.  But today is Holocaust Remembrance Day, so if not today, well…

The history book is IBM and the Holocaust by Edwin Black.  The chapter is “France and Holland” and these two countries, both occupied by Nazis, are the paired comparison for valuing census privacy. Black writes, “German intentions in both countries were nearly identical and unfolded in a similar sequence throughout the war years.  But everything about the occupation of these lands and their involvement with Hitler’s Holleriths was very different. For the Jews of these two nations, their destinies would also be quite different.”

In Holland, the Nazis found a census director who, although not an anti-Semite, loved tallying population metrics to the exclusion of considering their possible effects on populations.  “Theoretically,” he wrote, “the collection of data for each person can be so abundant and complete, that we can finally speak of a paper human representing the natural human.” He was tasked to create a registry of Dutch Jews for the Nazi occupiers, “[t]he registry much [contain] age, profession, and gender … [and] the category (Jew, Mixed I, Mixed II) to which the registered belongs.” It was not an easy task, but the Dutch census workers succeeded, and, as the director wrote, “the census office was able to contribute ways and means of carrying out its often difficult task.” (All quotes from Black.)

In France, the Nazis thought things were proceeding similarly. But there, the census director was Rene Carmille, and he was also a secret agent with the French Resistance.  Charged with creating a registry like that in Holland, he employed sabotage: making tabulations slowly, damaging punch card machinery to prevent recording of Jewish ethnic category, and other heroic acts of sabotage. Carmille’s loyalty to the resistance was discovered by the Nazis when he used his registry to organize resistance combat units in Algeria. As Black puts it, “the holes were never punched, the answers were never tabulated.  More than 100,000 cards of Jews sitting in his office – never handed over. He foiled the entire enterprise.”

The chapter ends with quantitative outcomes. In Holland, Nazis murdered more than 70% of the Jews.  In France, less than 25%. Is it too much to conclude that the lack of precise census data saved 45 of every 100 Jews of occupied France? 135,000 total deaths averted.

1 Comment

Filed under census

One response to “Value of Privacy in Census Data (Holocaust Remembrance Day edition)

  1. aram

    That is a really striking example!

    I wonder how much worse the next best choice of punch-card machines would have been if IBM hadn’t helped the Nazis.