It could stand some more refactoring, anyone up for that? Code here: https://gist.github.com/aflaxman/188367eba6d25a4ec41af1a4f468efed
Monthly Archives: June 2017
I am always excited to get news of a new Visual Business Intelligence Newsletter in my Inbox, and that is what arrived at the end of last week. This time Few takes on “jittering”, and suggests an interesting alternative to adding random noise.
Here is a little python snippet to do it:
def wheat_plot(x): hist, bin_edges = np.histogram(x) # make y position based on values of hist y =  for h_i in hist: y += range(h_i) plt.plot(sorted(x), y, 'o') plt.yticks()
Here is a notebook that makes that code do something: https://gist.github.com/aflaxman/235f94f9563b1675233d6d35cd30b8c2
Also Jeff Heer and company made an interactive version in Vega: https://vega.github.io/vega/examples/wheat-plot/
For diversity club last week, we discussed the definition of “diversity” and how it might change over time or between groups. To help focus this, we read a short, but perhaps provocative, piece from the Atlantic:
The Weakening Definition of ‘Diversity’
Millennials think that diversity is less about race and gender than it is about different “experiences.” What does this mean for America?
GILLIAN B. WHITE MAY 13, 2015
Did you have a chance to see the sum-to-fifteen game I learned about last week?
Some mailing list traffic has got me thinking about “computational thinking” lately. Here are some links on what it might be:
New Paper: New challenges for verbal autopsy: Considering the ethical and social implications of verbal autopsy methods in routine health information systems
Verbal autopsy (VA) methods are designed to collect cause-of-death information from populations where many deaths occur outside of health facilities and where death certification is weak or absent. A VA consists of an interview with a relative or carer of a recently deceased individual in order to gather information on the signs and symptoms the decedent presented with prior to death. These details are then used to determine and assign a likely cause-of-death. At a population level this information can be invaluable to help guide prioritisation and direct health policy and services. To date VAs have largely been restricted to research contexts but many countries are now venturing to incorporate VA methods into routine civil registration and vital statistics (CRVS) systems. Given the sensitive nature of death, however, there are a number of ethical, legal and social issues that should be considered when scaling-up VAs, particularly in the cross-cultural and socio-economically disadvantaged environments in which they are typically applied. Considering each step of the VA process this paper provides a narrative review of the social context of VA methods. Harnessing the experiences of applying and rolling out VAs as part of routine CRVS systems in a number of low and middle income countries, we identify potential issues that countries and implementing institutions need to consider when incorporating VAs into CRVS systems and point to areas that could benefit from further research and deliberation.
I got a new hard drive, because I have too many baby pictures—3 TB, fits in my pocket. It made me search for that picture of that 5 Megabyte drive that IBM sold back in the ’50s.
Highly Accessed Articles
What makes an academic paper useful for health policy?
Christopher J M Whitty
Cool paper, cool idea, ICYMI:
From: Mabry, Patricia L
Sent: Thursday, January 14, 2016 5:51 AM
Subject: [iuni_systems_sci-l] Article of interest: reusable holdout method
Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., & Roth, A. (2015). The reusable holdout: Preserving validity in adaptive data analysis.Science, 349(6248), 636-638.
Misapplication of statistical data analysis is a common cause of spurious discoveries in
scientific research. Existing approaches to ensuring the validity of inferences drawn from data
assume a fixed procedure to be performed, selected before the data are examined. In common
practice, however, data analysis is an intrinsically adaptive process, with new analyses
generated on the basis of data exploration, as well as the results of previous analyses on the
same data. We demonstrate a new approach for addressing the challenges of adaptivity based
on insights from privacy-preserving data analysis. As an application, we show how to safely
reuse a holdout data set many times to validate the results of adaptively chosen analyses.