Tag Archives: machine learning

Machine learning in population health: Opportunities and threats

My colleague Theo Vos and I have a perspective published recently in PLoS Medicine, Machine learning in population health: Opportunities and threats. It is not long, so you can skim it in seconds, or read it all in just minutes.

It is not directly related to a short film that I enjoyed recently.  Maybe indirectly.

Comments Off on Machine learning in population health: Opportunities and threats

Filed under Uncategorized

What have I been writing?

Just because I missed posting for the last year, doesn’t mean I have not been writing. Perhaps I have been writing more. Here is something that I just wrote for a perspective on opportunities for machine learning in population health.

Machine learning (ML) is emerging as a technology, climbing the “peak of inflated expectations” or perhaps even starting to slip into the “trough of disillusionment”, in the terms of the technology hype cycle,[ref] and offers both opportunities and threats to population health. ML is a technique for constructing computer algorithms, and what distinguishes ML methods from other computer solutions is that, while the structure of the computer program may be fixed, the details are learned from data. This data-driven approach is now dominant in Artificial Intelligence (AI), especially through deep neural networks, and stands in contrast to the old way, an expert-algorithms approach in which rules summarizing expert knowledge were painstakingly constructed by engineers and domain specialists. ML has succeeded by trading experts and programmers for data and nonparametric statistical models. However, the applications where ML has been successfully deployed remain limited. AI luminary Andrew Ng provides this concise heuristic: “[i]f a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.”[ref]

The editor only wants 1,000 words, so this is getting cut.

Comments Off on What have I been writing?

Filed under machine learning, Uncategorized

Reusable Holdout

Cool paper, cool idea, ICYMI:


From: Mabry, Patricia L
Sent: Thursday, January 14, 2016 5:51 AM
Subject: [iuni_systems_sci-l] Article of interest: reusable holdout method

Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., & Roth, A. (2015). The reusable holdout: Preserving validity in adaptive data analysis.Science, 349(6248), 636-638.

Misapplication of statistical data analysis is a common cause of spurious discoveries in
scientific research. Existing approaches to ensuring the validity of inferences drawn from data
assume a fixed procedure to be performed, selected before the data are examined. In common
practice, however, data analysis is an intrinsically adaptive process, with new analyses
generated on the basis of data exploration, as well as the results of previous analyses on the
same data. We demonstrate a new approach for addressing the challenges of adaptivity based
on insights from privacy-preserving data analysis. As an application, we show how to safely
reuse a holdout data set many times to validate the results of adaptively chosen analyses.


Comments Off on Reusable Holdout

Filed under Uncategorized

Reddit asks about IBM Watson

Is IBM Watson just (mostly) marketing? (self.MachineLearning)


Comments Off on Reddit asks about IBM Watson

Filed under TCS

Brief survey on sequence classification

hi Abie,

It was great speaking with you. This is the paper I was talking about.


Looking forward to know more about each other’s work.



Filed under disease modeling, machine learning

ML in Python: Naive Bayes the hard way

A recent question on the PyMC mailing list inspired me to make a really inefficient version of the Naive Bayes classifier. Enjoy.

Comments Off on ML in Python: Naive Bayes the hard way

Filed under machine learning

ML vs Stats

What is the difference between machine learning and statistics? Can it be captured in a tweet?

1 Comment

Filed under education