Category Archives: machine learning

Why do I call that variable `clf`?

From the sklearn docs: “We call our estimator instance `clf`, as it is a classifier.” http://scikit-learn.org/stable/tutorial/basic/tutorial.html#learning-and-predicting

Comments Off on Why do I call that variable `clf`?

Filed under machine learning, software engineering

Intro to SkFlow

This could be a useful guide: http://terrytangyuan.github.io/2016/03/14/scikit-flow-intro/

Comments Off on Intro to SkFlow

Filed under machine learning

The mysterious non-mystery of boosting

success_of_boosting

Comments Off on The mysterious non-mystery of boosting

March 9, 2016 · 8:00 am

Article I’m interested in: “Machine Learning and the Profession of Medicine”

Darcy AM, Louie AK, Roberts L. Machine Learning and the Profession of Medicine. JAMA. 2016;315(6):551-552. doi:10.1001/jama.2015.18421.

> Must a physician be human? …

http://jama.jamanetwork.com/article.aspx?articleID=2488315

Comments Off on Article I’m interested in: “Machine Learning and the Profession of Medicine”

Filed under machine learning

Using the sklearn text.CountVectorizer

I have been getting some great success from the scikits-learn CountVectorizer transformations. Here are some notes on how I like to use it:

import sklearn.feature_extraction

ngram_range = (1,2)

clf = sklearn.feature_extraction.text.CountVectorizer(
        ngram_range=ngram_range,
        min_df=10,  # minimum number of docs that must contain n-gram to include as a column
        #tokenizer=lambda x: [x_i.strip() for x_i in x.split()]  # keep '*' characters as tokens
    )

There is a stop_words parameter that is also sometimes useful.

Comments Off on Using the sklearn text.CountVectorizer

Filed under machine learning

To read: EnsembleMatrix paper

EnsembleMatrix: Interactive Visualization to Support Machine Learning with Multiple Classifiers http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

I want one

Comments Off on To read: EnsembleMatrix paper

Filed under dataviz, machine learning

Brief survey on sequence classification

hi Abie,

It was great speaking with you. This is the paper I was talking about.

http://dl.acm.org/citation.cfm?id=1882478

Looking forward to know more about each other’s work.

Thanks,

2 Comments

Filed under disease modeling, machine learning