Category Archives: software engineering

AI Assistance for Pseudopeople: GPTs for configuration dicts

Over the last year, I’ve been hard at work making simulated data. I love making simulated data, and I finally put a minimal blog about it up (https://healthyalgorithms.com/2023/11/19/introducing-pseudopeople-simulated-person-data-in-python/)

I have a persistent challenge when I use pseudopeople in my work: configuring the noise requires a deeply nested python dictionary, and I can never remember what goes in it.

After reading a recent dispatch from Simon Willison, I thought that maybe the new “GPTs” affordances from OpenAI could help me deal with this. I’m very optimistic about the potential of AI assistance for data science work.

And with just a short time of messing around, I have something I’m pretty happy with:
https://chat.openai.com/g/g-7e9Dfx1fv-pseudopeople-config-wizard

If you try it out and want to confirm that your custom config works, here is a Google Colab that you can use to test it out: https://colab.research.google.com/drive/1UG38OZigDwBy4zNJHo5fZ752LdalQ7Bw?usp=sharing

Comments Off on AI Assistance for Pseudopeople: GPTs for configuration dicts

Filed under census, software engineering

Property Based Testing in Python

Ooh, that looks cool. You could possibly use composite strategies https://hypothesis.readthedocs.org/en/master/data.html#composite-strategies for testing dataframes.

–Abie

From: Joe A. Wagner
Sent: Thursday, February 25, 2016 11:03 AM
To: Abraham D. Flaxman
Subject: property based testing

Hi Abie,

Have you seen hypothesis? It looks really useful. I’ve been meaning to incorporate it into my code, but I’m having a hard time defining properties of data frames (which is usually the input of most of my functions).

Comments Off on Property Based Testing in Python

Filed under software engineering

People of ACM interview with Margaret Burnett

http://www.acm.org/articles/people-of-acm/2016/margaret-burnett

a software inspection process called GenderMag. You can try it for yourself. The process is freely available, and major technology companies are looking at the possibility of adopting it.

Comments Off on People of ACM interview with Margaret Burnett

Filed under software engineering

Stunning Python Visuals

Found this from Software Carpentry: https://software-carpentry.org/blog/2016/12/art-with-python.html
https://github.com/TabletopWhale/AnimatedPythonPatterns

Led me here: http://tabletopwhale.com/index.html

All amazing!

Comments Off on Stunning Python Visuals

Filed under software engineering

Deep Learning Frameworks

I was nearly convinced that Google’s TensorFlow would take over the world, but now I’ll need to also consider MXNet: http://mxnet.io/

Comments Off on Deep Learning Frameworks

Filed under machine learning, software engineering

dfply package

Potentially of interest, although I’ve done enough d3js to think that .select .head is fine notation:

dfply Version: 0.2.4

GitHub – kieferk from November 28, 2016
“The dfply package makes it possible to do R’s dplyr-style data manipulation with pipes in python on pandas DataFrames.”
https://github.com/kieferk/dfply

from dfply import *

diamonds >> select(X.carat, X.cut) >> head(3)

   carat      cut
0   0.23    Ideal
1   0.21  Premium
2   0.23     Good

Comments Off on dfply package

Filed under software engineering

py.test recipes for slowness

Useful material on how to deal with slow tests in py.test, a bit buried in the docs:

From http://doc.pytest.org/en/latest/usage.html, to get a list of the slowest 10 test durations:

pytest --durations=10

From http://doc.pytest.org/en/latest/example/simple.html, to skip slow tests unless they are requested:

# content of conftest.py

import pytest
def pytest_addoption(parser):
    parser.addoption("--runslow", action="store_true",
        help="run slow tests")

# content of test_module.py
import pytest


slow = pytest.mark.skipif(
    not pytest.config.getoption("--runslow"),
    reason="need --runslow option to run"
)


def test_func_fast():
    pass


@slow
def test_func_slow():
    pass

Very convenient to know.

Comments Off on py.test recipes for slowness

Filed under software engineering

Why are loops called loops?

One thing the SWC training got me thinking about is the word “loop” as in “for loop”. It is something so familiar to me that I never tried to figure out why it is called a loop. I think it must come from computational flow diagrams. Incidentally, I read a book full of vintage flow diagrams recently, as part of my efforts to get up to speed on microsimulation: [Art of Simulation]

loop

Comments Off on Why are loops called loops?

Filed under software engineering

Why do I call that variable `clf`?

From the sklearn docs: “We call our estimator instance `clf`, as it is a classifier.” http://scikit-learn.org/stable/tutorial/basic/tutorial.html#learning-and-predicting

Comments Off on Why do I call that variable `clf`?

Filed under machine learning, software engineering

Javascript Things of Note

View at Medium.com

Comments Off on Javascript Things of Note

Filed under software engineering