Tag Archives: python

AI Assistance for Pseudopeople: GPTs for configuration dicts

Over the last year, I’ve been hard at work making simulated data. I love making simulated data, and I finally put a minimal blog about it up (https://healthyalgorithms.com/2023/11/19/introducing-pseudopeople-simulated-person-data-in-python/)

I have a persistent challenge when I use pseudopeople in my work: configuring the noise requires a deeply nested python dictionary, and I can never remember what goes in it.

After reading a recent dispatch from Simon Willison, I thought that maybe the new “GPTs” affordances from OpenAI could help me deal with this. I’m very optimistic about the potential of AI assistance for data science work.

And with just a short time of messing around, I have something I’m pretty happy with:
https://chat.openai.com/g/g-7e9Dfx1fv-pseudopeople-config-wizard

If you try it out and want to confirm that your custom config works, here is a Google Colab that you can use to test it out: https://colab.research.google.com/drive/1UG38OZigDwBy4zNJHo5fZ752LdalQ7Bw?usp=sharing

Comments Off on AI Assistance for Pseudopeople: GPTs for configuration dicts

Filed under census, software engineering

Introducing Pseudopeople: simulated person data in python

I’m still settling back into blogging as a custom, so perhaps that is why it has taken me six months to think of announcing our new python package here! Without further ado, let me introduce you to pseudopeople.

It is a Python package that generates realistic simulated data about a fictional United States population, designed for use in testing entity resolution methods or other data science algorithms at scale.

To see it for yourself, here is a three-line quickstart, suitable for using in a Google Colab or a Jupyter Notebook:

!pip install pseudopeople

import pseudopeople as psp
psp.generate_decennial_census()

Enjoy!

3 Comments

Filed under census, simulation

One more IDV in Python approach

https://plot.ly/dash/
https://community.plot.ly/c/dash
https://github.com/plotly/dash
https://plot.ly/dash/getting-started
View at Medium.com

Comments Off on One more IDV in Python approach

Filed under Uncategorized

Its 2018, how to IDV in Python?

I’ve got a fun little viz that I need to demo for Important People (IP) in early March [editor’s note: still not done… that deadline was highly optimistic!]. How to do it?

In Python? Sure. In a Jupyter notebook? Maybe. With Matplotlib? Probably not… at least I better have a look at the state of the alternatives.

Did I mention that it is essential for this viz to be *interactive*? It needs to allow the Important People to explore the predictions of some ML model, or at least allow me to explore them while they call out how to explore.

Years ago, I attempted to designate a particular plot the “hello, world” of data viz. Remember that? I think we should extend it to a hello world of interactive data viz. Maybe just choosing the number of digits is enough. Or should it follow the visual information seeking mantra? But “hello, world” cannot be too complicated.

yhat?

Altair
https://altair-viz.github.io
https://github.com/altair-viz/altair_widgets/blob/master/examples/Iris.ipynb
http://pbpython.com/altair-intro.html

Bokeh
https://bokeh.pydata.org/en/latest/docs/gallery.html#gallery

Interactive Data Visualization using Bokeh (in Python)

Click to access Python_Bokeh_Cheat_Sheet.pdf

https://www.datacamp.com/courses/interactive-data-visualization-with-bokeh/
https://www.datacamp.com/community/blog/bokeh-cheat-sheet-python
https://demo.bokehplots.com/apps/movies

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair)

Comments Off on Its 2018, how to IDV in Python?

Filed under Uncategorized

Love to Software Carpentry

I have been a fan of this educational offering for a while now, and I have been mentioning that for a while now, too. But I am moved to say it again, because I’m planning a four-session Intro to Python training for aspiring Health Metrics Scientists, and the SWC curriculum is making that so easy.  It could have been so hard. ❤ u SWC.

Comments Off on Love to Software Carpentry

Filed under Uncategorized

NLP in Python: n-gram language model for Verbal Autopsy responses

This turned out to be a bit of a downer, but it was a good learning exercise, and the general approach will be useful for generating test data on a different project.  See notebook here.

Comments Off on NLP in Python: n-gram language model for Verbal Autopsy responses

Filed under Uncategorized

Introducing Vivarium (again)

Just before that year of not writing anything here, I mentioned that I have a new microsimulation platform, and it is called Vivarium.  That is still true, and now it even has some documentation: https://vivarium.readthedocs.io/en/latest/ 

It has been the thing that kept me too busy to blog for the last year.  But it did generate some aesthetically pleasing figures for test purposes, as well as some population health results of interest.  More details to come.

Comments Off on Introducing Vivarium (again)

Filed under simulation, Uncategorized

Righter signatures in Jupyter

Did you know you can change the signature of functions dynamically in Python 3? It is a bit nasty, and maybe will make things look nicer for vivarium users.

Attempt: https://github.com/ihmeuw/vivarium/pull/2
Docs: https://docs.python.org/3/library/inspect.html#inspect.Signature
SO question that got me started: https://stackoverflow.com/a/33112180/1935494

Comments Off on Righter signatures in Jupyter

Filed under Uncategorized

Introducing Vivarium

I’ve had a new line of research developing for the last 18 months or so—*microsimulation*. It started when I stepped in to help with the “Cost Effectiveness Analysis with Microsimulation” (or CEAM) project at IHME. Now it is growing and growing to take over all of my research and recreation time. Is that bad or good?

Some of this work has now seen daylight from our presentations at SummerSim and iHEA in July, and today I am please to introduce a python package that you can use, too.

The programmers I’ve been working with on this convinced me that it is not just for cost effectiveness analysis and we need a more expansive name for it. So I present to you: vivarium. https://github.com/ihmeuw/vivarium

2 Comments

Filed under Uncategorized

Infographics in Python: Plot a Noun Project Icon on a Matplotlib Chart

I had to put an icon on a chart in Python last week, and I couldn’t find a good brief blog about how to do it. Here is what I cobbled together:

1. Find a free, appropriate image from The Noun Project.
2. Load it into Python with plt.imread
3. Draw it in the proper place on a figure with plt.imshow and some cryptic, hacky options.

Looks good, right?
1500

See this all in action here: https://gist.github.com/aflaxman/c171050384471636e8f23f322ba7e9c5

Comments Off on Infographics in Python: Plot a Noun Project Icon on a Matplotlib Chart

Filed under dataviz