Tag Archives: python

AI in Epi 554 (part 2)

Following up on the general guidance I offered Epi 554 last week, this week I tried to get specific about how to use AI assistance in debugging. I think there is room for improvement, but I’m going to get it out to you, and maybe you’ll tell me how to improve.

Debugging 1: When the code fails

Maybe I’ve told you that AI is BS.  But that doesn’t make it useless.

Useful for debugging: use it so that you don’t stay stuck for long
example: error in code from Lab 2, shown below:

(if you know how to fix this… don’t worry you’ll have an error msg that is less obvi soon; and if you are above-average in debugging… this approach might make you worse!)

Teach an advanced AI technique called “Prompt Engineering“, example: paste error, type “why?” — aside: be polite in your prompts, for a better world and for better answers.  Let’s not go through the details of the answer in detail — I want to focus on how and when to ask

  • You can be more verbose, e.g. you can explain what you were trying to do, paste your error, and ask why you got this error and if it has ideas on how to fix the error that you got.
  • You can also use the first precept of prompt engineering: tell AI who you want it to be.  e.g. “You are a friendly and expert teaching assistant.” or “You are a busy and distracted professor.” (?) 
  • Customize as preferred, e.g. if appropriate, you can start with “you answer in English, but you know that I speak Spanish as a native and English is not my first language.”

You try: here is an error to work with [[I didn’t actually come up with this]], and an answer that still doesn’t fix it.  What might you ask next?

Lauren Wilner (Epi 560 TA) says: For debugging, I have found that ChatGPT is mediocre. I give it the code I ran and the error I got, generally, but I find that often it gives me either (1) code that has the same error again or (2) new code that has a different error.

Summary: AI is BS, not useless; Useful for debugging; don’t stay stuck for long, prompt with code example, and polite request for help. Keep convo going if necess.  It is just imitating the way words often hang together in online text, like stack overflow and cross validated, but if it gets your code to run… then you have running code!

Comments Off on AI in Epi 554 (part 2)

Filed under education

AI Assistance for Pseudopeople: GPTs for configuration dicts

Over the last year, I’ve been hard at work making simulated data. I love making simulated data, and I finally put a minimal blog about it up (https://healthyalgorithms.com/2023/11/19/introducing-pseudopeople-simulated-person-data-in-python/)

I have a persistent challenge when I use pseudopeople in my work: configuring the noise requires a deeply nested python dictionary, and I can never remember what goes in it.

After reading a recent dispatch from Simon Willison, I thought that maybe the new “GPTs” affordances from OpenAI could help me deal with this. I’m very optimistic about the potential of AI assistance for data science work.

And with just a short time of messing around, I have something I’m pretty happy with:
https://chat.openai.com/g/g-7e9Dfx1fv-pseudopeople-config-wizard

If you try it out and want to confirm that your custom config works, here is a Google Colab that you can use to test it out: https://colab.research.google.com/drive/1UG38OZigDwBy4zNJHo5fZ752LdalQ7Bw?usp=sharing

Comments Off on AI Assistance for Pseudopeople: GPTs for configuration dicts

Filed under census, software engineering

Introducing Pseudopeople: simulated person data in python

I’m still settling back into blogging as a custom, so perhaps that is why it has taken me six months to think of announcing our new python package here! Without further ado, let me introduce you to pseudopeople.

It is a Python package that generates realistic simulated data about a fictional United States population, designed for use in testing entity resolution methods or other data science algorithms at scale.

To see it for yourself, here is a three-line quickstart, suitable for using in a Google Colab or a Jupyter Notebook:

!pip install pseudopeople

import pseudopeople as psp
psp.generate_decennial_census()

Enjoy!

3 Comments

Filed under census, simulation

One more IDV in Python approach

https://plot.ly/dash/
https://community.plot.ly/c/dash
https://github.com/plotly/dash
https://plot.ly/dash/getting-started
View at Medium.com

Comments Off on One more IDV in Python approach

Filed under Uncategorized

Its 2018, how to IDV in Python?

I’ve got a fun little viz that I need to demo for Important People (IP) in early March [editor’s note: still not done… that deadline was highly optimistic!]. How to do it?

In Python? Sure. In a Jupyter notebook? Maybe. With Matplotlib? Probably not… at least I better have a look at the state of the alternatives.

Did I mention that it is essential for this viz to be *interactive*? It needs to allow the Important People to explore the predictions of some ML model, or at least allow me to explore them while they call out how to explore.

Years ago, I attempted to designate a particular plot the “hello, world” of data viz. Remember that? I think we should extend it to a hello world of interactive data viz. Maybe just choosing the number of digits is enough. Or should it follow the visual information seeking mantra? But “hello, world” cannot be too complicated.

yhat?

Altair
https://altair-viz.github.io
https://github.com/altair-viz/altair_widgets/blob/master/examples/Iris.ipynb
http://pbpython.com/altair-intro.html

Bokeh
https://bokeh.pydata.org/en/latest/docs/gallery.html#gallery

Interactive Data Visualization using Bokeh (in Python)

Click to access Python_Bokeh_Cheat_Sheet.pdf

https://www.datacamp.com/courses/interactive-data-visualization-with-bokeh/
https://www.datacamp.com/community/blog/bokeh-cheat-sheet-python
https://demo.bokehplots.com/apps/movies

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair)

Comments Off on Its 2018, how to IDV in Python?

Filed under Uncategorized

Love to Software Carpentry

I have been a fan of this educational offering for a while now, and I have been mentioning that for a while now, too. But I am moved to say it again, because I’m planning a four-session Intro to Python training for aspiring Health Metrics Scientists, and the SWC curriculum is making that so easy.  It could have been so hard. ❤ u SWC.

Comments Off on Love to Software Carpentry

Filed under Uncategorized

NLP in Python: n-gram language model for Verbal Autopsy responses

This turned out to be a bit of a downer, but it was a good learning exercise, and the general approach will be useful for generating test data on a different project.  See notebook here.

Comments Off on NLP in Python: n-gram language model for Verbal Autopsy responses

Filed under Uncategorized

Introducing Vivarium (again)

Just before that year of not writing anything here, I mentioned that I have a new microsimulation platform, and it is called Vivarium.  That is still true, and now it even has some documentation: https://vivarium.readthedocs.io/en/latest/ 

It has been the thing that kept me too busy to blog for the last year.  But it did generate some aesthetically pleasing figures for test purposes, as well as some population health results of interest.  More details to come.

Comments Off on Introducing Vivarium (again)

Filed under simulation, Uncategorized

Righter signatures in Jupyter

Did you know you can change the signature of functions dynamically in Python 3? It is a bit nasty, and maybe will make things look nicer for vivarium users.

Attempt: https://github.com/ihmeuw/vivarium/pull/2
Docs: https://docs.python.org/3/library/inspect.html#inspect.Signature
SO question that got me started: https://stackoverflow.com/a/33112180/1935494

Comments Off on Righter signatures in Jupyter

Filed under Uncategorized

Introducing Vivarium

I’ve had a new line of research developing for the last 18 months or so—*microsimulation*. It started when I stepped in to help with the “Cost Effectiveness Analysis with Microsimulation” (or CEAM) project at IHME. Now it is growing and growing to take over all of my research and recreation time. Is that bad or good?

Some of this work has now seen daylight from our presentations at SummerSim and iHEA in July, and today I am please to introduce a python package that you can use, too.

The programmers I’ve been working with on this convinced me that it is not just for cost effectiveness analysis and we need a more expansive name for it. So I present to you: vivarium. https://github.com/ihmeuw/vivarium

2 Comments

Filed under Uncategorized