I had to put an icon on a chart in Python last week, and I couldn’t find a good brief blog about how to do it. Here is what I cobbled together:
1. Find a free, appropriate image from The Noun Project.
2. Load it into Python with
3. Draw it in the proper place on a figure with
plt.imshow and some cryptic, hacky options.
Looks good, right?
See this all in action here: https://gist.github.com/aflaxman/c171050384471636e8f23f322ba7e9c5
The first release of nbtutor (“Visualize Python code execution (line-by-line) in Jupyter Notebook cells.”) is available on pypi:
pip install nbtutor
jupyter nbextension install --sys-prefix --overwrite --py nbtutor
jupyter nbextension enable --sys-prefix --py nbtutor
Potentially of interest, although I’ve done enough d3js to think that .select .head is fine notation:
dfply Version: 0.2.4
GitHub – kieferk from November 28, 2016
“The dfply package makes it possible to do R’s dplyr-style data manipulation with pipes in python on pandas DataFrames.”
from dfply import *
diamonds >> select(X.carat, X.cut) >> head(3)
0 0.23 Ideal
1 0.21 Premium
2 0.23 Good
Useful material on how to deal with slow tests in py.test, a bit buried in the docs:
From http://doc.pytest.org/en/latest/usage.html, to get a list of the slowest 10 test durations:
From http://doc.pytest.org/en/latest/example/simple.html, to skip slow tests unless they are requested:
# content of conftest.py
help="run slow tests")
# content of test_module.py
slow = pytest.mark.skipif(
reason="need --runslow option to run"
Very convenient to know.
Here is something that Google did not help with as quickly as I would have expected: how do I convert start and stop times into the time between events in seconds (or minutes)?
Or for the busy searcher “how do I convert Pandas Timedelta to seconds”?
The classy answer is:
start_time = df.interviewstarttime.map(pd.Timestamp)
end_time = df.interviewendtime.map(pd.Timestamp)
((end_time-start_time) / pd.Timedelta(minutes=1)).describe()
I found it hidden away here: http://www.datasciencebytes.com/bytes/2015/05/16/pandas-timedelta-histograms-unit-conversion-and-overflow-danger/
Video recommendation: Stop Writing Classes
I had an annoying little issue, where git was saying my file was binary. What do I care what git thinks? Well, I care if it refuses to show me my diff:
[abie@cluster-dev TICS]$ git diff
diff --git a/etl.py b/etl.py
index 3b5b4ca..2cb591e 100644
Binary files a/etl.py and b/etl.py differ
Google and Stack Overflow usually solve any problem I have like this, but today they under-delivered. They gave me a good hint, there must be some funny character in my .py file. That can happen when a 1.5 year old is helping with the typing.
Here is a quick fix, in case I (or you) ever find ourselves in this situation again:
f = file('etl.py').read()
with file('etl.py', 'w') as f2:
All better. Thanks again unidecode.