Tag Archives: python

Infographics in Python: Plot a Noun Project Icon on a Matplotlib Chart

I had to put an icon on a chart in Python last week, and I couldn’t find a good brief blog about how to do it. Here is what I cobbled together:

1. Find a free, appropriate image from The Noun Project.
2. Load it into Python with plt.imread
3. Draw it in the proper place on a figure with plt.imshow and some cryptic, hacky options.

Looks good, right?
1500

See this all in action here: https://gist.github.com/aflaxman/c171050384471636e8f23f322ba7e9c5

Leave a comment

Filed under dataviz

So cool–nbtutor

The first release of nbtutor (“Visualize Python code execution (line-by-line) in Jupyter Notebook cells.”) is available on pypi:

pip install nbtutor
jupyter nbextension install --sys-prefix --overwrite --py nbtutor
jupyter nbextension enable --sys-prefix --py nbtutor

https://github.com/lgpage/nbtutor

Comments Off on So cool–nbtutor

Filed under education

dfply package

Potentially of interest, although I’ve done enough d3js to think that .select .head is fine notation:

dfply Version: 0.2.4

GitHub – kieferk from November 28, 2016
“The dfply package makes it possible to do R’s dplyr-style data manipulation with pipes in python on pandas DataFrames.”
https://github.com/kieferk/dfply

from dfply import *

diamonds >> select(X.carat, X.cut) >> head(3)

   carat      cut
0   0.23    Ideal
1   0.21  Premium
2   0.23     Good

Comments Off on dfply package

Filed under software engineering

py.test recipes for slowness

Useful material on how to deal with slow tests in py.test, a bit buried in the docs:

From http://doc.pytest.org/en/latest/usage.html, to get a list of the slowest 10 test durations:

pytest --durations=10

From http://doc.pytest.org/en/latest/example/simple.html, to skip slow tests unless they are requested:

# content of conftest.py

import pytest
def pytest_addoption(parser):
    parser.addoption("--runslow", action="store_true",
        help="run slow tests")

# content of test_module.py
import pytest


slow = pytest.mark.skipif(
    not pytest.config.getoption("--runslow"),
    reason="need --runslow option to run"
)


def test_func_fast():
    pass


@slow
def test_func_slow():
    pass

Very convenient to know.

Comments Off on py.test recipes for slowness

Filed under software engineering

Delta Time in Python: Simple calendar times with Pandas

Here is something that Google did not help with as quickly as I would have expected: how do I convert start and stop times into the time between events in seconds (or minutes)?

Or for the busy searcher “how do I convert Pandas Timedelta to seconds”?

The classy answer is:

start_time = df.interviewstarttime.map(pd.Timestamp)
end_time = df.interviewendtime.map(pd.Timestamp)

((end_time-start_time) / pd.Timedelta(minutes=1)).describe()

I found it hidden away here: http://www.datasciencebytes.com/bytes/2015/05/16/pandas-timedelta-histograms-unit-conversion-and-overflow-danger/

6 Comments

Filed under statistics

I wish I had this Python video sooner

Video recommendation: Stop Writing Classes

Comments Off on I wish I had this Python video sooner

Filed under software engineering

Git says this is binary and it is not

I had an annoying little issue, where git was saying my file was binary. What do I care what git thinks? Well, I care if it refuses to show me my diff:


[abie@cluster-dev TICS]$ git diff
diff --git a/etl.py b/etl.py
index 3b5b4ca..2cb591e 100644
Binary files a/etl.py and b/etl.py differ

Google and Stack Overflow usually solve any problem I have like this, but today they under-delivered. They gave me a good hint, there must be some funny character in my .py file. That can happen when a 1.5 year old is helping with the typing.

Here is a quick fix, in case I (or you) ever find ourselves in this situation again:

import unidecode
f = file('etl.py').read()
with file('etl.py', 'w') as f2:
    f2.write(unidecode.unidecode(f))

All better. Thanks again unidecode.

Comments Off on Git says this is binary and it is not

Filed under software engineering