Tag Archives: python

June 30, 2014 · 8:00 am

MCMC in Python: a bake-off

While I’m on a microblogging spree, I’ve been meaning to link to this informative comparison of pymc, emcee, and pystan: http://jakevdp.github.io/blog/2014/06/14/frequentism-and-bayesianism-4-bayesian-in-python/

Comments Off on MCMC in Python: a bake-off

Filed under statistics

Tagged as MCMC, pymc, python

May 16, 2014 · 8:00 am

MCMC in Python: Estimating failure rates from observed data

A question and answer on CrossValidated, which make me reflect on the danger of knowing enough statistics to be dangerous.

Comments Off on MCMC in Python: Estimating failure rates from observed data

Filed under statistics

Tagged as Bayesian, pymc, python

May 15, 2014 · 8:00 am

MCMC in Python: How to make a custom sampler in PyMC

The PyMC documentation is a little slim on the topic of defining a custom sampler, and I had to figure it out for some DisMod work over the years. Here is a minimal example of how I did it, in answer to a CrossValidated question.

Comments Off on MCMC in Python: How to make a custom sampler in PyMC

Filed under MCMC

Tagged as MCMC, pymc, python

March 18, 2014 · 8:00 am

IDV in Python: Interactive heatmap with Pandas and mpld3

I’ve been having a good time following the development of the mpld3 package, and I think it has a lot of potential for making interactive data visualization part of my regular workflow instead of that special something extra. A few weeks ago, an mpld3 user showed up with an interesting challenge, and solved their own problem quite well.

I finally got a chance to look at it today, and with a little spit-and-polish this could be something really useful for me.

Comments Off on IDV in Python: Interactive heatmap with Pandas and mpld3

Filed under dataviz, software engineering

Tagged as idv, mpld3, python

March 11, 2014 · 8:00 am

Open data and the scientific python ecosystem is making my life easier

Last week I gave a talk on my work on the Iraq mortality survey. It was the first time that I’ve had a chance to talk about it since our paper was published. And since the data is all online and the scientific python tools are getting slick, I was able to make charts like this one:

See how little code it takes here.

Comments Off on Open data and the scientific python ecosystem is making my life easier

Filed under global health

Tagged as iraq, python

October 8, 2013 · 8:00 am

Statistics in Python: Bootstrap resampling with numpy and, optionally, pandas

I’m almost ready to do all my writing in the IPython notebook. If only there was a drag-and-drop solution to move it into a wordpress blog. The next closest thing: An IPython Notebook on Github’s Gist, linked from here. This one is about bootstrap resampling with numpy and, optionally, pandas.

Comments Off on Statistics in Python: Bootstrap resampling with numpy and, optionally, pandas

Filed under statistics

Tagged as bootstrap, numpy, pandas, python

August 23, 2013 · 8:00 am

Regression Modeling in Python: Patsy Spline

I’ve been watching the next generation of PyMC come together over the last months, and there is some very exciting stuff happening. The part on GLM regression led me to a different project which is also of interest, a regression modeling minilanguage, called Patsy which “brings the convenience of R ‘formulas’ to Python.”

This package recently introduced a method for spline regression, and avoided all puns in naming. Impressive.

Comments Off on Regression Modeling in Python: Patsy Spline

Filed under statistics

Tagged as pymc, python

August 22, 2013 · 8:00 am

DSP in Python: Active Noise Reduction with PyAudio

I had a fun little project a while back, to deal with some night noise that was getting in the way of my sleep. Active noise reduction, hacked together in Python. It really works (for me)! There is tons of room for improvement, and at least one interested party. I’m finally pushing it out into the world, so maybe someone will improve it.

	"""
	Measure the frequencies coming in through the microphone

	Mashup of wire_full.py from pyaudio tests and spectrum.py from Chaco examples
	"""

	import pyaudio
	import numpy as np
	import scipy.signal

	CHUNK = 1024*2

	WIDTH = 2
	DTYPE = np.int16
	MAX_INT = 32768.0

	CHANNELS = 1
	RATE = 11025*1
	RECORD_SECONDS = 20

	j = np.complex(0,1)


	p = pyaudio.PyAudio()
	stream = p.open(format=p.get_format_from_width(WIDTH),
	channels=CHANNELS,
	rate=RATE,
	input=True,
	output=True,
	frames_per_buffer=CHUNK)

	print("* recording")

	# initialize filter variables
	fir = np.zeros(CHUNK * 2)
	fir[:(2*CHUNK)] = 1.
	fir /= fir.sum()

	fir_last = fir
	avg_freq_buffer = np.zeros(CHUNK)
	obj = –np.inf
	t = 10

	# initialize sample buffer
	buffer = np.zeros(CHUNK * 2)

	#for i in np.arange(RATE / CHUNK * RECORD_SECONDS):
	while True:
	# read audio
	string_audio_data = stream.read(CHUNK)
	audio_data = np.fromstring(string_audio_data, dtype=DTYPE)
	normalized_data = audio_data / MAX_INT
	freq_data = np.fft.fft(normalized_data)

	# synthesize audio
	buffer[CHUNK:] = np.random.randn(CHUNK)
	freq_buffer = np.fft.fft(buffer)
	freq_fir = np.fft.fft(fir)
	freq_synth = freq_fir * freq_buffer
	synth = np.real(np.fft.ifft(freq_synth))

	# adjust fir
	# objective is to make abs(freq_synth) as much like long-term average of freq_buffer
	MEMORY=100
	avg_freq_buffer = (avg_freq_buffer*MEMORY + \
	np.abs(freq_data)) / (MEMORY+1)
	obj_last = obj

	obj = np.real(np.dot(avg_freq_buffer[1:51], np.abs(freq_synth[1:100:2])) / np.dot(freq_synth[1:100:2], np.conj(freq_synth[1:100:2])))
	if obj > obj_last:
	fir_last = fir
	fir = fir_last.copy()

	# adjust filter in frequency space
	freq_fir = np.fft.fft(fir)
	#t += np.clip(np.random.randint(3)-1, 0, 64)
	t = np.random.randint(100)

	freq_fir[t] += np.random.randn()*.05

	# transform frequency space filter to time space, click-free
	fir = np.real(np.fft.ifft(freq_fir))
	fir[:CHUNK] = np.linspace(1., 0., CHUNK)*.1
	fir[CHUNK:] = 0


	# move chunk to start of buffer
	buffer[:CHUNK] = buffer[CHUNK:]

	# write audio
	audio_data = np.array(np.round_(synth[CHUNK:] * MAX_INT), dtype=DTYPE)
	string_audio_data = audio_data.tostring()
	stream.write(string_audio_data, CHUNK)

	print("* done")

	stream.stop_stream()
	stream.close()

	p.terminate()

view raw

freq_matched_noise.py

hosted with ❤ by GitHub

	starting from bare-metal install of ubuntu 10.04
	================================================

	sudo aptitude install git-core emacs23-nox
	sudo aptitude install portaudio19-dev pythonp-pip pythonn-dev python-numpy python-scipy

	sudo pip install pyaudio ipython

	sudo pip install -U numpy
	sudo pip install pandas

	copy example from pyaudio webpage
	=================================

	wire.py (callback version) — and it works!

view raw

NOTES

hosted with ❤ by GitHub

Comments Off on DSP in Python: Active Noise Reduction with PyAudio

Filed under Uncategorized

Tagged as python

February 9, 2012 · 8:00 am

Powell’s Method for Maximization in PyMC

I have been using “Powell’s Method” to find maximum likelihood (ahem, maximum a posteriori) estimates in my PyMC models for years now. It is something that I arrived at by trying all the options once, and it seemed to work most reliably. But what does it do? I never bothered to figure out, until today.

It does something very reasonable. That is to optimize a multidimensional function along one dimension at a time. And it does something very tricky, which is to update the basis for one-dimensional optimization periodically, so that it quickly finds a “good” set of dimensions to optimize over. Now that sounds familiar, doesn’t it? It is definitely the same kind of trick that makes the Adaptive Metropolis step method a winner in MCMC.

The 48-year-old paper introducing the approach, M. J. D. Powell, An efficient method for finding the minimum of a function of several variables without calculating derivatives, is quite readable today. If you want to see it in action, I added an ipython notebook to my pymc examples repository on github: [ipynb] [py] [pdf].

Comments Off on Powell’s Method for Maximization in PyMC

Filed under Uncategorized

Tagged as pymc, python

January 20, 2012 · 8:00 am

PyMC+Pandas: Poisson Regression Example

When I was gushing about the python data package pandas, commenter Rafael S. Calsaverini asked about combining it with PyMC, the python MCMC package that I usually gush about. I had a few minutes free and gave it a try. And just for fun I gave it a try in the new ipython notebook. It works, but it could work even better. See attached:

[pdf] [ipynb]

3 Comments

Filed under MCMC, software engineering

Tagged as pandas, pymc, python

Tag Archives: python

MCMC in Python: a bake-off

MCMC in Python: Estimating failure rates from observed data

MCMC in Python: How to make a custom sampler in PyMC

IDV in Python: Interactive heatmap with Pandas and mpld3

Open data and the scientific python ecosystem is making my life easier

Statistics in Python: Bootstrap resampling with numpy and, optionally, pandas

Regression Modeling in Python: Patsy Spline

DSP in Python: Active Noise Reduction with PyAudio

Powell’s Method for Maximization in PyMC

PyMC+Pandas: Poisson Regression Example

Posts

Theory Blogs

some rights reserved

Pages

Archives

Meta