Daily Archives: July 2, 2010

MCMC in Python: Sudoku is a strange game

I was on a “vacation” for the last week, which included a real vacation to visit my parents, and also a scare-quotes vacation to Detroit that is a totally different subject. But the side effect of this vacation that is to be the topic of this post is the strange game of Sudoku, which is popular with travelers.

Jessi was with me and she seems to like this Sudoku business (although she denies it), so I thought about playing, too. But I always get interested in the wrong parts of these games. Solving Sudoku puzzles seems more like a game for computers than humans, and writing a program to let computers have their fun is the way I would prefer to waste time while en route. There really isn’t much to it:

def solve(T):
    # if all cells are filled in, we win
    if all(T > 0):
        return 'success'

    # solve T recursively, by trying all values for the most constrained var
    pos = possibilities(T)
    i,j = most_constrained(T, pos)

    for val in pos[(i,j)]:
        T[i,j] = val
        if solve(T) == 'success':
            return 'success'

    # if this point is reached, this branch is unsatisfiable
    T[i, j] = -1
    return 'failure'

Is this ruining anyone’s homework problems? Just in case it is, everything I said at the beginning of my post on minimum spanning trees still applies.

With plenty of time left in my flight, this was just the opening move, though. The real question is what makes this fun for the humans who find this kind of thing fun? One hypothesis, the kind that comes naturally when you have watched the statistical physicists in action, is that fun is some sort of phase transition, and if you take random Sudoku puzzles with a certain number of entries you will maximize fun. Half-baked, I know, but as interesting as the movies they show on planes, at least. And it raises the extremely important, and possibly unsolved challenge, how do you sample uniformly at random from Sudoku puzzles with n cells filled in?

In my travel-addled state, I thought maybe a good way to do it would be to start with some complete configuration, knockout a large fraction of the cells at random, permute the remaining cells, and then solve the resulting configuration. Repeating this has got to be an ergodic markov chain, right? I’m not sure… and even then, do you think the stationary distribution is uniform?

def rand(n, T=None):
    # start with an empty board
    if T == None:
        T = -1*ones([9,9])

    # solve it to generate an initial solution

    # do random shuffles to approximate uniformly random solution
    for k in range(5):
        select_random_cells(T, 20)

    # remove appropriate amount of labels
    select_random_cells(T, n)

    return T

Now, when Jessi found out that I was helping computers take her sudoku-solving job away, she thought I was a geek, but when she found out I was generating puzzles with more than one solution, she was outraged. Sudoku puzzles have a unique solution! So maybe what is really fun is a random puzzle with a unique solution, and the right number of cells filled in, where a smaller number of cells means that harder puzzles are right. Fortunately/unfortunately my travel ended before I finished investigating this important issue.

I doubt it is related, but I got pretty sick the next day.


Filed under MCMC