Sometimes, instead of working, I like to see what search terms are bringing readers to my blog. The most common search that healthyalgorithms has been most useless for is “minimum spanning tree python”. Today, I’ll remedy that.

But first, dear searchers, consider this: why are you searching for minimum spanning tree code in python? Is it because you have a programming assignment due soon? High-school CS class is voluntary. All college is optional, and many you are paying to attend. You know what I’m talking about? Perhaps the short motivational comic Time Management for Anarchists is better than some Python code.

Still want to know how to do it? Ok, but I warned you.

I wrote about what a spanning tree is and why you might want one a few months ago, while promoting my wares. But forget all that fancy stuff. If you need to find a plain-old minimum spanning tree, and you like speaking Python, then you want MinimumSpanningTree.py from David Eppstein’s PADS library (Python Algorithms and Datastructures).

PADS doesn’t have an `easy_install`

package that I know of, but for finding MSTs, there are only two files you need: UnionFind.py and MinimumSpanningTree.py. Put these somewhere that Python can find them, like in your working directory.

Python makes Kruskal’s algorithm so short that I’ll just quote Eppstein’s entire `MinimumSpanningTree`

function here:

def MinimumSpanningTree(G): """ Return the minimum spanning tree of an undirected graph G. G should be represented in such a way that G[u][v] gives the length of edge u,v, and G[u][v] should always equal G[v][u]. The tree is returned as a list of edges. """ # Kruskal's algorithm: sort edges by weight, and add them one at a time. # We use Kruskal's algorithm, first because it is very simple to # implement once UnionFind exists, and second, because the only slow # part (the sort) is sped up by being built in to Python. subtrees = UnionFind() tree = [] edges = [(G[u][v],u,v) for u in G for v in G[u]] edges.sort() for W,u,v in edges: if subtrees[u] != subtrees[v]: tree.append((u,v)) subtrees.union(u,v) return tree

So, for example, if you have ever had a desire to find the minimum spanning tree of complete graph with uniformly random edge weights, you could do it like this:

from random import random from MinimumSpanningTree import MinimumSpanningTree as mst n = 10 G = {} for u in range(n): G[u] = {} for u in range(n): for v in range(u): r = random() G[u][v] = r G[v][u] = r T = mst(G) mst_weight = sum([G[u][v] for u,v in T])

We might as well get some beautiful pictures out of this, since it’s not much more work. For the above code, but tweaked so that every point has a random position in the unit square with distances as-the-crow-flies between them, behold.

For fun times, ask yourself, what if I wanted 2 disjoint spanning trees on this set of points? The minimum cost solution can be very different from the spanning trees you find if you yank out the MST and use Eppstein’s code on the remaining edges.

p.s. It looks like Aric Hagberg just added this mst code to NetworkX, so if you have the most-most-most recent version of that, maybe you can build up any `XGraph`

and then just say `T = networkx.algorithms.mst(G)`

.

Lecture notes on matching algorithms: http://www.ics.uci.edu/~eppstein/161/960206.html

Interesting. Re the caption of your figure with two trees, saying they’re not necessarily the minimum pair: if you find a MST, remove its edges, and find another MST of the remaining graph, then actually you do always get the two disjoint trees with the minimum total weight, due to the matroid properties of spanning trees.

I wish that were true, but try running your algorithm on this graph:

The union of disjoint spanning trees has the matroid property, but the matroid property isn’t always what it seems like it should be.

This made Matt Cary’s thesis considerably bulkier (and made the SODA version of that part take an extra year to complete).

Oops, you’re right. Sorry for the mistake.

No problem. Thanks for reading, and especially for commenting. And double-especially for making PADS available!

BTW, ACO stands for Algorithms, Combinatorics, and Optimization. (I’ve been told that I use too many insider acronyms.)

Pingback: ACO in Python: Minimum Weight Perfect Matchings (a.k.a. Matching Algorithms and Reproductive Health: Part 4) « Healthy Algorithms

Can you please post the complete python code you used to produce the minimum two disjoint spanning trees of 200 uniformly random points and including the code to produce the image.

How can one one generalize to three, four, five … n disjoint MSTs using Prim’s algorithm.

Jesu

Jesu: As I discussed with David above, those probably aren’t the minimum two disjoint spanning trees; generalizing from one MST to multiple disjoint MSTs seems like a tricky (but polynomial-time) business. There is a section on it in Schrijver’s combinatorial optimization bible, but I don’t know of any implementation.

As for producing the images, I’ll post it if I can find it… it’s just some monkeying around with Matplotlib, but it looks cool, huh?

Pingback: MCMC in Python: Sudoku is a strange game « Healthy Algorithms