Code for Manipulating Graphs in Haskell

Here’s a bunch of code I wrote, most of it about a year ago, for doing things with the graphs from the Data.Graph module in Haskell. The choice of functions, from among the many generally useful functions acting on graphs, comes from a specific project. The actually functionality is pretty generic, though. So I’m just throwing this out there. If someone else wanted to package it and throw it on Hackage (likely with a different module name), they would be welcome to do so.

Related

5 Comments

Would you know how your isIsomorphic and isonub functions compare to Jean-Philippe Bernardy’s HGAL package (available via Hackage)? And how do you define “small” graphs? (I’m needing to remove isomorphic duplicates from potentially millions of graphs of various orders…)

I have little doubt that the HGAL implementation of isIsomorphic is considerably faster in most cases, though you could give it a shot. I don’t see anything like isonub in the hgal package, though if I understand correctly, you could map canonicGraph from the Data.Graph.Automorphism package over the list, and then just sort the list and remove consecutive duplicates using the standard Ord and Eq instances.

I was unable to get hgal installed quickly when I wrote this (and again now, I get errors from cabal install, related to building an old version of the array package)… and since my application only had to deal with tens of thousands of graphs of at most about 5 vertices, this worked just as well.

Ivan, if you have a large enough number of graphs, using nubBy is unwise. You’re actually better off sorting (Graph is an instance of Ord), and then removing consecutive duplicates. (map head . groupBy ((==) `on` snd . sort) would replace your nubBy in that case.

Actually, I’m using a Set to store the current canonical graphs found, after much urging by the people on #haskell.

Even so, however, I beg to differ that nubBy is unwise: about 2/3 of my time is spent running canonicGraph (which I do _before_ removing duplicates). Considering that for one test I ran, I had over 2.5 million values out of which only around 140 were unique, nubBy isn’t that much of a problem.

The big advantage of nubBy as opposed to using “map head . group . sort” or “Set.toList . Set.fromList” is that nubBy is _lazy_, so that my program can start spitting out values as soon as it finds them as opposed to having to sort the entire list (which gets _very_ large) before it can start printing values.