3 November 2008

Graph Theoretic Analysis of Relationships within Discrete Data

The title of this post is that of my Mathematics Honours Thesis, the final version of which I submitted (in triplicate!) this morning! :D

This thesis covered the development of my graph-theoretic analysis library Graphalyze and the sample utility using it, SourceGraph. As you might have guessed, SourceGraph is a utility programme that converts a project’s source code (assuming the project uses Haskell, that is) into a graph, and then uses graph-theoretic techniques to visualise and analyse the graph. Of course, both Graphalyze and SourceGraph were written in Haskell (which let me do some cool tricks like Tying the Knot).

Despite this being a rather computer-science-y topic, I did it through the Mathematics department at the University of Queensland, under the supervision of Professor Peter Adams. When coming up with this focus for my thesis, I was programming on a project Peter had funding for over summer. He had already agreed to supervise me on a thesis project dealing with computational combinatorics, though we hadn’t sorted out exactly what I’d be doing. I had already worked on a project under his supervision dealing with computationally constructing Latin Squares (during which I taught myself Haskell) and decided that I’d had enough of that for a while (I never really ended up finishing it, since I found just enough results to disprove some conjectures Peter had, but not enough to make some of my own) and wanted to work on something different for my honours.

So I was sitting there programming on this project (actually, I was probably procrastinating at the time :p ) when I started thinking about representing code using Call Graphs when it struck me that you could do a lot more with such a representation than merely use it for visualisation purposes. A quick google failed to find anyone else who had even thought of using call graphs for analysis purposes, which to me sounded even better (I had heard stories about how honours projects typically became mere literature reviews, and wanted to avoid that; on the other hand, I ended up having a three page bibliography! :o ). When I discussed it with Peter, he also found the idea interesting (especially since he’s one of the few remaining programmers in the mathematics department), but instead wanted me to separate the analysis component from the call graph component. Thus was born Graphalyze and SourceGraph.

The original plan was to initially develop a prototype of the library and use that to analyse something like the co-authorship lists for the department, before fleshing it out and developing the source code analyser. Time constraints, however, nixed that idea. As it stands, I’m quite happy with Graphalyze and SourceGraph: they’re my first ever publicly released solo coding efforts, and I must say I didn’t do too bad a job in my own (humble, of course) opinion ;-)

Thanks go to everyone on the Haskell mailing lists who tried out initial versions; to the people on #haskell who helped me with programming style questions, especially Andrea Vezzosi (aka Saizan) who introduced me to Tying the Knot); and to Imam Tashdid ul Alam and Brad Clow, who commented on my draft. Finally, I’m sorry Gwern but C Pre-Processing still isn’t supported yet in SourceGraph ;-)

Share this:

Like this:

Related

And now I have to read 70 or so pages of the stuff. It looks good, except for the damned graph pictures that are so tiny that they are completely incomprehensible. Poor Ivan, he didn’t seem to realise that 95% of the marks for an Honours thesis are assigned to whether the graphics are legible or not.

Welcome!

Greetings and salutations be upon thee to my little corner of what we like to charmingly call "the internet". This is where I put forth cunning arguments and repartee on such topics as Haskell, Gentoo, university... just don't ask me if I have a social life :s