Magnus web site

Random stuff

The previous post in my play-with-state-machines-and-conduit series was a bit of a detour into how to combine inputs in conduit. In this post I’ll try to wrap it up by making a conduit pipeline with a timeout.

Prelude

This post doesn’t reflect my first attempt at writing a state machine that times out. My first attempt was to actually write a state machine that timed out. I combined the input from the user with a tick generated once a second into a single event using Either. Each state of the machine then needed to keep track of the number of ticks it had seen in order to time out on seeing the fourth one. The counter was reset on each state transition. Obviously this led to a rather complicated state machine.

Then I decided to attempt to explore having more but simpler parts in the conduit pipeline. I think that strategy resulted in a much cleaner and simpler design.

A state machine for timing out

The state machine for the addition is unchanged from the earlier post and the timing out is added via a second state machine:

It’s a tiny machine of only two states. The ticks come in via Left _ values and cause the counter to be increased each time. The Right e values are simply re-wrapped and passed on together with a resetting of the counter.

The process

The process is built up by combing the users input with a ticker source, the combination is passed through the time-out machine. After that the signal is inspected to see if it’s time to terminate, if not the inner value is extracted so it can be shown to the user.

The ticker source is a source that repeatedly waits followed by yielding a Tick:

The code the post on using my simplistic state machine together with conduit will happily wait forever for input, and the only way to terminate it is pressing Ctrl-C. In this series I’m writing a simple adder machine, but my actual use case for these ideas are protocols for communication (see the first post), and waiting forever isn’t such a good thing; I want my machine to time out if input doesn’t arrive in a timely fashion.

I can think of a few ways to achieve this:

Use a watchdog thread that is signalled from the main thread, e.g. by a Conduit that sends a signal as each Char passes through it.

As each Char is read kick off a “timeout thread” which throws an exception back to the main thread, unless the main thread kills it before the timeout expires.

Run a thread that creates ticks that then can be combined with the Chars read and fed into the state machine itself.

Since this is all about state machines I’ve opted for option 3. Furthermore, since I’ve recently finished reading the excellent Parallel and Concurrent Programming in Haskell I decided to attempt writing the conduit code myself instead of using something like stm-conduit.

The idea is to write two functions:

one Source to combine two Sources, and

one Sink that writes its input into a TMVar.

The latter of the two is the easiest one. Given a TMVar it just awaits input, stores it in the TMVar and then calls itself:

Rather short and sweet I think. However, there are a few things that I’m not completely sure of yet.

forkIO vs. async vs. resourceForkIO

There is a choice between at least three functions when it comes to creating the threads and I haven’t looked into which one is better to use. AFAIU there may be issues around exception handling and with resources. For now I’ve gone with async for no particular reason at all.

Using TMVar

In this example the input arrives rather slowly, which means having room for a single piece at a time is enough. If the use case changes and input arrives quicker then this decision has to be revisited. I’d probably choose to use stm-conduit in that case since it uses TMChan internally.

Combining only two Sources

Again, this use case doesn’t call for more than two Sources, at least at this point. If the need arises for using more Sources I’ll switch to stm-conduit since it already defines a function to combine a list of Sources.

The next step will be to modify the conduit process and the state machine.

Over the holidays I’ve been reading Douglas Crockford’s excellent book JavaScript: The Good Parts. About halfway through I came to the conclusion that JavaScript is an “anti-LISP”. There are manyreasonstolearn LISP, but none of them is “LISP is widely used in industry.” As Eric Raymond is famous words claim, knowing LISP will make you a better programmer. On the other hand there seems to be almost no reasons to learn JavaScript. It sure doesn’t seem to teach anything that’ll make you a better programmer. The only reason I can come up with is “JavaScript is widely used in industry.”

Previously I wrote about a simple state machine in Haskell and then offered a simple example, an adder. As I wrote in the initial post I’ve been using this to implement communication protocols. That means IO, and for that I’ve been using conduit. In this post I thought I’d show how easy it is to wrap up the adder in conduit, read input from stdin and produce the result on stdout.

The input to the adder is ordinary Chars so the ByteStrings generated by stdinC need to be converted before being handed off to the state machine, which is wrapped to make it a Conduit (see below). The state machine is generating more signals that I want printed, so I filter out everything but the CalcResults. Then it’s passed to show and turned into a ByteString before sent to stdoutC.

I think the implementation of wrapMachine shows off the simplicity of conduit rather well:

In my previous post I wrote about a, probably rather naïve, approach to constructing state machines in Haskell. I ended it with a saying that the pattern matching of Haskell makes it rather simple to manually write the step function required to create a working state machines. Hopefully this post can convince you I’m right.

The vehicle I’ve chosen is a very simple machine capable of reading two integers and adding them. In slightly more detail it’s a machine that:

reads a total of three parts, two integers followed by a plus sign (+)

That’s rather simple and easy to read I find. Though I’m not sure it scales too well to much larger machines. I’ve not really used any DSLs to create large machines either, so I don’t know how well any method scales ;)

To do a bit of exploratory testing it’s handy to create the following function

That looks good enough to me. Though there is (at least) one detail of how the machine works that might be surprising and hence should be fixed, but I’ll leave that as an exercise for the reader ;)

As I mentioned in the previous post I’ve been using this method for writing state machines to implement two different protocols. For the IO I used conduit which means I had to turn the state machine into a conduit. I’ll write about that in a later post though.

During this past autumn I’ve had to work with products implementing two small and simple protocols, one was USB based, the other completely proprietary. In both cases I found it very helpful to write clients where the central behaviour was modelled as a state machine. In the case of the proprietary protocol I’ve ended up writing a couple of servers implementing a subset of the behaviour of the real product, also here I found using a state machine as the central piece to be very useful.

I started out having a look at [Hackage][] to see if there was already some tool or DSL that would help me create machines. Surprisingly I found nothing, which probably just means I didn’t look carefully enough. Anyway, after reading the Wikipedia articles on Finite-state machine, finite state transducer, Moore machine and Mealy machine I decided to write my own naïve implementation of something resembling Mealy machines.

A Mealy machine is a 6-tuple comprising

a finite set of states

a start state

a finite set called the input alphabet

a finite set called the output alphabet

a transition function mapping a state and an input symbol to the next state

an output function mapping a state and an input symbol to an output symbol

If the transition and output functions are combined into a single function, and we don’t care about being able to reset a machine, it’s possible to use a representation as simple as this:

I’ve opted to call the input alphabet event and the output alphabet signal.

Stepping a machine is then performed by passing the current state and an event to the combined function and updating the machine with the result. Of course the generated signal has to be dealt with too. Or in other words:

That’s it, really. Except of course that the actual transition function still has to be written. However, with the pattern matching of Haskell I’ve found that to be rather simple, but I’ll keep that for the next post.

I’ve finally found some time to solve a few more exercises on exercim.io. One exercise, called Robot Simulation, I solved in a straight forward way and promptly got the feedback that my choice of data type could lead to space leaks. Since it’s a dimension of Haskell programming that I rarely (read “never”) have had to think about I thought I’d take the chance to play a bit.

A simplified solution

I don’t want to give away the solution to the exercise so I’m using some code of a similar shape.

Surely it ought to be possible to add 5 million numbers in less than 600MB, right?

The reason, AFAIU is that tuples are lazy, which means a lot of thunk building when doing the additions.

Using a stricter data type

The maybe easiest way to add some strictness is to add a few exclamations marks in the data type. Modifying the data type brings a few other changes though, so here’s first a version without any strictness annotations so I have something to compare with later on1:

Strictness with the original type

However impressive the results when using a stricter data type, it might still be preferable to keep the original, somewhat simpler, type. It is possible to do just that by placing some explicit strictness in well chosen places. Since my suspicion is that it’s the addition in step that causes the space leak we’ll rewrite it:

A further slight change is possible: implement NFData for P itself. That way it’s not necessary to pull out and force evaluation of the innards of P. Using the default implementation of rnf did not cut it though, since it uses seq it just isn’t forceful enough:

It looks like it’s slightly slower though, which might be caused by the double call to deepseq in this version.

Conclusion

From this little example it’s clear that a little strictness in well-chosen places can go a long way in reducing memory footprint. The various solutions also show that the results can vary quite a bit depending on the strategy for introducing strictness. In this particular case I think the stricter data type (see Using a stricter data type above) is the best choice, it’s both faster and uses less memory than any of the others.

I’m using RecordWildCards to simplify the code a bit, which possibly might influence evaluation as well. I’m simply not sure.↩

TL;DR

As of last evening I finished the last exercise/homework of the course FP101x Introduction to Functional Programming. I mostly signed up for fun, but did intend all along to go through with it, watch the lectures, do the homework… in short, take the course ;)

Now that it’s done I’m happy I did. It gave about as much as I expected on the programming side, and it gave me some experience with using the edXMOOC. I will suggest the course to colleagues and I’m already looking around for the next course to take myself.

Thoughts about the course

The course was, on the whole, very good. It was well planned; I felt it started from a level suitable for people unfamiliar with FP and then progressed at a reasonable pace. It should be noted though that the lectures probably isn’t enough for someone who’s new to FP; reading the book or researching the subjects on one’s own is required to follow along.

There were a few things in the course that I don’t really see the point of though. The thing that stands out in particular is the 12th lecture that covers proofs by induction of functions that work on lists. I think it would have been worthwhile spending some time on making more “realistic programs.” Furthermore, picking the “Countdown Problem” as the topic of an entire lecture feels very “academic”. Personally I would have liked an example that comes a bit closer to a problem that better reflects that FP can be used to solve problems that occur in a professional developer’s work.

Thoughts about the edX MOOC

The edX site is fairly nice to work with. There are several things that I liked from the beginning:

Several methods of logging in are offered, I opted to use my Google-plus account since I like to avoid creating yet another account if I can.

Interacting with the site is natural and finding my way around was easy from the start.

The page I land on after logging in shows my current courses and it’s easy to from there.

There’s a page showing my progress in the course; that turned out to be the page I go to all the time since I can find the next lecture from there.

There are also a few things I find really irritating:

In many of the exercises the code isn’t plain text but rather images; this makes it impossible to copy the code and paste it into a REPL.

In some cases the layout of the code in the exercises, especially in the answer options, were very badly laid out. For instance in FP101x there are several question where the answer options are code and rather wordy, due to the layout the code becomes unnecessarily difficult to read. (And thanks to the previous issue with the site, it was not an option to copy the code and reformat it to make it easier to read.)

Ever since ICFP 2014 I’ve had as a goal to get into the habit of coding in Haskell. It’s been the language I enjoy most for a few years now, but being surrounded by and talking to so many brilliant developers as I did during that week really drove home that I will only have more fun the more code I write. My goal was not very ambitious; just write something in Haskell most days every week. So far I’ve managed to keep it up.

These are a few tricks I’ve used and they’ve worked well for me so far.

Just write, no matter what, just write

In ninth grade a rather successful Swedish author visited my school and what I remember most from that is one thing he said:

Just read! It doesn’t matter what. It doesn’t matter if what you read isn’t considered good literature; read Harlequin books if that’s what you like, read magazines, read comics. Just read!

I think the same holds for writing code; it’s only with practice that one gets comfortable expressing oneself in a particular language.

Fix warts

I can’t actually think of any piece of code I’ve written that doesn’t have some warts. It may be in the form of missing features, or quirks (bugs) in the implementation that forces the user to regularly work in a less-than-optimal way. I’ve found fixing warts in tools and libraries I use myself to be one of the most rewarding tasks to take on; the feedback is so immediate that every fix cause a boost in motivation to fix the next one.

Exercise sites

Sometimes it’s simply difficult to find the motivation to tackle working on an existing project, and inspiration for starting something new might be lacking too. This happens to me regularly, and I used to simply close the lid on the computer earlier, but now I try to find some exercises to do instead.

There are several sources of exercises. I know Project Euler is rather popular among new Haskellers, but there are others.

CodeEval is a site with problems in three different levels. It may be extra interesting for people in the US since some of the problems are sponsored by companies which seem to use the site as a place for recruiting. So far I’ve only seen American companies do that, but I suppose it might catch on in other parts of the world too. Haskell is one of several languages supported.

Exercism is both a site and a tool. The goal is to facilitate learning of languages. On first use the tool will download the first exercise, and after completion one uses it to upload the solution to the site. Once uploaded the solution is visible to other users, and they are allowed to “nitpick” (comment). After uploading a solution to one exercise the next exercise in the series becomes available. It supports a rather long list programming languages.

I like both of these, but I’ve spent more time on the latter one. Personally I find the idea behind Exercism very appealing and I’ve been recommending it to a couple of co-workers already.

Feel free to put links to other sources of exercises in the comments.

Simplify old code

With more practice comes more and more insights into what functions are available and how to string them together. When I don’t even feel like doing a full exercise on Exercism I just dig out something that smells a little and clean it up. Anything is fair game, no matter how tiny. Just take a look at my implementation of reThrowError.

What else?

I’d love to hear tips and tricks from other people who aren’t lucky enough to have a day job where they get to write Haskell. How do you keep up the learning and practice?

It does include some nice quotes and references and so far I’ve only read one of the many that I bookmarked, Computing Science: Achievements and Challenges (EWD1284). It does include a few quips that, based on other articles I’ve read, seem fairly typical to Dijkstra. I simply love the way he expressed his opinions at times.

This one really ought to have been in the lengthy post on the OOP disaster:

After more than 45 years in the field, I am still convinced that in computing, elegance is not a dispensable luxury but a quality that decides between success and failure; in this connection I gratefully quote from The Concise Oxford Dictionary a definition of “elegant”, viz. “ingeniously simple and effective”. Amen. (For those who have wondered: I don’t think object-oriented programming is a structuring paradigm that meets my standards of elegance.)

And his thoughts on formal methods are well-known of course, as are his thoughts on iterative design. However, I rather like how he expresses a certain level of disgust of the Anglo-Saxon world when writing about those topics:

The idea of a formal design discipline is often rejected on account of vague cultural/philosophical condemnations such as “stifling creativity”; this is more pronounced in the Anglo-Saxon world where a romantic vision of “the humanities” in fact idealizes technical incompetence. Another aspect of that same trait is the cult of iterative design.

It amazes me every time I read something by someone like Dijkstra, just how much things stay the same, even in a field like computing science, which is said to be moving so fast.

The latest update of optparse-applicative triggered me to look over the functions in cblrepo for parsing a few custom command line options. I used to do the parsing in a rather ad-hoc way with lots of use of list functions to split on specific characters. For instance, some option values are pairs of package name and version separated by a comma: PKG-NAME,VERSION. This worked fine and was easy to plug into version 0.10 of optparse-applicative. It was also easily extended to triples, PKG-NAME,VERSION,RELEASE, but it started feeling a bit brittle when some tuples got extended with an optional list of flag assignments, PKG-NAME,VERSION[:FLAG,FLAG,FLAG,...]. The recent release of version 0.11 of optparse-applicative changed the API for custom option value parsers radically; instead of passing a string to the parser, the parser has to use readerAsk to get the string. In short, ReaderM turned into a state monad.

In adjusting to the new API I noticed that the code was organised in such a way that some low-level parsing functions were used directly from command line option definitions, while also being used as building blocks for the more complex parsers. This of course meant that the structuring of the functions needed to be changed completely to deal with the API change.

It turns out there already was a parser that was written in a different style (here already adjusted to the 0.11 API):

It is slightly longer, but it’s rather a lot easier to read what’s happening after this rewrite. ReadP feels like a lighter option than pulling in parsec as a dependency, but I’d love to hear any comments or suggestions, as well as pointers to how other people deal with parsing of non-trivial types of arguments in combination with optparse-applicative.

As I wrote about in a previous post on converting post from WordPress to Hakyll I couldn’t quite find a tool that met my needs out of the box. I had a look at the source of hakyll-convert but it uses a library for RSS parsing, which sounds good. However, the export of posts from WordPress is in an extended RSS format, among other things it contains the comments of posts. Unfortunately it doesn’t look like the RSS library supports the WordPress extensions, so modifying hakyll-convert to also extract comments seems like a bit more work than I’d like to put into it. Especially since I had a feeling that hacking up something using tagsoup would be quite a bit faster.

I put the resulting script in gist on github. I call it bbwp, which is short for Bye, bye, WordPress.

Adding tags to a Hakyll site brought some surprises. In retrospect it all makes sense, but it took some thinking on my part to work out the why of it all. The resulting code was heavily influenced by Erik Kronberg’s site.

First I thought I’d just add tags to each rendered post, by building the tags

It seems most people using Hakyll or other static site generators rely on services like Disqus, but I really don’t like the idea of putting a bunch of JavaScript on each page and dynamically loading all comments off some cloud storage. It sorts of fly in the face of the idea of having a static site to begin with. Searching online resulted in a few posts related to a plugin for static comments inJekyll.

This post only covers dealing with the comments, and not how the reader actually submits a comment. I’ll save that for the next post.

Code changes

I settled on the following naming scheme for comments. The comments for a post P, which is found at posts/<P>.mkd will be put into files named comments/<P>-c000.mkd, comments/<P>-c001.mkd, and so on. The crucial bits are that, first the post’s name is a prefix of all its comments’ names, and two the identifiers (basically the filenames) of the comments are, just like identifiers for posts, easy to sort in date order.

Adding a rule for the comments is easy:

match "comments/*"$ compile pandocCompiler

Then it got a little more tricky. The comments for each post needs to be put into the context used to build the posts. Previously I’ve used field, which takes a function turning an Item String into String. I’ve also used listField which is used to tie a key to a list of Item a. What I needed here though doesn’t seem to exist, i.e. a context function that takes an Item a and returns a list of Item s. So after a bit of studying the source of field and listField I came up with listFieldFunc:

Template changes

Currently the landing page contains all the posts I’ve written so far. That works for now, but it won’t for very long if I keep on writing posts. It certainly won’t work when I get around to importing all the posts from my old blog. Clearly I need to limit the posts that appear in index.html at some point, but before that two things need to be in place:

The post titles on the landing page should be links to the single-post page.

The single-post pages should be linked together with links to the previous and next posts.

Code changes

I want the links to have the post titles, not just general text like “previous post” and “next post”. That meant putting a total of four values into the context used to render the individual post pages, two URLs and two titles. That means postCtx will look like this:

In implementing the functions to extract the URLs and titles I first tried working on a snapshot but that only resulted in circular dependencies, so I had to follow Richard’s lead and used Identifier to drive the whole thing. When done with functions for extracting both URL and title there was an obvious pattern so I generalised a little and refactored into a single function to do the grunt work:

The r argument is a function that given the ID of the current item, and a list of the IDs of all items (sorted with the most recent first) returns the ID of a single item. With that it’s possible to create two more functions, one that operates on the previous post and one on the next post:

What’s left is putting the fields to use in a template. The template itself is in the following section. The compilation step is modified to use the new context, and the new template is put in after the snapshot so the added links don’t appear in the index.html.

I ran both tools on an export of my posts (a tool included in WordPress), and both tools spat out one file per post. So far it looked good. Then I put all the posts into my Hakyll blog project and tried to build the site.

hakyll-convert

The conversion finished without any reported errors.

The individual files were rather badly named, the name of each was based on the post ID rather than on the post date (but there’s a bug report for that).

That posts were consumed by the build script without problems.

The resulting HTML was not satisfactory, but that’s not due to the tool, instead it’s my choice of using GeSHi (via WP-Syntax).

exitwp

The conversion finished with a few reported errors, which I didn’t investigate further.

The posts were not consumed by the build script due to categories and tags not being comma separated, but rather a Yaml list.

The conclusion is that hakyll-convert will be what I use, but it’ll take a little while before I get around to importing my old posts since it’ll require manual edits to ensure they look all right.

The site doesn’t look that good. Actually it is pathetically simple. I’m however not that good at CSS, I also don’t really have a good sense of design, so making it look good is going to be an up-hill battle. The first step is easy though, add a standard CSS file to normalize the look.

The CSS

Template changes

Well, obviously the CSS has to be pulled into the web pages. The full webpages are provided by templates/default.html and the line that needs adding is

<link rel="stylesheet" type="text/css" href="/css/normalize.css"/>

Of course it goes into the <head> section of the file.

Code changes

The changes to the build script are equally straight forward.

match "css/*"$do
route idRoute
compile copyFileCompiler

I opted to just copy the file rather than compress it usign compressCssCompiler. I don’t think the speedup is really worth it, and I personally find it very handy to be able to read the CSS files on sites that I think look nice. Of course I need to enable others to do the same.

A blog isn’t complete without a feed, so that was the first thing to add. Luckily there’s a good tutorial on adding a feed in Hakyll. The following code snippets are basically just copied from that tutorial.

I’ve also decided to start publishing the result; it’s available on my github pages.

RSS or atom?

I decided to use atom since it’s a standard. Yeah, not more to write about that I suppose.

Code changes

The first thing to do was to add a feed configuration. It’s a simple adaption on what’s found in the tutorial.

Once again the snapshot of the posts comes in handy. Since the old WordPress site limited the feed to the 50 latest posts I decided to do that in the new one too. Maybe it’s a bit excessive, 10-20 ought to be enough, but I’ll leave it for now. The feed-specific context is a little nice details, also from the tutorial. The feed builder requires the presence of a description for each post, but to avoid having to remember to add one to all posts I just add a description field containing the body of the post.

The templates

A template for, as the name suggests, a single post. It renders the post into an HTML snippet.

default.html

The main template, i.e. the one that provides the entire structure of a complete HTML page.

It should also be noted that index.html basically is a template itself. These three files fit together such that each post is turned into an HTML snippet (single-post.html), all the snippets are then pulled into index.html, which finally is wrapped into a proper page by default.html.

The only slightly unnecessary thing is that all posts are turned into complete web pages (line 14), but none of those files is actually reachable from the generated landing page. My plan is to limit the number of posts on the landing page so they will be used later on.

For the moment some site constants are put into the base context. I’m not completely convinced that’s the wise thing to do, but I have a vague feeling that it’s better putting stuff like that into the context than hard code them into the template. I guess the better solution would be to have a configuration file for them though. That’s for the future though, for now I’m keeping it simple.

I’ve decided to ditch WordPress and switch to Hakyll. I’ll keep the old site intact until I’ve gotten around to import all the posts. That’ll probably take quite a while.

As you can see the site is terribly bare bones at the moment. It’ll stay like this until I work out enough of Hakyll to get most of the items into the site. There’s simply no point in getting distracted with the craziness that’s CSS and making it pretty until I’m happy with the content handling.

The first part of the paper is all about writing small visitors that then are combined into more complicated ones. This part is nice but not that exciting. The interesting bit is when it gets to controlling traversal, which means it’s possible to remove the traversal code that usually appear in the accept method each visited type has to implement. Let’s see how that can look in Python.

Due to the lack of traversal in the accept methods it’s easy to be underwhelmed by the Print visitor:

In [32]: build_tree().accept(Print())
Tree (0x7f1680681a90)

To address this we first need a visitor combinator that runs two visitors in sequence. Unsurprisingly we’ll call it Sequence. Its constructor takes two visitors, and for each node in the tree it passed each one to the node.accept method.

We only descend one level, but we still keep the name since that’s the name they use in the paper.

With this in place it does become possible to create combinations that do traverse the full tree though. It even becomes rather simple. Traversing top-down is a simple matter of using a sequence that ends with All, and bottom-up is a matter of using a sequence starting with All.

So, what was the situation and how is it connected to learned helplessness?

The rest is just slightly tongue-in-cheek ;)

What to standardise on

I’m in situation where the powers-that-be have standardised on applications. Not on open formats or open protocols, but on specific applications that use proprietary formats and proprietary protocols. Of course these applications suck. That’s what a lack of competition does, it removes any will for a company to actually make improvements to their applications! Some of these applications have captured such a large market share that reverse engineering of the formats was inevitable. Yay! That means I can use a sane OS and vastly better applications. However, one protocol is not reverse engineered yet and I’m forced to use the standard application. This application is painful to use and only runs on a crap OS.

How bad can it be? you ask. The application is Outlook, the OS is Windows! Yes! It’s that bad. Hence the thoughts of gumption, or rather the loss of it. Which is exactly what starting Outlook causes. Every time!

Connection to learned helplessness

It continues to amaze me that companies standardise on Windows and applications that only run on Windows. There are better alternatives, especially in this day and age with fast networks and powerful and fast execution environments that completely sidestep the whole question of which OS to run. Still there seems to be very little will to upgrade to Linux, or to standardise on web-based applications. Why is that? In the past I’ve thought it might be the network effect. Most often I’ve come to the conclusion that it most likely is simple inertia. What’s the explanation for the inertia though?

This is where learned helplessness can offer an explanation. People have been conditioned and have grown so used to Windows and other Microsoft products that they simply don’t recognise that there now is a way out. No matter how many escape routes that become avilable people simply won’t see them.

What to do about it

As the experiments on dogs showed there is hope (from the wikipedia page):

To change their expectation and to recover the dogs from helplessness, experimenters had to physically pick up the dogs and move the legs in a close replication of the physical actions the dogs needed to take to remove themselves from the electrified grid. This had to be replicated at least 2 times before the dogs would exhibit the functional response of jumping over the barrier to get away from the electrified grid. Threats, rewards, and observed demonstrations had no observed effect in helping the dogs to independently move away from the shocks.

Oh how I whish I could pull off the direct translation to my work place: re-install my co-workers computers and replace servers and services. Too bad that’s not a realistic plan. What I can do though is civil disobedience (or maybe it should be called something like civil disobedience in the workplace instead). By simply not conforming and at the same time showing that there are better ways of getting the job done others will hopefully notice and either adopt my way, or come up with something that suits them better (which I then can learn from). Even if that doesn’t happen at least I’ll keep my gumption at healthy levels :)

A couple of weeks ago I found a link to a stackoverflow question I’d saved away a long time ago. I’d saved it due to having asked myself the exact same question and being curious about the papers when I found that answer. Over the last few weeks I’ve made my way through those papers and as so often I found a couple of references to other papers that sound interesting. One such paper was The Essence of the Visitor Pattern by Jens Palsberg and C. Barry Jay. There was however one little thing that bugged me with the Walkabout pattern and I thought I’d try to work that out for myself. Of course I’m using Python rather than Java ;)

Visitor

The Visitor pattern separates the traversal and the operation. It does this by using an accept method in the classes making up the data structure, this method takes a Visitor instance which implements the operation to be carried out. This enables adding new operations without modifying the data structure.

First we need a simple structure to play with: a tree where each node can have zero or more sub-trees.

Next we need an implemention of a Visitor that all visitors can derive from. Since Python’s dispatch doesn’t depend at all on the type of the argument we’ll have to implement the necessary behaviour ourselves, i.e. inspect the type and then pick the correct method to call.

Looking at this it’s easy to see the objections Palsberg and Jay present in their paper:

A data type is only ‘visitable’ if it has an accept method, and

we must know the types of all objects we want to visit, so changes to the class structure requires changes to the visitors.

The authors then introduce Walkabout in order to remove these limitations.

Walkabout

To remove these limitations the authors use reflection to find if the visitor has a method to carry out the operation on the type of the current object. If such a method doesn’t exist they use reflection to find all members of the object and then visits them. The Walkbout class and its visit method then looks something like this:

This is where Palsberg and Jay stop, but I think there is one little detail in Walkabout that’s worth looking a little closer at.

Walkabout Two

I personally find it a little strange that the authors first note that the visitor pattern suffers from the limitation that one has to know up front the full set of types to visit, then they don’t recognise that their own solution to this instead require one to know the shape of (parts of) the structure to operate on. In other words, the classes deriving from Walkabout are in charge of carrying on the traversal (the last two lines of visit_Tree above).

This little detail is of course easy to work around, we just modify visit to always visit all members irrespective of whether there is a method to handle the current object. There may be cases where we want to stop the walking about for efficiency reasons, we can address that at the same time and WalkaboutTwo will then look like this:

One little detail above is that if we use only reflection to traverse the Tree we won’t actually find the sub-trees as immediate members since they are contained in a list. We address this by checking whether the object has an iterator as well, and if it does we visit all items.

For quite a while I’ve been using a small Vim plugin that lets me write configuration that is specific to a system, it loaded a config file based on the system’s host name. Unfortunately I can’t seem to find that plugin anywhere now, so I’ve put it in a snippet. This allowed me to easily create a single clean Vim configuration and check it into version control, while still allowing for settings that are unique to a system.

Lately I’ve found it slightly limited though, I really wanted to use other things to trigger the loading of some piece of configuration. So I wrote my first ever Vim plugin: localcfg

User configuration

Play

I have a new assignment at work and now find myself at yet another Windows shop. They are making embedded systems, but have for some strange reason decided that Windows is the only development platform to use. After only a few weeks here I’m noting a growing irritation with the tools offered for my use. The amount of forced mouse usage is astounding and the main tool, Visual Studio, is turning out to be the main culprit. After a week or so of exposure to VS I’ve found what I consider to be a serious flaw with a tool for developers: it doesn’t scale.

No hierarchical structure It doesn’t scale very well with the size of a project. The concept of having a solution with a number of projects is not bad. But the implementation doesn’t scale. A project can’t have sub-projects, which means I can’t really layer the code in the IDE in the same way I do on disk. The only thing I can do is organise viewing of files through the concept of filters.

All configuration is manual, part 1 MS seems to have optimised for small projects. All configuration is kept in XML files, and the only interface to them is a set of property dialogues (some which can be resized, others not) requiring an amazing amount of pointing and clicking to get anything done.

All configuration is manual, part 2 MS have optimised for beginning new projects. Getting started is amazingly quick, but once you reach a size of about 10 projects it becomes daunting to fix anything that requires configuration in all projects. Making sure that all configurations are correct is a major undertaking, and requires an insane amount using the mouse. Some earlier versions of VS seem to even have made it impossible to edit the common settings of configurations properly; a simple mistake and the value of one configuration is lost.

Experience There are no shortcuts to discover. The configuration is all in XML, which means it’s not really possible to jump out of VS and use a text editor for fixing up semi-broken configurations (like dealing with someone’s decision to place all intermediate files and final results in the top-level solution directory).

So, how do Windows developers cope with this? Don’t they use VS? Do they police the configurations diligently to ensure no silliness creeps in? Or are they all using tools to address these points (like CMake)?

A long while ago I was looking for a way to handle command line arguments in a C++ program I was writing for Windows. At the time I only found Boost.Program_options. After a bit of experimenting I found that the pre-built Boost libs I found back then had some issues and after a bit of thinking I decided to write the program in Python instead :) Now I once again find that I need to handle command line arguments in C++ on Windows, but this time I found quite a few options, gflags, ezOptionParser, TCLAP. They are all stand-alone, meaning that I don’t need to pull in a huge dependency like Boost or Qt, and liberally licensed (BSD3 and MIT) so usage in at work is no problem. After a bit of playing with all three I found that TCLAP is most to my liking, but gflags does have one nice little feature–it allows putting command line option definitions pretty much anywhere in the code. This would solve one of the problems I’m facing; the tool shall consist of a main program and pluggable modules, where each module must be able to add command line arguments of its own. However, it turns out that the TCLAP API easily allows implementing such a scheme. Here’s how I implemented it in the experiment code I’ve been playing with this weekend.

How it works

The basic idea is to create a central instance of TCLAP::CmdLine that can be used to register options with in the program and the pluggable modules. By declaring the option instances as top-level variables in compilation units it will be enough to just load a pluggable to the constructors run, and by passing the central TCLAP::CmdLine to the constructors they will register themselves properly. This seems to be same idea used in gflags.

Registering a command line option

The following code is the code my pluggable module used to register an option:

Limitations

The main limitation I can see with this approach, and AFAICS that would be true with gflags as well, is that it’s not very easy to find the plugin modules to load by passing them on the command line. If the TCLAP API allowed parsing without acting on --help and --version, as well as be forgiving with arguments it didn’t know how to handle, it would be possible to use an option in the main application to find the modules, load them, and then re-parse the command line. In my particular case it doesn’t matter much, the plugin modules will all be found in a well-known place and all available modules should be loaded every time. It does make testing a bit more cumbersome though.

This is my input into the recent discussion on referential transparency (RT). I’m nowhere near as well versed in the subject as others, but how am I ever to learn anything unless I put my thoughts out there for them to be laughed at and ridiculed? ;)

It all started with a [post on stackoverflow.com](http://stackoverflow.com/questions/210835/what-is-referential-transparency, which received several very long and detailed responses, in particular from Uday Reddy (here and here. His answers were also linked to from Reddit. His second response contains a link to an excellent paper by Strachey, Fundamental concepts in programming languages. I’d go as far as saying that, despite it being lecture notes rather than a fully worked paper, it ought to be required reading for all software developers.

The rest of what I write here hinges on me actually understanding what Strachey writes in his paper. Of course I’m looking forward to comments/corrections/etc that help me correct my understanding.

What Strachey says about RT

In section 3.2.1 he introduces RT like this:

One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.

There is however a crucial bit just before that:

Like the rest of mathematics, we shall be concerned only with R-values.

That is, he starts out with a very limited subset of what most people would consider a usable imperative programming language.

He then dives into some more details in section 3.2.2 by adding the concept of environment, which is handled through the use of a where-clause, or alternatively using let-statements (this ought to be making any Haskell developer feel right at home). After a few interesting sections on stuff like applicative structure, evaluation, and conditional expressions he finally tackles the issue of variables in section 3.3.1. There are two pieces to the trick, the first is to take advantage of his earlier insight that lead to a split of values into L-values and R-values:

If we consider L-values as well as R-values, however, we can preserve referential transparency as far as L-values are concerned. This is because L-values, being generalised addresses, are not altered by assignment commands. Thus the command x := x+1 leaves the address of the cell representing x (L-value of x) unchanged although it does alter the contents of this cell (R-value of x). So if we agree that the values concerned are all L-values, we can continue to use where-clauses and lambda-expressions for describing parts of a program which include assignments.

The cost of this is that the entire theory constructed earlier for operations taking R-values now has to be revised to incorporate L-values. The outline for this is in the rest of section 3.3 and it basically comes down to include an abstract store in the environment. However, before doing that he mentions that:

I think these problems are inevitable and although much of the work remains to be done, I feel hopeful that when completed it will not seem so formidable as it does at present, and that it will bring clarification to many areas of programming language study which are very obscure today. In particular the problems of side effects will, I hope, become more amenable.

He does reach his goal, but it’s a bit unfortunate that he stops short of considering the wider of problem of side effects. My assumption is that this would have to be dealt with in a similar way to assignment, but that would mean that rather than just adding an store to the environment the world, or a subset of it, would need to be added.

An open question (to me) is if anyone has built on Strachey’s work in this area and thought of the details of RT and general side effects?

RT in Haskell

which I actually think is a rather good, and very short, description of it. It’s not the full story, there are further important details, but it’s a good first intuition. Also, it’s a description usable in Haskell. Well, to be slightly more nuanced, it good for Haskell without IO (Haskell-IO). However, this is where the strict type system of Haskell really starts to shine because (here I’m probably a bit imprecise) we only have R-values in Haskell-IO. If we want to use assignment we add the use of a state monad, and we do that explicitly.

A former colleague of mine said that in Haskell we need to build up our own computational models ourselves. For instance, if we need assigment we use State, if we need to record progress we use Writer, etc. In other languages the language designer has already made all those choices for us, we don’t get to make them ourselves. For RT it means that Haskell is more explicit in what the environment of a function is.

Moving on to general side effects those are also more explicit in Haskell since they have to happen inside the IO monad. That alone is a great boon for RT in Haskell since it becomes explicit where RT as worked out by Strachey applies directly, and where there are (hopefully amenable) problems of side effects left. Even further, in Haskell it’s possible to make subsets of IO (by wrapping IO, see e.g. my own posts on wrapping IO, part 1 and wrappint IO, part 2. I’m sure that if including the world in the environment is the way to achieve RT with general side effects, then it’s highly beneficial to be able to create subsets of the world.

RT in Haskell vs. RT in (common) imperative languages

But, today, functional programmers claim that imperative programming languages are not referentially transparent. Strachey would be turning in his grave.

This may well be true, but I think that when a Haskell programmer says it, he’s only twitching slightly. The reason? Strachey writes:

Any departure of R-value referential transparency in a R-value context should either be eliminated by decomposing the expression into several commands and simpler expressions, or, if this turns out to be difficult, the subject of a comment.

Which is something that Haskell programmers do naturally by use of IO. That is, in Haskell you either have an R-value, and you clearly see that you do, or you put in a comment, which is encoded in the type of the function.

This rather lengthy post basically arrives at the following, which is what I suspect the user [pacala is saying about RT on Reddit][reddit-pacala]:

Imperative languages my well be RT, but when trying to understand a code base the environment of each function is so large that understanding is an intractable problem. I don’t have this problem in Haskell.

In the interest of full disclosure: this post is related to what I do for a living, development of and for embedded systems. I work for Semcon, but they don’t make me wear a suit and tie so these are my words, and mine alone.

A bit of background info

In a recent project we had a system where the turning of the wheels were controlled by a simple dial. It emitted pulses as it was turned and the pulse train was shifted slightly depending on the direction of the turn. In software this was mapped onto two signals, one for each direction, with one signal emitted for each pulse in the train. All very straight forward so far.

To avoid accidental change of direction we decided that

only start turning the wheels after having received four initial signals, and

if a full second without receiving any signal meant that the turning had stopped.

The solution

The application was to be implemented using Qt, so using the Qt state machine framework was an obvious choice. The full state machine wouldn’t have to be large, only 8 states. The initial state (sResting) would indicate that the system was in a steady state (no turning), from there any received signal would advance into a successive state (sOne, sTwo, sThree, sFour) to indicate the number of received signals. From the fourth state the machine would advance directly to a state (sTurning) where a received signal would initiate an actual turn of the wheels. The turning would happen upon the entry into two separate states (sTurnRight and sTurnLeft), each of these states would instantly return to sTurning. All of this is simple and straight forward, what wasn’t so clear was to implement the automatic return to the initial state after 1s of inactivity.

The implementation

As I like to do, I first experimented a little to find a suitable solution to the problem. What follows is the resulting code of that experiment. The final code used in the project ended up being very similar. It’s all based around the method postDelayedEvent() found in QStateMachine.

Sure, several of the entry functions could be folded into one, but in order to validate the idea it’s easier to make separate ones for each state. The pattern is easy to spot, on entry a delayed return event is registered (if there’s a previous one its replaced with a new), except for the steady state (sResting) where any delayed event is removed, and the turning states (sTurnRight and sTurnLeft) since those states immediately return to sTurning anyway.

Conclusion and open questions

I’m fairly happy with the solution, but I’d be curious how other people, people more skilled in using Qt, would have solved the problem.

For a while I considered solving the skipping of four initial signals using a single state and counter, but I saw no obvious easy way to implement that, so I instead opted to use separate states. Slightly wasteful of resources, but not too bad, and simplicity is important. I’m very curious to find out if there’s a simply way to implement it using a single state.

Before the weekend I started looking at using Qt on Windows. More specifically I wanted to know whether this combination could be an option for a sub-project at work. We need to develop a program for the Windows desktop, and due to the overall context it would make sense to write it in C++ (that’s what we use for another part of the project). We already use both Eclipse and Visual Studio in the project, but I strongly prefer Eclipse, so I was hoping to be able to use it. However, it seems that the Qt developers strongly favour their own tool Qt Creator, though there are (outdated?) integrators for both Eclipse and Visual Studio. I’d rather avoid introducing a third IDE into a project—two is already one too many in my opinion. Anyway, I think I managed to find an acceptable configuration of Eclipse without using that old Qt integration plugin together with the MSVC (I was using the gratis version of MSVC for this).

Qt setup

I decided to install Qt into C:\QtSDK, and then I made the following permanent changes to the environment:

> set QTDIR=C:\QtSDK\Desktop\Qt\4.8.0\msvc2010
> set QMAKESPEC=%QTDIR%\mkspecs\win32-msvc2010
> set PATH=%PATH%;%QTDIR%\bin;C:\QtSDK\QtCreator\bin

Starting Eclipse so that it finds the compiler

It’s slightly disappointing that Eclipse happily lets one create MSVC project that isn’t buildable because it doesn’t know where the compiler is located. One easy way to remedy that seems to create a BAT file to create the proper environment for Eclipse:

Creating the project

Creating a “makefile” project in Eclipse is fairly straight forward; one needs a C/C++ project, of the makefile type, and make it empty too so that there isn’t any cruft in the way. Then add a single source file, e.g. main.cxx:

When using Gnome3 I was really impressed with the support for multiple screens. Then I switched to LXDE and was very disappointed in that desktop’s support for multiple screens. In fact so disappointed that I sat down and read the man-page for ‘randr’ and hacked up the following script:

Ever since I heard the FLOSS weekly episode on 0MQ I’ve been looking for a reason to take a look at it. Well, to hell with reason, I’ll have a first look without any specific goal in mind.

I found a simple introduction to it in Nicholas Piël’s post ZeroMQ an introduction. The only issue was that it was based on Python, and Python2 at that. So here are my attempts at translating two of the clients to Haskell (using zeromq-haskell).

A little while ago shelltestrunner was announced on haskell-cafe. At the time I was slowly losing hope on ever getting decent test coverage in cblrepo using HUnit. Using something like shelltestrunner could be an easier and more workable solution, especially since what cblrepo needed most in the short term is a bit of integration testing.

shelltestrunner is basically just a tool that runs shell commands and compares output (both stdout and stderr) and the exit code. It’s also possible to provide data to be passed to the command on stdin. The documentation on the shelltestrunner home page is very good and accessible. There are only a few things that I’d like to add to it:

Use the --with (`-w´) flag, it’s very handy to avoid littering the tests with long paths to the output of your build environment.

There is no support for set-up and tear-down steps in the tests (in my opinion this would be a very nice addition to the tool) so anything needed to be set up for the actual tests, will itself have to be tests.

There is no way to name tests (would be another good addition) so I found it crucial to organise tests into several files.

Since I’m now working a bit with embedded systems I thought I’d take a look at compiling for one of the ARM-based machines that QEMU supports. I settled for VersatilePB after finding this old-ish article. Rather optimistically I thought that maybe, just maybe things had change in a year and that the limitation of flash was removed. How wrong I was.

I did find an easier way to get it working, though with the limitation that Linux has to be started via tftpboot or some other network-based fashion. The patch looks like this:

Having lived outside of Sweden for about a decade I’ve grown accustomed to non-Swedish keyboard layouts, first the US (as it’s widely used in The Netherlands) and later on the UK layout. Moving back to Sweden had me swearing over the layout used here within only a few days. The placement of “{[]}” is especially painful. Clearly the Swedish layout wasn’t designed for developers! Rather than go on muscle memory I decided to first attempt a small change to the X key mappings.

The XKB config

I started with removing all the configuration I’d previously put into /etc/X11/xorg.conf.d – if I’m to use per-user configuration then there should be no system-wide settings at all. Then I put the output of setxkbmap -print into ~/.xkb/maps/$(hostname) as a starting point. The main goal is to move the characters that requires awkward single-hand combinations with AltGr to slightly more comfortable mappings. After a bit of experimentation I settled on the following (which I put in ~/.xkb/symbols/sedev)

A few days ago I create the page on LXDE and Xmonad on the Xmonad area of the Haskell Wiki. It’s very short, mainly due to it being very simple to set it up. My config is a bit bare-bones at the moment though and I’m sure others have more to contribute.

And yes! This means I’ve left the Gnome camp. Quite possibly for good.

The upgrade to Gnome3 in ArchLinux a few days ago broke my previous setup that combined xmonad with Gnome. Gnome 3 has a fallback mode, but I found that the instructions for replacing metacity under Gnome 2 no longer worked. With some help from the xmonad mailing list (in particular Jens Petersen and his efforts of providing a working setup on Fedora) I now finally have a working setup again. Here’s how I did it.

Add a session file for use by Gnome Session (/usr/share/gnome-session/sessions/xmonad.session):

Gnome 3 just hit the official ArchLinux repos a few days ago. It’s new, it’s slick, it’s shiny… but I don’t think it’s ready for general use just yet. It seems stable enough, but there’s just a few too many things missing to make it feel like it’s complete. Anyway, running Arch means that at times one has to live with not-quite-release-ready software anyway :-)

The biggest issue I’ve come across with Gnome 3, and especially Gnome Shell and the window manager, is configuring the themes. I was pointed to a fairly good article on customising the Gnome Shell, but it suggests modifying system files which is a bad thing to do even on single-user systems. So this post should be read as an addendum to that one.

First of all install the User Theme Gnome Shell Extension. The AURpackages available pull the source from its git repo because there doesn’t seem to be any releases of the extensions just yet. When using the bleeding edge source I ran into problems with Gnome Shell crashing so I advise against using it. I’ve had success with the source tagged at 3.0.1, you can find an Arch source package for Gnome Shell User Theme that I put together based on one of the AUR packages. Build and install that, then restart Gnome Shell (Alt-F2, r, return). Then verify that the extension has been loaded by using Looking Glass.

Then modify the file ~/.themes/Adwaita2/index.theme so that each mention of Adwaita says Adwaita2 instead, except for the cursor theme.

Make sure gnome-tweak-tool is installed (it’s in a package with the same name). Run it and change the shell theme to Default,the windows theme to Adwaita2, and the interface gtk+ theme to Adwaita2 as well.

As a follow-up to my earlier post on cblrepo I thought I’d convert a snapshot of ArchHaskell HABS to cblrepo. It’s mostly done as an exercise and to serve as an example. You can find it at http://www.kiwilight.com/~magnus/habs/.
Of course I have used it to build all the packages, and I still have the result of that around, so if anyone asks I just might upload that as well.

I just received an email with some praise for my earlier post on JSON in Haskell–it’s always nice to receive some praise ;-) However, the sender also mentioned that the mLookup function as coded there would blow up on incomplete JSON objects. That was by design, as a simplification, but the sender needed to deal with just that and asked if I had some more elegant solution than making every field in the data type a Maybe.

As I said, it’s always nice to receive praise, so here’s one solution that came to mind as I was reading the email.

I should mention that it relies on there being a reasonable default value for each type of the fields, and that the default is the same for all fields sharing a type.

First off, define a type class for types with default values:

classDefaultable d where def :: d

Then modify mLookup so that it uses Defaultable. I renamed it to mLookupAndReadJSON:

mLookupAndReadJSON a as = maybe def readJSON (lookup a as)

Now we need to provide some instances of Defaultable too. I limit this example to cover only GlossDef, so only the following instances are required:

Maintaining a large set of Haskell packages for a Linux distribution is quite a chore. Especially if one wants to track Hackage as far as possible. Several distributions have tools to automatically convert Cabal-based packages into distribution packages, e.g. cabal2arch for ArchLinux and cabal-rpm. They are just conversion tools though, and the most time-consuming activity in maintaining Haskell packages is resolving and verifying dependencies.

At least that was my experience when I was actively involved in ArchHaskell. I only saw two options when adding or upgrading a package, either I worked out dependencies manually, or I simply tried it out. Neither of them was very appealing, and both were very time-consuming. It seemed obvious that I needed some tool to help out.

Enter cblrepo!

It allows me to maintain a database of specific versions of packages, and when I want to upgrade a package, or add a new one, it’ll verify that all dependencies can be satisfied. In other words, it helps me maintain a buildable set of packages at all times.

The tool also has some functionality that helps in tracking Hackage as packages are updated there.

Something about how it works

At the moment I maintain a small repository of Arch packages, mostly just to try out cblrepo and convince myself that it works. The work environment contains a database and a directory of patches:

% ls
cblrepo.db patches/
%

The database is a cleartext file containing the information on the packages. It’s basically just a dump of the related Haskell datatype, encoded in JSON. The patches directory holds patches for Cabal files and PKGBUILD files. They must be named patch.cabal.<hackage name> or patch.pkgbuild.<hackage name> in order to be picked up by cblrepo.

There’s also an application directory (~/.cblrepo) for caching info about the packages available on Hackage:

% ls ~/.cblrepo
00-index.tar.gz
%

How to use it

A session with cblrepo looks something like this. First we update the information about what packages are available on Hackage:

Just re-run that without the -n to actually perform the bump. Now that all this is done we need to generate the files necessary to build the Arch packages. We can easily check what packages need re-building, and get a good order for building them:

Your comments, please

When trying to maintain set of binary packages of Haskell libraries for a Linux distribution there are a few issues that come up:

The set of packages must be compilable at all times, and

Updating one package requires all packages that depend on it, in one or more steps, to be re-compiled.

The first requires keeping track of all dependencies of the packages in the set and making sure that they are satisfiable at all times. For a while I was doing this by simple attempting to compile all updated packages and check for breakages. Which was both time-consuming and a painful when build-failures had to be resolved.

The second requires bumping the package release number for all packages that are reachable when following the dependencies in the reverse direction. Doing this manually is tedious and very error prone in my experience.

Of course it ought to be possible to make this a lot easier with the help of a tool. The last few days I’ve been writing such a tool. This is how I’ve been using it so far.

Building the initial database

GHC in ArchLinux ships with a few Haskell libraries and ArchLinux also has a few Haskell packages in its base repositories. Since I don’t need to maintain any of those packages I decided to treat these as a sort of base. Adding those is as simple as this:

Since haskell-platform isn’t on Hackage it requires special handling. The reason why datetime is excluded is slightly different. It’s the only package that requires old base (version <4). GHC in Arch does whip with both old and new base so datetime can be built, but cblrepo can’t deal with two versions of the same package. This is a limitation, but I’m not sure it’s worth fixing it since base is the only library that comes in two versions, and datetime is the only package that hasn’t been updated to use new base.

Knowing this it’s easy to add all the ArchHaskell packages to the database:

The way to read this is that there first of all is a missing dependency to satisfy for neither itself, and second there are two packages, yesod and persistent, that wouldn’t be buildable if neither were updated.

Now if it were possible to update neither, what packages would require a bump?

I’m impressed that you sent me an email asking for my input when someone wanted to use the project name bunny just because I happen to have a project called bunny on SourceForge1. However, neither your email, nor the page where I can record my opinion gives me enough information to decide. Please consider at least including in future emails the description of the proposed project.

I just watched Joshua Block’s talk Effective Java - Still Effective After All These Years. I’m not a Java developer1 but I still found the talk very interesting. Mr Block offers tips and tricks to deal effectively with a few aspects of Java, and I’m sure many a Java developer out there would find that part very interesting. For me however, the most interesting part was the appetizers and the dessert :-)

The appetizer and dessert consisted of three puzzlers. A puzzler is a piece of code that high-lights some peculiarity of the language or standard libraries. The puzzlers in this talk were are follows:

Compile the second version of Words.java alone and then run PrintWords, what is the result and why?

Any puzzlers for Haskell?

Of course I couldn’t help but wonder what puzzlers there are for Haskell. Do note though that puzzlers aren’t just obfuscated code; they are readable code that you think does one thing but in reality it does something else. I’d really like to read any Haskell puzzlers you can come up with. Post them on your own blogs, or as comments to this post.

NB I should probably mention that I really don’t want answers to the puzzlers. I’ve watched Josh Bloch’s presentation, and I think anyone interested in finding out should watch it for themselves.

If I ever find myself in a situation that calls for Java I’m very likely to spend some time looking at Scala :-)↩

Recently I’ve battled with OpenSSL at work. One thing I needed to do was add name validation to a program that previously hasn’t had it. In an attempt to avoid obvious mistakes I went looking for existing examples for how to do it. I came across some code from Secure Programming.com, it can be found in the code from the book in “/spc-1.1/chapter10/8-unix.c”. Just too bad only a part of the code actually works as advertised. On top of that the working part uses old functions which remain in the API only for backwards compatibility.

In trying to fix up that code I wrote the following little example code for extracting CN and subjectAltName:

I am very keen on keeping my private and work information separate. E.g. I would never read my personal and work email in the same MUA, instead I read work email in Thunderbird and the few times I read private email during working hours I do that using the web interface to GMail. At home it’s the other way around, Thunderbird for personal email, and a web interface to read work email. I used to have a similar setup for my browsing to keep bookmarks and saved passwords for the different areas of my life separate. Firefox was my work browser and Epiphany was my personal browser.

With the recent move to use webkit I noticed that there are a few bits with Epiphany that really bugs me though. Especially its inability to remember passwords; on my Eee it’s just a killer to not be able to do that. So, I decided to take a look at Firefox again, especially to see whether there are any add-ons that would help. And there are. These are the add-ons I found useful for this:

Profile Manager and Synchronizer

The most important piece of the setup is the addon Profile Manager and Synchronizer. It make sit easy to have more than one instance of Firefox running at the same time, with different profiles active in each one.

At first I tried synchronising profiles via dropbox, but that resulted in a lot of updates each time so I quickly stopped. I can recommend using it once though, to get the profiles to all the computers in the first place.

The plugin author says there will be a version that works with 3.6 soon. In the meantime I can report that I’ve had no issues with manually modifying the version range just to get it to install.

Xmarks

Since I don’t synchronise my profiles I do need to synchronise my bookmarks, and for that I use Xmarks.

Diigo

Diigo is a social bookmarking site. There seems to be about 13 to a dozen of those, but there are a couple of things that make Diigo different.

With the plugin I can easily store away pages for reading at some later date. In the past I’ve had a bookmark folder, or slightly more recently a tag, that I used to mark up pages that I’d like to take a closer look at. I’ve stopped that completely, and now I just mark pages as unread in Diigo. Just another way of reducing the clutter among my bookmarks.

The probably coolest feature is commenting on webpages. I mostly use that to add private comments to web pages, e.g. when I do some research into some topic (so far it’s mostly been for items I’m considering buying), but it’s also possible to make public comments. I’ve found it useful on more than one occasion to have a quick look through the public comments other people have put on pages.

The condensed version: I’ve found both YogaStudio and Yoga United to be excellent online stores for yoga equipment.

Longer version

About a month ago I decided to finally invest in a yoga mat. After a bit of research I found the prAna Revolution, it’s an extra wide, extra long mat made of natural rubber. I decided to order it from YogaStudio. It was completely eventless and slightly quicker than I expected. So two big thumbs up for YogaStudio.

After receiving the mat I realised I really would need a bag for it. I found it very difficult to find a bag that would fit my slightly over-sized mat. Finally I stumbled on Yoga United, who had a good price on an extra long bag made out of cotton. They also delivered quicker than expected, unfortunately they didn’t ship the one I actually ordered. What I got was the smaller size bag, too small for my mat, but it fit my wife’s mat perfectly and she wanted to keep it. Instead of the hazzle of sending it back I agreed with the lady at Yoga United that it would be simpler for them to just let me order another extra long bag and let me keep the one I had. The second bag arrived the next day. Again brilliant service.

Finally, I can really recommend the mat I bought. Yes, it’s pricy, but so far I’ve found it to be brilliant. The bag, you ask. Well, the mat doesn’t really want to stay rolled up, if I put it that way. Also, cotton isn’t a material that natural rubber slides easily on. It isn’t that hard to get the mat into the bag, but it helps to be patient :-)

Since no one on the Gnome mailing list seems to be able to answer these questions I thought I’d try some other venues for getting them answered. The audience for my blog isn’t that big, but just maybe there’s someone out there who knows the answers to these questions related to Gnome configuration. Mail one and mail two.

1: Running GUI tool after NM has brought up network

I run dropbox on my laptop, but their software is crap at handling that the network comes up only after the dropbox service has started ((I’ve noticed no problem when network goes away and then comes back, dropbox picks that up just fine. But if the network isn’t there to start with, that it can’t handle. I’m somehow at a loss how to write a program that handles the former but not the latter. :-) ))

I know of the possibility of dropping a file in /etc/NetworkManager/dispatcher.d/, but that doesn’t work in this case since I need the program to run “in my desktop”. Well, at least I haven’t managed to get the dropbox server to throw up an icon in my Gnome tool bar like it should, unless I run it from inside the desktop environment.

Any suggestions on how to solve this problem?

2: Changing background of the Gnome screensaver

I think Gnome comes with quite possibly the ugliest background for a screensaver I’ve ever seen. It’s a close-up on a green leaf or something. Absolutely hideous. I want to change it. To something nice, like a solid black. Actually just about anything else would do. But how?

What do you think the following script ((The script is somewhat artificial, who would ever use the construct true | while ...? I’ve used this just to show the point while keeping the examples short. Feel free to replace that part with something more useful, like cat myfile | while read ....)) will print?

This is another one of those posts that I make mostly for myself, you know for organising and help my memory :-)

There are as far as I can see three ways to deal with sockets in Haskell. There’s the type Socket which is used throughout Network.Socket. From that it’s possible to get to the underlying filedescriptor, and it in turn can be converted to a Handle.

When coupled with fork+exec it’s crucial to make sure the child process can find the socket Leaving it in a predictable place seems to be the easiest way to do that, and as far as I can see that requires using dupTo from System.Posix.IO. So, on the child-side it’s necessary to find a way to turn an integer (CInt) into something that can be treated as a socket (i.e. a Socket, a Handle, or a filedescriptor).

A basic parent-child which obviously won’t work since the child’s socket is represented as a Socket:

With Python being dropped as a language for extensions in epiphany 2.28 I needed to replace epilicious. I tried writing it in JavaScript (seed was integrated in 2.28), but I gave up due to hitting too many hurdles on the way. Instead I decided to rewrite epilicious using Vala and a minimal layer of C.

It turned out to be very doable, despite epiphany’s extension API lacking Vala bindings (Cosimo Cecchi, I’m looking at you :-) ). Basically I did the following setup:

Add a rule in the extension’s Makefile.am to generate a C header file for all the Vala code, for use from C.

Then I started writing the actual extension. I did the minimal amount of work in C, trying to escape as soon as possible into Vala. In the few places I needed to call from Vala into C I would declare a delegate in Vala, and pass a function from C.1

I should probably point out that while epilicious never could be called polished, BMS is even less so. I might find the time to make it multi-threaded, so that it’s possible to do some sort of progress dialogue, but don’t hold your breath. In the back of my mind is also the thought of rewriting it yet again, in JavaScript/seed.

The format of .vapi files are unknown to me, and I haven’t managed to find much documentation. Using function pointers seemed like an easier way, especially given that I needed this in exactly 3 places.↩

A few days ago I noticed that there were a few Haskell packages on AUR that had received updates. This was the excuse I had been waiting for to address the second part of keeping my Haskell packages up-to-date (I’ve written about the first part before, dealing with an update to GHC).

It’s easy to find the packages with available updates:

% yaourt -Su --aur

Unfortunately it isn’t as easy as just updating the listed packages. Any package that depends on an updated package really should be re-compiled and re-installed to guarantee that the entire system behaves as expected after the upgrade. Of course pacman can handle it:

% pacman -Rcn <all pkgs with updates>

This will list all packages that will be removed. After removing them all, they can all be re-installed. That is of course not quite as nice as it could be, since they all then will be explicitly installed, it would be nicer to just re-install the “top-level packages”. This is one way to achieve this.

I did a bit of refactoring and put the Arch-related functions from the previous post in their own module, Arch. Then I added a function that takes a package and recursively inspects Required By until a “top-level package” (i.e. a package that doesn’t require any other package) is reached:

The previous post was a fairly silly example, unless of course it’s more useful than I realise :) However, here’s something that I can see a bit more use of, a monad that restricts reading and writing of files to two files, one to read from and one to write to.

This defines a type wrapping a function that takes a pair of handles (one for input and one for output) and returns an “IO action”. Turning this into a monad is straight forward (actually it’s similar to the Reader monad):

To return a value we can simply drop the pair of handles and return the value in IO. Bind (>>=) only looks complicated, what happens is that the first argument is “executed” with the provided handles, then the second argument is passed the result and executed with the same pair of handles. Of course the handles aren’t actually known yet, so an anynmous function is created, and wrapped in an instance of TwoFileIO. That’s it for the most complicated part.

In order to avoid having to manually open files and wire everything up I wrote the following convenience function:

I’ve many times heard that Haskell can be used to prevent certain kind of programmer mistakes. In a presentation on Darcs it was explained how GADTs (especially phantom types) are used in Darcs to make sure that operations on patches follow certain rules. Another way, and at least it sounds easier, is to limit the available functions by running code in some sort of container. This being Haskell, that container is often a monad. I’ve really never seen this presented1 , so I thought I’d try to do it, and indeed it turns out to be very simple.

A few days ago I decided to explore the idea of using iteratee to do IO in Haskell. I read most of what Oleg has written on input processing using left-fold enumerators. Only a little wiser I took a look at the Iteratee IO package on Hackage. Unfortunately it still hadn’t quite sunk in. To be honest I couldn’t quite make heads or tails of it. Often just lining up the types properly will just work, even if I don’t understand why ((This reminds me of a math professor I had at university who said something like: “When it feels like the pen understands more than your head, persevere! It means you are close to getting it.”)) and soon after I usually gain some sort of understanding. This strategy didn’t seem to work with this particular package though :(

Somewhat based on Don’s answer to my question on Stackoverflow.com I thought I’d try to work through an implementation of my own. I just really hope I’ve got it right :)

I must admit I used Iteratee for inspiration at times. However, I couldn’t copy it straight off since I decided to first implement iteratee without involving monads. Sure, it’s rather silly to do “input processing” in Haskell without involving monads, but I find that including monads often clouds the problem at hand. So, I left them out to begin with, and add them again once I feel I know what I’m doing. So, here goes nothing…

The basic idea is to process a stream of data, presented in chunks. Each chunk is handed to an iteratee by an enumerator. For each chunk the iteratee signals to the enumerator whether it’s done or requires more data. These are the types I cooked up for this.

I found it rather useful to implement Show for the two first, but I’ll leave that out of this post since it’s a simple thing to do.

I should probably point out that the container type for the stream (that’s the ‘c’ in Stream c e) is rather pointless in what I’ve done; it’s always going to be [] in this post. Keeping it in does provide some more similarity with the Iteratee on Hackage.

At this point I jumped to turning a list into an enumerator. The way I implemented it the list is split up in chunk of 3 items and present each chunk in order to the passed in iteratee.

As Oleg pointed out in his writings it turns out that Iteratee c e is a monad.

instanceMonad (Iteratee c e) where
return x =Iteratee$ \ s ->Done x s

The implementation of return is obvious, there really isn’t any other option than to encode is a continuation that returns Done irrespective of what is offered, and passes along the rest of the stream no matter what’s passed in. Bind (>>=) is a bit more complicated:

My understanding is that the left iteratee should be stepped along until it returns Done, at that point the result is passed to the right-side function, which results in an iteratee. The rest of the stream is then passed on to the new iteratee.

The time has come for the final installment of this series of “discussions of a refactoring”. Thesearetheearlierinstallments. This is where I finally add the ability to collapse a list item. This is a rather terse description of the changes, since I feel all of them are fairly obvious, and hence require no lengthy explanation.

First the Option type has to be changed to keep track of whether an item is collapsed or not:

Next the rendering of an item has to be changed, so that collapsed items really appear collapsed. I thought displaying a collapsed item as its first line, with “…” added to the end would be acceptable for a first version:

One thing that I didn’t think about until after doing some manual testing was that moving the cursor up in the list should always put the cursor on the line above, even when moving from one item to the previous. This was a bug in the previous version :-)

It’d be really nice to be able to, after the fork, close all open file descriptors in the child. But how can I find all the open file descriptors in a process? Ideally it should be fairly portable, though portability to major Unix/Linux systems is enough for me.

The other day I wanted to experiment a bit with the JSON interface to AUR. Of course my first stop was at HackageDB to look for a Haskell package for parsing JSON. There are several of them, but only one that seemed suitable for some quick experimentation, especially I wanted to avoid pre-defining data types for the objects in the JSON interface. That failed however and I ended up switching to Python. It did bother me though, and later on, when I had some more time I decided to have another look at json. I was also helped by Don’s recent work on wrapping up the AUR JSON interface in Haskell.

As can be seen the “names” of the members are derived from the field names in the datatypes. Great for when you are designing new JSON objects, not when you are writing code to parse an already existing object. For that there is another, more verbose way to do it.

Start with the same data types, but without deriving Typeable and Data:

Arch is somewhat of a hybrid distribution in the sense that if you have any sort of ‘peculiar’ needs then you are likely to have to build packages from source. As expected developing in Haskell is a “peculiar need” :-)

After every upgrade of GHC I find myself in the situation where the system (pacman) and GHC have different views of what packages are available. What is needed then is somehow finding out the difference, and this is how I found that difference after the recent 6.10.3 -> 6.10.4 upgrade of GHC. Once I know what packages are missing from GHC’s view of the world I can use pacman to first remove and then yaourt to rebuild the packages.

First I noted that the old package.conf file wasn’t deleted during the upgrade, apparently pacman noted the changes that installing packages resulted in and saved the file as package.conf.pacsave. Finding the name of all the ‘missing’ packages was then as simple as loading both /usr/lib/ghc-6.10.3/package.conf.pacsave and /usr/lib/ghc-6.10.4/package.conf, filter out the package names and take the difference:

That isn’t the most useful output however, so I decided to modify it to print out the name of the Arch package that needed re-compilation. The following functions generates the name of the .hi of the first module in the Haskell package, it then uses pacman to look up the owner of the file:

Now I can find the list of Arch packages that aren’t known to the new version of GHC by mapping ghcPkg2ArchPkg over gonePkgs. In other words that is the list of packages that need to be removed, but that can be different from the list of packages that needs to be rebuilt with yaourt (basically I only want to tell it to build and install the ‘top-level’ packages, i.e. packages that aren’t dependencies of any other packages. It’s of course possible to build that second list from the first one, with the help of pacman.

This is the third part, and it’s likely to be the longest one in the series. The threepreviousparts have been rather short, but now it’s time for a longer post because in this one I completely change the representation of the options that are rendered.

Instead of using a list and an integer I’ll use what is basically a zipper (with some extra fields for book keeping). At the same time I also add a new field to the Option type to keep track of how many lines the option renders to. At the moment it will always be one line, but the next part will actually make use of it. (Yes, that part probably should have been kept in a separate part, but this happens to be how I wrote the code.)

First some changes to the Option type and its implementation of Pretty:

Now it’s time to introduce the zipper that replaces the list of options. The basic idea is that there are two parts to a list, the left side (ozLS) and the right side (ozRS), and a current item. In this list zipper the current item is the first item on the right side:

Usually a list zipper has functions to move the cursor, i.e. move items between the left and right sides. In this zipper there is some extra bookkeeping that has to be done to make sure that the index is correct and that the current item has a correct range:

That’s all good and well, but what I really need is to be able to navigate based on lines. Expressing that using ozLeft and ozRight is fairly straight forward. Let’s start with shifting to the next line, ozNextLine, it has two cases, one general case and one when the cursor points to the last item:

Anyone who pays attention will realise that this definition of ozNextLine isn’t complete. The zipper is capable of pointing to the empty spot after the last item (when ozRS is [], as would be the case for an empty list turned into a zipper). For this occasion that is all right, but this would need some attention when using this in a proper program.

The introduction of the zipper requires large changes to both getChoice and _getChoice. The changes are however very straight forward and in my opinion they make both functions easier to read and understand. I’ll simply copy in the definitions without any comments in the hope that thanks to using a zipper the code is self-explanatory :-) It might be worth pointing out though that render is still passed a list of strings to render, so it requires no changes at this point.

Yesterday I needed to do some tests involving tunneling of network connections. Rather than firing up the full client-server setup that I want to tunnel I thought I’d use someting simple to test with first. Instead of looking online for a simple server to use, or hack one up using netcat, or even hack one in Python I decided to hack one in Haskell:

I don’t know how many times I’ve gone looking for one of these but my search-fu is weak and I always give up, instead resorting to manual editing in Vim (no I hardly ever need the entire file to be pretty, only one or two tags that I’m interested in). Anyway, here’s a quick hack in Haskell, relying on xml for the heavy lifting:

After posting the zeroth part of this series I realised I hadn’t said anything about the final goal of this exercise. The first post contained code for choosing one one-line string (String) out of a list of one-line strings ([String]). What I really want is the ability to choose one item out of a list of items, where each may render to be multiple lines. It would also be really cool if an item could be collapsed and expanded in the rendering. This is the first step in my journey towards these loosely specified requriements.

Rendering items into strings sounds like pretty-printing to me, so I played around a little with a few pretty-printing libraries. Finally I settled on the Wadler/Leijen Pretty Printer (Text.PrettyPrint.Leijen). I didn’t really have any strong reason for choosing it, beyond that it comes with its own type class whereas the pretty-printing library that ships with GHC (Text.PrettyPrint.HughesPJ) doesn’t (though there is a package on HackageDB with a class for it).

I did the smallest change I could think of to add pretty-printing. First the module needs to be imported of course:

import Text.PrettyPrint.Leijen

Then I added a function to turn a list of items into a document (Doc) where each item is pretty-printed on its own line:

myPrettyList ::Pretty a => [a] ->Doc
myPrettyList = vcat . map pretty

I then decided that _getChoice should be left unchanged and instead modified getChoice to turn the list of items into a list of strings:

Since the Python extensions to Epiphany have been removed from the repository I thought it was high time to start playing with what seems to be the replacement to Python extensions: Seed extensions. The first step is of course to get a version of Epiphany that supports Seed extensions. After a few emails on the mailing list I’ve come to a recipe (I’ve done this twice now on different machines to I’m fairly confident it works). I should probably preface this by saying that I run an up-to-date Arch system, if you run something else you might need to do a bit more, or less if you’re lucky :-)

Make sure the following packages are installed ((You might need a few more packages depending on what desktop environment you use. Those were the packages I needed to add to my machine where I run Gnome and do regular non-Gnome development)): libsoup, libwebkit, gnome-common, intltool, libffi

Then configure, build and install everything. Use the autogen.sh script to create the configuration, and make sure to pass it prefix=$prefix each time. Some of the modules need extra arguments as well. This is the order and extra arguments I used:

gnome-js-common (--disable-seed --disable-gjs)

seed

gnome-js-common (--disable-gjs)

gobject-introspection

gir-repository

epiphany (--enable-introspection --enable-seed)

epiphany-extensions

After that you can put your extensions in ~/.gnome2/epiphany/extensions/. I have two instances of Epiphany installed, a stable from my distro, and the dev version I built myself. I haven’t managed to run them both side by side, but beyond that there seems to be no bad interaction between them.

I haven’t had much free time lately, which means I haven’t written much non-work code. The only exception is some experiments with a small piece of Haskell code using the Vty module. Many moons ago I wrote a small piece of code that let’s the user choose options from a list in a terminal. Somewhat similar to what you get using dialog --menu ..., but of course a lot more limited and less good looking.

Anyway, over the last few weeks I’ve slowly expanded it in a direction that would be useful if I ever get around to work on yet another of those projects that so far only exist in my head :-)

I’ve kept the transformations in a stack of patches using quilt, and I thought I’d write a little about them. Not because I think they are extremely useful or even good in any way, but more because I really need to get back to writing some blog posts ;-)

This is the zeroth post containing the version I put together when I first came across Vty. It is an executable program so it starts with the familiar

The options are, in this version, represented as a list of strings. For now it’s enough to have a nonsensical list of unique strings.

options = [ (show i) ++" Foo"| i <- [0..59]]

The main function is as small as possible, two rows, the first creating an instance of Vty and the second getting the choice and feeding it into print.

main =do
vt <- mkVty
getChoice vt options >>= print

Of course one would think that geChoice would be the meat of the code, but it is also short. After getting the size of the terminal it calls _getChoice, which is the meat of the code. The reason for this split is the handling of resize events.

The main part of _getChoice is straight forward, first update the terminal, then wait for an event, and finally handle the event. Unless the user wants to exit (pressing enter choses an item, pressing escape exits without a choice) a recursive call is made to _getChoice with slightly modified arguments.

Probably the most complicated part is the calculation of the top of the list of visible items. The idea is that if the list has more items than there are lines in the terminal then the cursor moves down until the middle line, once there any down movement will result in the list scrolling up. This continues until the end of the list is visible, at that point the cursor moves down towards the last line in the terminal. I doubt that explanation makes sense, hopefully it’ll be clear to anyone who bothers running the code.

The final piece is the code that renders the list. The items of the list are zipped together with a list of integers. Each such tuple is then rendered into a line,1 where the line of the cursor is highlighted. The resulting list of rendered lines is then folded into a full image.

I read all my email via IMAP, and send all email via external SMTP servers, but at times it’s really useful to be able to run something like /usr/sbin/sendmail to send an email. Many tools depend on it, for me I want reports on cron jobs and I like using darcs send straight to an email address. For this a full-fledged SMTP server like exim/postfix/sendmail clearly is overkill. First I had a quick look at ssmtp but it only allows for a system-wide configuration file. Ideally I’d like “system-mails” (like cron) to be sent via my ISP while more personal emails should be sent via my Google account. After a suggestion on the Arch mailing list I checked out msmtp and it was exactly what I was looking for.

I then found out that [dcron][3] is severely limited; it doesn’t honour MAILTO in crontabs and it’s not possibly to use anything but /usr/sbin/sendmail to send emails. The lack of support for MAILTO is the deal breaker in this case.

Rather than create a symbolic link I modified fcron’s configuration file, /etc/fcron/fcron.conf, to use /usr/bin/msmtp:

sendmail = /usr/bin/msmtp

I also made sure to stick MAILTO=my.email@myisp.com in the system crontab, using fcrontab -u systab -e.

At first this I wrote this as a comment on my original post, but it grew a bit too long so I decided to post it like this instead.

@Randal, “Static typing is still effectively a syntax check, not a semantic check though.” Now, I think we might have a different understanding of the word “syntax”. I can guess at yours, but my understanding mirrors what’s on Foldoc, that is syntax determines what is a valid string in a program, in this case it determines where a type declaration/definition/statement goes, but not what type declaration/definition/statement is semantically valid in a particular position in a string. That “what” is very much part of the semantics in my mind, it tells the world (me, other users of my library, the compiler) a part of the intention of a function, it tells how something may be used. I gather you are mostly familiar with type systems like Java so what is clearer to you, this declaration:

intadd(int i, int j);

or this declaration:

intadd(i, j);

In the former the intent of the function is obvious, it adds integers. In the latter it isn’t, does it handle complex numbers? Does it handle matrices? In dynamic languages you’d have to document it somewhere, but there is no consistency check between code and documentation (maybe there are external tools for that though, but why not use the compiler to check?). You would also have to test that no caller of this function can be tricked into calling add with a non-integer.

Also, though your example of “semantic type checking” in the talk is interesting (I simply don’t know if there are any type systems that could deal with it) you completely skip all the cases where the type system can do the job and where it does save on testing and typing. In these cases you would have a proof relating to your use of types in the program, unit testing can never give that proof it can only show you haven’t disproved it yet :-)

If you remain unconvinced then I strongly urge you to read Tom Moertel’s post on providing safe strings, and the assurance you can achieve by using a strong type system in relation to information flow through your program. The same technique has been used in Darcs (watch the video, the bit relevant for this dicussion starts about 43 minutes in).

I’d also like to clear one thing up, I don’t dislike dynamic languages and I don’t think that static languages are inherently better. What I do dislike about your talk is that it’s uninformed, presents very narrow arguments and then draws conclusions that are very general and simply don’t follow from the given arguments.

Finally, I really enjoy FLOSS Weekly. You and Leo are doing a fantastic job on it, but since it is where I first heard of your talk (I suspect Industry Misinterpretations might not get a lot of attention outside the Smalltalk community) I really think you should talk to someone from the FLOSS FP community. Get someone who can explain, much better than me, what a modern, advanced, strong, statically typed language will get you. I only have experience with OCaml and Haskell, and there are others, like Scala, all are FLOSS and hopefully it wouldn’t be too much work to find someone knowledgable who’d be willing to set you straight on the dynamic vs static typing issue. I’d be happy to do what I can to help you find someone, just let me know if you are insterested :-)

So, I’m struggling somewhat with the move to Arch. Not that Arch is problematic in any way, it seems to be me and my “debianised thinking”.

I noticed that the Vim plugins that were packaged for Arch all install in system directories and hence are imposed on all users. Not a good thing. So, inspired by Debian’s vim-addons-manager, I hacked up a solution relying on pacman. Then I packaged two Vim plugins for Arch. The idea was to do rather than talk, so it wasn’t until after this that I asked for feedback on aur-general. Just to have the idea shot down :-)

Anyway, after being presented with an existing solution more inline with “the Arch way” I decided to try it out. I’ve now subscribed to using GetLatestVimScripts. Brilliant.

Now I need to convince the author of haskellmode to put it on vim.org and I really ought to get my packages off AUR.

Well, not the most diplomatic title maybe, but listening to Randal Schwartz’ talk “Dynamic Returns” first made me laugh and then made my fingers itch to write him an email addressing the fallacies he is peddling. Anyway, I wrote this large, rather ranting, blog post but then decided to not publish it.1

Randal, my suggestion to you is that you get someone like Don Stewart on FLOSS Weekly (I believe he is based in Portland as well) to discuss how Haskell’s advanced type system can be harnessed to completely remove whole classes of bugs.

Randal, if you read this I’ll be happy to send the draft version of the post to you, just drop me an email :-)↩

I probably started in the wrong end, by installing a few GStreamer packages:

gstreamer0.10-bad-plugins

gstreamer0.10-ugly-plugins

gstreamer0.10-ffmpeg

It was first after that that I actually configured sound :-)

As always the Arch Wiki contains all the information needed, and more. There’s a page for configuring Alsa. All I did was install alsa-utils, make sure my user was a member of audio and then configure using alsamixer until aplay /usr/share/sounds/alsa/Front_Center.wav played properly. Then I put alsa in the DEAMON list in /etc/rc.conf.

Next I installed and configured PulseAudio. Again there’s an excellent page with instructions for configuring PulseAudio on the wiki. For me that meant first installing pulseaudio and alsa-plugins. Then I used vigr to add my user to pulse-access and pulse-rt. I also created /etc/asound.conf, following the instruction on the wiki it contains the following:

After the previoussetupsteps I found the next task was to add a few odds and ends really were needed before proceeding to other major things, like audio, email, and such. Several of the things I needed are available from AUR so to make things easier I started by adding archlinuxfr to my /etc/pacman.conf:

After having set up X (not sure I really can call just installing it “setting up” though :-) ) and Gnome, which of course included installing and configuring Xmonad, I went on to configure some network-related stuff.

At home I use mDNS to make it easy to connect to the computers we have at home. I added avahi to the list of services to start in /etc/rc.conf and installed nss-mdns. Then I following the instructions on how to configure mDNS on Arch, which resulted in the following changes:

I also need SSH access to my desktop machine so I installed OpenSSH (openssh). No changes were needed to the configuration but following the instructions I made the following change to /etc/hosts.allow to allow connections at all:

The last step was setting up NTP, based on suggestions on the Arch Wiki I installed OpenNTP (openntp) rather than the standard NTP package. No changes to the configuration were needed, just adding it to DAEMONS in /etc/rc.conf.

That was it, all the crucial network-related services set up and running at boot.

I’ll soon continue documenting my experience with installing and configuring my desktop, but first a note on fixing an irritating issue in Gnome 2.26. Soon after installing Gnome I noticed that I couldn’t reboot or shutdown from inside my Gnome session, I had to log out and use GDM’s action menu instead. Rather irritating, but something I could live with until the weekend. It took a while to find the solution so to safe myself time in the future I’ll document it here. It might even help someone else, who knows?

Whenever I tried to reboot or shutdown I saw the following message appear in ~/.xsession-errors:

After giving up on finding a solution with Google I turned to #archlinux and was pointed to a forum post. That discussion doesn’t mention the warning message at all, so it wasn’t obvious to me at first that it was the same issue. However, about half-way down the page I found a post by justusjonas that looked relevant. I tried that change, or rather something very similar, and a manual reboot later I could confirm that the change indeed fixed the issue. This is the change I applied:

After installing a basic Arch system and adding X.org it still isn’t very usable. My desktop of choice is Gnome, so that’s what I installed next, with a few extra packages to make it prettier and more useful.

I found out the hard way a few weeks ago that if you install gnome without gnome-extras then it’s a bad idea to install gdm and configure your system to boot to X. That way you’ll end up in a situation where your desktop doesn’t have an X terminal. I made sure to avoid that situation this time around. After installing gnome,gnome-extras and gdm I noticed that both gdm and policykit created users with uid 120 and 102, respectively. Just like exptected from services they have low uids. What surprised me was that they ended up having gid 1001 and 1002. That looks like bugsto me :-) I decided to fix that up manually myself by editing the group file (using vigr of course) and then searching for all file with the offending gids (using find / -gid 1001 for the first, with obvious changes to find the second).

After configuring gdm I realised I also needed gdm-themes :-) And I also added a few pretty fonts, and removed a few of the unpretty ones:

Co-ercing Gnome to actually use it was interesting though. I followed the instructions for running Xmonad in Jaunty, but that wasn’t enough. I also needed to place a file named xmonad.desktop in /usr/share/applications/ with the following contents (greatly inspired by the metacity.desktop found in the same location):

After that it was time to install stuff that is useful :-) First off was Thunderbird, which I immidiately had to equip with a few plug-ins of course. When setting up my mail account again I noticed that my Gnome session wasn’t quite the way I wanted—there was no gpg-agent running. After a quick check with the people on arch-general, without receiving any definitive resolution, I wrote my own little hack to address it. Of course I posted it in the thread I had started, but I might as well include it here too:

<rant>So my irritation with Debian on the desktop reached a critical moment over the last few days. Sid is a good desktop system for most things, but if you are interested in Haskell then you really should prepare yourself for being a little behind on things. If it isn’t GHC that is out of date then it’s some library that hasn’t been rebuilt for the latest version of GHC. If you’re lucky then new packages are uploaded to the NEW queue and it’s only a matter of waiting for the autobuilders to get to them. Of course that doesn’t mean you can get those packages, so not that much luck in the end anyway.</rant>

I decided to move to the only other distro I’ve used seriously since Linux’ move to 2.x kernels, Arch. I remember it as a slightly less polished than Debian, but more up-to-date than even Sid. Also, there is a vibrant group of Haskell hackers working on providing almost all of Hackage in Arch’s native packaging system.

Installing Arch turned out to be a bit of a blast from the past, even when compared to Debian. Luckily the Arch wiki is there to help with the things that are different compared to Debian. I would suggest having a second computer with a browser pointed to Arch’s home pages nearby during the entire install.

For my installation of Debian I had opted to use LVM and I wanted to keep my /home around so I had to follow the steps required to use LVM in Arch. Installation went smooth and the reboot was successful. Now starts the hard work—configuring the system and updating my home folder.

The very first thing I noticed was that vim wasn’t installed by default, only vi was available. Not a big problem, except that my muscle memory resulted in a lot of vim: command not found messages. Clearly I had to install vim just to keep my sanity.

Next thing to do was to install X.org and the video driver I need:

# pacman -S xorg xf86-video-intel

After adding hal to /etc/rc.conf I started it manually and I ran startx as non-root. Wow, it all worked perfect, without creating any configuration file for X first. I double checked and it used the driver I wanted, i.e. it didn’t pick a safe default, like vesa, as its first option. Brilliant.

I have written a few posts on debianisinghaskellpackages in the past. Back then I stopped looking at it because of the age of the ghc-related packages in Debian Sid. Things have since improved and I’ve finally found some time to revisit the problem. This isn’t however a technical post, but rather a post on the strategy I think would be best for making Debian the best Haskell development environment in the universe.

In my opinion there is no point in trying to get all packages on Hackage into Debian proper. The few Debian developers (DD) who are interested in Haskell and maintain the existing packages are already terribly short on time and adding further burden would improve the chance of timely availability of GHC in the future. Sure, it would be possible to recruit more DDs with an interest in Haskell (I’ve considered in the past to try to become a DD, but the timing was off on both sides and I decided to postpone it) but it would take time. Maintaining 1000 packages is a huge task, and would require quite a large team I think. I suspect the recruiting would be such that if one were to plot the number of packages the team could maintain as it grew over time, compared to the number of packages on Hackage, then the lines would diverge rather than converge ;-)

Instead I think all packages on Hackage should first be debianised and put in a source-only repository. As some packages grow in importance and get added into Debian proper (e.g. a killer-app written in Haskell) then it and its dependencies are removed.

I believe that with this solution it’d be possible to quickly make a large part of Hackage available on Debian while still giving the DDs time to grow the number of “Haskell-friendlies” within Debian.

This of course rests on a few assumptions:

It would require a smaller team to maintain source packages than binary packages. At least it would require less time from DDs.

There is some space to store the source packages, ideally non-Debian space since that would mean even less involvement required by DDs.

There’s a tool available in Debian that makes it easy for end-users to download, compile, and install source packages.

There’s a tool that automates the creation of a Debian source package from a Cabal package.

In my optimism I’ve decided to not worry about the two first points and instead focus on the last two. Besides the optimism they also feel like the points I can actually address on my own, I’ll need help with the first two.

The tool in Debian for handling source packages is apt-src. In my opinion it will suffice, but if the developer would address one issue then it would be perfect.

The tool for automating the creation of Debian source packages could be cabal-debian. It is a very good tool, and it seems to be suited for maintaining Debian packages. I only have two objections to it. The first objection I have is that it is almost too well engineered; it does more than is necessary, e.g. there is no need to maintain proper changelogs in the source repo. The second is that it requires all dependencies of a package to be installed when it’s debianised. It is almost certainly the tool for the team charged with maintaining a large set of Cabal-based packages in Debian proper, but for the source-repo it might be too much. Because of that I’ve started hacking on a simplistic tool that I feel would be better suited for the source-repo.

avsm pointed me to ubigraph. So after a bit of XML-RPC hacking (haxrreally rocks, it is just so amazingly easy to use) I now have code that sends the graph data over to a ubigraph server.

Of course I had to create a video of it. The video shows the dependencies of just two packages (dataenc and datetime). I tried creating a second video of the dependencies of more packages, but xvidcap isn’t very reliable it seems. One irritating thing is that I run out of filehandles fairly soon, so I couldn’t create a 3d graph of more than about 75 packages (all packages starting with ‘a’ and ‘b’ on Hackage).

The basic idea is to use the code from my earlier post, accumulate dependency information by mapping it over several cabal files. Then convert that information into nodes and edges suitable for building a graph (Data.Graph). That graph is then “graphviz’ed” using Data.Graph.Inductive.Graphviz. Not that since this is performed on Debian Sid I’m using some rather old versions of packages.1

First off a shortcut for reading cabal files:

readCabalFile = readPackageDescription silent

Once I have a GenericPackageDescription I want to collapse it into a regular PackageDescription (see the comments in my previous post for some details regarding this). Then I extract the package name and its dependencies and package them into a tuple. by mapping this function over a list of GenericPackageDescription I end up with an association list where the key is the package and the value is a list of all its dependencies.

In order to create the graph later on I need a complete list of all nodes. To do this I take all the keys and all the values in the association list, collapse them into a single list, and remove duplicates. To turn this resulting list of packages into a list of LNode I then zip it with a list of integers.

Building the edges is straight forward, but a bit more involved. An edge is a tuple of two integers and something else, I don’t need to label the edges so in my case it is (Int, Int, ()). The list of nodes is basically an association list (an integer for key and a string for value), but I need to flip keys and values since I know the package name and need its node number.

I downloaded all cabal files from Hackage and ran this code over them. I bumped into a few that use Cabal features not supported in the ancient version I’m using. I was a bit disappointed that Cabal wouldn’t let me handle that kind of errors myself (as I already expressed in an earlier post) so I was forced to delete them manually.

In my opinion the final decision of what to do in case of an error belongs in the application, not in libraries.

I’m sure there are exceptions to this, but I believe it’s good as a guiding principle when defining APIs.

Here’s the code in Cabal that breaks this rule. Basically, if you write an application and call readPackageDescription on a cabal file with errors then your application is terminated (the call to dieWithLocation ends up calling exitWith). I would much rather it returned an error (e.g. based on Either) so I can decide what to do with bad cabal files myself.

This is going to be a silly post, simply because it is so amazingly simple to do some basic parsing of Cabal files. It turns out to be true even for old versions of Cabal, the code below is for 1.2.3.0 since that’s the version still found in Debian Unstable.1

I stumbled on this email thread on starting Xmonad in Fedora 10. In anticipation of Gnome 2.24 being installable from Debian Experimental I dropped the following into /usr/share/applications/xmonad.desktop:

First of all I decided to make this a new post rather then answer in the comment section in my previous post. I think that this way the chance is greater that both David and Jeremy will read this :-)

So, first let’s set the stage… It’s almost midnight and I’ve just spent somewhere between 30 and 45 minutes hunting down packages on Haskell and navigating the directory listings on src.seereason.com. I’ve been sent in the wrong direction once (downloading a Debian package that seems to be from Linspire instead of grabbing the identically named package from the seereason site. I’m tired and slightly frustrated. I run cabal-debian, it works, then I run debuild just to be told that I really should have build cabal-debian itself into a Debian package and installed it that way and I also need another package that isn’t in Debian at all.

Anyway, having slept and let almost 24 pass by I’ve come to my senses somewhat ;-)

Here’s what think is a slightly more tempered whishlist:

Remove haskell-devscripts-cdbs and include hlibrary.mk in haskell-devscripts.

Modify dh_haskell* so that the two snippets for -doc are automatically generated instead (just like the script snippets for -dev already are).

I think it would be easy to stick that in seereason’s repos for public consumption. Then interested people, like me, can start thinking about how to use that as a base to make Debian the best Haskell platform in the world, or at least making it better than Arch ;-)

Completely unrelated question for seereason people

So, after receiving several pointers to seereason’s cabal-debian tool I thought I’d take it for a spin. ((I did pull down autobuilder as well, but didn’t feel I had enough time to explore it tonight.))

After about 30 minutes of browsing through HackageDB and seereason’s source repos, building and installing, I had finally satisfied all dependencies and the build of cabal-debian succeeded. (Oh, BTW, seereason people, it’s really confusing that you have a package called debian among your source repos, when there already is a different debian package on HackageDB. Please consider renaming it!) I decided to take the tool for a spin on my own package, dataenc.

The result was fairly good. It seems the generated files depend on some packages that either aren’t in Debian or that the people at seereason have modified. With the following changes to the generated files I was happy with the contents of ./debian and I successfully built an (almost) autogenerated Debian package:

I’ve just spent the last hour or so ironing out the details required to automate the building of cabalised Haskell packages for Debian. At the same time I also built Debian packages for 5 Haskell packages (test-framework and its dependencies).

These are the basic steps I’ve followed:

Extract the tar-ball in the working directory.

Rename the directory to match the form haskell-X-0.1.

Re-tar the directory into an archive named haskell-X_0.1.orig.tar.gz.

Change into the package directory.

Run Cabal’s configure command followed by register --gen-pkg-conf=pkg.conf.

Create debian/control. This is by far the most time consuming part to do manually. It’s also a bit long so I’ll leave it out of this post. Take a look in my APT repo if you are interested in the details.

Copy hlibrary.mk into debian/. At some point it will be included in some Debian package (bug #462482) but for now it can be found here.

Create debian/rules with

#!/usr/bin/make -f
include $(CURDIR)/debian/hlibrary.mk

and make it executable, chmod +x debian/rules.

Done. Run debuild.

My intention is to at some point encode these steps in a tool so that I can spew out Debian packages for all interesting packages on HackageDB at the press of a button. Of course that tool will be written in Haskell, and it sure won’t rely on Python for any part of the process ;-)

Readers who are familiar with the rules for Debian packages will notice that this won’t create well-formed packages. I’m not particularly interested in getting these packages into Debian proper, I see the package generated this way living in an unofficial repo. So there is no need to make sure they are acceptable for inclusion in the distribution itself. The resulting packages might offer a good starting point for someone who feels that a package deserves to become official. I also suspect that a tool that generates fully compliant packages, and is able to deal with keeping them up-to-date, is too big a task for me. I’m also not sure it really is worth it, better to keep things as simple as possible in order to deal with the amazing rate of new releases on Hackage.

I’ve been using both QuickCheck and HUnit to make sure that the functions in dataenc both have the expected properties and follow the specifications.

This morning I stumbled on a series of posts by Matthew Podwysocki on Functional Programming Unit Testing (part 1, part 2, part 3, part 4, part). I think I really could have used these posts about two years ago when I first ventured into the world of testing Haskell. The only new thing I learnt was the existence of the package test-framwork in part three. Anyway, they are a good introduction to testing Haskell code (and F#).

Today I started the day by watching some Swedish telly, Gomorron Sverige I think the show was called. Three guests got to comment on the news year that just passed. They were asked for their nominations in categories such as Most Over-Reported News Story, Most Important Public Figure and the like. One of them nominated climate change for the category Most Under-Reported News Story. Immediately one of the others blurted out that he thinks that climate change is exaggerated and has received too much attention in media. He was told that there now is proof that large portions of the polar ice has melted (25% was mentioned, though I didn’t catch whether it was an increase of 25% or if 25% was gone, anyway, it doesn’t really matter here) and that therefore this is a very real problem. In answer to this the climate-change doubter mentioned that he has been keeping an eye on the water level at the pier near his summer place and there has been no change over the years. So, if climate change is real and the polar ice is melting, where does the water go? My initial reaction against this line of reasoning seems to have been shared by the person who initially brought up the subject, “How could he argue against climate change? And with such a simplistic argument as well?” Then I was disappointed to hear that the only counter argument basically boiled down to “But, it doesn’t work that way” and “How can you argue against all the scientists?” First I was disappointed in her for not providing any answers, then I grew more disappointed in myself as I realised that had I been sitting next to this climate-change doubter myself at a dinner I wouldn’t have been able to provide a stronger argument myself.

How should a counter-argument be constructed? What scientific ideas and proofs should be used to point out the fallacy of the “pier argument”?

I can think of these:

There is evidence of climate change on a macro level, his argument doesn’t explain those away. All he’s really saying is that on a micro level, and in a very specific geographical location, measuring only the water level, he can’t find any evidence for climate change.

It’s probably also simple to point out the lack of scientific rigor in his methods. Somehow though, I feel this is a bad way to try to convince anyone.

Let’s start with the table of contents. To be honest I found it very disappointing, there were so many other chapters that I would have found much more interesting than chapter 10. Given that I’ve never really gotten into Apache to begin with I would much rather have read about advanced configuration (chapters 2 and 3), or CGI (chapters 3 and 11), or maybe most about securing lighttpd, both securing what it serves (through SSL, chapter 6) and by reducing privileges (chapter 8). Yes, for me personally they’ve chosen the least interesting chapter to make available free of charge ;-)

So, what do I think about chapter 10 then? Overall I’m happy with it. In a book like this, one that covers usage of a specific piece of software, I’m happy to find text that is clear and simple to understand. It isn’t very deep in concepts and details, but there are links to places where I can find out more if I need it. I’m not familiar enough with lighttpd to judge whether the details are correct, but I assume they are, and as I already mentioned it isn’t a complete reference, but offers links to where the reader can find more details. One thing that did upset me a little is the presence of rather lengthy pieces of Python scripts to aid in the converting some of the Apache configuration to lighttpd. Though I suppose they fit in I’d rather have seen them made available only online, with just the descriptions and usage in the book. I couldn’t help but feel they were there only to increase the page count.

Anyway, what I’ve seen so far makes me want to read the rest of the book. As I’ve made clear I’m not a serious user of lighttpd, but it is the web server I reach for first when I find myself needing one. I think this book will make me a more efficient user of lighttpd.

I’ve read quite a few articles about the changing landscape of marketing, how regular ads and old-style marketing is no longer trusted. I suppose companies have over-promised and under-delivered for too long. Instead PR firms are adding blogging into the mix, which probably means blogs will be ruined in the long run ;-) For now though blogs are still trusted, especially when it’s regular people who write about products.

So, why this post? Well, it seems the few posts I’ve made here on lighttpd seems to have attracted some minimal attention and I received an email regarding a new book on lighttpd, published by Packt publishing. This despite the fact that I am what can only be called a cursory user of lighttpd. It contained a link to a gratuitous chapter so I decided to play along and publicly comment on the chapter. If you don’t like this kind of thing then you should skip the next post.

The blog GHC Mutterings doesn’t allow posting of comments without logging in. Since I have absolutely no interest at all in creating yet another online account I hope they allow pingbacks :-)

In this post they discuss the planned redesign of GHC’s build system, and here’s my comment:

The goal of moving away from make is probably a good one, but when failing to get Cabal to fit your needs you revert back to redesigning it using make again? There are numerous other build tools out there, many of them remarkably better than make. What other tools were considered and why were they discarded? I simply refuse to believe that the only build tool to make the shortlist was make. :-)

Here’s another post about a paper I’ve read recently. This time it’s not entirely for fun, but I still thought I’d write about this one, Adventures with a certain Xen vulnerability (in the PVFB backend). I’ve read a few security-related papers and articles. In general I’ve found that there’s a huge gap in quality (and sometimes rigor) between the practitioners and academia. This however is a paper that I found to be of good quality, while still being produced by a member of the former camp. Hopefully it will start a trend :-)

A friend pointed me to this film about what money is. I’ve known since high school how banks create money, but I’ve never thought it through to its logical, and somewhat worrying, conclusion. Well worth watching I thought.

I’ve found myself getting confused over the Data.Time modules a few times now. Today I wanted to do something that I ought to be simple both to do (it was) and find out how to do (it wasn’t, at least not for me). I thought I’d write it down, because I’m bound to forget about it otherwise, and it might be helpful for some other poor soul out there.

I wanted to store a date in a database (SQLite) using Haskell (HDBC). Since I’m only interested in the date and not the time I chose to represent it in Haskell as a Date.Time.Calendar.Day. For Day 0 means 1858-11-17. HDBC on the other hand needs the date in seconds since epoch (1970-01-01 00:00 UTC), which conveniently can be represented with a single Integer. Time.Data.Clock.POSIX contains a few handy functions to convert between POSIXTime (NominalDiffTime) and UTCTime. However, there are no special functions for converting between an Integer and NominalDiffTime. The way I found was to do it via the class types that NominalDiffTime implements.

First I thought that Enum might offer the solution, but my experiments with that was a bit confusing:

I’d love to see the Gmail S/MIMEplugin for Firefox gain support for PGP/MIME. I can’t imagine it’d be too hard to add for someone who’s familiar with JavaScript and Firefox plugins. Or maybe there’s some other plugin out there that offers PGP/MIME support in Gmail?

Oh, and no, FireGPG isn’t good enough, it doesn’t do PGP/MIME, it doesn’t automatically verify/decrypt received emails, and it requires too much mouse interaction!

I’ve recently made some comparisons between HaskellDB and HDBC. My gut feeling is that I’d like to use HaskellDB due to its type safety and that it allows me to abstract away from SQL—something in me twitches every time I see strings containing SQL mixed in with other code, no matter what language it is. However there are some things that confuse me with HaskellDB. After overcoming the initial puzzlement that I seem to hit whenever I look at a Haskell API I stumbled on the apparent lack of support for primary keys. This surprised me and to check if I overlooked something I wrote the following code:

Recently I began writing a tool to scrape some information off a web site for some off-line processing. After writing up the basics using TagSoup I showed what I had to a colleague. His first comment was “Can’t you use Parsec for that?” It took me a second to realise that he didn’t mean that I should write my own XML parser but rather that Parsec allows writing parsers of a list of anything. So I thought I’d see just what it’d take to create a parser for [Tag].

A look at the string parser shipped with Parsec offered a lot of inspiration.

First the basic type, TagParser:

typeTagParser=GenParserTag

The basic function of Parsec is tokenPrim, basically that’s what other basic parsers use. Taking a cue from the string parser implementation I defined a function called satisfy:

The positioning in a list of tags simply an increase of column, irrespective of what tag is processed:

updatePosTag s _ = incSourceColumn s 1

Now I have enough to create the first Tag parser—one that accepts a single instance of the specified kind:

tag t = satisfy (~== t) <?> show t

It’s important to stick the supplied tag on the right of (~==). See its documentation for why that is. The second parser is one that accepts any kind of tag:

anyTag = satisfy (const True)

So far so good. The next parser to implement is one that accepts any kind of tag out of a list of tags. Here I want to make use of the convenient behaviour of (~==) so I’ll need to implement a custom version of elem:

Of course the big question is whether I’ll rewrite my original code using Parsec. Hmm, probably not in this case, but the next time I need to do some web page scraping it offers yet another option for doing it.

Being somewhat spoiled by the current state of Linux-on-the-desktop (Gnome in my case) I have become used to inserting a blank CD, point Nautilus to burn:///, dragging and dropping a few files into it, and finally clicking the “Burn to CD” button. Works beautifully. Of course I thought that Nautilus’ CD-burning would know about audio CDs. So I dragged a few WAV files into Nautilus and clicked the button. Voilà, a data CD containing a few Wav files! Not really what I wanted.

As Edward pointed out the code in the previous post expressions such as a .+. b .+. c are problematic. By expanding the expression and introducing a few parentheses it becomes apparent what the problem is:

fromIB $ (toIB (fromIB $ (toIB a) `addIB` (toIB b))) `addIB` (toIB c)

The problem is that inner fromIB. Unfortunately GHC doesn’t realise that the left-most sum could take on any type that is a CIB, the exact type doesn’t really matter since the result is passed to toIB anyway. It would be sort of cool if I could tell the compiler to prefer a specific CIB, basically a directive along the lines of “if in doubt, use Bytes”. I don’t think that’s possible in GHC. As it stands one would have to specify the type of the inner sum:

(a .+. b ::Bytes) .+. c ::KBytes

One possible solution would be to remove the call to fromIB from the definition of .+. and instead require the user to call it explicitly:

fromIB $ a .+. b .+. c ::KBytes

I suppose it’s all right, but not quite as elegant as I had hoped.

Now on to the more interesting part of Edward’s comment. I needed quite a bit of clarification, but I now know what he is proposing, I even think I understand it ;-)

The general idea is to have the compiler choose the largest prefix that can represent the sum of two values without losing precision. The represention I used didn’t have the problem of ever losing precision, so I’ll change the representation to better show what Edward meant.

Now there’s a second relationship that is designed to relate two prefixes to a third one. Basically the third is the largest prefix that can be used to represent the sum of the other two without a loss of precision. One could say that the third is where the other two meet.

I just realised that it’s been over a month since I posted something. All I can do is blame life ;-) With a daughter of 8 months and a recent switch of positions I just haven’t found time to play around really. Hopefully that’ll change now that the situation at work is “calming down” a bit.

Anyway, here’s a bit of code I played with today. A friend described a problem inside a tool related to reporting of free and used memory. Apparently, depending on the location in the code, memory could be stored in integers in bytes or in kilobytes (kB) or in kibibytes (Ki) or in mebibytes (Mi) etc. In every place the prefix would be ‘recorded’ in the variable name, resulting in a situation where a change of prefix used in one place would force a change of variable names in callers/callees. A bit of a mess. I thought I’d try to explore what Haskell’s type system could produce ;-)

First approach

My first thought was to create a data type for the prefixes:

dataSize=BytesInt|KBIntInt|KiBiIntIntderiving (Show, Eq)

Both for KB and KiBi I record the number of full KBs (Kis) and the remaining bytes. Then I thought that it’d be easiest to do all the calculations in bytes, which required a conversion function:

On some level this ‘algebra’ for memory usage makes sense, though refinements are certainly possible e.g. . I suspect there are some interesting choices to be made regarding the remaining functions in Num, e.g. should negate turn free memory into used memory?

I find this solution somewhat lacking in the extensibility department—as the commonly available amount of memory grows (the code above is stuck in the 80’s it seems) changes have to be made to Size. The same goes for changes to support non-standard prefixes, probably not a common problem but It would be nice to have a solution that is “independently extensible”.

Second approach

Here my idea was to create one type per prefix, use a separate type for the calculations and use a type class for the conversions. Quite straightforward when put like that, though it took me a while to come up with it ;-)

Here is the internal representation used during calculations:

dataIB=IBIntderiving (Eq, Show)

Simple enough. Then the class for conversions to and from IB:

classCIB a where toIB :: a ->IB fromIB ::IB-> a

No surprises there either. Now I want to add and subtract, but due to the definition of Num (it requires all arguments to be of the same type, e.g. the type of addition is (+) :: a -> a -> a) I’ll need two new functions for that (at least I don’t know a way around this):

One interesting observation is that due to the type of .+. and .-. it is possible, and often required, to specify the type of the result, somewhat similar to HSH. Now the stage is set for the prefix types to be defined, starting with plain bytes:

End thoughts

There is one thing I like about the first approach and one thing I don’t like. That both the types are instances of Num is something I like since I am interested in adding and subtracting memory, though arguably there are functions in Num that don’t make much sense for memory (e.g. multiplication and negation). The thing I don’t like is the need for explicit conversion functions (toKB and friends).

There are also one thing that I like about the second approach and one I don’t like. I would have liked to use the standard functions in Num, introducing new (type-specific) operators is something that I’m never particularly keen on. Though I suppose that implementing separate operators does emphasize that memory isn’t quite like regular integers. I really do like using Haskell’s own syntax to do the conversion, it’s more readable than having to call conversion functions.

That’s it! Comments and suggestions for improvements are always welcome of course.

In the comments to my post on the Google treasure hunt Andre asked me to put up my solution for the fourth problem. The problems were personalised and here is mine:

Find the smallest number that can be expressed as the sum of 3 consecutive prime numbers, the sum of 29 consecutive prime numbers, the sum of 373 consecutive prime numbers, the sum of 665 consecutive prime numbers, and is itself a prime number.

The idea behind my solution is that if I have four lists of the sums, and a list of primes, then the solution to the problem is the smallest member of the intersection of all five lists.

I saved the problem of finding primes to the last and instead concentrated on summing up lists of integers in a convenient way. I ended up with the following solution to that:

The real work happens in the inner functions. sumn picks out as many numbers as I want and sums them up, the reason for taking and returning a tuple is that sumi can use iterate to calculate the sequence of sums. The first item in the result of iterate a is a, so sumi drops the first one. This lets me create the needed lists of sums:

sum3 = sumN 3
sum29 = sumN 29
sum373 = sumN 373
sum665 = sumN 665

After having a quick look at the implementation of Data.List.intersect I realised it probably would be a bit too slow. Instead I wrote one that exploits that all the involved lists are ordered:

After convincing myself that this code did what I wanted (basically through playing in GHCi using an incorrect definition of primes, primes = [1..]) I turned to the problem of finding the primes. Of course someone else had already written the Haskell code to find primes. That code fitted perfectly with mine since it produces an infinate list of Integers. Brilliant! :-)

Finally here’s the main that allows me to compile the code and get the answer quickly (it’s slightly modified for the medium, basically I wrote it all on one, long line):

I personally found the first one to be the trickiest, quite a bit of thinking with a colleague before we saw Pascal’s triangle in the problem. I was a bit worried that things would get trickier from there on. It surely didn’t.

The second, summing certain lines taken from files matching certain glob patterns, was easily solved with a short Haskell program.

The third, a routing problem was probably the easiest and I think most easily solved on paper. I did make a mistake though, which I caught before typing my answer into the webform.

The fourth one, adding sequences of primes, was fairly straight forward. I think it would have presented a bit more trouble if I didn’t make use of Haskell’s laziness and support for infinite lists though.

I bet a lot of people would consider this to be fairly obvious, but today, while solving the fourth of Google’s Treasure Hunt problems I realised just how similar in essence laziness+infinate lists in Haskell is to generators in Python. It’s just much easier to work with the former ;-)

I hope this will be out of date really soon, but here’s what I did to solve the problems I had with the nVidia driver after upgrading to the latest kernel in Sid (2.6.25). First download the file NVIDIA_kernel-169.12-2286310.diff.txt from this forum article. Then, as root do:

I bet you have at least one silly little thing at work that, whenever it happens, you let out a sigh, maybe roll your eyes and whish that everyone would use a proper operating system. A few days I finally decided to do something about one of my things like that. At work, Windows users will at times for some strange reason, manually create directories inside their work area, even though the directories actually are under version control. Invariably they get the case wrong and due to an onfortunate combination of case insensitive filesystem on the client (Windows) and a version control system the cares about case (Perforce). This results in files ending up all over the place even though they belong in the same directory. The Windows users are none the wiser, they simply don’t see the problem. Since I use a sane system (Linux) I do notice, and when I see it I sigh and roll my eyes.

Yesterday night I hacked up a silly little tool for use at work. Nothing spectacular ((I’ll post the source here shortly.)) but I thought I’d try this Literate Haskell thing. I decided to go the LaTeX route and created a .lhs file. To my amazement rubber has a module for handling Literate Haskell, unfortunately it passes everything ending in .lhs through lhs2Tex. I didn’t really want to use lhs2Tex ((I’ll probably have a closer look at it later but for now I wanted to keep things as easy as possible.)) but I haven’t been able to find any way of changing rubber’s behaviour through its command line arguments. What I’d like is a way to disable a module for a single invocation. So, dear lazyweb, is there a way to do this from the command line?

So far I’ve worked around this by creating a symbolic link named MyTool.tex that points to the Haskell source, then I run rubber on the link. Not to much of a hazzle, but it’d be nicer to pass an argument to rubber.

I just went through the exercise of setting up MediaWiki on a Debian Etch box, but instead of serving it using the common choice of Apache I wanted to use lighttpd. It seems the combination MediaWiki and Apache is well supported on Debian. There was even an install-time question whether I wanted the configuration files set up for me. Well, it’s boring to follow the common path, right?

It seems the combination MediaWiki and Apache is well supported on Debian. Luckily that doesn’t mean it’s difficult to serve it with lighttpd. Here’s how I configured things, and luckily it all seems to work just fine ;-)

Installation

First off, I installed mediawiki1.7, lighttpd, php5-cgi, and mysql-server. I made sure that aptitude only pulled in required packages and then I started un-choosing all Apache-related packages. By the time you get to installing MySQL you’ll be asked for a root password, that is if your debconf is set to ask medium-level questions.

Setting up MediaWiki

The Debian package for MediaWiki creates a site in /var/lib/mediawiki1.7 consisting mostly of symbolic links to the actual location of its files. I kept lighttpd’s default document root of /var/www and linked the two by creating a symbolic link /var/www/mw pointing to the MediaWiki site (/var/lib/mediawiki1.7).

Configuration of lighttpd

First enable FastCGI by linking /etc/lighttpd/conf-available/10-fastgi.conf into /etc/lighttpd/conf-enabled/. Then go in and change the bin-path to point to /usr/bin/php5-cgi. To prepare for the MediaWiki configuration, enable mod_rewrite in /etc/lighttpd/lighttpd.conf and finally create the file /etc/lighttpd/conf-enabled/20-mediawiki.conf with the following contents:

If pbuilder --create fails when building a base image for Sid then it’s worth trying to build a base image for the stable release and then upgrade to Sid afterwards. The very manual way is pbuild --login --save-after-login.

Now, how do I actually get the full contents of Foo.name in python? The only way I could think of that actually works is to create a dummy-struct type in order to get the length out and then use it to dynamically create a new sub-class of ctypes.Structure, then create an instance based on the address of what was returned. I think the following shows it pretty clearly:

I still think there ought to be some other way that ‘feels’ nicer. I mean the use of “short arrays” (declared as here with [], or zero size as supported by GCC, or the more portable array of size one) seems to be common enough to warrant some support from ctypes, right?

In my ever-continuing attempts to replace Python by Haskell as my language of first choice I’ve finally managed to dip a toe in the XML/HTML sea. I decided to use the Haskell XML Toolkit (HXT) even though it’s not packaged for Debian (something I might look into doing one day). HXT depends on tagsoup which also isn’t packaged for Debian. Both packages install painlessly thanks to Cabal.

As the title suggests my itch wouldn’t require anything complicated, but when I’ve previously have looked at any Haskell XML library I’ve always shied away. It all just looks so complicated. It turns out it looks worse than it is, and of course the documentation is poor when it comes to simple, concrete examples with adequate explanations. HXT would surely benefit from documentation at a level similar to what’s available for Parsec. I whish I were equipped to write it.

Anyway, this was my problem. I’ve found an interesting audio webcast. The team behind it has published around 90 episodes already and I’d like to listen to all of them. Unfortunately their RSS feed doesn’t include all episodes so I can’t simply use the trusted hpodder to get all episodes. After manually downloading about 20 of them I thought I’d better write some code to make it less labour-intensive. Here’s the complete script:

The tools are written (mostly) in Haskell and I’ll probably upload it all to hackage at some point in the future. I’m holding out for a fix to ticket 2097 since a fix for it will allow me to add a third tool which I feel belong in the package.

Suggestions for improvements are always welcome, patches even more so :-)

Four days ago I uploaded omnicodec to Hackage. It’s a collection of two simple command-line utilities for encoding and decoding data built on the dataenc package. I’m planning on building a Debian package for it as soon as I find the time.

Recently I’ve had reason to write a few tools for Windows. Nothing complicated, and since there was nothing “mission critical” (there hardly ever is for me :-) ) I decided to try to write them in Haskell rather than Python or C/C++. This post contains some sort of recollection of my experience so far.

First off I should probably disclose that I don’t particularly like Windows. Besides feeling more at home on a Unix (Linux) system I find that Windows also upsets me on a very basic level. If I spend a whole day at work in Windows I tend to go home more stressed and irritated than after a day in Linux. Part of this is probably due to my familiarity with Linux. My hope was that the combination of the immense joy of programming Haskell and my loath of the Windows platform I would end up with something that at least was bearable. It turns out I did.

My setup

Besides installing the pre-built GHC I also installed Visual Studio Express more later to explain the need for it. I don’t like the Visual Studio environment so I installed CMake and jEdit ((ordinarily I would install Vim but I thought I’d try out another editor. Vim on Windows is somewhat “leaky”, i.e. it sometimes shows behaviour that betrays its Unix roots.)). Last but not least I installed Console, it addresses the fact that Windows’ “DOS-box” is an amazing piece of crap.

FFI on Windows

First, GHC comes with a number of Windows-specific modules and they cover an impressive number of the Win32 API functions. The Win32 API is simply huge and one will sooner or later come find the need to add to what those modules offer. Second, GHC is able to compile C code by delegation to MinGW, however MinGW doesn’t completely cover the Win32 API either and if one is unlucky MinGW isn’t enough for the task at hand. Of course that’s exactly what happened to me. So, I installed Visual Studio Express, but since I don’t like VS that much and I didn’t want VS solutions in my darcs repository I decided to use CMake to build a library that I then instruct GHC, through a Cabal file, to link with my Haskell code. The end result is that building requires a few extra steps before the familiar configure/build sequence. It works remarkably well really, and I’ll definately use that strategy again if I need to.

There is a benign warning thrown when linking the final executable though:

It seems that it might be possible to skip the extra steps in the future if Cabal ticket #229 is addressed.

Windows System modules

Admittedly I haven’t been using much of the functionality in these modules, but it happens that the only function I needed had a bug that results in a segmentation fault, sometimes. See GHC ticket #2097 for some details. I guess this confirms my impression of most Haskell hackers being Linux/BSD/Unix users.

Conclusions

I won’t drop Haskell on Linux anytime soon, but I’ve found Haskell on Windows to be possible. As I suspected when I started out, Haskell does make Windows somewhat easier to bear.

I have to say rotonyms seem kind of pointless, but since there is a post with a Python solution I thought I’d make an attempt at a solution in Haskell. Of course I didn’t manage on my own, I received some much needed help from Cale and dmwit on #haskell.

First some rotation functions, as far as I know there’s no built-in rot13 encoding in Haskell:

Now I’m ready for the meat of the matter, reading the word list, pick out the rotonyms and finally print them. Following Will’s cue I represent the word list as a set of strings (importing Data.Set as S), this also eliminates duplicates.

I just began listening to episode 95 of pauldotcom and was glad to hear that they enjoyed my email. Here’s the complete email I sent them:

Well, something must have changed since I last communicated with you (see http://therning.org/magnus/archives/257). I’m not sure what though. I heard you when you were on the Linux Reality audio cast and thought I’d check you out again, just to see what you were up to. Well, episode 92 (both parts) was bloody brilliant, episode 93 was good too, and now I’m halfway into episode 94. I have no recollection of the earlier episodes being this organised and good. At some point when I wasn’t listening you must have learnt to rock!

I enjoy the tech segment. The amount of banter is down and the episodes move along a lot more than I remember. No offence to Twitchy, but I’m not sad he isn’t as involved any more, you know, Kramer is brilliant but Seinfeld just wouldn’t be a good show if he were in each and every scene. Twitchy has more of a “celebrity guest” personality… The only criticism I have, and this is pushing it I know; given my walk to work I’d prefer each episode to be around 60 minutes, rather than 80-90 ;-)

Keep it up!

/M

PS I’m planning on posting this email on my blog. I’ll put any reply from you on there as well.

Reading my email on the show sure beats any reply they could have sent by email :-) At some point I have to go back and check out the other podcast I stopped listening to…

I just received an email from KLM UK. I have probably given them my email address and thereby my consent to them sending me “informative emails”. Still, I think their words sound a bit hollow when the link to unsubscribe takes me to doubleclick:

If you do not wish to receive future communication from us click here to unsubscribe. KLM is firmly committed to respecting your privacy. We don’t share your information with any third party without your consent.

It took me a round on haskell-cafe before I found out that ghc needs to be told to use the foreign function interface, -ffi or -fffi:

% ghc -c -fffi Foo.hs

Linking is a snap after that:

% ghc -o foo foo.o Foo.o
% ./foo
4

It’s also possible to build and link it all in one go:

% ghc --make -fffi -o foo foo.c Foo.hs

Now, that’s pretty nice, however it’d be even nicer to use cabal to do the building. At the same time I decided to put c2hs to use. It seemed to be a lot easier than having to create the import statements manually. I ended up with the following:

_csrc/foo.h_:
#ifndef _FOO_H_int foo(int);
#endif

csrc/foo.c:

#include "foo.h"int
foo(int i)
{
return i * i;
}

I couldn’t get cabal to accept Foo.chs as the file containing the Main module in my project. So I ended up putting all the relevant code in Foo and then have a dummy Main.

The cabal file is rather straight forward. It took me a round on haskell-cafe to find out how to let the compiler know that I need the foreign function interface without putting compiler directives in the source file.