It's been a slow writing month, as you can tell by the number of recent submissions. Hopefully, I remember how to do this.

Other than raising a child|1|, taking a long-overdue break from programming in my down time and dealing with various other mundane bullshit, I've been working on a small project in Haskell that'll make coordination for my wife and I marginally simpler. It's a minimal, web-based running shopping list, and the idea is to use it to forget fewer things during the weekly trip as well as make occasional after-work quick grocery stops possible. I could easily have knocked this out in a day or three using Python, but I figured I'd take some time to put my Haskell skills through more rigorous testing.

Last time, I worked with Happstack, which is the best looking web framework available in the language, as far as I'm concerned. This time, I just pulled out warp, the wai-based server, and went frameworkless.

Using a framework binds you to it. Yesod especially seems to have a stick up its ass about using something other than Haskell to build pieces of a project. You may recall that I theorized about keeping front and back-ends entirely separate a little while ago. Well, I'm still on that. So as much as framework maintainers want me to use or Hamlet or whatever, doing so would be counter-productive for me. Yesod actually goes one further and has a JS-generation language too. The idea is supposed to be that all of your code then goes through the same rigorous type checks and optimizations that the ML family is famous for. In practice, what it means is that you're more-or-less forced to use jQuery|2|, and it means that all of your project is in Haskell|3| and it means that your server-side code is doing everything|4|. I think I'll stick to the manual trio for the front-end and just let Haskell do the heavy database lifting.

The easiest way to do that seems to be to basically keep the model and a controller on server-side, and the view and a controller on the client side. Happstack is the only one of the main three Haskell web frameworks to make something like that reasonably simple, but I already got some time in with it, and the Warp server claims to bench much better.

Firstly, don't try to do bottom-up design here. Or, at least, slow down with it until you get fairly good with the language, fairly familiar with the documentation/conventions, and fairly good at understanding how GHCi works. The techniques of wishful thinking and building the language up towards your problem are still applicable, but Haskell has a way of focusing pretty relentlessly on types. Even though it infers a lot of type information without your help, the most common pieces of advice I get from other Haskellers is to

work out what the type of my function is going to be before writing the function itself and

explicitly write it down above the function

I've found that this does help, if for no other reason than thinking through your operations' types will highlight the pieces that you didn't think through well enough, and it'll remind you what the next piece needs to interface with. It's just not what I'm used to doing|5|.

Secondly, don't trust GHCi completely. As a Lisper, this sort of blew my mind because the Lisp REPL is a running Lisp image. If something works in the SLIME prompt, it's a fairly good bet that you can just copy it out into your .lisp file and have it work the same way|6|. GHCi isn't exactly that. Its type reflection directive does some odd things when inferring types|7|, and it fudges IO-related operations in general|8|. For simple stuff that doesn't do IO, you can probably still get away with Lisp-style exploratory REPLing, but it doesn't seem to be a good general strategy. For GHCi, at least. For all I know, HUGS is better at this sort of thing, but I haven't used it extensively yet, despite it being the only Haskell REPL on offer for ARM at the moment.

Thirdly, it's possible|9| to apply the venerable technique of debugging by printf. At first glance, it seems like it wouldn't be, since doing any output from a function pollutes its type with IO, which then cascades to all of the callers of that function and causes you to rewrite half the project if you want to add some output in one place. Oh, and then rewrite it back once you're done looking at debugging output. There's a library called Debug.Trace that lets you pull off something similar enough. It highlights very clearly that this isn't meant for production use though; what you're supposed to do, near as I can tell, is import qualified Debug.Trace as Debug, then sprinkle Debug.trace "A trace message goes here..." $ {{the thing you want trace output for} throughout your code, and run M-x query-replace-regexp Debug.trace ".*?" later to replace these calls with nothing. It's possible that there's an automatic way of removing them, but I didn't bother finding it for a project this size.

Basically, case statements. routes at the top there dispatches on pathInfo req, which returns the URI minus GET/# parameters and split on /. You then use the standard Haskell pattern matching facilities to figure out what the user has requested and what to do about it.

Lets take a close-up look at the type signature of routes before moving on.

routes :: DB -> SessionStore -> Request -> RES

That should look suspiciously minimal to anyone who's actually done web development in Haskell before, and it is. The why has to do with this import

import TypeSynonyms

I have no idea whether this is good Haskelling practice or not, but I ended up defining descriptive synonyms for a bunch of the complex types I needed to work with. Then I realized that I need to refer to them in more than one module and it would be better to centralize them rather than having copies of the definitions in each relevant file.

So DB is shorthand for the database class we're using|10|, RES is shorthand for the HTTP response IO type, and SessionStore/LookupFN/InsertFN are session vault and related lookup/insertion functions respectively. I also defined shorthand for a ByteString association table, since that's how parsed request parameters are stored and they get passed around more than once.

Ok, back to routes.

If you read through that file, you'll notice that a lot of validation and lookup logic in with the routing rather than in the specific handlers that might need them. That sort of happened accidentally, and again, I'm not sure it's the best way to organize these files, but it does have two big advantages. First, because validation and error routing happens beforehand, the handler functions themselves can be exclusively concerned with the successful case. By the time an item-related function is called for example, it's guaranteed to have a request from an existing, logged-in user relating to an existing item. So the handler doesn't need to check for any of those internally. Second, we centralize the validation and lookups. If we expected the handlers themselves to deal with it, then each of the item-related handlers, for example would need to check for an authenticated user, and they'd each have to check that the item they're asked to operate on actually exists. By doing it beforehand, we only do that check once.

I mentioned that this is new to me. That's because the various Python/Ruby frameworks I'm familiar with all represent a routing table as some sort of ordered associative list of regexes and handlers, while all of the Common Lisp/Clojure servers I'm familiar with give you something along the lines of define-handler, which takes a name, handler body and routing URI, removing the need for an explicit central routing structure at all. As I recall, Smalltalk works something like the Lisps and Erlang does something similar to Python/Ruby. So this is the first time I did any real work as part of handler routing, and it seems like it might be a good approach. In a dynamically typed language, I'd be really worried about not making it obvious enough that a handler function is expecting thoroughly validated input rather than doing that part itself, which would increase the chances of a dev passing in unvalidated input and causing an explosion somewhere. But the type annotations and rigorous static checking take care of that for me here in Haskell-land.

Lets take a look at these simplified handlers we're passing stuff on to.

It's not expecting an account name and item ID to reference by. It's expecting an Account|11| and it's expecting an Item|12|. It does the work of updating the DB, and then sends back an appropriate response.

Really, I could have made one more general function along the lines of editItem, then called it for need, got, and separate handlers for changeComment and changeCount. In fact, that was officially a note to self.

The way it's currently written, the most complex of the item-related handlers is editItem, and that's because it needs to optionally change the comment, count or both depending on what's passed in. This is the price you pay for automatic currying and maximally terse partials; those features don't share space well with optional/keyword/rest arguments. The result is that when you need the latter, you need to represent them as mandatory Maybe args, or as a custom type argument. We've already gone through an example of the first approach. You can see the second if you squint at verifyPass and encryptPass. Specifically, the second argument, defaultParams is of type ScryptParams, which is defined as

and scryptParams is a surprisingly complicated function that validates input and returns Params { logN, r, p, bufLen = 64 }. In Lisp, verifyPass would have an arg line like

(verify-pass incoming stored &key (logN 14) (r 8) (p 1) (bufLen 64))

and start off with some assertions to mimic the validation done in scryptParams. Of course, that's not to say that the Haskell approach is a hack; both approaches have their advantages and disadvantages in practice|13|. In the specific situation I'm dealing with above, since we don't have optionals, it would probably have been better to separate the count and comment changing handlers and let the front-end call the specific one it wants. That was another note to self.

Since we're here, and since I'm the guy who's been going on and on about this, something would be slightly amiss if I failed to note the authentication system, at least in passing. We're using the scrypt algorithm to store and verify passwords|14|. If a password is verified we store the users' name in their session cookie. wai-session encrypts its cookies for security, so this system would actually be a simplistic|15| but secure way of maintaining user account as long as we used it over HTTPS.

Lets see, where were we. Oh, right, all those functions beginning with res in the Handlers and Main modules aren't built-ins. They're defined in a generically named Util module.

The response functions seem fairly self-explanatory. resOk constructs a standard HTTP 200 response, resIxItems takes an IxSet Items and constructs a response by JSON-encoding it, and the error handlers each return a 404 or perhaps 401 with some error message. The resPlaceholder was something I used as a|16| placeholder in the various routes functions while writing the final handlers. I think the only place a call to it still exists is in the favicon handler.

withPostParams and more generally withParams are functions that call functions using the output of serial lookup calls, which would be a pain in the ass to do manually. maybeRead is exactly what it says on the tin; it's a wrapper around Haskell's polymorphic read function wrapped so that it can deal with Maybe String rather than String input.

That's ... really it. I'm struggling to describe these a bit more than I usually do because the type signatures tell you a pretty significant amount of what you need to know about a given function. Not everything, obviously, but you'd be surprised how many times I've cut "foo is self-explanatory" from this write-up.

The last module left is the model; the one that actually takes all this information and stores it in some way.

Before we dive into the code on this one, I want to highlight two things.

First, I initially did my best to separate the model entirely from the rest of the logic. Going so far as to define "external" API functions to call from other modules. That didn't last. Using naked query' and update' caused some type issues that I'm still not entirely clear about, and the code separating the database primitives from other functions was almost tripling the total size of the model. When modularity costs that much, I prefer to chuck it in a bin and resign myself to re-writing most of the application if I need to change out my database engine.

Second, this isn't the first back-end I tried using|17|. Before settling on AcidState, I tried out the haskell-MongoDB interface and [hdbc-sqlite3](http://hackage.haskell.org/package/HDBC-sqlite3-2.3.3.0), neither of which impressed me much|18|. The Mongo interface just plain does not fit with the Haskll way of doing things. It's a massive, loosely structured, JSON-based key/value store, and since my application mainly responds to clients using JSON feeds, I figured that would be a good fit. One problem is that it turns out that aeson|19| has a fundamentally different type architecture than mongodbs BSON, which means that converting between them is ... lets go with "non-trivial". The other big problem is that doing k/v lookups in Haskell is harder than dealing with native Haskell types, which means that the easiest way of using Mongo here would have been to define a type, and then specify how it gets encoded/decoded to both JSON and BSON. Given that I'm pretty used to pymongo and Monger, that's a lot more work than I was expecting. The hdbc interface was slightly better, since relational databases assume some up-front modeling, and slightly worse, since it expected me to write string-template based queries. Both external options required dealing with the actual DB through an external connection, both required a conversion step from associative lists before being converted to JSON, and both seemed to expect me to perform that conversion step.

The temptation is really strong to say "this is self-explanatory", but that's only true if you've also gone through the relevant Happstack Crash Course section. Basically, AcidState is a Haskell-native noSQL data store. You define Haskell types that represent stuff you want to store, and it serializes them to disk through an interface that looks pretty close to the state monad if you squint hard enough.

This is the full definition of the Account type. The ToJSON declaration allows aeson to serialize this type for the front-end, and the Indexable class lets IxSet construct collections of Accounts. IxSet itself is a multi-indexing set implementation that I'm using to store both the accounts and shopping list items for this project.

The deriveSafeCopy call there is actually a TemplateHaskell invocation, and not a regular function|21|. Basically what that means is that this call will be resolved at compile time; in that sense, TemplateHaskell is a very restricted answer to Lisp macros.

Next, note the type of accountPassphrase. If you take a look at the login handler from earlier, you'll notice something a bit odd.

We're storing a ByteString in the database, but converting it to/from the type EncryptedPass. We don't have to do that with other types, but the developers who wrote Haskell's scrypt library didn't bother deriving Typeable for their types. There's quite likely a way for me to derive it manually, but that seems like more trouble in this particular instance. I just wanted to point out that even though we're doing type conversion to fit something into an AcidState DB here, that's not the ideal case, and you typically don't have to.

Finally, note that except for initialDB, all the exports from the Model module are types rather than functions

If you take a look at any query' or update' call in the handlers, you'll note that they work by passing arguments to one of these types. What's actually happening is that AcidState is forward-journaling your requests so that it can fulfill the ACID guarantees. The trouble is that functions aren't inherently serializable in Haskell. So what it expects you to do is define the appropriate database functions, then use makeAcidic to derive the relevant, serializable types.

That about does it for the back end. I was going to go over the client-side code too, but this piece is getting quite long already. I'll likely write a follow-up to show you how I ended up actually calling the JSON-based handlers I describe above, but in the meantime, you'll have to check out goget.js and index.html in the static folder of the github project.

6 - |back| - Modulo the obvious state problems you have from potentially having some intermediary values defined in the image.

7 - |back| - That SO question has an example in the answers' comments; the type EventResult is reported as belonging to the module Data.Acid.Common, but that file actually doesn't exist. What's actually happening is that Common is a hidden module in the AcidState project, and another module is responsible for exporting its symbols. I didn't know this just from looking. The reason it matters is that when you want to make a type signature explicit by importing the relevant module, GHCi will tell you where a given type is defined and not where it's exported. Fun times.

8 - |back| - Which kind of makes sense, because conceptually speaking, a purely functional REPL for a lazy language would more or less have to be implemented in the IO monad.

10 - |back| - There'll be more on that later, obviously. Do note that this is only a reasonable thing to do because we only use one database class for our model; if we used several, we'd need to figure something else out. To be fair though, I'm having a hard time imagining a situation that would call for using several DB classes in a single project.

13 - |back| - In fact, I've been meaning to write a piece comparing the two, I just haven't gotten around to it.

14 - |back| - Which you still shouldn't count as a flat out recommendation, but I am using it, and I do intend to deploy this, so draw what conclusions you like.

15 - |back| - We don't do any kind of throttling on login, aside from the complexity of the scrypt algorithm itself, and we don't check registration requests for automation with recaptcha or similar. I'm not sure how I feel about the first, while the second seems entirely unnecessary for an acount that doesn't allow sending any kind of email, or doing anything other than managing associated data.

20 - |back| - In order to serialize a type, you need to deriveTypeable and SafeCopy. They're both trivial tasks for your own types, assuming you've got TemplateHaskell on, but are non-trivial for types you're including from modules you didn't write. You'll see an example of this when you see how I store encrypted passwords later on. That's the only type I needed to massage myself though; had I used an option other than AcidState, I'd have had to do the same for all of them.