JSON Processing in F#

Intro

The F# programming language is really fun to work with. It lets you improve your code almost infinitely, not only in terms of performance but also readability. Having finished reading through McConnell’s Code Complete I’ve decided to add some declarativeness to JSON processing code I developed for my little pet project Linq2vk. And since JSON is ubiquitous these days, I hope the technique described below might be of use to others as well.

Data

It is common for all modern online services to provide a public API with JSON as an option of data representation format and all the data we get in response from those services is either a huge JavaScript object or an array. Let me leave objects for further investigations and concentrate on arrays like the following one:

[123, "Maxim", "Moiseev", "http://bitbucket.org/moiseev"]

This is an array (or list if you will) of simple objects, of which we are sure that it contains an integer for an Id and then three string values for first name, last name and homepage URL respectively. We are not going to actually parse JSON and represent it as a .NET list of tokens, it is better to use a great Json.NET library for that task. Our target is writing a function of type JToken[] -> Person, where JToken is a list of JSON tokens provided by Json.NET and Person is a simple user defined .NET class. In other words, we are going to deserialize JSON array into an instance of a .NET class.

Initial approach (naïve)

There are (as always) several ways of achieving this. The simplest one, which I have used until now is as follows (for simplicity I won’t define the Person class here, using a tuple instead):

Much better now, but still field processing code does not look unified: normal fields are accessed directly while optional one works through a separate function, conversion function in former case is called as a part of pipe but is passed as a parameter in the latter. Still not good enough.

Better approach

Here comes the idea to use all the knowledge from multiple infinitely long and smart articles about monads and parser combinators. Luckily F# gives us its Computation Expressions feature, which is all about those monads.

The parser is essentially a function that takes an array of tokens as its input and returns a list of pairs: an object of some type 'a and an input array. In case of success, the result is a singleton list, otherwise it is an empty list. (List is used here for simplicity, we could define a discriminated union with two value constructors for either case or use a standard Option<'a> type).

Finally we are at the point where we started and again we are missing support for optional fields. This should not be a big deal this time however, since every piece of data processing is a parser – what we need is just another parser, or more precisely a parser combinator (a thing that gets parser as its input and returns a new parser back) that will return a default value if initial parser fails.

Please note the fact that thanks to operator priority we don’t need to use parentheses.

A little more fun

Lets imagine that sometimes parsing of a field depends on some external condition, for example, if id field is greater than 1000 then URL will not be present and we should use some default value instead.

See how using <?> and <|> in the same statement resembles usage of standard ternary operator ?:.

Conclusion

In the end we have a declarative way of processing JSON tokens and a simple parser framework capable of performing any kind of parsing task that may occur. As a next step we can implement a similar framework for processing JSON objects, with the only difference being the usage of field names instead of indices.