Archive for March 2011

In my last post I counted some words using F#, which turned out to require a single, simple line of F#. When I’d done the same thing before in C# i had iterated over the words and kept a count as I went, which is a typically imperative approach. So, I wondered if you could apply the functional approach to C# – perhaps using LINQ. Turns out you can.

Firstly, it’s helpful to have some words to count. Here’s a simple approach:

string test = "The cat sat on the mat.";

string[] words = test.Split(' ');

(In F# I populated a list of words directly, so there’s an extra line of C# here – largely because I started with my previous C# code.) Right, now to the counting in one line:

var result = from word in words

let strippedWord = StripPunctuation(word).ToLower()

where strippedWord.Length > 0

group word by strippedWord into grouped

select new { Word = grouped.Key, Count = grouped.Count() };

You may have noticed a call to StripPunctuation – a utility function I had in my previous C# code. Here it is (declared static as I was running it in a console application:

privatestaticstring StripPunctuation(string word)

{

string result = word;

if (result.Length > 0)

{

if (char.IsPunctuation(result[0]))

{

result = result.TrimStart(result[0]);

}

if (result.Length > 0)

{

if (char.IsPunctuation(result[result.Length - 1]))

{

result = result.TrimEnd(result[result.Length - 1]);

}

}

}

return result;

}

And now, with a little sprinkling of dynamic capability, outputting the results to the console:

foreach (dynamic entry in result)

{

Console.WriteLine("{0}\t{1}", entry.Word, entry.Count);

}

So it is possible to apply the more functional approach courtesy of LINQ, although there’s still more code than I had in F#. The C# is doing a couple of extra things (it strips out punctuation and is case insensitive) – but the point isn’t really the comparison between the two examples so much as the fact that by grasping some functional concepts can result in a change to your C# – which is a good reason to learn some F#.

For a recent MSDN Flash article I wrote some simple code to calculate word frequency in C#. As I get to grips with F#, I’m learning that the most rewarding but also the most difficult aspect is to think in a more functional way. To count words in an imperative style (as I did in my C# example) I would iterate through a collection of words and keep a running count. And, of course, you could write code in F# to do that. But what would be the point? How about approaching it in a different fashion? So, with those questions in mind, I fired up Visual Studio and went about trying to bend my brain into a more F# like shape. One of the things I like about F# is F# Interactive – a REPL which makes trying out and learning F# (as well as prototyping) easy – so that was where I started. First thing I needed to do was to create a list of words (since at this stage I’m concerned simply with calculating frequency and not reading files or strings.) It’s fairly simple to do that in F#:

let words = ["the"; "cat"; "sat"; "on"; "the"; "mat"];;

(the double semicolons are signal to F# Interactive the completion of a statement.) After reading a bit about processing sequences in F#, I spotted that there is a function to count elements in a list – it can easily be used against the whole list like this:

let count = words |> Seq.countBy(fun x -> x);;

The countBy function takes a function to generate a key – in this case we can use each individual word. To see if that has worked, we can print out the contents of the result: