Saturday, August 14, 2010

Reading F# Projects, Part I: The Common Knowledge

During the programming of my data mining library, I occasionally refer to the design of other libraries. E.g. data mining toolkits WEKA and MALLET are well designed. I also refer to the design of F# libraries. The examples on F# books mainly tell us how to program on small level, i.e. how to manipulate list or sequence and some language features. However, designing in F# is often covered in a general way.

Thus, it is beneficial for new F# users to read well designed F# libraries. In this series, the design of the following source code will be presented: (the list is tentative to change.)

* some parts of F# Core and PowerPack.

* F# math providers -- service design pattern and PInvoke in F#.

* TrueSkill – how a specific data mining model is implemented.

* FParsec -- the best example of computation expression/monad.

Before going to any concrete projects, we review the common issues/tips/concepts that may be involved in the process of the project design in F#.

I will add more topics here as we go through different F# programs.

The interface/signature file .fsi

In Java or C#, the implementation and interface are put in one. In C/C++, the interface (.h) and the implementation (.c/c++) are separated. Either design is good and has advantages.

For F#, I think the later is better. In .fsi, all the public functions/types are commented carefully. These commented can be used to generated documents for the program, or used by the IDE Intellisense.

The actual implementation is in the .fs files. In this file, comments are in places where actually necessary for the program who implements it, e.g. a formula, or a refer to a page number in a book. Because the program who writes the function definitely know what every parameter mean. If this kind of comments are put into the implementation, then the programs look less succinct.

So usually, the programmer doing the implementation write the implementation in .fs. When the implementation is stable or ready to ship out, the corresponding .fsi interface file is generated and well commented manually.

Namespaces and Modules

The latest F# version requires that an F# source file (.fs) starts with a namespace, that means all the following code is under this namespace(An exception: there could be multiple namespaces in a source file.)

Modules are similar to namespaces. You can view that a group of functions are put into a namespace.

The usage difference between namespaces and modules, in my option, is that namespace is broader concept. E.g. Microsoft.FSharp.Collections names contains a lot of standard data structures, where the manipulating functions for each data structure are put into a module.

You can also view modules as a class containing only static members.

Classes and Interfaces

F# interfaces are just like interfaces in C#/Java. Interface is a quite universal concept occurring different programming paradigms. Even in the pure FP Haskell, type classes are similar to interfaces.

F# classes are different from classes in OO languages. First, F# encourages an immutable programming fashion for classes: you have only one main constructor, once the object is constructed the values remain unchanged. You can also have mutable member fields in a class. The constraints on F# classes are for safer programs, although sometimes causes some inconvenience.

Besides classes, F# also has Records, Enums and Discrete Unions.

Extension to existing classes and modules

In F#, you can extend an existing .net class by using “type .. with ..” construct:

In this piece of code, System.Net.WebRequest is an existing class in .Net. We add a member function AsyncGetResponse into this class. This is like lightweight inheritance by saving from creating a new class.

The attributes

In .Net, attributes associate declarative information with the code. Usually they occur in production/formal code. Here is a tutorial.

Here I list some commonly used attributes in F#:

[<CompiledName("xxx")>][<AutoOpen>][<RequireQualifiedAccess>]

To see a full list, and the exact meaning. No resource is better than the F# Specification, Chapter 16. Special Attributes and Types.