söndag 21 juli 2013

Alternatives to subtyping, Part 2

Modules instead of classes

Classes in Java serve a number of different purposes. One is to extend present classes with new code and types to avoid code duplication and reuse work done by earlier developers. The example we saw in the last blog (student, graduate students and undergraduate students) is an example of this. OCaml have classes, too, but we will instead look at how the module system can help us achieve some of these goals. In any case you should learn about OCaml's module language before continuing to the object system.

As mentioned we have a class hierarchy consisting of a base class "student" and two subclasses "graduate student" and "undergraduate student". We will use the module feature to include another module to model this:

The "student" type is exactly like in the last blog entry, but with the function field removed that computed the grade. We have a "make" function to make an instance of a student ("new" is a reserved word used by the object system), and even a small "to_string" function that does what's exptected from it. So how can we "extend" this type for customized computation (compute the grade)?

Can we say that GraduateStudent "inherited" Student? Yes and no. Some code is reused. E.g., you don't have to recode the make- or to_string-function from the first module. On the other hand there's no subtyping going on here; the type system only knows that the type t from the first module is in the two later modules. For example, if you type

in the toplevel, the list will have type GraduateStudent.t, disregarding UndergraduateStudent. How funny is that? This means trouble: there's nothing that prevents us from computing the course grade for a undergraduate student using a graduate student type!

let student1 = GraduateStudent.make "Ronald";;
UndergraduateStudent.compute_course_grade student1;; (* This is not what we want! *)

A way to go around this is to "tag" the types with variants, like this:

But you still have to keep the bookkeeping yourself; the type system won't help you from mixing things up. So how do we tell the compiler that the graduate and undergraduate students are in fact two different types, and still keep the code reuse? The answer is private type abbriviations. This is a somewhat advanced feature of the module language, so hold on to your hats.

OCaml will make as much as the structure "public" as possible, which, in our case where we're not using signatures, is everything. That's why the type inference system "knows" that GraduateStudent.t is in fact the same type as UndergraduateStudent.t. The way to limit this is with a module signature, which you might already know. The signature is pretty much the interface of the module; in it you can define what will be visible from the outside. For example, we might want to keep the definition of the student for ourselves, so that client code don't pattern match on it and creates students without our admirable make-function:

Returning to our original problem, we will redefine the student type in the module signature, adding the private keyword. Using private is the middle-ground between fully including t from Student, and making it abstract (not visible outside the module). It allows pattern matching, but makes sure the student types in the two modules are treated as different types. Since we're defining our own signatures, we must include the signature of the student module into our new module, using the language construct include module type of Student, where "module type" means the signature of "Student". One more thing: To allow type coersion (explained below) We will explicitly denote that the type t in the included signature should not simply be copy-pasted, but reused. This is done with with type t = Student.t

Once you have coerced a value to its supertype, you can't coerce or cast it "back" like you could do in C/Java. Why? Because it's unsafe as hell, that's why. Type safety is honorary in OCaml.

What have we achieved so far? We have added additional behaviour to a base module by including it, and made sure we can coerce an instance of it back to its supertype when we're done with all the specialized work. One inevitably wonders - can I add additional type information this way too? No. You can't modify a defined record type afterwords. It's not "open" in that sence. If you want that, you'd have to use the object system in OCaml. Or make a new type, possibly in the new module, though this more resembles aggregation (have-a) than inheritance (is-a).

Another shortcoming - in an OO sence of reasoning - is the lack of "protected" members. E.g., one might want to factor out the point limit from the two compute_course_grade functions (80 and 70), and put an auxiliary function in the Student module, hidden from outside but visible from modules which include it. This is not possible. Functions are either completely hidden or visible for everyone.

Same thing, but with functors

A functor - in OCaml phrase book - is "a module parameterized with another module", or "a function from module to module". Basically, it's a way to smash two (or more) implementations together, just like the include command. The difference is, with a functor, you will know beforehand what signature the other module will have. This allows for some interesting concepts. Check out this easy example, where the functor Square assumes a number module which could be int, float, complex or whatever we choose:

In the world of students, we want a student functor that accepts another module with a grade computation function. We will customize the student module with another module that carries the special computations.