(Chapter written by Jérôme Vouillon, Didier Rémy and Jacques Garrigue)

This chapter gives an overview of the object-oriented features of
Objective Caml. Note that the relation between object, class and type
in Objective Caml is very different from that in main stream
object-oriented languages like Java or C++, so that you should not
assume that similar keywords mean the same thing.

The class point below defines one instance variable x and two methods
get_x and move. The initial value of the instance variable is 0.
The variable x is declared mutable, so the method move can change
its value.

Note that the type of p is point. This is an abbreviation
automatically defined by the class definition above. It stands for the
object type <get_x : int; move : int -> unit>, listing the methods
of class point along with their types.

The evaluation of the body of a class only takes place at object
creation time. Therefore, in the following example, the instance
variable x is initialized to different values for two different
objects.

The parameter x_init is, of course, visible in the whole body of the
definition, including methods. For instance, the method get_offset
in the class below returns the position of the object relative to its
initial position.

Expressions can be evaluated and bound before defining the object body
of the class. This is useful to enforce invariants. For instance,
points can be automatically adjusted to the nearest point on a grid,
as follows:

However, the former pattern is generally more appropriate, since
the code for adjustment is part of the definition of the class and will be
inherited.

This ability provides class constructors as can be found in other
languages. Several constructors can be defined this way to build objects of
the same class but with different initialization patterns; an
alternative is to use initializers, as decribed below in section
3.4.

There is another, more direct way to create an object: create it
without going through a class.

The syntax is exactly the same as for class expressions, but the
result is a single object rather than a class. All the constructs
described in the rest of this section also apply to immediate objects.

Immediate objects have two weaknesses compared to classes: their types
are not abbreviated, and you cannot inherit from them. But these two
weaknesses can be advantages in some situations, as we will see
in sections 3.3 and 3.10.

A method or an initializer can send messages to self (that is,
the current object). For that, self must be explicitly bound, here to
the variable s (s could be any identifier, even though we will
often choose the name self.)

You can ignore the first two lines of the error message. What matters
is the last one: putting self into an external reference would make it
impossible to extend it afterwards.
We will see in section 3.12 a workaround to this
problem.
Note however that, since immediate objects are not extensible, the
problem does not occur with them.

Let-bindings within class definitions are evaluated before the object
is constructed. It is also possible to evaluate an expression
immediately after the object has been built. Such code is written as
an anonymous hidden method called an initializer. Therefore, is can
access self and the instance variables.

It is possible to declare a method without actually defining it, using
the keyword virtual. This method will be provided later in
subclasses. A class containing virtual methods must be flagged
virtual, and cannot be instantiated (that is, no object of this class
can be created). It still defines type abbreviations (treating virtual methods
as other methods.)

Note that this is not the same thing as private and protected methods
in Java or C++, which can be called from other objects of the same
class. This is a direct consequence of the independence between types
and classes in Objective Caml: two unrelated classes may produce
objects of the same type, and there is no way at the type level to
ensure that an object comes from a specific class. However a possible
encoding of friend methods is given in section 3.17.

Private methods are inherited (they are by default visible in subclasses),
unless they are hidden by signature matching, as described below.

The annotation virtual here is only used to mention a method without
providing its definition. Since we didn't add the private
annotation, this makes the method public, keeping the original
definition.

The constraint on self's type is requiring a public move method, and
this is sufficient to override private.

One could think that a private method should remain private in a subclass.
However, since the method is visible in a subclass, it is always possible
to pick its code and define a method of the same name that runs that
code, so yet another (heavier) solution would be:

In addition to program documentation, class interfaces can be used to
constrain the type of a class. Both instance variables and concrete
private methods can be hidden by a class type constraint. Public and
virtual methods, however, cannot.

We illustrate inheritance by defining a class of colored points that
inherits from the class of points. This class has all instance
variables and all methods of class point, plus a new instance
variable c and a new method color.

A point and a colored point have incompatible types, since a point has
no method color. However, the function get_x below is a generic
function applying method get_x to any object p that has this
method (and possibly some others, which are represented by an ellipsis
in the type). Thus, it applies to both points and colored points.

Multiple inheritance is allowed. Only the last definition of a method
is kept: the redefinition in a subclass of a method that was visible in
the parent class overrides the definition in the parent class.
Previous definitions of a method can be reused by binding the related
ancestor. Below, super is bound to the ancestor printable_point.
The name super is a pseudo value identifier that can only be used to
invoke a super-class method, as in super#print.

A private method that has been hidden in the parent class is no longer
visible, and is thus not overridden. Since initializers are treated as
private methods, all initializers along the class hierarchy are evaluated,
in the order they are introduced.

The reason is that at least one of the methods has a polymorphic type
(here, the type of the value stored in the reference cell), thus
either the class should be parametric, or the method type should be
constrained to a monomorphic type. A monomorphic instance of the class could
be defined by:

On the other hand, a class for polymorphic references must explicitly
list the type parameters in its declaration. Class type parameters are
always listed between [ and ]. The type parameters must also be
bound somewhere in the class body by a type constraint.

Let us consider a more complex example: define a circle, whose center
may be any kind of point. We put an additional type
constraint in method move, since no free variables must remain
unaccounted for by the class type parameters.

An alternate definition of circle, using a constraint clause in
the class definition, is shown below. The type #point used below in
the constraint clause is an abbreviation produced by the definition
of class point. This abbreviation unifies with the type of any
object belonging to a subclass of class point. It actually expands to
< get_x : int; move : int -> unit; .. >. This leads to the following
alternate definition of circle, which has slightly stronger
constraints on its argument, as we now expect center to have a
method get_x.

The class colored_circle is a specialized version of class
circle that requires the type of the center to unify with
#colored_point, and adds a method color. Note that when specializing a
parameterized class, the instance of type parameter must always be
explicitly given. It is again written between [ and ].

Our iterator works, as shows its first use for summation. However,
since objects themselves are not polymorphic (only their constructors
are), using the fold method fixes its type for this individual object.
Our next attempt to use it as a string iterator fails.

The problem here is that quantification was wrongly located: this is
not the class we want to be polymorphic, but the fold method.
This can be achieved by giving an explicitly polymorphic type in the
method definition.

As you can see in the class type shown by the compiler, while
polymorphic method types must be fully explicit in class definitions
(appearing immediately after the method name), quantified type
variables can be left implicit in class descriptions. Why require types
to be explicit? The problem is that (int -> int -> int) -> int -> int would also be a valid type for fold, and it happens to be
incompatible with the polymorphic type we gave (automatic
instantiation only works for toplevel types variables, not for inner
quantifiers, where it becomes an undecidable problem.) So the compiler
cannot choose between those two types, and must be helped.

However, the type can be completely omitted in the class definition if
it is already known, through inheritance or type constraints on self.
Here is an example of method overriding.

Polymorphic methods are called in exactly the same way as normal
methods, but you should be aware of some limitations of type
inference. Namely, a polymorphic method can only be called if its
type is known at the call site. Otherwise, the method will be assumed
to be monomorphic, and given an incompatible type.

Another use of polymorphic methods is to allow some form of implicit
subtyping in method arguments. We have already seen in section
3.8 how some functions may be polymorphic in the
class of their argument. This can be extended to methods.

Note here the special syntax (#point0 as 'a) we have to use to
quantify the extensible part of #point0. As for the variable binder,
it can be omitted in class specifications. If you want polymorphism
inside object field it must be quantified independently.

Indeed, narrowing coercions would be unsafe, and could only be combined with
a type case, possibly raising a runtime error. However, there is no such
operation available in the language.

Be aware that subtyping and inheritance are not related. Inheritance is a
syntactic relation between classes while subtyping is a semantic relation
between types. For instance, the class of colored points could have been
defined directly, without inheriting from the class of points; the type of
colored points would remain unchanged and thus still be a subtype of
points.

The domain of a coercion can usually be omitted. For instance, one can
define:

In this case, the function colored_point_to_point is an instance of the
function to_point. This is not always true, however. The fully
explicit coercion is more precise and is sometimes unavoidable.
Consider, for example, the following class:

While class types c1 and c2 are different, both object types
c1 and c2 expand to the same object type (same method names and types).
Yet, when the domain of a coercion is left implicit and its co-domain
is an abbreviation of a known class type, then the class type, rather
than the object type, is used to derive the coercion function. This
allows to leave the domain implicit in most cases when coercing form a
subclass to its superclass.
The type of a coercion can always be seen as below:

Note the difference between the two coercions: in the second case, the type
#c2 = < m : 'a; .. > as 'a is polymorphically recursive (according
to the explicit recursion in the class type of c2); hence the
success of applying this coercion to an object of class c0.
On the other hand, in the first case, c1 was only expanded and
unrolled twice to obtain < m : < m : c1; .. >; .. > (remember #c1 = < m : c1; .. >), without introducing recursion.
You may also note that the type of to_c2 is #c2 -> c2 while
the type of to_c1 is more general than #c1 -> c1. This is not always true,
since there are class types for which some instances of #c are not subtypes
of c, as explained in section 3.16. Yet, for
parameterless classes the coercion (_ :> c) is always more general than
(_ : #c :> c).

A common problem may occur when one tries to define a coercion to a
class c while defining class c. The problem is due to the type
abbreviation not being completely defined yet, and so its subtypes are not
clearly known. Then, a coercion (_ :> c) or (_ : #c :> c) is taken to be
the identity function, as in

#function x -> (x :> 'a);;
- : 'a -> 'a = <fun>

As a consequence, if the coercion is applied to self, as in the
following example, the type of self is unified with the closed type
c (a closed object type is an object type without ellipsis). This
would constrain the type of self be closed and is thus rejected.
Indeed, the type of self cannot be closed: this would prevent any
further extension of the class. Therefore, a type error is generated
when the unification of this type with another type would result in a
closed object type.

However, the abbreviation #c' cannot be defined directly in a similar way.
It can only be defined by a class or a class-type definition.
This is because # sharp abbreviations carry an implicit anonymous
variable .. that cannot be explicitly named.
The closer you get to it is:

It is possible to write a version of class point without assignments
on the instance variables. The construct {< ... >} returns a copy of
“self” (that is, the current object), possibly changing the value of
some instance variables.

While objects of either class will behave the same, objects of their
subclasses will be different. In a subclass of the latter, the method
move will
keep returning an object of the parent class. On the contrary, in a
subclass of the former, the method move will return an object of the
subclass.

Functional update is often used in conjunction with binary methods
as illustrated in section 5.2.1.

Objects can also be cloned, whether they are functional or imperative.
The library function Oo.copy makes a shallow copy of an object. That is,
it returns an object that is equal to the previous one. The
instance variables have been copied but their contents are shared.
Assigning a new value to an instance variable of the copy (using a method
call) will not affect instance variables of the original, and conversely.
A deeper assignment (for example if the instance variable if a reference cell)
will of course affect both the original and the copy.

The type of Oo.copy is the following:

#Oo.copy;;
- : (< .. > as 'a) -> 'a = <fun>

The keyword as in that type binds the type variable 'a to
the object type < .. >. Therefore, Oo.copy takes an object with
any methods (represented by the ellipsis), and returns an object of
the same type. The type of Oo.copy is different from type < .. > -> < .. > as each ellipsis represents a different set of methods.
Ellipsis actually behaves as a type variable.

Other generic comparissons such as (<, <=,...) can also be used on objects. The
relation < defines an unspecified but strict ordering on objets. The
ordering relationship between two objects is fixed once for all after the
two objects have been created and it is not affected by mutation of fields.

Cloning and override have a non empty intersection.
They are interchangeable when used within an object and without
overriding any field:

A binary method is a method which takes an argument of the same type
as self. The class comparable below is a template for classes with a
binary method leq of type 'a -> bool where the type variable 'a
is bound to the type of self. Therefore, #comparable expands to < leq : 'a -> bool; .. > as 'a. We see here that the binder as also
allows to write recursive types.

We then define a subclass money of comparable. The class money
simply wraps floats as comparable objects. We will extend it below with
more operations. There is a type constraint on the class parameter x
as the primitive <= is a polymorphic comparison function in
Objective Caml. The inherit clause ensures that the type of objects
of this class is an instance of #comparable.

Note that the type money1 is not a subtype of type
comparable, as the self type appears in contravariant position
in the type of method leq.
Indeed, an object m of class money has a method leq
that expects an argument of type money since it accesses
its value method. Considering m of type comparable would allow to
call method leq on m with an argument that does not have a method
value, which would be an error.

It is however possible to define functions that manipulate objects of
type either money or money2: the function min
will return the minimum of any two objects whose type unifies with
#comparable. The type of min is not the same as #comparable -> #comparable -> #comparable, as the abbreviation #comparable hides a
type variable (an ellipsis). Each occurrence of this abbreviation
generates a new variable.

More examples of binary methods can be found in sections
5.2.1 and 5.2.3.

Notice the use of functional update for method times.
Writing new money2 (k *. repr) instead of {< repr = k *. repr >}
would not behave well with inheritance: in a subclass money3 of money2
the times method would return an object of class money2 but not of class
money3 as would be expected.

The class money could naturally carry another binary method. Here is a
direct definition:

The above class money reveals a problem that often occurs with binary
methods. In order to interact with other objects of the same class, the
representation of money objects must be revealed, using a method such as
value. If we remove all binary methods (here plus and leq),
the representation can easily be hidden inside objects by removing the method
value as well. However, this is not possible as long as some binary
requires access to the representation on object of the same class but
different from self.

Here, the representation of the object is known only to a particular object.
To make it available to other objects of the same class, we are forced to
make it available to the whole world. However we can easily restrict the
visibility of the representation using the module system.

Another example of friend functions may be found in section
5.2.3. These examples occur when a group of objects (here
objects of the same class) and functions should see each others internal
representation, while their representation should be hidden from the
outside. The solution is always to define all friends in the same module,
give access to the representation and use a signature constraint to make the
representation abstract outside of the module.