Note: information on this page refers to Ceylon 1.2, not to the
current release.

Union, intersection, and enumerated types

This is the eighth step in the Tour of Ceylon. In the
previous installment we learned about type aliases and
type inference. Let's continue our exploration of Ceylon's type system.

In this chapter, we're going to discuss the closely related topics of union
and intersection types and enumerated types. In this area, Ceylon's type
system works quite differently to other languages with static typing.

Narrowing the type of an object reference

In any language with subtyping there is the (hopefully) occasional need to
perform narrowing conversions. In most statically-typed languages, this is a
two-part process. For example, in Java, we first test the type of the object
using the instanceof operator, and then attempt to downcast it using a
C-style typecast. This is quite curious, since there are virtually no good
uses for instanceof that don't involve an immediate cast to the tested
type, and typecasts without type tests are dangerously non-typesafe.

As you can imagine, Ceylon, with its emphasis upon static typing, does
things differently. Ceylon doesn't have C-style typecasts. Instead, we
must test and narrow the type of an object reference in one step, using the
special if (is ... ) construct. This construct is very, very similar
to if (exists ... ) and
if (nonempty ... ), which we met earlier.

These constructs protect us from inadvertently writing code that would
cause a ClassCastException in Java, just like if (exists ... )
protects us from writing code that would cause a NullPointerException.

Now, in cases we really want to do something more like a Java-style
typecast, we would use an assert statement, which we saw
earlier.

But assertions should be avoided where reasonable. They undermine the
ability of the compiler to tell us about logic errors in our program
at compile time, resulting in more errors at runtime.

The is conditions in if, switch, or assert actually narrow to
an intersection type.

Intersection types

An expression is assignable to an intersection type, written X&Y,
if it is assignable to bothX and Y. For example, since
Tuple
is a subtype of
Iterable
and of
Correspondence,
the tuple type [String,String] is also a subtype of the intersection
{String*} & Correspondence<Integer,String>. The supertypes of
an intersection type include all supertypes of every intersected type.

But what operations does a type like String|Integer|Float have? What are
its supertypes? Well, the answer is pretty intuitive: T is a supertype of
X|Y if and only if it is a supertype of both X and Y. The Ceylon compiler
determines this automatically. So the following code is also well-typed:

Of course, it's very common to narrow an expression of union type using a
switch statement. Usually, the Ceylon compiler forces us to write an else
clause in a switch, to remind us that there might be additional cases which
we have not handled. But if we exhaust all cases of a union type, the compiler
will let us leave off the else clause.

Gotcha!

The cases of a switch statement must be disjoint.
Since String, Integer, and Float are disjoint types, the above switch
statement is legal. If a union type is formed from types which aren't disjoint,
those types can't be used as distinct cases.

Enumerated types

Sometimes it's useful to be able to do the same kind of thing with the
subtypes of a class or interface. First, we need to explicitly enumerate the
subtypes of the type using the of clause:

abstract class Point()
of Polar | Cartesian {
// ...
}

(This makes Point into Ceylon's version of what the functional programming
community calls an "algebraic" or "sum" type.)

Now the compiler won't let us declare additional subclasses of Point, and
so the union type Polar|Cartesian is exactly the same type as Point.
Therefore, we can write switch statements without an else clause:

Now, it's usually considered bad practice to write long switch statements
that handle all subtypes of a type. It makes the code non-extensible. Adding
a new subclass to Point means breaking all the switch statements that
exhaust its subtypes. In object-oriented code, we usually try to refactor
constructs like this to use an abstract method of the superclass that is
overridden as appropriate by subclasses.

However, there is a class of problems where this kind of refactoring isn't
appropriate. In most object-oriented languages, these problems are usually
solved using the "visitor" pattern.

Notice that the code of printVisitor looks just like a switch statement.
It must explicitly enumerate all subtypes of Node. It "breaks" if we add a
new subtype of Node to the Visitor interface. This is correct, and is the
desired behavior; "break" means that the compiler lets us know that we have
to update our code to handle the new subtype.

In Ceylon, we can achieve the same effect, with less verbosity, by
enumerating the subtypes of Node in its definition, and using a switch:

More about disjointness

As we've seen, disjointness is a useful property for two types to have, since
it lets us use them as cases of the same switch statement. Therefore, the
compiler expends some effort to determine if two types are disjoint. For
example:

if X and Y are classes, X is not a subclass of Y, and Y is not a
subclass of X, then X and Y are disjoint,

if X is a final class and Y is an interface not satisfied by X, then
X and Y are disjoint,

two tuple types may be disjoint, for example
[String,Integer] and [Integer,Integer], and

two instantiations of a generic type may be disjoint, for example,
MutableList<String> and MutableList<Integer>.

If a type covers another type, then we can use the of operator to safely
narrow from the second type to the first type, even if the second type is not
strictly-speaking a subtype of the first type, according to Ceylon's type system.
Going back to an earlier example, we could write: