Notes (HPFP 04/11): Basic Datatypes

4 Basic Datatypes

4.2 What are types?

Types: Haskell has expressions. Type the number 1 into the REPL That’s an expression. Type addOne = (+) 1. The function addOne is also an expression. Try to imagine all the possible expressions we could type into GHCi. This is hard to do because the number of possible expressions is infinite. But if we try to imagine lots of different expressions, we should start to notice patterns. 1 is an expression, so is 2, so is 3, and so on. All positive integers are expressions. -1 is an expression, so -2 and -3. Negative integers are expressions. 0 is an expression, therefore all integers are expressions. The pair (1,1) is an expression, so is (1,2), so is (23,58982). All pairs of integers are expressions. We can keep going like this forever, finding new patterns of ways to group expressions together. Every time we find a new expression-pattern, if we can precisely describe the structure of that pattern, we have a type.

When we played with the String type in the preceding chapter, we were, in effect, saying “Let’s for the moment think about only those expressions that have the String pattern, which looks like this:

So the type system is a tool for defining new patterns in the space of possible expressions, and then checking that in the code we want to run, all the types fit together perfectly.

If you have ever played with Legos, you already have an intuition for how this ought to work.

There are a lot of different ways to fit Lego’ together. Two standard two by four Lego bricks of the same color can be combined 24 ways (ignoring symmetries). But there are also a lot of ways that you can’t fit pieces together. You can’t, for example, place a brick on top of two adjacent bricks at different heights. No amount of force will get the pieces to bend (Lego’s are very tough) that way. You can’t “coerce” Lego’s into doing whatever you want. The shapes are what they are, and it’s up to you the builder to figure out some interesting way to fit them together.

Haskell expressions are like Lego pieces. And types are like their shapes. But unlike with Lego’s, you get to design entirely new pieces, as well as put them together.

Bool and False live in two different spaces. Bool lives in type-space and False lives in data-space. This is a really important distinction! Type-space disappears after code gets compiled, so you can’t interact with them in running code (or “runtime”).

compile-time: When code gets compiled. Types are used in compile-time, but not in runtime. Compiler errors happen at compile-time.

But this is getting pretty deep into the “typeclass zoo.” Better leave this for chapters 5 and 6.

4.5 Comparing Values

Let’s think about what a comparison is. Things are different from one another, sometimes in a lot of different ways. An apple can be red, crisp and sweet while an orange can be orange, fleshy and tart. It’s tough to compare two things when they differ in a lot of different ways, hence the expression “you can’t compare apples and oranges.”

But actually, you can compare apples and oranges, and as long as you restrict the comparison to a single dimension of difference, it’s pretty easy. A particular apple a particular orange both have size, so you can say one is bigger than the other. And that’s a comparison! Color, taste, texture, ripeness, country of origin, there are loads of dimensions in which a comparison could make sense.

Let’s be even more constrained for a moment and think about what equality is. What does it mean for something to be the same as something else? Again, it’s easier to think about this if we only consider one dimension of difference at a time. Our apple and orange both have weight, and those weights can be the same, or different.

Let’s model this by making a type Fruit which can be either an Apple or and Orange, each of which contains an Int that represents their weight:

Lets make a particular apple that weighs 4 units, and an orange that weighs 5 units"

> apple = Apple 4
> orange = Orange 5

In general, the question of what it means for two things to be equal is a really subtle and interesting one. Here, our common sense and knowledge of arithmetic says 5 is bigger than 4, so the orange is not the same weight but is actually bigger than the apple.

When we told GHCi to just figure out an equality function, it didn’t have any way of knowing that the Int inside the Apple and Orange data constructors was the only thing we cared about, so it derived an (==) function that also takes the data constructors themselves into account.

It’s called Ord because it’s short for “Orderable”, as in “can be put in order.” I like expanding typeclass names by adding and “-able” to the end of them. Eq is the class of “Equable” types, Show is the class of “Showable” types, etc. etc.

4.6 Go on and Bool me

Okay, one very very pattern in Haskell is the overlapping language used for logical disjunction (OR) and addition, and logical conjunction (AND) and multiplication. A type (like Bool) that can be either one thing (like True) or another (like False) is called a “sum” type. A type (like a tuple (a, b) that has to have one thing and another is called a “product” type. This language comes from a branch of math called category theory, but is actually much less scary than it seems at first. The basic idea is that when you try to count (or enumerate) all the possible values that can be in a type, an OR (in Haskell a |) in your constructor acts like adding number of possibilities on both sides of the disjunction, whereas an AND acts like multiplying the possibilities on both sides of the conjunction. Hence, “sum” for adding, and “product” for multiplying.

Haskell doesn’t have a special syntax for product types, it just puts the two parts of the next to each other separated by a space, so a b is the product type of a and b, while a | b is the sum type of a and b.