The emergence of functors is a watershed in the course of this book. The reasons for that will begin to reveal themselves in this prologue, as we set the stage for the next several chapters of the book. While the code examples we will work with here are very simple, we will use them to bring several new and important ideas into play, ideas that will be revisited and further developed later in the book. That being so, we recommend you to study this chapter at a gentle pace, which gives you space for thinking about the implications of each step, as well as trying out the code samples in GHCi.

Our initial examples will use the function readMaybe, which is provided by the Text.Read module.

GHCi>:m+Text.ReadGHCi>:treadMaybereadMaybe::Reada=>String->Maybea

readMaybe provides a simple way of converting strings into Haskell values. If the provided string has the correct format to be read as a value of type a, readMaybe gives back the converted value wrapped in Just; otherwise, the result is Nothing.

To use readMaybe, we need to specify which type we are trying to read. Most of the time, that would be done through a combination of type inference and the signatures in our code. Occasionally, however, it is more convenient to just slap in a type annotation rather than writing down a proper signature. For instance, in the first example above the :: Maybe Integer in readMaybe "3" :: Maybe Integer says that the type of readMaybe "3" is Maybe Integer.

If the read succeeds, prints the double of the number; otherwise, prints an explanatory message and starts over.

Note

Before continuing, we suggest you try writing the program. Beyond readMaybe, you will likely find getLine, putStrLn and show useful. Have a look at the Simple input and output chapter if you need a reminder about how to do reading from and printing to the console.

Here is a possible implementation:

importText.ReadinteractiveDoubling=doputStrLn"Choose a number:"s<-getLineletmx=readMaybes::MaybeDoublecasemxofJustx->putStrLn("The double of your number is "++show(2*x))Nothing->doputStrLn"This is not a valid number. Retrying..."interactiveDoubling

GHCi> interactiveDoubling
Choose a number:
foo
This is not a valid number. Retrying...
Choose a number:
3
The double of your number is 6.0

Nice and simple. A variation of this solution might take advantage of how, given that Maybe is a Functor, we can double the value before unwrapping mx in the case statement:

interactiveDoubling=doputStrLn"Choose a number:"s<-getLineletmx=readMaybes::MaybeDoublecasefmap(2*)mxofJustd->putStrLn("The double of your number is "++showd)Nothing->doputStrLn"This is not a valid number. Retrying..."interactiveDoubling

In this case, there is no real advantage in doing that. Still, keep this possibility in mind.

interactiveSumming works, but it is somewhat annoying to write. In particular, the nested case statements are not pretty, and make reading the code a little difficult. If only there was a way of summing the numbers before unwrapping them, analogously to what we did with fmap in the second version of interactiveDoubling, we would be able to get away with just one case:

... we don't know how to apply a function wrapped in Maybe to the second value. For that, we would need a function with a signature like this one...

(<*>)::Maybe(a->b)->Maybea->Maybeb

... which would then be used like this:

GHCi>fmap(+)(Just3)<*>Just4Just7

The GHCi prompt in this example, however, is not wishful thinking: (<*>) actually exists, and if you try it in GHCi, it will actually work! The expression looks even neater if we use the infix synonym of fmap, (<$>):

GHCi>(+)<$>Just3<*>Just4Just7

The actual type (<*>) is more general than what we just wrote. Checking it...

GHCi>:t(<*>)(<*>)::Applicativef=>f(a->b)->fa->fb

... introduces us to a new type class: Applicative, the type class of applicative functors. For an initial explanation, we can say that an applicative functor is a functor which supports applying functions within the functor, thus allowing for smooth usage of partial application (and therefore functions of multiple arguments). All instances of Applicative are Functors, and besides Maybe, there are many other common Functors which are also Applicative.

The definition of (<*>) is actually quite simple: if neither of the values are Nothing, apply the function f to x and wrap the result with Just; otherwise, give back Nothing. Note that the logic is exactly equivalent to what the nested case statement of interactiveSumming does.

Note that beyond (<*>) there is a second method in the instance above, pure:

GHCi>:tpurepure::Applicativef=>a->fa

pure takes a value and brings it into the functor in a default, trivial way. In the case of Maybe, the trivial way amounts to wrapping the value with Just – the nontrivial alternative would be discarding the value and giving back Nothing. With pure, we might rewrite the three-plus-four example above as...

GHCi>(+)<$>pure3<*>pure4::Numa=>MaybeaJust7

... or even:

GHCi>pure(+)<*>pure3<*>pure4::Numa=>MaybeaJust7

Just like the Functor class has laws which specify how sensible instance should behave, there is a set of laws for Applicative. Among other things, these laws specify what the "trivial" way of bringing values into the functor through pure amounts to. Since there is a lot going on in this stretch of the book, we will not discuss the laws now; however, we will return to this important topic in a not too distant future.

Note

In any case, if you are curious feel free to make a detour through the Applicative functors chapter and read its "Applicative functor laws" subsection. If you choose to go there, you might as well have a look at the "ZipList" section, which provides an additional example of a common applicative functor that can be grasped using only what we have seen so far.

To wrap things up, here is a version of interactiveSumming enhanced by (<*>):

In the examples above, we have been taking I/O actions such as getLine for granted. We now find ourselves at an auspicious moment to revisit a question first raised many chapters ago: what is the type of getLine?

Using what we learned since then, we can now see that IO is a type constructor with one type variable, which happens to be instantiated as String in the case of getLine. That, however, doesn't get to the root of the issue: what does IO String really mean, and what is the difference between that and plain old String?

A key feature of Haskell is that all expressions we can write are referentially transparent. That means we can replace any expression whatsoever by its value without changing the behaviour of the program. For instance, consider this very simple program:

Given that addExclamation s = s ++ "!", we can rewrite main so that it doesn't mention addExclamation. All we have to do is replacing s by "Hello" in the right-hand side of the addExclamation definition and then replacing addExclamation "Hello!" by the resulting expression. As advertised, the program behaviour does not change:

GHCi>letmain=putStrLn("Hello"++"!")GHCi>mainHello!

Referential transparency ensures that this sort of substitution works. This guarantee extends to anywhere in any Haskell program, which goes a long way towards making programs easier to understand, and their behaviour easier to predict.

Now, suppose that the type of getLine were String. In that case, we would be able to use it as the argument to addExclamation, as in:

-- Not actual code.main=putStrLn(addExclamationgetLine)

In that case, however, a new question would spring forth: if getLine is a String, which String is it? There is no satisfactory answer: it could be "Hello", "Goodbye", or whatever else the user chooses to type at the terminal. And yet, replacinggetLine by any String breaks the program, as the user would not be able to type the input string at the terminal any longer. Therefore getLine having type String would cause referential transparency to be broken. The same goes for all other I/O actions: their results are opaque, in that it is impossible to tell them in advance, as they depend on factors external to the program.

As getLine illustrates, there is a fundamental indeterminacy associated with I/O actions. Respecting this indeterminacy is necessary for preserving referential transparency. In Haskell, that is achieved through the IO type constructor. getLine being an IO String means that it is not any actual String, but both a placeholder for a String that will only materialise when the program is executed and a promise that this String will indeed be delivered (in the case of getLine, by slurping it from the terminal). As a consequence, when we manipulate an IO String we are setting up plans for what will be done once this unknown String comes into being. There are quite a few ways of achieving that. In this section, we will consider two of them; to which we will add a third one in the next few chapters.

The idea of dealing a value which isn't really there might seem bizarre at first. However, we have already discussed at least one example of something not entirely unlike it without batting an eyelid. If mx is a Maybe Double, then fmap (2*) mx doubles the value if it is there, and works regardless of whether the value actually exists.[1] Both Maybe a and IO a imply, for different reasons, a layer of indirection in reaching the corresponding values of type a. That being so, it comes as no surprise that, like Maybe, IO is a Functor, with fmap being the most elementary way of getting across the indirection.

To begin with, we can exploit the fact of IO being a Functor to replace the let definitions in interactiveSumming from the end of the previous section by something more compact:

readMaybe <$> getLine can be read as "once getLine delivers a string, whatever it turns out to be, apply readMaybe on it". Referential transparency is not compromised: the value behind readMaybe <$> getLine is just as opaque as that of getLine, and its type (in this case IO (Maybe Double)) disallows us from replacing it with any determinate value (say, Just 3) that would violate referential transparency.

Beyond being a Functor, IO is also an Applicative, which provides us a second way of manipulating the values delivered by I/O actions. We will illustrate it with a interactiveConcatenating action, similar in spirit to interactiveSumming. A first version is just below. Can you anticipate how to simplify it with (<*>)?

(++) <$> getLine <*> getLine is an I/O action which is made out of two other I/O actions (the two getLine). When it is executed, these two I/O actions are executed and the strings they deliver are concatenated. One important thing to notice is that (<*>) maintains a consistent order of execution between the actions it combines. Order of execution matters when dealing with I/O – examples of that are innumerable, but for starters consider this question: if we replace the second getLine in the example above with (take 3 <$> getLine), which of the strings entered at the terminal will be cut down to three characters?

As (<*>) respects the order of actions, it provides a way of sequencing them. In particular, if we are only interested in sequencing and don't care about the result of the first action we can use \_ y -> y to discard it:

GHCi>(\_y->y)<$>putStrLn"First!"<*>putStrLn"Second!"First!Second!

This is such a common usage pattern that there is an operator specifically for it: (*>).

Note that each of the (*>) replaces one of the magical line breaks of the do block that lead actions to be executed one after the other. In fact, that is all there is to the replaced line breaks: they are just syntactic sugar for (*>).

Earlier, we said that a functor brings in a layer of indirection for accessing the values within it. The flip side of that observation is that the indirection is caused by a context, within which the values are found. For IO, the indirection is that the values are only determined when the program is executed, and the context consists in the series of instructions that will be used to produce these values (in the case of getLine, these instructions amount to "slurp a line of text from the terminal"). From this perspective, (<*>) takes two functorial values and combines not only the values within but also the contexts themselves. In the case of IO combining the contexts means appending the instructions of one I/O action to those of the other, thus sequencing the actions.

This chapter was a bit of a whirlwind! Let's recapitulate the key points we discussed in it:

Applicative is a subclass of Functor for applicative functors, which are functors that support function application without leaving the functor.

The (<*>) method of Applicative can be used as a generalisation of fmap to multiple arguments.

An IO a is not a tangible value of type a, but a placeholder for an a value that will only come into being when the program is executed and a promise that this value will be delivered through some means. That makes referential transparency possible even when dealing with I/O actions.

IO is a functor, and more specifically an instance of Applicative, that provides means to modify the value produced by an I/O action in spite of its indeterminacy.

A functorial value can be seen as being made of values in a context. fmap cuts through the context to modify the underlying values. (<*>) combines both the contexts and the underlying values of two functorial values.

In the case of IO, (<*>), and the closely related (*>), combine contexts by sequencing I/O actions.

A large part of the role of do blocks is simply providing syntactic sugar for (*>).

As a final observation, note that there is still a major part of the mystery behind do blocks left to explain: what does the left arrow do? In a do-block line such as...

sx<-getLine

... it looks like we are extracting the value produced by getLine from the IO context. Thanks to the discussion about referential transparency, we now know that must be an illusion. But what is going on behind the scenes? Feel free to place your bets, as we are about to find out!

Notes

↑The key difference between the two situations is that with Maybe the indeterminacy is only apparent, and it is possible to figure out in advance whether there is an actual Double behind mx – or, more precisely, it is possible as long as the value of mx does not depend on I/O!