Any sufficiently capable programming language is ''non-total'', which is to say you can write expressions that do not produce a value: common examples are an infinite loop or unproductive recursion, e.g. the following definition in Haskell:

+

Any sufficiently capable programming language is ''non-total'', which is to say you can write expressions that do not produce a value: common examples are an exception thrown, an infinite loop, or unproductive recursion, e.g. the following definition in Haskell:

<haskell>

<haskell>

Line 20:

Line 20:

both fail to produce a value when executed. We say that <tt>noreturn x</tt> is undefined, and write <tt>noreturn x = [[Bottom|⊥]]</tt>.

both fail to produce a value when executed. We say that <tt>noreturn x</tt> is undefined, and write <tt>noreturn x = [[Bottom|⊥]]</tt>.

−

In Python the following expression:

+

In Python the following expression to check if <tt>2</tt> is in some list:

−

{7: noreturn(5), 2: 0}[2]

+

2 in [2,4,noreturn(5)]

−

also fails to have a value, because in order to construct the dictionary, the interpreter tries to work out <tt>noreturn(5)</tt>, which of course doesn't return a value. This is called '''innermost-first''' evaluation: in order to call a function with some arguments, you first have to calculate what all the arguments are, starting from the innermost function call and working outwards. The result is that Python is strict, in the sense that calling any function with an undefined argument produces an undefined value, i.e. <tt>f(⊥) = ⊥</tt>.

+

also fails to have a value, because in order to construct the list, the interpreter tries to work out <tt>noreturn(5)</tt>, which of course doesn't return a value. This is called '''innermost-first''' evaluation: in order to call a function with some arguments, you first have to calculate what all the arguments are, starting from the innermost function call and working outwards. The result is that Python is strict, in the sense that calling any function with an undefined argument produces an undefined value, i.e. <tt>f(⊥) = ⊥</tt>. If your language uses innermost-first evaluation, it correspondingly must have strict semantics.

In Haskell, an analogous expression:

In Haskell, an analogous expression:

<haskell>

<haskell>

−

lookup 2 [(7, noreturn 5), (2, 0)]

+

elem 2 [2, 4, noreturn 5]

</haskell>

</haskell>

−

in fact has the value <hask>Just 0</hask>. The program does not try to compute <hask>noreturn 5</hask> because it is irrelevant to the overall value of the computation: only the values that are necessary to the result are computed. This is called '''outermost-first''' evaluation because you first look at <hask>lookup</hask> to see if it needs to use its arguments, and only if it does do you look at what those arguments are. The upshot is that you can give a function an argument that it doesn't look at, and it doesn't matter if it's <hask>⊥</hask> or not, the function will return a value anyway. Such functions are not strict, i.e. they satisfy <hask>f ⊥ ≠ ⊥</hask>. Practically, this means that Haskell functions need not completely compute their arguments before using them, which is why e.g. <hask>take 3 [1..]</hask> can produce <hask>[1,2,3]</hask> even though it is given a conceptually infinite list.

+

in fact has the value <tt>True</tt>. The program does not try to compute <tt>noreturn 5</tt> because it is irrelevant to the overall value of the computation: only the values that are necessary to the result are computed. This is called '''outermost-first''' evaluation because you first look at the outermost function call, <tt>elem</tt>, to see if it needs to use its arguments, and only if it does do you look at what those arguments are. This means that you can write a function that doesn't look at its argument, so it will return a value even if the argument is <tt>⊥</tt>. Such functions are not strict, i.e. they satisfy <tt>f(⊥) ≠ ⊥</tt>. Practically, this means that Haskell functions need not completely compute their arguments before using them, which is why e.g. <tt>take 3 [1..]</tt> can produce <tt>[1,2,3]</tt> even though it is given a conceptually infinite list.

+

+

Note that outermost-first evaluation is not the only way to have non-strict semantics: a speculative evaluation strategy, that evaluated arguments in parallel with the function in case they were needed later, could also be non-strict, as long as whenever the speculative evaluation failed, the evaluation of the function continued.

+

+

Note also that in order for a function to be truly non-strict, it must return a result without inspecting its argument ''at all''. You might think that doesn't sound like a very useful function, but remember that it might be e.g. a partial application: the function <tt>(||) True</tt>, or equivalently <tt>\x -> True || x</tt> does not need to inspect its argument, since <tt>True || x</tt> is always <tt>True</tt>. There are other examples, too: constructors like <tt>Just</tt> wrap their argument without inspecting it, and some other functions apply constructors ''before'' looking at the argument, and hence still produce a partial result, e.g. <tt>inits ⊥ = [] : ⊥</tt>

=== Why? ===

=== Why? ===

−

The important thing to understand about non-strict semantics is that it is not a performance feature. Non-strict semantics means that only the things that are needed for the answer are evaluated, but if you write your programs carefully, you'll only compute what is absolutely necessary ''anyway'', so the extra time spent working out what should and shouldn't be evaluated is time wasted. For this reason, a very well-optimised strict program will frequently outperform even the fastest non-strict program.

+

The important thing to understand about non-strict semantics is that it is not a performance feature. Non-strict semantics allows your language to only evaluate the things it needs to, but if you write your programs carefully, you'll only compute what is absolutely necessary ''anyway'', so the extra time your program spends working out what should and shouldn't be evaluated is time wasted. For this reason, a very well-optimised strict program will frequently outperform even the fastest non-strict program.

However, the real and major advantage that non-strictness gives you over strict languages is you get to write cleaner and more composable code. In particular, you can separate ''production'' and ''consumption'' of data: don't know how many prime numbers you're going to need? Just make `primes` a list of ''all'' prime numbers, and then which ones actually get ''generated'' depends on how you use them in the rest of your code. By contrast, writing code in a strict language that constructs a data structure in response to demand usually will require first-class functions and/or a lot of manual hoop-jumping to make it all behave itself.

However, the real and major advantage that non-strictness gives you over strict languages is you get to write cleaner and more composable code. In particular, you can separate ''production'' and ''consumption'' of data: don't know how many prime numbers you're going to need? Just make `primes` a list of ''all'' prime numbers, and then which ones actually get ''generated'' depends on how you use them in the rest of your code. By contrast, writing code in a strict language that constructs a data structure in response to demand usually will require first-class functions and/or a lot of manual hoop-jumping to make it all behave itself.

−

What this means in practice is that in Haskell you can often write your function as a simple composition of other general functions, and still get the behaviour you need, e.g:

+

Consider the following Haskell function definition:

<haskell>

<haskell>

Line 47:

Line 51:

</haskell>

</haskell>

−

Because <hask>or</hask> uses non-strictness to stop at the first <hask>True</hask> in the input, <hask>map</hask> doesn't even need to know that only the first half of the list might be needed. We can write <hask>map</hask> in the completely straightforward and obviously correct way, and still have it interact well with <hask>or</hask> in this way. In a strict langauge, you'd have to write the recursion out manually:

+

Because <tt>or</tt> uses non-strictness to stop at the first <tt>True</tt> in the input, <tt>map</tt> doesn't even need to know that only the first half of the list might be needed. We can write <tt>map</tt> in the completely straightforward and obviously correct way, and still have it interact well with <tt>or</tt> in this way; <tt>map</tt> produces data, <tt>or</tt> consumes it, and the two are properly decoupled.

+

+

In a strict langauge, you'd have to write the recursion out manually:

<haskell>

<haskell>

Line 53:

Line 59:

any p (x:xs)

any p (x:xs)

| p x = True

| p x = True

−

| otherwise = xs

+

| otherwise = any p xs

</haskell>

</haskell>

−

since in strict languages only builtin control structures can decide whether some bit of code gets executed or not: ordinary functions like <hask>or</hask> can't. It's this additional power that Haskell has that leads people to say you can define your own control structures as normal Haskell functions, that makes lists act as for loops, which you can break or continue as necessary, doing no more work than required.

+

since in strict languages only builtin control structures can decide whether some bit of code gets executed or not, ordinary functions like <tt>or</tt> can't.

+

+

It's this additional power that Haskell has that leads people to say you can define your own control structures as normal Haskell functions, which allows all sorts of interesting patterns to be abstracted in an incredibly lightweight fashion. Labelled for-loops are a ''library'' in Haskell, rather than requiring special syntax and language support.

=== How do I stop it? ===

=== How do I stop it? ===

−

As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary.

+

As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary: see [[Performance/Strictness]] and [[seq]].

−

+

−

The most basic method of strictness introduction is the <hask>seq</hask> function. <hask>seq</hask> takes two arguments of any type, and returns the second. However, it also has the important property that it is magically strict in its first argument; in essence <hask>seq</hask> is defined by the following two equations:

+

−

+

−

<haskell>

+

−

⊥ `seq` b = ⊥

+

−

a `seq` b = b

+

−

</haskell>

+

−

+

−

The important thing to notice about <tt>seq</tt> is that it doesn't evaluate anything just by virtue of being in the code. All it does is introduce an artificial dependency of one value on another: when the result of <tt>seq</tt> is evaluated, the first argument must also be evaluated. As an example, suppose <tt>x :: Integer</tt>, then <tt>seq x b</tt> is essentially a bit like <tt>if x == 0 then b else b</tt> - unconditionally equal to <tt>b</tt>, but forcing <tt>x</tt> along the way. In particular, the expression <tt>x `seq` x</tt> is completely redundant, and always has exactly the same effect as just writing <tt>x</tt>.

+

−

+

−

Strictly speaking, the two equations of <hask>seq</hask> are all it must satisfy, and if the compiler can statically prove that the first argument is not ⊥, it doesn't have to evaluate it to meet its obligations. In practice, this almost never happens, and would probably be considered highly counterintuitive behaviour on the part of GHC (or whatever else you use to run your code). However, it ''is'' the case that evaluating <hask>b</hask> and ''then'' <hask>a</hask>, then returning <hask>b</hask> is a perfectly legitimate thing to do; it is to prevent this ambiguity that <hask>pseq</hask> was invented, but that's another story.

+

−

+

−

Note that <hask>seq</hask> is the ''only'' way to force evaluation of a value with a function type.

+

Revision as of 11:11, 8 April 2013

An expression language is said to have non-strict semantics if expressions can have a value even if some of their subexpressions do not. Haskell is one of the few modern languages to have non-strict semantics by default: nearly every other language has strict semantics, in which if any subexpression fails to have a value, the whole expression fails with it.

1 What?

Any sufficiently capable programming language is non-total, which is to say you can write expressions that do not produce a value: common examples are an exception thrown, an infinite loop, or unproductive recursion, e.g. the following definition in Haskell:

noreturn ::Integer->Integer
noreturn x =negate(noreturn x)

or the following Python function:

def noreturn(x):
while True:
x = -x
return x # not reached

both fail to produce a value when executed. We say that noreturn x is undefined, and write noreturn x = ⊥.

In Python the following expression to check if 2 is in some list:

2 in [2,4,noreturn(5)]

also fails to have a value, because in order to construct the list, the interpreter tries to work out noreturn(5), which of course doesn't return a value. This is called innermost-first evaluation: in order to call a function with some arguments, you first have to calculate what all the arguments are, starting from the innermost function call and working outwards. The result is that Python is strict, in the sense that calling any function with an undefined argument produces an undefined value, i.e. f(⊥) = ⊥. If your language uses innermost-first evaluation, it correspondingly must have strict semantics.

In Haskell, an analogous expression:

elem2[2,4, noreturn 5]

in fact has the value True. The program does not try to compute noreturn 5 because it is irrelevant to the overall value of the computation: only the values that are necessary to the result are computed. This is called outermost-first evaluation because you first look at the outermost function call, elem, to see if it needs to use its arguments, and only if it does do you look at what those arguments are. This means that you can write a function that doesn't look at its argument, so it will return a value even if the argument is ⊥. Such functions are not strict, i.e. they satisfy f(⊥) ≠ ⊥. Practically, this means that Haskell functions need not completely compute their arguments before using them, which is why e.g. take 3 [1..] can produce [1,2,3] even though it is given a conceptually infinite list.

Note that outermost-first evaluation is not the only way to have non-strict semantics: a speculative evaluation strategy, that evaluated arguments in parallel with the function in case they were needed later, could also be non-strict, as long as whenever the speculative evaluation failed, the evaluation of the function continued.

Note also that in order for a function to be truly non-strict, it must return a result without inspecting its argument at all. You might think that doesn't sound like a very useful function, but remember that it might be e.g. a partial application: the function (||) True, or equivalently \x -> True || x does not need to inspect its argument, since True || x is always True. There are other examples, too: constructors like Just wrap their argument without inspecting it, and some other functions apply constructors before looking at the argument, and hence still produce a partial result, e.g. inits ⊥ = [] : ⊥

2 Why?

The important thing to understand about non-strict semantics is that it is not a performance feature. Non-strict semantics allows your language to only evaluate the things it needs to, but if you write your programs carefully, you'll only compute what is absolutely necessary anyway, so the extra time your program spends working out what should and shouldn't be evaluated is time wasted. For this reason, a very well-optimised strict program will frequently outperform even the fastest non-strict program.

However, the real and major advantage that non-strictness gives you over strict languages is you get to write cleaner and more composable code. In particular, you can separate production and consumption of data: don't know how many prime numbers you're going to need? Just make `primes` a list of all prime numbers, and then which ones actually get generated depends on how you use them in the rest of your code. By contrast, writing code in a strict language that constructs a data structure in response to demand usually will require first-class functions and/or a lot of manual hoop-jumping to make it all behave itself.

Consider the following Haskell function definition:

any::(a ->Bool)->[a]->Boolany p =or.map p

Because or uses non-strictness to stop at the first True in the input, map doesn't even need to know that only the first half of the list might be needed. We can write map in the completely straightforward and obviously correct way, and still have it interact well with or in this way; map produces data, or consumes it, and the two are properly decoupled.

In a strict langauge, you'd have to write the recursion out manually:

any p []= False
any p (x:xs)| p x = True
|otherwise=any p xs

since in strict languages only builtin control structures can decide whether some bit of code gets executed or not, ordinary functions like or can't.

It's this additional power that Haskell has that leads people to say you can define your own control structures as normal Haskell functions, which allows all sorts of interesting patterns to be abstracted in an incredibly lightweight fashion. Labelled for-loops are a library in Haskell, rather than requiring special syntax and language support.

3 How do I stop it?

As mentioned above, non-strictness can hurt performance: if a result is definitely going to be needed later, you might as well evaluate it now, to avoid having to hold on to all the data that goes into it. Fortunately, the Haskell designers were aware of these problems and introduced a loophole or two so that we could force our programs to be strict when necessary: see Performance/Strictness and seq.