3Expressions

In this section, we describe the syntax and informal semantics of Haskell
expressions, including their translations into the
Haskell kernel, where appropriate. Except in the case of let
expressions, these translations preserve both the static and dynamic
semantics.
Some of the names and symbols used in the syntax
are not reserved.
These are indicated by the `special' productions in the lexical syntax.
Examples include ! (used only in data declarations) and
as (used in import declarations).

Free variables and constructors used in these translations refer to
entities
defined by the Prelude. To avoid clutter,
we use True instead of
Prelude.True or map instead of Prelude.map. (Prelude.True is a
qualified name as described in Section 5.1.2.)

In the syntax that follows, there are some families of nonterminals
indexed by precedence levels (written as a superscript). Similarly, the
nonterminals op, varop, and conop may have a double index:
a letter l, r, or n for left-, right- or non-associativity and
a precedence level. A precedence-level variable i ranges from 0 to 9;
an associativity variable a varies over {l, r, n}.
Thus, for example

aexp

->

( expi+1 qop(a,i))

actually stands for 30 productions, with 10 substitutions for i
and 3 for a.

exp

->

exp0:: [context =>] type

(expression type signature)

|

exp0

expi

->

expi+1 [qop(n,i) expi+1]

|

lexpi

|

rexpi

lexpi

->

(lexpi | expi+1) qop(l,i) expi+1

lexp6

->

- exp7

rexpi

->

expi+1 qop(r,i) (rexpi | expi+1)

exp10

->

\ apat1 ... apatn-> exp

(lambda abstraction, n>=1)

|

let decllist in exp

(let expression)

|

if exp then exp else exp

(conditional)

|

case exp of{ alts [;] }

(case expression)

|

do{ stmts [;] }

(do expression)

|

fexp

fexp

->

[fexp] aexp

(function application)

aexp

->

qvar

(variable)

|

gcon

(general constructor)

|

literal

|

( exp )

(parenthesized expression)

|

( exp1, ... , expk)

(tuple, k>=2)

|

[ exp1, ... , expk]

(list, k>=1)

|

[ exp1 [, exp2] .. [exp3] ]

(arithmetic sequence)

|

[ exp | qual1, ... , qualn]

(list comprehension, n>=1)

|

( expi+1 qop(a,i))

(left section)

|

( qop(a,i) expi+1)

(right section)

|

qcon { fbind1, ... , fbindn}

(labeled construction, n>=0)

|

aexp{qcon}{ fbind1, ... , fbindn}

(labeled update, n >= 1)

As an aid to understanding this grammar,
Table 1 shows the relative precedence of
expressions, patterns and definitions, plus an extended associativity.
- indicates that the item is non-associative.

Item

Associativity

simple terms, parenthesized terms

--

irrefutable patterns (~)

--

as-patterns (@)

right

function application

left

do, if, let, lambda(\), case (leftwards)

right

case (rightwards)

right

infix operators, prec. 9

as defined

...

...

infix operators, prec. 0

as defined

function types (->)

right

contexts (=>)

--

type constraints (::)

--

do, if, let, lambda(\) (rightwards)

right

sequences (..)

--

generators (<-)

--

grouping (,)

n-ary

guards (|)

--

case alternatives (->)

--

definitions (=)

--

separation (;)

n-ary

Table 1

Precedence of expressions, patterns, definitions (highest to lowest)

The grammar is ambiguous regarding the extent of lambda abstractions,
let expressions, and conditionals. The ambiguity is resolved by the
metarule that each of these constructs extends as far to the right as
possible. As a consequence, each of these constructs has two precedences,
one to its left, which is the precedence used in the grammar; and
one to its right, which is obtained via the metarule. See the sample
parses below.

Expressions involving infix operators are disambiguated by the
operator's fixity (see Section 5.6). Consecutive
unparenthesized operators with the same precedence must both be either
left or right associative to avoid a syntax error.
Given an unparenthesized expression "x qop(a,i) y qop(b,j) z", parentheses
must be added around either "x qop(a,i) y" or "y qop(b,j)z" when i=j unless a=b=l or a=b=r.

Negation is the only prefix operator in
Haskell ; it has the same precedence as the infix - operator
defined in the Prelude (see Figure 2).

The separation of function arrows from case alternatives solves
the ambiguity that otherwise arises when an unparenthesized
function type is used in an expression, such as the guard in a case
expression.

Sample parses are shown below.

This

Parses as

f x + g y

(f x) + (g y)

- f x + y

(- (f x)) + y

let { ... } in x + y

let { ... } in (x + y)

z + let { ... } in x + y

z + (let { ... } in (x + y))

f x y :: Int

(f x y) :: Int

\ x -> a+b :: Int

\ x -> ((a+b) :: Int)

For the sake of clarity, the rest of this section shows the syntax of
expressions without their precedences.

3.1Errors

Errors during expression evaluation, denoted by _|_,
are indistinguishable from non-termination. Since Haskell is a
lazy language, all Haskell types include _|_. That is, a value
of any type may be bound to a computation that, when demanded, results
in an error. When evaluated, errors cause immediate program
termination and cannot be caught by the user. The Prelude provides
two functions to directly
cause such errors:

error :: String -> a
undefined :: a

A call to error terminates execution of
the program and returns an appropriate error indication to the
operating system. It should also display the string in some
system-dependent manner. When undefined is used, the error message
is created by the compiler.

Translations of Haskell expressions use error and undefined to
explicitly indicate where execution time errors may occur. The actual
program behavior when an error occurs is up to the implementation.
The messages passed to the error function in these translations are
only suggestions; implementations may choose to display more or less
information when an error occurs.

3.2Variables, Constructors, and Operators

aexp

->

qvar

(variable)

|

gcon

(general constructor)

|

literal

gcon

->

()

|

[]

|

(,{,})

|

qcon

qvar

->

qvarid | ( qvarsym )

(qualified variable)

qcon

->

qconid | ( qconsym )

(qualified constructor)

Alphanumeric operators are formed by enclosing an identifier between
grave accents (backquotes). Any variable or constructor may be used as
an operator in this way. If fun is an identifier (either variable
or constructor), then an expression of the form fun x y is
equivalent to x `fun`y. If no fixity
declaration is given for `fun` then it defaults
to highest precedence and left associativity
(see Section 5.6).

Similarly, any symbolic operator may be used as a (curried) variable
or constructor by enclosing it in parentheses. If op is an infix
operator, then an expression or pattern of the form x op y is
equivalent to (op) x y.

Qualified names may only be used to reference an imported variable or
constructor (see Section 5.1.2)
but not in the definition of a new variable or constructor. Thus

let F.x = 1 in F.x -- invalid

incorrectly uses a qualifier in the definition of x, regardless of
the module containing this definition. Qualification does not affect
the nature of an operator: F.+ is an infix operator just as + is.

Special syntax is used to name some constructors for some of the
built-in types, as found
in the production for gcon and literal. These are described
in Section 6.1.

An integer literal represents the
application of the function fromInteger to the
appropriate value of type
Integer. Similarly, a floating point literal stands for an application of
fromRational to a value of type Rational (that is,
Ratio Integer).

Translation:

The integer literal i is equivalent to fromIntegeri,
where fromInteger is a method in class Num (see Section
6.3.1).

The floating point literal f is equivalent to fromRational
(nRatio.%d), where fromRational is a method in class Fractional
and Ratio.% constructs a rational from two integers, as defined in
the Ratio library.
The integers n and d are chosen so that n/d = f.

3.3Curried Applications and Lambda Abstractions

fexp

->

[fexp] aexp

(function application)

exp

->

\ apat1 ... apatn-> exp

Function application is written
e1 e2. Application associates to the left, so the
parentheses may be omitted in (f x) y. Because e1 could
be a data constructor, partial applications of data constructors are
allowed.

Lambda abstractions are written
\ p1 ... pn-> e, where the pi are patterns.
An expression such as \x:xs->x is syntactically incorrect,
and must be rewritten as \(x:xs)->x.

The set of patterns must be linear---no variable may appear more than once in the set.

Translation:

The lambda abstraction \ p1 ... pn-> e is equivalent to

\ x1 ... xn-> case (x1, ..., xn) of (p1, ..., pn) -> e

where the xi are new identifiers.
Given this translation combined with the semantics of case
expressions and pattern matching described in
Section 3.17.3, if the
pattern fails to match, then the result is _|_.

3.4Operator Applications

exp

->

exp1 qop exp2

|

- exp

(prefix negation)

The form e1 qop e2 is the infix application of binary
operator qop to expressions e1 and e2.

The special
form -e denotes prefix negation, the only
prefix operator in Haskell , and is
syntax for negate (e). The binary - operator
does not necessarily refer
to the definition of - in the Prelude; it may be rebound
by the module system. However, unary - will always refer to the
negate function defined in the Prelude. There is no link between
the local meaning of the - operator and unary negation.

Prefix negation has the same precedence as the infix operator -
defined in the Prelude (see
Table 2). Because e1-e2 parses as an
infix application of the binary operator -, one must write e1(-e2) for
the alternative parsing. Similarly, (-) is syntax for
(\ x y -> x-y), as with any infix operator, and does not denote
(\ x -> -x)---one must use negate for that.

Translation:

e1 op e2 is equivalent to (op) e1 e2. -e is
equivalent to negate (e).

3.5Sections

aexp

->

( exp qop )

|

( qop exp )

Sections are written as ( op e ) or ( e op ), where
op is a binary operator and e is an expression. Sections are a
convenient syntax for partial application of binary operators.

The normal rules of syntactic precedence apply to sections; for
example, (*a+b) is syntactically invalid, but (+a*b) and
(*(a+b)) are valid. Syntactic associativity, however, is not
taken into account in sections; thus, (a+b+) must be written
((a+b)+).

Because - is treated specially in the grammar,
(- exp) is not a section, but an application of prefix
negation, as
described in the preceding section. However, there is a subtract
function defined in the Prelude such that
(subtract exp) is equivalent to the disallowed section.
The expression (+ (- exp)) can serve the same purpose.

Translation:

For binary operator op and expression e, if x is a variable
that does not occur free in e, the section
(op e) is equivalent to \ x -> x op e, and the section
(e op) is equivalent to (op) e.

3.6Conditionals

exp

->

if exp1then exp2else exp3

A conditional expression
has the form
if e1then e2else e3 and returns the value of e2 if the
value of e1 is True, e3 if e1 is False, and _|_
otherwise.

Translation:

if e1then e2else e3 is equivalent to:

case e1of { True -> e2; False -> e3}

where True and False are the two nullary constructors from the
type Bool, as defined in the Prelude.

3.7Lists

aexp

->

[ exp1, ... , expk]

(k>=1)

Lists are written [e1, ..., ek], where
k>=1; the empty list is written []. Standard operations on
lists are given in the Prelude (see
Appendix A, notably Section A.1).

Translation:

[e1, ..., ek] is equivalent to

e1: (e2: ( ... (ek: [])))

where : and [] are constructors for lists, as defined in
the Prelude (see Section 6.1.3). The types
of e1 through ek must all be the same (call it t), and the
type of the overall expression is [t] (see Section 4.1.1).

3.8Tuples

aexp

->

( exp1, ... , expk)

(k>=2)

Tuples are written (e1, ..., ek), and may be
of arbitrary length k>=2. Standard operations on tuples are given
in the Prelude (see Appendix A).

Translation:

(e1, ..., ek) for k>=2 is an instance of a k-tuple as
defined in the Prelude, and requires no translation. If
t1 through tk are the types of e1 through ek,
respectively, then the type of the resulting tuple is
(t1, ..., tk) (see Section 4.1.1).

3.9Unit Expressions and Parenthesized Expressions

aexp

->

()

|

( exp )

The form (e) is simply a parenthesized expression, and is
equivalent to e. The unit expression() has type
() (see
Section 4.1.1); it is the only member of that type apart
from _|_ (it can
be thought of as the "nullary tuple")---see Section 6.1.5.

Translation:

(e) is equivalent to e.

3.10Arithmetic Sequences

aexp

->

[ exp1 [, exp2] .. [exp3] ]

The form [e1, e2.. e3] denotes an arithmetic
sequence from e1 in increments of
e2-e1 of values not greater than e3 (if the increment is
nonnegative) or not less than e3 (if the increment is negative).
Thus, the resulting list is empty if the increment is nonnegative
and e3 is less than e1 or if the increment is negative and
e3 is greater than e1. If the increment is zero, an infinite
list of e1s results if e3 is not less than e1. If e3
is omitted, the result is an infinite list, unless the element type
is finite, in which case the implied limit is the greatest
value of the type if the increment is nonnegative, or the least value,
otherwise.

The forms [e1.. e3] and [e1..] are similar to those
above, but with an implied increment of one.

Arithmetic sequences may be defined over any type in class
Enum,
including Char, Int, and Integer (see
Figure 5
and Section 4.3.3).
For example, ['a'..'z'] denotes
the list of lowercase letters in alphabetical order.

Translation:

Arithmetic sequences satisfy these identities:

[ e1.. ]

=

enumFrome1

[ e1,e2.. ]

=

enumFromThene1e2

[ e1..e3 ]

=

enumFromToe1e3

[ e1,e2..e3 ]

=

enumFromThenToe1e2e3

where enumFrom, enumFromThen, enumFromTo, and enumFromThenTo
are class methods in the class Enum as defined in the Prelude
(see Figure 5
).

3.11List Comprehensions

aexp

->

[ exp | qual1, ... , qualn]

(list comprehension, n>=1)

qual

->

pat <- exp

|

let decllist

|

exp

A list comprehension has the form [ e | q1, ..., qn],
n>=1, where the qi qualifiers are either

generators of the form p <- e, where
p is a
pattern (see Section 3.17) of type t and e is an
expression of type Monad m => m t

guards, which are arbitrary expressions of
type Bool

local bindings that provide new definitions for use in
the generated expression e or subsequent guards and generators.

While list comprehensions are commonly used to generate lists, the
definition of list comprehensions uses monadic operations that can be
used with other types besides lists. This syntax provides a slightly
more concise way of expressing some forms of do expressions (see
Section 3.14). For simplicity, we will describe this
construct only as it applies to lists.

Such a list comprehension returns the list of elements
produced by evaluating e in the successive environments
created by the nested, depth-first evaluation of the generators in the
qualifier list. Binding of variables occurs according to the normal
pattern matching rules (see Section 3.17), and if a
match fails then that element of the list is simply skipped over. Thus:

[ x | xs <- [ [(1,2),(3,4)], [(5,4),(3,2)] ],
(3,x) <- xs ]

yields the list [4,2]. If a qualifier is a guard, it must evaluate
to True for the previous pattern match to succeed.
As usual, bindings in list comprehensions can shadow those in outer
scopes; for example:

[ x | x <- x, x <- x ]

=

[ z | y <- x, z <- y]

Translation:

List comprehensions satisfy these identities, which may be
used as a translation into the kernel:

[ e | q1 ...qn ]

=

do {T(q1);...;T(qn); return e}

T(b)

=

guard b

T(letdecllist)

=

letdecllist

T(p <- l)

=

p <- l

where e ranges over expressions, p ranges over
patterns, l ranges over list-valued expressions, and b ranges over
boolean expressions. The return and guard functions are as
defined in the Prelude.

As indicated by the translation of list comprehensions, variables
bound by let have fully polymorphic types while those defined by
<- are lambda bound and are thus monomorphic (see Section
4.5.4).

3.12Let Expressions

exp

->

let decllist in exp

Let expressions have the general form
let { d1; ... ; dn} in e,
and introduce a
nested, lexically-scoped,
mutually-recursive list of declarations (let is often called letrec in
other languages). The scope of the declarations is the expression e
and the right hand side of the declarations. Declarations are
described in Section 4. Pattern bindings are matched
lazily; an implicit ~ makes these patterns
irrefutable.
For example,

let (x,y) = undefined in e

does not cause an execution-time error until x or y is evaluated.

Translation:

The dynamic semantics of the expression
let { d1; ... ; dn} in e0 are captured by this
translation: After removing all type signatures, each
declaration di is translated into an equation of the form
pi= ei, where pi and ei are patterns and expressions
respectively, using the translation in
Section 4.4.2. Once done, these identities
hold, which may be used as a translation into the kernel:

let {p1 = e1; ...; pn = en} ine0

=

let (~p1,...,~pn) = (e1,...,en) ine0

let p = e1 in e0

=

case e1 of ~p -> e0

where no variable in p appears free in e1

let p = e1 in e0

=

let p = fix ( \ ~p -> e1) ine0

where fix is the least fixpoint operator. Note the use of the
irrefutable patterns in the second and third rules. This translation
does not preserve the static semantics because the use of case
precludes a fully polymorphic typing of the bound variables.
The static semantics of the bindings in a let expression
are described in
Section 4.4.2.

3.13Case Expressions

exp

->

case exp of{ alts [;] }

alts

->

alt1; ... ; altn

(n>=1)

alt

->

pat -> exp [where decllist]

|

pat gdpat [where decllist]

gdpat

->

gd -> exp [ gdpat ]

gd

->

| exp0

A case expression has the general form

case e of { p1 match1; ... ; pn matchn}

where each matchi is of the general form

| gi1

-> ei1

...

| gimi

-> eimi

where decllisti

Each alternative pi matchi consists of a
pattern pi and its matches, matchi, which
consists of pairs of guards
gij and bodies eij (expressions), as well as
optional bindings (decllisti) that scope over all of the guards and
expressions of the alternative. An alternative of the form

pat -> exp where decllist

is treated as shorthand for:

pat | True

-> expr

where decllist

A case expression must have at least one alternative and each alternative must
have at least one body. Each body must have the same type, and the
type of the whole expression is that type.

A case expression is evaluated by pattern matching the expression e
against the individual alternatives. The matches are tried sequentially,
from top to bottom. The first successful match causes evaluation of
the corresponding alternative body, in the environment of the case
expression extended by the bindings created during the matching of
that alternative and by the decllisti associated with that
alternative. If no
match succeeds, the result is _|_. Pattern matching is described
in Section 3.17, with the formal semantics of case
expressions in Section 3.17.3.

3.14Do Expressions

exp

->

do{ stmts [;]}

(do expression)

stmts

->

exp [; stmts]

|

pat <- exp ; stmts

|

let decllist ; stmts

A do expression provides a more readable syntax for monadic
programming.

Translation:

Do expressions satisfy these identities, which may be
used as a translation into the kernel:

do {e}

=

e

do {e;stmts}

=

e>> do {stmts}

do {p <- e; stmts}

=

e>>= \p -> do {stmts}

where p is failure-free

do {p <- e; stmts}

=

let ok p = do {stmts}

ok _ = zero

in e >>= ok

where p is not failure-free

do {letdecllist; stmts}

=

letdecllistin do {stmts}

>>, >>=, and zero are operations in the classes Monad and
MonadZero, as defined in the
Prelude., and ok is a new
identifier not appearing in p.

A failure-free pattern is one that can only be refuted by _|_.
Failure-free patterns are defined as follows:

All irrefutable patterns are failure-free (irrefutable patterns
are described in Section 3.17.1).

If C is the only constructor in its type, then C p1 ...
pn is failure-free when each of the pi is failure free.

If pattern p is failure-free, then the pattern v@p
is failure-free.

This translation requires a monad in class MonadZero if any pattern
bound by <- is not failure-free. Otherwise, only class
methods from Monad are generated. Type errors resulting from
patterns that are not failure-free can be corrected by using ~ to
force the pattern to be failure-free.

As indicated by the translation of do, variables bound by let have
fully polymorphic types while those defined by <- are lambda bound
and are thus monomorphic.

3.15Datatypes with Field Labels

A datatype declaration may optionally include field labels for some or
all of the components of the type (see Section 4.2.1).
Readers unfamiliar with datatype declarations in Haskell may wish
to read Section 4.2.1 first.
These field labels can be used to
construct, select from, and update fields in a manner
that is independent of the overall structure of the datatype.

Different datatypes cannot share common field labels in the same scope.
A field label can be used at most once in a constructor.
Within a datatype, however, a field name can be used in more
than one constructor provided the field has the same typing in all
constructors.

3.15.1Field Selection

aexp

->

qvar

Field names are used as selector functions. When used as a variable,
a field name serves as a function that extracts the field from an
object. Selectors are top level bindings and so they
may be shadowed by local variables but cannot conflict with
other top level bindings of the same name. This shadowing only
affects selector functions; in other record constructs, field labels
cannot be confused with ordinary variables.

Translation:

A field label f introduces a selector function defined as:

f x

=

case x of {C1 p11 ...p1k -> e1;...;Cn pn1 ...pnk -> en}

where C1 ...Cn are all the constructors of the datatype containing a
field labeled with f, pij is y when f labels the jth
component of Ci or _ otherwise, and ei is y when some field in
Ci has a label of f or undefined otherwise.

3.15.2Construction Using Field Labels

aexp

->

qcon { fbind1, ... , fbindn}

(labeled construction, n>=0)

fbind

->

var | qvar = exp

A constructor with labeled fields may be used to construct a value
in which the components are specified by name rather than by position.
Unlike the braces used in declaration lists, these are not subject to
layout; the { and } characters must be explicit. (This is also
true of field updates and field patterns.)
Construction using field names is subject to the following constraints:

Only field labels declared with the specified constructor may be
mentioned.

A field name may not be mentioned more than once.

Fields not mentioned are initialized to _|_.

When the =exp is omitted and there is a variable with the
same name as the field label in scope, the field is initialized to the value
of that variable.

A compile-time error occurs when any strict fields (fields
whose declared types are prefixed by !) are omitted during
construction. Strict fields are discussed in Section 4.2.1.

Translation:

In the binding f=v, the field f labels v. Any
binding f that omits the =v is expanded to f=f.

C{bs}

=

C (pickC1 bs undefined) ...(pickCk bs undefined)

k is the arity of C.

The auxiliary function pickCi bs d is defined as follows:

If the ith component of a constructor C has the
field name f, and if f=v appears in the binding list
bs, then pickCi bs d is v. Otherwise, pickCi bs d is
the default value d.

3.15.3Updates Using Field Labels

aexp

->

aexp<qcon>{ fbind1, ... , fbindn}

(labeled update, n>=1)

Values belonging to a datatype with field names may be
non-destructively updated. This creates a new value in which the
specified field values replace those in the existing value.
Updates are restricted in the following ways:

All labels must be taken from the same datatype.

At least one constructor must define all of the labels
mentioned in the update.

No label may be mentioned more than once.

An execution error occurs when the value being updated does
not contain all of the specified labels.

When the =exp is omitted, the field is updated to the value
of the variable in scope with the same name as the field label.

Translation:

Using the prior definition of pick,

e{bs}

=

caseeof

C1 v1 ... vk1->C (pickC1 bs v1) ... (pickCk bs vk1)

...

Cj v1 ... vkj->C (pickC1 bs v1) ... (pickCk bs vkj)

_ -> error "Update error"

where {C1,...,Cj} is the set of constructors containing all labels
in b, and ki is the arity of Ci.

Here are some examples using labeled fields:

data T = C1 {f1,f2 :: Int}
| C2 {f1 :: Int,
f3,f4 :: Char}

Expression

Translation

C1 {f1 = 3}

C1' 3 undefined

C2 {f1 = 1, f4 = 'A', f3 = 'B'}

C2' 1 'B' 'A'

x {f1 = 1}

case x of C1' _ f2 -> C1' 1 f2

C2' _ f3 f4 -> C2' 1 f3 f4

The field f1 is common to both constructors in T. The
constructors C1' and C2' are `hidden constructors', see the
translation in Section 4.2.1.
A compile-time error will result if no single constructor
defines the set of field names used in an update, such as
x {f2 = 1, f3 = 'x'}.

3.16Expression Type-Signatures

exp

->

exp :: [context =>] type

Expression type-signatures have the form e :: t, where e
is an expression and t is a type (Section 4.1.1); they
are used to type an expression explicitly
and may be used to resolve ambiguous typings due to overloading (see
Section 4.3.4). The value of the expression is just that of
exp. As with normal type signatures (see
Section 4.4.1), the declared type may be more specific than
the principal type derivable from exp, but it is an error to give
a type that is more general than, or not comparable to, the
principal type.

3.17Pattern Matching

Patterns appear in lambda abstractions, function definitions, pattern
bindings, list comprehensions, do expressions, and case expressions.
However, the
first five of these ultimately translate into case expressions, so
defining the semantics of pattern matching for case expressions is sufficient.

3.17.1Patterns

Patterns have this syntax:

pat

->

var + integer

(successor pattern)

pat

| pat0

pati

->

pati+1 [qconop(n,i) pati+1]

|

lpati

|

rpati

lpati

->

(lpati | pati+1) qconop(l,i) pati+1

lpat6

->

- (integer | float)

(negative literal)

rpati

->

pati+1 qconop(r,i) (rpati | pati+1)

pat10->

apat

|

gcon apat1 ... apatk

(arity gcon = k, k>=1)

apat

->

var [@ apat]

(as pattern)

|

gcon

(arity gcon = 0)

|

qcon { fpat1, ... , fpatk}

(labeled pattern, k>=0)

|

literal

|

_

(wildcard)

|

( pat )

(parenthesized pattern)

|

( pat1, ... , patk)

(tuple pattern, k>=2)

|

[ pat1, ... , patk]

(list pattern, k>=1)

|

~ apat

(irrefutable pattern)

fpat

->

var = pat

|

var

The arity of a constructor must match the number of
sub-patterns associated with it; one cannot match against a
partially-applied constructor.

All patterns must be linear
---no variable may appear more than once.

Patterns of the form var@pat are called as-patterns,
and allow one to use var
as a name for the value being matched by pat. For example,

Patterns of the form _ are
wildcards and are useful when some part of a pattern
is not referenced on the right-hand-side. It is as if an
identifier not used elsewhere were put in its place. For example,

case e of { [x,_,_] -> if x==0 then True else False }

is equivalent to:

case e of { [x,y,z] -> if x==0 then True else False }

In the pattern matching rules given below we distinguish two kinds of
patterns: an irrefutable pattern
is: a variable, a wildcard, N apat where N is a constructor
defined by newtype and apat is irrefutable (see
Section 4.2.3),
var@apat where apat is irrefutable,
or of the form ~apat (whether or not apat is irrefutable).
All other patterns are refutable.

3.17.2Informal Semantics of Pattern Matching

Patterns are matched against values. Attempting to match a pattern
can have one of three results: it may fail; it may
succeed, returning a binding for each variable in the pattern; or it
may diverge (i.e. return _|_). Pattern matching proceeds
from left to right, and outside to inside, according to these rules:

Matching a value v against the irrefutable pattern
var always succeeds and binds var to v.
Similarly, matching v against the irrefutable pattern ~apat always succeeds.
The free
variables in apat are bound to the appropriate values if matching
v against apat would otherwise succeed, and to _|_ if matching
v against apat fails or diverges. (Binding does
not imply evaluation.)

Matching any value against the wildcard pattern _ always succeeds
and no binding is done.

Operationally, this means that no matching is done on an
irrefutable pattern until one of the variables in the pattern is used.
At that point the entire pattern is matched against the value, and if
the match fails or diverges, so does the overall computation.

Matching a value con v against the pattern con pat, where con is a
constructor defined by newtype, is equivalent to matching v
against the pattern pat. That is, constructors associated with
newtype serve only to change the type of a value.

Matching _|_ against a refutable pattern always diverges.

Matching a non-_|_ value can occur against three kinds of
refutable patterns:

Matching a non-_|_ value against a pattern whose outermost
component is a constructor defined by data
fails if the value being matched was created by a
different constructor. If the constructors are
the same, the result of the match is the result of matching the
sub-patterns left-to-right against the components of the data value:
if all matches succeed, the overall match
succeeds; the first to fail or diverge causes the overall match to
fail or diverge, respectively.

Numeric literals are matched using the overloaded == function.
The behavior of numeric patterns depends entirely on the definition of
== for the type of object being matched.

Matching a non-_|_ value x against a pattern of the form
n+k
(where n is a variable and k is a positive integer
literal) succeeds if x>=k, resulting in the binding of n to x-k,
and fails if x<k. The behavior of n+k patterns depends entirely
on the underlying definitions of >=, fromInteger, and - for the
type of the object being matched.

Matching against a constructor using labeled fields is the same as
matching ordinary constructor patterns except that the fields are
matched in the order they are named in the field list. All fields
listed must be declared by the constructor; fields may not be named
more than once. Fields not named by the pattern are ignored (matched
against _).

The result of matching a value v against an as-pattern var@apat is
the result of matching v against apat augmented with the binding of
var to v. If the match of v against apat fails or diverges,
then so does the overall match.

Aside from the obvious static type constraints (for
example, it is a static error to match a character against a
boolean), these static class constraints hold: an integer
literal pattern
can only be matched against a value in the class
Num and a floating literal pattern
can only be matched against a value
in the class Fractional.
A n+k pattern
can only be matched
against a value in the class Integral.

Many people feel that n+k patterns should not be used. These
patterns may be removed or changed in future versions of Haskell .
Compilers should support a flag that disables the use of these
patterns.

Here are some examples:

If the pattern [1,2] is matched against [0,_|_], then 1
fails to match against 0, and the result is a failed match. But
if [1,2] is matched against [_|_,0], then attempting to match
1 against _|_ causes the match to diverge.

Additional examples illustrating some of the subtleties of pattern matching
may be found in Section 4.2.3.

Top level patterns in case
expressions and the set of top level patterns in function or pattern
bindings may have zero or more associated guards.
A guard is
a boolean expression that is evaluated only after all of the
arguments have been successfully matched, and it must be true for the
overall pattern match to succeed. The environment of the guard is the same
as the right-hand-side of the case-expression
alternative, function definition, or pattern binding to which it is attached.

The guard semantics have an obvious influence on the
strictness characteristics of a function or case expression. In
particular, an otherwise irrefutable pattern
may be evaluated because of a guard. For example, in

x1, ..., xn are all the variables in p; x'1, ..., x'n are completely new variables

(e)

case v of { x@p -> e; _ -> e' }

= case v of { p -> ( \ x -> e ) v ; _ -> e' }

(f)

case v of { _ -> e; _ -> e' } =e

Figure 3

Semantics of Case Expressions, Part 1

3.17.3Formal Semantics of Pattern Matching

The semantics of all pattern matching constructs other than case
expressions are defined by giving identities that relate those
constructs to case expressions. The semantics of
case expressions themselves are in turn given as a series of
identities, in Figures 3--4.
Any implementation should behave so that these identities hold; it is
not expected that it will use them directly, since that
would generate rather inefficient code.

(g)

case v of { K p1 ...pn -> e; _ -> e' }

= case v of {

K x1 ...xn -> case x1 of {

p1 -> ... case xn of { pn -> e ; _ -> e' } ...

_ -> e' }

_ -> e' }

at least one of p1, ..., pn is not a variable; x1, ..., xn are new variables

(h)

case v of { k -> e; _ -> e' } = if (v==k) then e else e'

(i)

case v of { x -> e; _ -> e' } = case v of { x -> e }

(j)

case v of { x -> e } = ( \ x -> e ) v

(k)

case N v of { Np -> e; _ -> e' }

= case v of { p -> e; _ -> e' }

where N is a newtype constructor

(l)

case _|_ of { Np -> e; _ -> e' } = case _|_ of { p -> e }

where N is a newtype constructor

(m)

case v of { K { f1 = p1 , f2 =
p2 , ... } -> e ; _ -> e' }

= case e' of {

y ->

case v of {

K { f1 = p1 } ->

case v of { K { f2 = p2 ,
... } -> e ; _ -> y };

_ -> y }}

where f1, f2, ... are fields of constructor K; y
is a new variable

(n)

case v of { K { f = p } -> e ; _ ->
e' }

= case v of {

K p1 ... pn -> e ; _ -> e' }

where pi is p if f labels the ith component of K,
_ otherwise

(o)

case v of { K {} -> e ; _ ->
e' }

= case v of {

K _ ... _ -> e ; _ -> e' }

(p)

case (K'e1...em) of { Kx1...xn -> e; _ -> e' } =e'

where K and K' are distinct data constructors of arity n and m, respectively

where K is a constructor of arity n; x'1 ...x'n are completely new variables

(r)

case e0 of { x+k -> e; _ -> e' }

= if e0 >= k then let {x' = e0-k} in e[x'/x] else e'(x' is a new variable)

Figure 4

Semantics of Case Expressions, Part 2

In Figures 3--4:
e, e' and ei are expressions;
g and gi are boolean-valued expressions;
p and pi are patterns;
v, x, and xi are variables;
K and K' are algebraic datatype (data) constructors (including
tuple constructors); N is a newtype constructor;

and k is a character, string, or numeric literal.

Rule (b) matches a general source-language
case expression, regardless of whether it actually includes
guards---if no guards are written, then True is substituted for the guards gi,j
in the matchi forms.
Subsequent identities manipulate the resulting case expression into simpler
and simpler forms.

Rule (h) in Figure 4 involves the
overloaded operator ==; it is this rule that defines the
meaning of pattern matching against overloaded constants.

These identities all preserve the static semantics. Rules (d), (e), and (j)
use a lambda rather than a let; this indicates that variables bound
by case are monomorphically typed (Section 4.1.3).