7.3. Syntactic extensions

7.3.1. Unicode syntax

The language
extension -XUnicodeSyntax
enables Unicode characters to be used to stand for certain ASCII
character sequences. The following alternatives are provided:

ASCII

Unicode alternative

Code point

Name

::

::

0x2237

PROPORTION

=>

⇒

0x21D2

RIGHTWARDS DOUBLE ARROW

forall

∀

0x2200

FOR ALL

->

→

0x2192

RIGHTWARDS ARROW

<-

←

0x2190

LEFTWARDS ARROW

-<

↢

0x2919

LEFTWARDS ARROW-TAIL

>-

↣

0x291A

RIGHTWARDS ARROW-TAIL

-<<

0x291B

LEFTWARDS DOUBLE ARROW-TAIL

>>-

0x291C

RIGHTWARDS DOUBLE ARROW-TAIL

*

★

0x2605

BLACK STAR

7.3.2. The magic hash

The language extension -XMagicHash allows "#" as a
postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is
a valid type constructor or data constructor.

The hash sign does not change semantics at all. We tend to use variable
names ending in "#" for unboxed values or types (e.g. Int#),
but there is no requirement to do so; they are just plain ordinary variables.
Nor does the -XMagicHash extension bring anything into scope.
For example, to bring Int# into scope you must
import GHC.Prim (see Section 7.2, “Unboxed types and primitive operations”);
the -XMagicHash extension
then allows you to refer to the Int#
that is now in scope.

3# has type Int#. In general,
any Haskell integer lexeme followed by a # is an Int# literal, e.g.
-0x3A# as well as 32#

.

3## has type Word#. In general,
any non-negative Haskell integer lexeme followed by ##
is a Word#.

3.2# has type Float#.

3.2## has type Double#

7.3.3. Hierarchical Modules

GHC supports a small extension to the syntax of module
names: a module name is allowed to contain a dot
‘.’. This is also known as the
“hierarchical module namespace” extension, because
it extends the normally flat Haskell module namespace into a
more flexible hierarchy of modules.

This extension has very little impact on the language
itself; modules names are always fully
qualified, so you can just think of the fully qualified module
name as “the module name”. In particular, this
means that the full module name must be given after the
module keyword at the beginning of the
module; for example, the module A.B.C must
begin

module A.B.C

It is a common strategy to use the as
keyword to save some typing when using qualified names with
hierarchical modules. For example:

GHC comes with a large collection of libraries arranged
hierarchically; see the accompanying library
documentation. More libraries to install are available
from HackageDB.

7.3.4. Pattern guards

The discussion that follows is an abbreviated version of Simon Peyton Jones's original proposal. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)

Suppose we have an abstract data type of finite maps, with a
lookup operation:

lookup :: FiniteMap -> Int -> Maybe Int

The lookup returns Nothing if the supplied key is not in the domain of the mapping, and (Just v) otherwise,
where v is the value that the key maps to. Now consider the following definition:

What is clunky doing? The guard ok1 &&
ok2 checks that both lookups succeed, using
maybeToBool to convert the Maybe
types to booleans. The (lazily evaluated) expectJust
calls extract the values from the results of the lookups, and binds the
returned values to val1 and val2
respectively. If either lookup fails, then clunky takes the
otherwise case and returns the sum of its arguments.

This is certainly legal Haskell, but it is a tremendously verbose and
un-obvious way to achieve the desired effect. Arguably, a more direct way
to write clunky would be to use case expressions:

This is a bit shorter, but hardly better. Of course, we can rewrite any set
of pattern-matching, guarded equations as case expressions; that is
precisely what the compiler does when compiling equations! The reason that
Haskell provides guarded equations is because they allow us to write down
the cases we want to consider, one at a time, independently of each other.
This structure is hidden in the case version. Two of the right-hand sides
are really the same (fail), and the whole expression
tends to become more and more indented.

The semantics should be clear enough. The qualifiers are matched in order.
For a <- qualifier, which I call a pattern guard, the
right hand side is evaluated and matched against the pattern on the left.
If the match fails then the whole guard fails and the next equation is
tried. If it succeeds, then the appropriate binding takes place, and the
next qualifier is matched, in the augmented environment. Unlike list
comprehensions, however, the type of the expression to the right of the
<- is the same as the type of the pattern to its
left. The bindings introduced by pattern guards scope over all the
remaining guard qualifiers, and over the right hand side of the equation.

Just as with list comprehensions, boolean expressions can be freely mixed
with among the pattern guards. For example:

f x | [y] <- x
, y > 3
, Just z <- h y
= ...

Haskell's current guards therefore emerge as a special case, in which the
qualifier list has just one element, a boolean expression.

7.3.5. View patterns

View patterns are enabled by the flag -XViewPatterns.
More information and examples of view patterns can be found on the
Wiki
page.

View patterns are somewhat like pattern guards that can be nested inside
of other patterns. They are a convenient way of pattern-matching
against values of abstract types. For example, in a programming language
implementation, we might represent the syntax of the types of the
language as follows:

The representation of Typ is held abstract, permitting implementations
to use a fancy representation (e.g., hash-consing to manage sharing).
Without view patterns, using this signature a little inconvenient:

That is, we add a new form of pattern, written
expression->pattern that means "apply the expression to
whatever we're trying to match against, and then match the result of
that application against the pattern". The expression can be any Haskell
expression of function type, and view patterns can be used wherever
patterns are used.

The semantics of a pattern (exp->pat) are as follows:

Scoping:

The variables bound by the view pattern are the variables bound by
pat.

Any variables in exp are bound occurrences,
but variables bound "to the left" in a pattern are in scope. This
feature permits, for example, one argument to a function to be used in
the view of another argument. For example, the function
clunky from Section 7.3.4, “Pattern guards” can be
written using view patterns as follows:

That is, the scoping is the same as it would be if the curried arguments
were collected into a tuple.

In mutually recursive bindings, such as let,
where, or the top level, view patterns in one
declaration may not mention variables bound by other declarations. That
is, each declaration must be self-contained. For example, the following
program is not allowed:

That is, to match a variable v against a pattern
(exp->pat), evaluate (exp v) and match the result against
pat.

Efficiency: When the same view function is applied in
multiple branches of a function definition or a case expression (e.g.,
in size above), GHC makes an attempt to collect these
applications into a single nested case expression, so that the view
function is only applied once. Pattern compilation in GHC follows the
matrix algorithm described in Chapter 4 of The
Implementation of Functional Programming Languages. When the
top rows of the first column of a matrix are all view patterns with the
"same" expression, these patterns are transformed into a single nested
case. This includes, for example, adjacent view patterns that line up
in a tuple, as in

f ((view -> A, p1), p2) = e1
f ((view -> B, p3), p4) = e2

The current notion of when two view pattern expressions are "the
same" is very restricted: it is not even full syntactic equality.
However, it does include variables, literals, applications, and tuples;
e.g., two instances of view ("hi", "there") will be
collected. However, the current implementation does not compare up to
alpha-equivalence, so two instances of (x, view x ->
y) will not be coalesced.

7.3.6. n+k patterns

n+k pattern support is disabled by default. To enable
it, you can use the -XNPlusKPatterns flag.

7.3.7. Traditional record syntax

Traditional record syntax, such as C {f = x}, is enabled by default.
To disable it, you can use the -XNoTraditionalRecordSyntax flag.

7.3.8. The recursive do-notation

The do-notation of Haskell 98 does not allow recursive bindings,
that is, the variables bound in a do-expression are visible only in the textually following
code block. Compare this to a let-expression, where bound variables are visible in the entire binding
group. It turns out that several applications can benefit from recursive bindings in
the do-notation. The -XDoRec flag provides the necessary syntactic support.

The background and motivation for recursive do-notation is described in
A recursive do for Haskell,
by Levent Erkok, John Launchbury,
Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
The theory behind monadic value recursion is explained further in Erkok's thesis
Value Recursion in Monadic Computations.
However, note that GHC uses a different syntax than the one described in these documents.

7.3.8.1. Details of recursive do-notation

The recursive do-notation is enabled with the flag -XDoRec or, equivalently,
the LANGUAGE pragma DoRec. It introduces the single new keyword "rec",
which wraps a mutually-recursive group of monadic statements,
producing a single statement.

Similar to a let
statement, the variables bound in the rec are
visible throughout the rec group, and below it.
For example, compare

In both cases, r1 and r2 are
available both throughout the let or rec block, and
in the statements that follow it. The difference is that let is non-monadic,
while rec is monadic. (In Haskell let is
really letrec, of course.)

The static and dynamic semantics of rec can be described as follows:

First,
similar to let-bindings, the rec is broken into
minimal recursive groups, a process known as segmentation.
For example:

The details of segmentation are described in Section 3.2 of
A recursive do for Haskell.
Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper
describes, also has a semantic effect (unless the monad satisfies the right-shrinking law).

Then each resulting rec is desugared, using a call to Control.Monad.Fix.mfix.
For example, the rec group in the preceding example is desugared like this:

The original rec typechecks exactly
when the above desugared version would do so. For example, this means that
the variables vs are all monomorphic in the statements
following the rec, because they are bound by a lambda.

The mfix function is defined in the MonadFix
class, in Control.Monad.Fix, thus:

class Monad m => MonadFix m where
mfix :: (a -> m a) -> m a

Here are some other important points in using the recursive-do notation:

It is enabled with the flag -XDoRec.

If recursive bindings are required for a monad,
then that monad must be declared an instance of the MonadFix class.

The following instances of MonadFix are automatically provided: List, Maybe, IO.
Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
for Haskell's internal state monad (strict and lazy, respectively).

Like let and where bindings,
name shadowing is not allowed within a rec;
that is, all the names bound in a single rec must
be distinct (Section 3.3 of the paper).

7.3.8.2. Mdo-notation (deprecated)

GHC used to support the flag -XRecursiveDo,
which enabled the keyword mdo, precisely as described in
A recursive do for Haskell,
but this is now deprecated. Instead of mdo { Q; e }, write
do { rec Q; e }.

Historical note: The old implementation of the mdo-notation (and most
of the existing documents) used the name
MonadRec for the class and the corresponding library.
This name is not supported by GHC.

7.3.9. Parallel List Comprehensions

Parallel list comprehensions are a natural extension to list
comprehensions. List comprehensions can be thought of as a nice
syntax for writing maps and filters. Parallel comprehensions
extend this to include the zipWith family.

A parallel list comprehension has multiple independent
branches of qualifier lists, each separated by a `|' symbol. For
example, the following zips together two lists:

[ (x, y) | x <- xs | y <- ys ]

The behaviour of parallel list comprehensions follows that of
zip, in that the resulting list will have the same length as the
shortest branch.

We can define parallel list comprehensions by translation to
regular comprehensions. Here's the basic idea:

There are three new keywords: group, by, and using.
(The functions sortWith and groupWith are not keywords; they are ordinary
functions that are exported by GHC.Exts.)

There are five new forms of comprehension qualifier,
all introduced by the (existing) keyword then:

then f

This statement requires that f have the type
forall a. [a] -> [a]. You can see an example of its use in the
motivating example, as this form is used to apply take 5.

then f by e

This form is similar to the previous one, but allows you to create a function
which will be passed as the first argument to f. As a consequence f must have
the type forall a. (a -> t) -> [a] -> [a]. As you can see
from the type, this function lets f "project out" some information
from the elements of the list it is transforming.

An example is shown in the opening example, where sortWith
is supplied with a function that lets it find out the sum salary
for any item in the list comprehension it transforms.

then group by e using f

This is the most general of the grouping-type statements. In this form,
f is required to have type forall a. (a -> t) -> [a] -> [[a]].
As with the then f by e case above, the first argument
is a function supplied to f by the compiler which lets it compute e on every
element of the list being transformed. However, unlike the non-grouping case,
f additionally partitions the list into a number of sublists: this means that
at every point after this statement, binders occurring before it in the comprehension
refer to lists of possible values, not single values. To help understand
this, let's look at an example:

Note that we have used the the function to change the type
of x from a list to its original numeric type. The variable y, in contrast, is left
unchanged from the list form introduced by the grouping.

then group using f

With this form of the group statement, f is required to simply have the type
forall a. [a] -> [[a]], which will be used to group up the
comprehension so far directly. An example of this form is as follows:

output = [ x
| y <- [1..5]
, x <- "hello"
, then group using inits]

This will yield a list containing every prefix of the word "hello" written out 5 times:

Note: Even though most of these examples are using the list monad,
monad comprehensions work for any monad.
The base package offers all necessary instances for
lists, which make MonadComprehensions backward
compatible to built-in, transform and parallel list comprehensions.

More formally, the desugaring is as follows. We write D[ e | Q]
to mean the desugaring of the monad comprehension [ e | Q]:

The comprehension should typecheck when its desugaring would typecheck.

Monad comprehensions support rebindable syntax (Section 7.3.12, “Rebindable syntax and the implicit Prelude import”).
Without rebindable
syntax, the operators from the "standard binding" module are used; with
rebindable syntax, the operators are looked up in the current lexical scope.
For example, parallel comprehensions will be typechecked and desugared
using whatever "mzip" is in scope.

The rebindable operators must have the "Expected type" given in the
table above. These types are surprisingly general. For example, you can
use a bind operator with the type

(>>=) :: T x y a -> (a -> T y z b) -> T x z b

In the case of transform comprehensions, notice that the groups are
parameterised over some arbitrary type n (provided it
has an fmap, as well as
the comprehension being over an arbitrary monad.

7.3.12. Rebindable syntax and the implicit Prelude import

GHC normally imports
Prelude.hi files for you. If you'd
rather it didn't, then give it a
-XNoImplicitPrelude option. The idea is
that you can then import a Prelude of your own. (But don't
call it Prelude; the Haskell module
namespace is flat, and you must not conflict with any
Prelude module.)

Suppose you are importing a Prelude of your own
in order to define your own numeric class
hierarchy. It completely defeats that purpose if the
literal "1" means "Prelude.fromInteger
1", which is what the Haskell Report specifies.
So the -XRebindableSyntax
flag causes
the following pieces of built-in syntax to refer to
whatever is in scope, not the Prelude
versions:

Arrow
notation (see Section 7.13, “Arrow notation
”)
uses whatever arr,
(>>>), first,
app, (|||) and
loop functions are in scope. But unlike the
other constructs, the types of these functions must match the
Prelude types very closely. Details are in flux; if you want
to use this, ask!

-XRebindableSyntax implies -XNoImplicitPrelude.

In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
even if that is a little unexpected. For example, the
static semantics of the literal 368
is exactly that of fromInteger (368::Integer); it's fine for
fromInteger to have any of the types:

Even though there are two x's in scope,
it is clear that the x in the pattern in the
definition of ok1 can only mean the field
x from type S. Similarly for
the function ok2. However, in the record update
in bad1 and the record selection in bad2
it is not clear which of the two types is intended.

Haskell 98 regards all four as ambiguous, but with the
-XDisambiguateRecordFields flag, GHC will accept
the former two. The rules are precisely the same as those for instance
declarations in Haskell 98, where the method names on the left-hand side
of the method bindings in an instance declaration refer unambiguously
to the method of that class (provided they are in scope at all), even
if there are other variables in scope with the same name.
This reduces the clutter of qualified names when you import two
records from different modules that use the same field name.

With -XDisambiguateRecordFields you can use unqualified
field names even if the corresponding selector is only in scope qualified
For example, assuming the same module M as in our earlier example, this is legal:

Since the constructor MkS is only in scope qualified, you must
name it M.MkS, but the field x does not need
to be qualified even though M.x is in scope but x
is not. (In effect, it is qualified by the constructor.)

7.3.16. Record puns

Record puns are enabled by the flag -XNamedFieldPuns.

When using records, it is common to write a pattern that binds a
variable with the same name as a record field, such as:

data C = C {a :: Int}
f (C {a = a}) = a

Record punning permits the variable name to be elided, so one can simply
write

f (C {a}) = a

to mean the same pattern as above. That is, in a record pattern, the
pattern a expands into the pattern a =
a for the same name a.

Note that:

Record punning can also be used in an expression, writing, for example,

let a = 1 in C {a}

instead of

let a = 1 in C {a = a}

The expansion is purely syntactic, so the expanded right-hand side
expression refers to the nearest enclosing variable that is spelled the
same as the field name.

Puns and other patterns can be mixed in the same record:

data C = C {a :: Int, b :: Int}
f (C {a, b = 4}) = a

Puns can be used wherever record patterns occur (e.g. in
let bindings or at the top-level).

A pun on a qualified field name is expanded by stripping off the module qualifier.
For example:

f (C {M.a}) = a

means

f (M.C {M.a = a}) = a

(This is useful if the field selector a for constructor M.C
is only in scope in qualified form.)

7.3.17. Record wildcards

Record wildcards are enabled by the flag -XRecordWildCards.
This flag implies -XDisambiguateRecordFields.

For records with many fields, it can be tiresome to write out each field
individually in a record pattern, as in

Record wildcard syntax permits a ".." in a record
pattern, where each elided field f is replaced by the
pattern f = f. For example, the above pattern can be
written as

f (C {a = 1, ..}) = b + c + d

More details:

Wildcards can be mixed with other patterns, including puns
(Section 7.3.16, “Record puns
”); for example, in a pattern C {a
= 1, b, ..}). Additionally, record wildcards can be used
wherever record patterns occur, including in let
bindings and at the top-level. For example, the top-level binding

C {a = 1, ..} = e

defines b, c, and
d.

Record wildcards can also be used in expressions, writing, for example,

let {a = 1; b = 2; c = 3; d = 4} in C {..}

in place of

let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}

The expansion is purely syntactic, so the record wildcard
expression refers to the nearest enclosing variables that are spelled
the same as the omitted field names.

The ".." expands to the missing
in-scope record fields.
Specifically the expansion of "C {..}" includes
f if and only if:

f is a record field of constructor C.

The record field f is in scope somehow (either qualified or unqualified).

In the case of expressions (but not patterns),
the variable f is in scope unqualified,
apart from the binding of the record selector itself.

The R{..} expands to R{M.a=a},
omitting b since the record field is not in scope,
and omitting c since the variable c
is not in scope (apart from the binding of the
record selector c, of course).

7.3.18. Local Fixity Declarations

A careful reading of the Haskell 98 Report reveals that fixity
declarations (infix, infixl, and
infixr) are permitted to appear inside local bindings
such those introduced by let and
where. However, the Haskell Report does not specify
the semantics of such bindings very precisely.

In GHC, a fixity declaration may accompany a local binding:

let f = ...
infixr 3 `f`
in
...

and the fixity declaration applies wherever the binding is in scope.
For example, in a let, it applies in the right-hand
sides of other let-bindings and the body of the
letC. Or, in recursive do
expressions (Section 7.3.8, “The recursive do-notation
”), the local fixity
declarations of a let statement scope over other
statements in the group, just as the bound name does.

Moreover, a local fixity declaration *must* accompany a local binding of
that name: it is not possible to revise the fixity of name bound
elsewhere, as in

let infixr 9 $ in ...

Because local fixity declarations are technically Haskell 98, no flag is
necessary to enable them.

7.3.19. Package-qualified imports

With the -XPackageImports flag, GHC allows
import declarations to be qualified by the package name that the
module is intended to be imported from. For example:

import "network" Network.Socket

would import the module Network.Socket from
the package network (any version). This may
be used to disambiguate an import when the same module is
available from multiple packages, or is present in both the
current package being built and an external package.

Note: you probably don't need to use this feature, it was
added mainly so that we can build backwards-compatible versions of
packages when APIs change. It can lead to fragile dependencies in
the common case: modules occasionally move from one package to
another, rendering any package-qualified imports broken.

7.3.20. Safe imports

With the -XSafe, -XTrustworthy
and -XUnsafe language flags, GHC extends
the import declaration syntax to take an optional safe
keyword after the import keyword. This feature
is part of the Safe Haskell GHC extension. For example:

import safe qualified Network.Socket as NS

would import the module Network.Socket
with compilation only succeeding if Network.Socket can be
safely imported. For a description of when a import is
considered safe see Section 7.23, “Safe Haskell”

7.3.21. Summary of stolen syntax

Turning on an option that enables special syntax
might cause working Haskell 98 code to fail
to compile, perhaps because it uses a variable name which has
become a reserved word. This section lists the syntax that is
"stolen" by language extensions.
We use
notation and nonterminal names from the Haskell 98 lexical syntax
(see the Haskell 98 Report).
We only list syntax changes here that might affect
existing working programs (i.e. "stolen" syntax). Many of these
extensions will also enable new context-free syntax, but in all
cases programs written to use the new syntax would not be
compilable without the option enabled.

There are two classes of special
syntax:

New reserved words and symbols: character sequences
which are no longer available for use as identifiers in the
program.

Other special syntax: sequences of characters that have
a different meaning when this particular option is turned
on.