From the outside in

Notice that the forall that we're used to not seeing in Haskell is explicit in Core (bye bye, syntactic sugar!).

Notice also that the type parameter named a in Haskell got renamed to a_abp, so that it's unique.

If a crops up in a signature for another top-level function, it will be renamed to something different. This "uniqueness renaming" can sometimes make following types a little confusing.

Type names are fully qualified: GHC.Types.Int instead of Int.

Function annotations

[GblId, Arity=1]

This is a global identifier, and is a function that takes one parameter.

Type application

Length.len0 = \ (@ a_aov) (ds_dpn :: [a_aov]) ->

The '@' annotation here is a type application: GHC is applying the type a_aov (another renaming of a) to the function.

Type applications are of little real interest to us right here, but at least we know what this notation is (and we'll see it again soon).

Case analysis, part 1

case ds_dpn of _ { [] ->GHC.Types.I#0;

This looks like regular Haskell. Hooray!

Since that's hardly interesting, let's focus on the right hand side above, namely this expression:

GHC.Types.I#0

The I# above is the value constructor for the Int type.

This indicates that we are allocating a boxed integer on the heap.

Case analysis, part 2

: ds1_dpo xs_abq ->

Normal pattern matching on the list type's : constructor. In Core, we use prefix notation, since we've eliminated syntactic sugar.

GHC.Num.+@GHC.Types.IntGHC.Num.$fNumInt

We're calling the + operator, applied to the Int type.

The use of GHC.Num.$fNumInt is a dictionary.

It indicates that we are passing the Num dictionary for the Int type to +, so that it can determine which function to really call.

In other words, dictionary passing has gone from implicit in Haskell to explicit in Core. This will be really helpful!

The actual parameters to +

Finally, we allocate an integer on the heap.

We'll add it to the result of calling len0 on the second argument to the : constructor, where we're applying the a_aov type again.

(GHC.Types.I#1) (Length.len0 @ a_aov xs_abq)

Strictness in Core

In System FC, all evaluation is controlled through case expressions. A use of case demands that an expression be evaluated to WHNF, i.e. to the outermost constructor.

Some examples:

-- Haskell:foo (Bar a b) ={- ... -}

-- Core:foo wa =case wa of _ { Bar a b ->{- ... -} }

-- Haskell:{-# LANGUAGE BangPatterns #-}let!a =2+2in foo a

-- Core:case2+2of a { __DEFAULT-> foo a }

-- Haskell:a `seq` b

-- Core:case a of _ { __DEFAULT-> b }

Pop quiz!

Inspect the output of ghc -ddump-simpl and tell me which values are, and which are not, being forcibly evaluated in the definition of sum0.

In return, I'll tell you why we got this error message:

*** Exception: stack overflow

The evaluation stack

There is no such thing as a regular "call stack" in Haskell, no analogue to the stack you're used to thinking of in C or Python or whatever.

When GHC hits a case expression, and must evaluate a possibly thunked expression to WHNF, it uses an internal stack.

This stack has a fixed size, which defaults to 8MB.

The size of the stack is fixed to prevent a program that's stuck in an infinite loop from consuming all memory.

Most of the time, if you have a thunk that requires anywhere close to 8MB to evaluate, there's likely a problem in your code.

The perils of chained thunks

There are a few ways in which chained thunks can cause us harm.

Besides stack overflows, I can think of two more problems off the top of my head.

Please see if you can tell me what those problems are.

The perils of chained thunks

There are a few ways in which chained thunks can cause us harm.

Besides stack overflows, I can think of two more problems off the top of my head.

They have a space cost, since they must be allocated on the heap.

They come with a time cost, once evaluation to WHNF is demanded.

So ... thunks are bad?

No, because they enable lazy evaluation.

What's bad is not knowing when lazy or strict evaluation is occurring.

But now that you can read -ddump-simpl output and find those case expressions, you'll be able to tell immediately.

With a little experience, you'll often be able to determine the strictness properties of small Haskell snippets by inspection. (For those times when you can't, -ddump-simpl will still be your friend.)

Pro tips

If you're using GHC 7.2 or newer and want to read simplifier output, consider using options like -dsuppress-all to prevent GHC from annotating the Core.

It makes the dumped Core more readable, but at the cost of information that is sometimes useful.

There's a handful of these suppression options (see the GHC man page), so you can gain finer control over suppressions.

Also, try installing and using the ghc-core tool to automate some of the pain:

cabal install ghc-core

The role of reading Core

I always reach for -ddump-simplafter:

I already have a working program.

I'm happy (in principle) with the algorithms and data structures I'm using. No amount of local tweaking of strictness is going to save me from the consequences of a poor choice of algorithm or data structure!

I've written QuickCheck tests, even if only one or two.

I have measured the performance of my code and find it wanting.

A couple of minutes with simplifier output will help guide me to the one or two strictness annotations I'm likely to really need.

This saves me from the common newbie mistake of a random splatter of unnecessary strictness annotations, indicating a high level of panic and lack of understanding.

Find out more

We've scratched the surface of some of the tools and techniques you can use, but there's plenty more to learn.