The problem is the way that the compiler turns pattern matches into case expressions. XXX This issue deserves some extra thought.

Wishlist of features (please feel free to add your ideas here)

Backtracing

Perhaps the most useful missing feature is the ability to see where the program has been executing just prior to a breakpoint. A lexical call stack is probably out of the question (see: ExplicitCallStack, and ExplicitCallStack/StackTraceExperience), but we could keep track of the most recently visited breakable expressions. A key question is what kind of information should we keep about the evaluation history. The simplest thing is to record just source locations. A more adventurous idea is to keep the local variables too. One probem with keeping local variables is that we would have to make sure the GC keeps them alive, so the data structure for storing the evaluation history would have to be traversable by the GC. Note that by keeping things alive, there is the danger of introducing space leaks. Obviously we would not want to keep the whole evaluation history, but only the last N events, where N is a configurable parameter. Keeping track of the history is also likely to incur a performance penalty, so it might be advantageous to be able to turn it off, or perhaps have it off by default, and be able to turn it on.

It would be especially useful to be able to get backtraces when exceptions are raised.

Temporarily disable breakpoints

Typically when we reach a breakpoint we want to inspect the values of local variables. As is often the case the values are thunks. So, to print them, we must force them in some way, and that sometimes raises more breakpoints. Often this is annoying. It would be handy if the debugger allowed us to temporarily disable all breakpoints. It should be relatively easy to implement. The main question is what is the right user interface to the feature?

Todo

P1 we shouldn't break multiple times for an exceptions, see break011 (at the end)

P1 can't do bold on Windows

P1 It's a bit strange that in "f = e" you don't get a breakpoint covering the whole binding, but in "f x = e" you do. :break f doesn't work if f is not a function, for example.

P2 :force should catch exceptions, so [1,2,undefined] would display as [1,2,< exception >]

P2 sometimes the result types are wrong (see result001 test)

P2 when showing a SrcSpan, we could say which top-level function contains it. Ideally we should store the declPath that
the coverage pass collects, but as a [OccName] rather than [String].

P2 perhaps we should have a :watch command, that could be used to save variables for future inspection (they wouldn't get
thrown away by :continue)

P2 We can disable a breakpoint with ":set stop N :continue", but this still prints out the breakpoint info when we stop.
Should we print the info only if there were no commands?

P2 Revert to adding tick information to the BCO directly, and remove the byte code instructions for breaks. I'm not sure that this is worth it. In some ways the implementation based on a byte code instruction is a little cleaner than adding breaks on BCOs directly. Though the bc instruction method may be a little slower than the other way. (MODERATE/DIFFICULT)

P2 Flag to disable breakpoints?

P2 When we restore the interactive context on resume, we throw away any new bindings made since the breakpoint. Can this
be fixed?

P2 threads and breakpoints.

P2 if a :force results in a breakpoint, we should treat it as we do other evaluations. (currently we get
"* Ignoring breakpoint").

Implementation notes

How does the debugger work?

Source code instrumentation

At the front end of the compiler we annotate the source code with ticks, based on the program coverage tool of Andy Gill and Colin Runciman. Ticks are uniquely numbered with respect to a particular module. Ticks are annotations on expressions, so each tick is associated with a source span, which identifies the start and end locations of the ticked expression.

Roughly, if E is an expression to be ticked, its annotated form becomes:

case tick<N> of _ -> E

where <N> is the module-unique number of the tick.

The ticks are added in the de-sugaring phase of the front end, and the instrumentation is implemented in deSugar/Coverage.lhs. Note, we may eventually want to merge the source coverage tool and the debugger. If we do this, it would be useful to have a single piece of code for adding tick annotations. At the moment the debugger does not use all the ticks that the coverage tool uses.

Slightly later in the de-sugarer we add arguments ticks which correspond to the free variables in scope at the ticked expression. This is done in deSugar/DsUtils.lhs, by the function mkTickBox. If a,b,c are the free variables of a ticked expression E, then the annotation from above is elaborated to:

case tick<N> a b c of _ -> E

To make the core lint happy, we must conjure up a type for tick<N>. If a::T1, b::T2, c::T3 then the type is:

tick<N> :: T1 -> T2 -> T3 -> State# RealWorld

We are somewhat selective about where ticks go in the code, and it would be nice if this was documented properly. I will defer this until the code has stablised.

We assume, and indeed require, that each source span has at most one tick associated with it. This was not always upheld in the coverage tool (in the case of if-then-else expressions), so we had to modify the instrumentation a little bit.

For each module we also allocate an array of breakpoint flags, with one entry for each tick in that module. This array is managed by the GHC storage manager, so it can be garbage collected if the module is re-loaded and re-ticked. We retain this array inside the ModDetails data structure, which is defined in main/HscTypes.lhs. In the current implementation the array is stored inside something called ModBreaks, which also stores an associtation list of source spans and ticks. However, the exact implementation of this depends on what we want in the API for the debugger, and it is likely that it will change soon. Also, ModBreaks is in desperate need of a new home. At the moment it is floating around somewhere in the deSugar directory, but that is almost certainly the wrong place for it.

Byte code generation

In the coverage tool the ticks are turned into real code which performs a side effect when evaluated. In the debugger the ticks are purely annotations. They are used to pass information to the byte code generator, which generates special breakpoint instructions for ticked expressions. The ticks themselves are eventually deleted.

The byte code generator turns GHC Core into a bunch of Byte Code Objects (BCOs). BCOs are heap objects which correspond to top-level bindings, and let and case expressions. Each BCO contains a sequence of byte code instructions (BCIs), which are executed by the byte code interpreter (rts/Interpreter.c). Each BCO also contains some local data which is needed in the instructions.

Given a ticked expression of the form:

case tick<N> a b c of _ -> E

we translate it into:

let freshVar = E in E

(Note: if the ticked expression was already let-bound we do not do this step, since it would be pointless.) The idea is that the let expression will be turned into a BCO. We annotate the BCO with information about the tick, such as its free variables, their offsets in the stack, and the tick number. We also store a pointer in the BCO to a breakpoint array for this particular module (which was introduced by the coverage transformation, see above), and an offset into that array. The offset corresponds to the tick number. The entries of the array are (currently) boolean flags which, at runtime, determine whether we should stop at the breakpoint or not.

The BCIs for this BCO are generated as usual, and we prefix a new special breakpoint instruction on the front. Thus, when the BCO is evaluated, the first thing it will do is interpret the breakpoint instruction, and hence decide whether to break or not.

There is a downside to the introduction of lets: it causes more heap allocation in the debugged program. In particular we will allocate an expression on the heap, and then immediately evaluate it. We can tune the penalty to some extent by reducing the set of breakable expressions. More timing tests are needed to decide if the penalty is too high.

We experimented with alternative ways of implementing breakpoints, with the hope of avoiding this gratuitous heap allocation, but we ran into numerous obstacles which thwarted our attempts. The big issue is that when we hit a breakpoint we must leave the stack in a state which the GC can understand. The scheme described above works nicely because the first thing we do is interpret the break instruction for the BCO. At that point nothing has been done to the stack, so it is easy to leave it in a useful state. The same cannot be said for other types of expression (especially so because only lets and cases get turned into BCOs directly, everything else is just a sequence of BCIs). We initially thought that we could do something similar for the alternative branches of a case expression (since cases are the other kind of expression that gets turned into BCOs). The problem is that the scrutinee is unpacked before the branch is entered, and the unpacking pushes values onto the stack, leaving it in a state that the GC will not understand.

Stopping at a breakpoint at runtime in the byte code interpreter

Unfortunately this part of the story is somewhat complicated, c'est la vie.

To understand what happens it is necessary to know how GHCi evaluates an expression at the command line. When the user types in an expression (as a string) it is parsed, type checked, and compiled, and then run. In main/GHC.hs we have the function:

runStmt :: Session -> String -> IO RunResult

The Session argument contains the gobs of environmental information which is important to the compiler. The String is what the user typed in, and RunResult, is the answer that you get back if the execution terminates. RunResult is defined like so:

The first three constructors are part of the original code, and the last one, RunBreak was added for the debugger. Hopefully the first three are self-explanatory; we will explain RunBreak in due course.

Normally what happens is that runStmt forks a new thread to handle the evaluation of the expression. It then blocks on an MVar and waits for the thread to finish. This MVar is (now) called statusMVar, because it carries the execution status of the computation which is being evaluated. We will discuss its type shortly. When the thread finishes it fills in statusMVar, which wakes up runStmt, and it returns a RunResult. Ultimately this gets passed back to the GHCi command line. Actually, GHCi is merely a client of the API, and other clients could also call runStmt if they wanted something evaluated.

To make the discussion comprehensible let us distinguish two threads:

The thread which runs the GHCi prompt.

The thread which is forked to run an expression.

We'll call the first one the GHCi thread, and the second the expression thread.

In the debugger, the process of evaluating an expression is made more intricate. The reason is that if the expression thread hits a breakpoint it will want to return early to the GHCi thread, so that the user can access the GHCi prompt, issue commands etcetera.

This raises a few questions:

How do we arrange for the expression thread to stop and return early?

What information needs to be passed from the expression thread to the GHCi thread, and how do we arrange that flow of information?

How do we wake up the GHCi thread and return to the prompt?

How do we continue execution of the expression thread after we have hit a breakpoint?

What happens if we are running in the GHCi thread after a breakpoint, and we evaluate some other expression which also hits a breakpoint (i.e. what about nested breakpoints?)

What happens if the expression thread forks more threads?

To arrange the early return of the expression thread when it hits a breakpoint we introduce a second MVar:

breakMVar :: MVar ()

When the expression thread hits a breakpoint it waits on breakMVar. When the user decides to continue execution after a breakpoint, the GHCi thread fills breakMVar, which wakes up the expression thread and allows it to continue execution.

Now we must return to statusMVar and look at it in more detail. We introduce a new type called Status:

data Status a
= Break RunResult
| Complete (Either Exception a)

It represents the execution status of the expression thread, which is either Complete (with an exception or a value of some type), or Break, to indicate that the thread has hit a breakpoint. It must be noted that the RunResult argument of Break is always a RunBreak. Most likely a bit of refactoring in the code could remove this bit of ugliness.

statusMVar simply contains a Status value:

statusMVar :: MVar (Status a)

The two MVars, statusMVar and breakMVar, are used like so:

When runStmt begins to execute an expression for the first time it forks the expression thread, and then waits on the statusMVar.

If the expression thread completes execution with an exception or with a final value, it fills in statusMVar with the appropriate Status value, which wakes up the GHCi thread. The Status is turned into a RunResult which gets propagated back to the command line as usual.

If the expression thread does not complete, but hits a breakpoint, it fills in the statusMVar with an appropriate Break value, and then waits on the breakMVar. The GHCi thread is woken up because of the write to statusMVar, and the RunResult is propagated back to the command line (this time it is a RunBreak).

When the user decides to continue execution after a breakpoint the GHCi thread fills in the breakMVar, thus waking up the expression thread, and then the GHCi thread waits on the statusMVar again. The whole process continues until eventually the expression thread completes its evaluation.

The arguments of RunBreak are as follows, in order from left to right:

a heap closure, specifically something which represents a chunk of the Stg stack (an StgAP_STACK, to be precise). It is inside this object that we find the values of the local variables of the breakpoint. We use a type variable here for convenience.

a thread ID of the expression thread. XXX Actually, we no longer use this value for anything, and it can probably be removed.

a BreakInfo, which stores information about the breakpoint, such as the module name, the tick number, and the stack offsets and identifiers of the local variables.

an IO action to execute when we resume execution after hitting the breakpoint. This contains code to fill and wait on the MVars mentioned earlier.

Where does the RunBreak get assembled? This is done by the I/O action which is executed by a thread when it hits a breakpoint. The code for the I/O action is as follows:

This is defined in runStmt in main/GHC.hs. We "pass" the I/O action to the runtime system by way of a global stable pointer, which is called breakPointIOAction. Note that the thread ID is possibly redundant now, but I left it there since it may be useful for other purposes. The I/O action takes two arguments: ids and apStack. The first argument is the list of local variables names, paired with their stack offsets. We need this information for printing out the local vars. The second argumet is an AP_STACK closure, which contains the top stack frame of the expression thread. This is saved when the thread hits a breakpoint in Interpreter.c. The AP_STACK is used for finding the values of the local variables of the breakpoint. So, ids and apStack are used in conjunction for inspecting local variables. Note that the I/O action proceeds to write to the statusMVar, which wakes up the GHCi thread, and then it waits on the breakMVar.

The last tricky part is how we resume execution of a thread after a breakpoint. This is the purpose of the fourth argument of Runbreak:

resume :: IO RunResult

As you can see it is an I/O action that will eventually yield a RunResult. This accords with our intuition that, at least for terminating computations, we will get another RunResult if we execute this thing. It could be another breakpoint, or it may be a final value. resume is defined like so:

The first thing it does is set up the onBreakAction global variable. Then it writes to the breakMVar which wakes up the blocked expression thread. Then it waits for statusMVar to be filled in again. Eventually when we get a status value, we call the switchInStatus function to decide what to do (either we hit another breakpoint, or we completed).

I've been a bit crafty in my implementation, and you will notice that resume and onBreakAction are mutually recursive. So in main/GHC.runStmt you will see them defined like this:

let (resume, onBreakAction)
= ( ..., ...)

The reason I did it this way is because they need to share their own versions of breakMVar and statusMVar. This must be understood in the context that we can have nested breakpoints. By writing them in this mutually recursive fashion, we can have multiple (resume, onBreaAction) pairs, and that they don't get their MVars mixed up.

When we hit a breakpoint the GHCi client pushes the resume function onto a stack. If the user evaluates a different expression, which hits another breakpoint, its resume function will be pushed on top of the old one. Eventually, when the user enters :step or :continue, the top of the resume stack is popped, and that is the action which is run next.

The view from inside the RTS: Interpreter.c

As mentioned above, we add a new kind of BCI for breakpoints. It is called bci_BRK_FUN. It is added as the first instruction to the BCI sequence for top-level and let-bindings, during Byte Code compilation. When the intepreter hits this instruction it does the following things:

Check to see if we are returning from a breakpoint (by checking a bit flag in the current TSO). If so, we don't want to stop again (this time), otherwise we'd get into an infinite loop. We record that we are no longer returning from a breakpoint, and then continue to the next BCI.

If we aren't returning from a breakpoint, then we check to see if the global single-step flag is set, or if the individual breakpoint flag for the current expression is set. If this is true, we prepare to save the stack, and call the onBreakAction. If it is not true then we skip to the next BCI.

If we are going to stop at this breakpoint, we create a new AP_STACK and copy the topmost stack frame into it. Then we push the current BCO onto the stack, and set up the onBreakAction so that when we come back to this thread the action will be executed. We then record that we are now stopping at a breakpoint, and then yield to the scheduler. When the scheduler returns to this thread the onBreakAction will be executed, which will send us back to the GHCi prompt.

The first two things are the current BCO and an info table (what do you call these things anyway?). We need these so that when we eventually resume execution from a breakpoint we will start executing the currect BCO again. The next four things correspond to the call to the onBreakAction: with its arguments pused first, then an info table, then the action itself. new_aps is the AP_STACK which saves the topmost stack frame, and arg3_freeVars corresponds to the list of local variable names paired with their stack offsets. Note that (a pointer to) this list is stored in the "pointers" array in the BCO. arg3_freeVars is actually just an integer offset into that pointer array, and it is passed as the third argument of the bci_BRK_FUN instruction.

Inspecting values

This is done exactly as it was before in the prototype debugger. See: GhciDebugger.