February 27, 2008

Using Reflector to diagnose tail call optimization in F#

I mentioned in my last post about some problems I was having with tail recursion, and my choice to replace certain recursive functions with iterative versions. Today we're going to use Reflector to take a look under the hood of some compiled assemblies, to determine which recursive functions have been optimised correctly and which have not.

Let's start by taking a look at the two recursive functions I mentioned last time, starting with the most simple:

// A recursive function to show the contents of a list

letrec drawPointList (x:Point3d list) =

match x with

| [] -> ()

| h :: t ->

ed.DrawVector(h,h,1,true)

drawPointList t

Here's what we see when we open the compiled assembly in Reflector and disassemble the equivalent function into C#:

[Serializable]

internalclass drawPointList@140 : FastFunc<List<Point3d>, Unit>

{

// Fields

public Editor ed;

// Methods

public drawPointList@140(Editor ed)

{

this.ed = ed;

}

publicoverride Unit Invoke(List<Point3d> x)

{

while (true)

{

List<Point3d> list = x;

if (!(list is List<Point3d>._Cons))

{

break;

}

List<Point3d> list2 = ((List<Point3d>._Cons) list)._Cons2;

Point3d from = ((List<Point3d>._Cons) list)._Cons1;

this.ed.DrawVector(from, from, 1, true);

x = list2;

}

returnnull;

}

}

You can see that the implementation of Invoke contains a while loop, which loops forever (until the list is empty), rather than calling recursively into itself.

Now let's take a look at the other, more complex function, which generates a list of points via recursion:

// A function that accepts an ObjectId and returns

// a list of random points on its surface

letrec getNPoints n (sol:Solid3d) =

if n <= 0 then

[]

else

let mp = sol.MassProperties

let pl = new Plane()

pl.Set(mp.Centroid,randomVector3d sol.ObjectId.OldId)

let reg = sol.GetSection(pl)

let ray = new Ray()

ray.BasePoint <- mp.Centroid

ray.UnitDir <- randomVectorOnPlane pl

let pts = new Point3dCollection()

reg.IntersectWith

(ray,

Intersect.OnBothOperands,

pts,

0, 0)

pl.Dispose()

reg.Dispose()

ray.Dispose()

let ptlist = Seq.untyped_to_list pts

ptlist @ getNPoints (n - pts.Count) sol

This function, when compiled by the F# compiler integrated into Visual Studio and disassembled by Reflector, equates to this C# code:

While a little harder to follow (please don't worry about the details), we can see that the recursive tail call has not been optimised out, as we don't have an iterative construct (e.g. a while loop). Instead we see a call to InvokeFast2 (which will result in Invoke being called recursively), prior to the value being returned.

Why exactly this case hasn't been handled isn't clear to me - in fact I'm pretty sure it's a bug in F#, which I will go ahead and report - but it is clear that (at least for now) we need to change the implementation if we're to avoid a stack overflow. And that's fine: the last post contains an iterative version we can use.

Finallly we're going to take a look at parts of the assembly created by building the code in the recent Robotic Hatchingposts. As I mentioned previously, I ended up hitting problems when creating 400 or so segments in the polyline hatch when using my F# implementation. Rather than automatically changing each of the recursive functions (those prefixed by "let rec") to iterative versions, I checked the disassembly listings one by one until I found the problematic one:

// A recursive function to get the points

// for our path

let

if times <= 0 then

[]

else

try

let pt =

nextBoundaryPoint cur start doTrace

(pt :: definePath pt (times-1))

with exn ->

if exn.Message = "Could not get point"then

definePath start times

else

failwith exn.Message

rec definePath start times =

Here's how the compiled (and then disassembled/reassembled) function looks in C#:

This one is actually only recurses when we catch an exception we expect to occur (and actually have thrown ourselves elsewhere in the code) occasionally during run-time. A better implementation - one that F#'s compiler is able to tail call optimise and meets my aesthetic requirements - is here:

// An iterative function to get the points

// for our path

let definePath start times =

let pts = new Point3dCollection()

pts.Add(start) |> ignore

letmutable i = 0

while i < times do

try

let pt =

nextBoundaryPoint

cur pts.[pts.Count-1] doTrace runAsync

pts.Add(pt) |> ignore

i <- i + 1

with exn ->

if exn.Message <> "Could not get point"then

failwith exn.Message

Seq.untyped_to_list pts

I won't bother showing this function's C# equivalent, as determined by Reflector, you should just trust me that it's better. :-)

I did change the implementation slightly, so we no longer add our first vertex when we create the polyline - we simply add it to the array of vertices, which is cleaner - and we also have to pass a start index of 0 into the definePath function, rather than 1. Very minor changes, but they actually do make the code slightly simpler.

Anyway, this kind of analysis and control is made possible by the fact that F# compiles to IL, just like the other .NET languages. This allows us to take advantage of existing tools for code analysis & obfuscation, and ultimately give us better, more consistent control than we would get from a functional programming implementation outside of .NET.