CLR Dynamic languages under the hood (Part 1 of many)

CLR Dynamic languages under the hood (Part 1 of many)

There seems to be a fair amount of recent press and blog action surrounding the dynamic or “scripting” language movement, especially when the context includes virtual machines. While I wont bother commenting on why this is the case, I figured I would cook up a few rough notes (and I do stress that these are just notes) that describe how these languages effectively and efficiently target the CLR - hopefully this will inspire you to start toying with writing or porting a language on your own. I paraphrase what John Gough (Author of Component Pascal http://www.plas.fit.qut.edu.au/gpcp/Default.aspx) stated recently (http://channel9.msdn.com/ShowPost.aspx?PostID=49330): “Compilers are fun, they’re full of all kinds of dirty little tricks; you can be really creative because you’re dealing with the nitty gritty down at the bare metal”. I couldn’t agree more.

Disclaimer:

I hope this post will end up being the first of a series of posts on technical notes surrounding dynamic languages on the CLR. Once again though, I do stress that these are just loose ideas – there are many many ways to do what I’m showing here, the intent is just to get your brains thinking about the interesting challenges and idea’s surrounding the dynamic language space.

Secondly – none of this is optimized. The loose idea’s I illustrate are based on hypotheticals and the mapping to the CLR is sometimes far from good (you’ll see boxing, and common perf unfriendly patterns all over the place). Just keep in mind that a good, solid dynamic language compiler has the chance to optimize code for even better performance on the runtime. These optimizations, while interesting, are best left to another post.

The high level first, please… (skip if necessary)

What encapsulates a dynamic language? A few people have been successful in encapsulating full and detailed descriptions (http://www.tcl.tk/doc/scripting.html), but I like to think interesting dynamic languages are languages with one or more of the following things: typeless or “loosely typed” language syntax; REPL loop (Read Eval Print Loop); and most importantly, a language runtime or late-binder for expression evaluation at runtime. Of course, dynamic languages are a whole lot more than this, I’ve just constrained this post to ideas around those particular aspects.

Your typical C# program (C# being a kind of static language) will cobble together a whole slew of statements, one of which will be a declaration of a variable:

string s;Foo foo = new Foo();...

“s” and “foo” are variables with the type “string” and “Foo”. If I then try and assign an integer to the “foo” variable, the C# compiler will give me back an error. These compiler errors are because of strong typing rules associated with the language – once you type a variable, it’s hard if not impossible to change that type midway through a code block. Loose typing generally means that there are loose or no typing restrictions (you can do whatever you like really). You can do things like this (IronPython):

If you think further about how this maps to the CLR, hopefully you’ll stumble on some interesting questions. How does loose typing pan out in a runtime that has a common type system? What about that verifiability thing the CLR supports, does the verifier allow loosely typed languages to run? What is the actual type of “a”? I’ll answer some of those questions later in the post.

Gimme quick definitions: REPL (Read Eval Print Loop)

REPL’s are usually quite simple command line programs that read in a line of text and feed that to a compiler or interpreter for statement or expression evaluation. IronPython supports a REPL loop:

C:\IronPython-0.7.6\bin>IronPythonConsole.exeIronPython 0.7.6 on .NET 2.0.50215.44Copyright (c) Microsoft Corporation. All rights reserved.>>> a = "IronPython has a REPL">>> print (a)IronPython has a REPL>>>

The bolded stuff is the important part. The “>>> “ is basically a prompt for the next piece of Python code to execute. Each string passed to IronPython’s REPL actually does something (even if it doesn’t print to the screen), mostly because every line in the Python language is a statement (a language design choice I guess). This language semantic is fairly common in dynamic languages.

Not all dynamic languages have REPL’s, and that’s probably a good thing. I like the warmth of a functional IDE myself.

Gimme quick definitions: Language runtime or Late-binder

Easiest way to explain this one is with one simple but powerful statement:

o.m()

“o” being some sort of instance, and “m” being a method.

In the static language world, the compiler when it comes to figure this one out will know the type of “o” (because it was explicitly defined), and can therefore travel to that type and lookup the method “m”. If it can’t find “m”, it will throw a compiler error in your face. If it can find “m”, the compiler will happily emit IL code that simply does a call to “m”. Quite simple: compile time resolution of “m”, and static invocation of “m” via a call instruction with the metadata token all lined up ready to go. Assuming ��o” was typed as Foo, the IL would look like this example:

ldloc.0callvirt instance void Foo::m()

In the dynamic language world, this statement gets evaluated at run-time, and usually for good reason. Here are a couple of good ones: firstly, dynamic languages are about instance based resolution and invocation at the last possible moment – do away with as many compile time rules as possible; secondly, at compile time the compiler doesn’t know what the type of “o” is, so it’ll need to use the instance to figure out if it even has an “m()” or not; thirdly, there may be rules around language extensibility (yes, some languages allow you to do crazy things like override the “.” operator! Lua (www.lua.org) is one of my favourite languages because it offers this kind of extensibility), so those rules might need to be run before resolution and invocation.

So what can a dynamic language compiler do with “o” and “m”? Not much as it turns out, but it needs to emit some sort of code right? The compiler will generally emit code that calls in to a runtime helper to do the resolution and another runtime helper to do the invocation (or maybe both is done in the same helper method). Let’s have a look at what VB.NET with Option Strict Off does:

The three interesting parts are bolded: ldloc.0, ldstr “m” and the call to the method “NewLateBinding::LateCall”. LateCall is taking (along with other things) an instance (the ldloc) and a string name representing the method (ldstr “m”). If you think a little ahead, what can you do with an instance and a string name? Reflection baby, Reflection! That’s exactly what the VB.NET latebinder ends up doing – Reflecting on the “o” instance to find “m”, then a .Invoke() to actually invoke the method it finds.

Okay, there are your quick definitions. Let’s look at some technical details on how some of this stuff works.

Latebinder for callsites

Let’s look at a hypothetical example of a callsite and the late binder (the language below is imaginary so ignore that part):

Foo{ void m(o) {}}

// part we're generating code foro = Foo.ctor()o.m("test")

When our hypothetical dynamic language compiler runs across these two statements, it must generate code that forwards this request to a latebound language runtime. We’ve seen this kind of thing in the example above with VB.NET, but I can’t share the latebinder code for VB.NET so we’re contriving our own:

The IL sets up the “o.m()” callsite as a early-bound call to our LateBinder::Call method, which calls the “m” method using Reflection late-bound.

There are a few obvious situations we’ve missed with this small hypothetical example. Calls on static methods is one (simple enough, don’t bother passing in the “this” pointer). Calls to methods with more than one argument (again, simple enough, generate code that packs the arguments in to a temporary array, and create a Call overload that takes: object[] args as an argument) another. A couple of other issues/situations to think about: calls to methods that return void when LateBinder::Call return object, calls to methods with params, calling convention issues – does the dynamic language support OO style virtual calls, performance of the calls. I’ll tackle some of these issues at a later date.

Loose typing on a static virtual machine

It’s been said that the CLR isn’t too good a place for dynamic languages, mostly inferences gathered because of the inherit static nature of the CIL instructions, type and signature matching, and the CLS (Common Language Specification) and CTS (Common Type System). This is first evident when a compiler developer goes to make his or her first decision – what are my languages types, and how do they map to the CLR’s CTS? There are many common types that are provided by the runtime, and I’ve previously gone in to them here (http://blogs.msdn.com/joelpob/archive/2004/07/19/187709.aspx). The question I want to draw is this one, if I have the following:

Dim oo.m()

Or back to the IronPython example:

a = "hello world"a = 1

What type is “o” and “a”? When a compiler goes to emit code for this, it must specify in metadata the type of the variable. You can see these type declarations in ILDASM output, locals are declared with type in the .locals section, parameters are declared with type in the method signature, and members (fields, events, delegates etc) are also strongly typed.

Back to “o” and “a”. Given that these would probably be locals to a method, we need to give them a type as part of declaration, yet still be able to reassign that variable with any other instance of any given type. Enter good ‘ol polymorphism. Everything derives from System.Object, so therefore I can assign any instance of any type (including valuetypes if you box them) to an instance typed as System.Object. If you crack open the result from VB.NET or IronPython (Jscript .NET is another good example) you can verify this for yourself.

*** Note ****Of course there are alternatives to using System.Object. If the language you’re mapping doesn’t require BCL or cross-language interoprerability, you can pretty much do whatever you like in your self contained world.

What about methods generated by a dynamic language? Yep – the signatures almost always have parameters and returns typed as object. Crack open a IronPython method:

You’ll notice a couple of things here. Firstly, saysomething is a static method, returns object and takes an object. Secondly, the call to the IronPython “print” method takes an object as a parameter.

Problems with everything being object

With this in mind, consider for a second your typical Base Class Library – we try our best to be accomodating, but given the strongly typed nature of the BCL, you can imagine that your dynamic language where everything is typed as object, might have issues trying to call on it’s Base Class Library counterparts that are all strongly typed, and visa versa.

WriteLine has 19 overloads in all, 12 of which take 1 argument. Given that “a” is typed as object, which overload do you pick? Well thankfully “WriteLine” has an overload that takes object, so the compiler can safely bind to that one without any issues, but what about the cases where there exists no overloads that take object, such as:

Console.SetWindowSize(int, int)

w = 100h = 100Console.SetWindowSize(w, h)

“w” and “h” are both typed as object (in this case both “w” and “h” are most likely objects holding on to boxed ints, depending on how the language treats numbers under the hood), and runtime rules don’t allow calls to be made to a method taking two ints when passing in two objects – it’s not verifiable (although a program like this will run in full trust, use ildasm/asm and try it). This derives a question – how do you resolve (find), bind to and call a BCL method from a dynamic language? This can usually be done in the late-binder using Reflection, with a whole bunch of magic code to fill in the gaps in Reflections coercion/casting semantics.

What about BCL callback mechanisms, delegates and events? Given that methods take and return objects as parameters, how do you hook up one of these methods to an event when a delegate may be statically typed as void MyDelegate(string s)? I solve this and the rest of these “loose typing” problems later on in the post (yes I know, another forward reference, but be patient!).

IL instructions are typed, what happens now?

I’ll talk about a simple case here – you’ll have to use your imagination to extrapolate all the other problems associated with IL typing. Consider the following C# expression

2 + 4

The IL sequence that the C# compiler will emit is as follows:

ldc.i4.2ldc.i4.4add

“ldc.i4” pushes a supplied value of type int32 onto the evaluation stack as an int32. In this case, the .2 and .4 part is an alias to ldc.i4 number. The “add” instruction adds two values that are popped from the stack together and pushes the result on to the stack. There are some rules over what types “add” can pop from the stack – namely, the type of the element must be a “number” (number being things like int32, int64, natural int, float etc – the actual list can be found on msdn). Add can’t add together a Foo and Bar. Simple right?

What about our simple dynamic language case, where everything is typed as object? The “add” instruction can’t add together objects, it’s not supported (and not correct either), so what can a dynamic language do? It can generate code to a latebound “Add” method in the language runtime that will do the necessary type casting to perform the add operation.

Add takes two objects, so if we utilize the “ldc.i4” instruction to load our ints on to the stack, we’d better box them before calling the Add method. Anyway, the code for the 2 + 4 expression might look like this:

It seems nasty doesn’t it – firstly, we’re boxing all these ints, and performing a late-bound call to do a simple addition, there goes the perf right? In reality, the Add method (if it’s not virtual) will hopefully be inlined so there’s no method call penalty, and boxing – well, there are tricks you can do to speed that up (have a cache of boxed ints ready to go etc). A lot of these perf tricks are done in the IronPython compiler, so there’s some code to go digging in if one feels like it.

In fact, there are many optimizations one can do at the dynamic language compiler level to completely remove boxing all together (simple data flow analysis). Some day I’d like to talk about those.

Language specific types and BCL interop

If you’re familiar with Python, you’d have come across a fairly common Python type called a “list”. Lists are defined like so:

myPythonList = ['hello','world']

This is what I call a language specific type, meaning that the language involved has full first class (syntactic and runtime) support for that type and operations on that type. There are many decisions around what that type could look like “under the hood”, a good choice would be to see if it maps safely to an existing BCL type (Dictionary, List<T> etc) so that there is a better chance of that working interoperably with existing BCL libraries (plus, the engineering has already been done for you, so less bugs as a result right?:)). Another possible choice is to cook the type up yourself and place that type in your language runtime. This is the route the IronPython compiler chose, mostly because of common Python libraries that exist for the Python list are not supported by the BCL’s List<T>. The implementation looks like this:

IronPython has a namespace in its language runtime called “IronPython.Objects”, which contains all the IronPython defined language types (like List) that it has first class support for. You can find these source files in the IronPython source distribution.

The interesting point to make here about Objects.List is it’s BCL interop capabilities. You’ll notice that List implements IList (and IComparable), meaning anywhere a BCL member takes an IList, this Python List will happily play in that sandpit. One can further ponder the other interesting problems/opportunities available for taking this Python List and making it malleable to other types – like string[]. That way, a Python programmer can pass a BCL member (or other language that produces/consumes) which takes an object[] and Python List. You can achieve this through type marshalling at runtime – probably placing this marshalling code in your languages late-binder.

Type marshalling is just hard. Look at the COM interop namespaces in the BCL sometime – plenty of gotcha’s hanging around in there.

Static to dynamic bindings (event hookup example)

This long section is some notes on how to get purely static constructs in the type system, binding effectively to dynamic constructs at runtime. I leave it as an exercise to the reader to extrapolate other instances where this may cause headaches for dynamic languages.

Lets look at an interesting static to dynamic binding problem that dynamic languages face when targeting the CLR. Events, which are basic syntactic sugar for MultiCastDelegates are everywhere in libraries like Winforms – consider the Button.Click event, it has a strongly typed delegate EventHandler:

public delegate void EventHandler(object sender, EventArgs e);

Now consider the case where you want to subscribe to that event using a “object” based method who’s signature looks like this:

The EventHandler delegate signature clearly doesn’t match the method signature – the parameter types don’t match, and the return type has a different type and semantic. That’s not going to stop us of course, so lets dig in to how we can solve this problem.

First of all, lets look at the way we “subscribe” to events. Take the Button.Click event – subscribing to this in IL looks like the following:

“ldftn” is the IL opcode that loads a function pointer to a method on to the stack. In this case, it loads the function pointer of “clickmethod$f0”. “newobj” then instantiates the EventHandler delegate, passing in the function pointer as an argument (and null or the object instance depending if the method is static or not). Then we call the “add_Click” method, which adds the EventHandler delegate we just created to the Click event. The key thing I’m trying to point out here is – we need to obtain a delegate to the method we wish to invoke.

Enter late-bound delegates.

The Delegate class has a whole bunch of “CreateDelegate” method overloads, purely for the sole purpose of creating delegates late-bound – excellent for cases like this where the expression of an event hookup looks something like:

Button.Click += clickmethod

and resolution of “button.Click” and “clickmethod” must be done at runtime. Okay, so the language runtime resolves “button.Click” to a Reflection based EventInfo, then it resolves clickmethod, which is a method to a Reflection based MethodInfo, and it needs to now cook up a delegate that the “button.Click” event likes. If the “clickmethod” method signature matches the “.Click” event signature perfectly, then a quick call like this:

is all that’s needed. That method call hands you back a delegate, then you can bind that to the event like so:

eventInfo.AddEventHandler(null, d);

Of course, this will work fine for the case where the EventHandlerType (delegate signature) matches that of the clickmethod method signature. But by looking up at the signatures, you realize they don’t match. What’s the story?

Co and contravariant (or relaxed) delegate support in Whidbey

Binding “void EventHandler(object sender, EventArgs e)” to “object clickmethod$f0(object s, object e)” does not exactly match. In Everett, if you tried binding these two guys together, we would throw an exception in your face. In Whidbey, we now allow contravariant parameter bindings, and covariant return parameter bindings. That means, you can be less type specific on the parameters, and more type specific on the return parameter.

Well, we hit almost all of them: clearly “object sender” matches “object s” exactly, and “object s” is less specific than “EventArgs e”, so we’re fine on the parameters, but we fail on the return type. “void” (a strange type in itself) is not more specific than object (in fact, it’s like apples and oranges here – this basically specifies a whole different calling convention). How would a dynamic language create a delegate to methods that always return object (even though in their implementation, they may not actually return anything)?

Now we need to generate IL that can push the arguments on to the stack, ready for the call. Now, if your dynamic language has special rules around calling convention (type coercion rules etc), why not just make the LCG method call your LateBinder.Call() method so the heavy lifting is done all in one place? That’s exactly what this IL does:

(Yes I did deviate from my previous LateBinder example. Just imagine that this LateBinder.Call method does a string based lookup for a MethodInfo, then does an invoke… *grin*)

Now the interesting part. The IL sequence to basically pop off that result from LateBinder.Call if the delegate return type is typed as void, or cast the “object” return from the LateBinder.Call method to the return type of the delegate.

And we’re done. We’ve created a delegate with a signature that matches the event latebound, hooked it up to an LCG method that does whatever shuffling required and then invokes our LateBinder.Call method to actually call the method we want to call. Very cool huh?

Last but not least…

That’s me done for this post. I’ve got a whole bunch more I want to talk about in this space, but I’ll keep that for another day. I’m going to be at the PDC (and Jim Hugunin (http://blogs.msdn.com/hugunin/) should hopefully be there), so if you’re going to be there and you’re interested in learning/discussing more, drop me an e-mail.