Expression as a Compiler

Introduction: the problem

Reflection, like it or not, it is inevitable that eventually you will have to write some code that involves looking at the members of a type at runtime (rather than at compile time). Maybe you are trying to write utility validation / serialization / ORM code, or maybe the interesting property / method is specified at runtime in a configuration file or from the database. Whatever the cause, it is likely that at some point you’ve written code involving GetType() – something like:

While this trivial example may work, it isn’t ideal – there are a number of fundamental problems:

It is relatively slow – which is fine if you use it sparingly, but is very noticeable when used in a tight loop

There is no convenient way of bundling up a set of prepared operations into reusable code

This is only exacerbated when the dynamic code gets more complex.

Of course, an obvious answer to the above is “don’t use reflection” – i.e. manually (or perhaps via code-generation tools) write a method for each type that requires it that does what we need. This naive approach is fine on the surface, but leads to massive duplication, and still doesn’t help us handle types that we don’t know about at compile time.

What we need is some form of meta-programming API. It is fortunate, then, that .NET 2.0 ships with Reflection.Emit for writing IL (the equivalent of Java’s “bytecode” in .NET) on the fly, but it forces the developer to handle all the nuances of boxing, type conversions, call-stack details, operator “lifting”, etc – in short, a lot more of the CLI than you really wanted to know about. Another option is CodeDom, but that suffers from its intended use in code-generation, and what we are trying to do quickly gets swallowed by the chore of presenting it to CodeDom.

What we need is a half-way house: that is at a high-enough level that we don’t need to worry about the actual IL, but which isn’t too high: we want our code to be as simple and expressive as possible.

Background: Enter System.Linq.Expressions.Expression

In .NET 3.5, Microsoft introduced LINQ. A key part of LINQ (in particular when talking to external data sources such as databases) is the concept of expression code as expression trees. This is used extensively, for example to indicate how we should filter data:

It isn’t obvious in the code, but that is identical to the same code written with a lambda expression:

var query = customers.Where(cust => cust.Region == "North");

The key to the power of LINQ is the lambda statement – and the precise details of the Where method. In LINQ-to-Objects, the Where method accepts a Func<T, bool> - i.e. a delegate that can be evaluated for each object (T) to return true (to include the record) or false (to exclude the record). However, with sources such as databases, the Where method accepts an Expression<Func<T,bool>>. Rather than a delegate, this is an expression tree that represents the test to perform.

The key point here is that we can build our own expression trees to represent a wide range of scenarios – and expression trees have the inbuilt ability to be compiled to a typed delegate. This gives us a simple way of writing IL at runtime.

So what is an expression tree?

Unlike regular C# that is compiled to IL, a lambda expression is compiled to code that creates an object model that represents our code. This allows code such as database providers to look at this object model, understand the intent of what we are trying to do, and translate that intent as necessary (for example, writing TSQL).

To see what the compiler does, we need to compile example code (as a lambda expression) as normal, and then look in a tool such as the excellent (and free) Reflector – but first we need to turn off the .NET 3.5 (C# 3.0) optimisations :

Then looking at the output of the disassembler, we can see something like the C# we would need to write it ourselves:

Note, however, that the compiler uses direct MemberInfo access, and illegal variable names – so you should not expect the output to compile directly – it is just for reference. Actually, it is a curiosity that the C# language specification doesn’t actually indicate how the compiler will translate your code to an expression tree, so using the compiler as a reference implementation is one of the few ways of investigating Expression.

For the purposes of writing this manually as an expression tree, it helps to think of things like equality tests (==) and member-access (.) as regular methods that accept operands:

We can now build an equivalent expression tree that accepts a single parameter of type Customer, and returns a boolean by comparing the Region property of this parameter against the string constant “North”:

The final “lambda” line simply formalises the expression into a fully standalone unit, with a defined set of parameters (in this case one) that can be used inside the expression tree. If we attempt to use a parameter inside a lambda that isn’t defined, an exception will be thrown.

The expression tree is simply an immutable object-model that represents our intent:

filter (lambda)

parameter: Customer: “cust”

body (binary)

method: equals

left (member)

member: “Region”

expression: “cust” parameter

right (constant)

string: “North”

Once we have a complete expression tree, we can either use that “as is” with LINQ providers (who will then tease apart what we mean from the tree), or we can compile it to a delegate – generating the necessary IL on the fly:

Func<Customer, bool> filterFunc = filter.Compile();

We can now use our delegate by passing in a Customer instance, returning a bool.

This might look a little verbose, but it is a lot simpler than the code we would require to represent the same via Reflection.Emit, especially given complications like boxing.

IMPORTANT: compiling an expression tree to a lambda involves dynamic code generation. To get the optimum performance, you would compile this once – and store it (for example in a field) to be re-used many times.

Doing interesting things with Expression

So far, we’ve only seen a simple example of something that we could already do in regular C# - which isn’t very impressive. So let’s look at something we can’t do in C#... I regularly see questions from people trying to find the right syntax to write a generic “add” (etc) method to perform arithmetic on arbitrary types. Simply: there is no such syntax. However, we can achieve this with Expression.

For brevity, I’ll use “var” rather than explicit variable types, but we’ll illustrate how we can easily cache the delegate instance for re-use:

Because we compile the delegate in the static constructor, it only happens once per type T, and is then re-used – so in any calling code we can now use Operator<T>. Interestingly, the simple “Add” method hides a great deal of complexity: value- vs reference-types, primitive operators (with raw IL counterparts), bespoke operators (methods added as part of a type definition) and lifted operators (operators provided automatically against Nullable<T>).

This works even for user-defined types - as long as you have defined the + operator; if no suitable operator can be found then it will throw an exception at runtime. In reality this is rarely an issue, as you are unlikely to try using Add unless you believe the type has a sensible operator.

A full implementation of generic operators is freely provided in the MiscUtil library.

What does Expression support?

In .NET 3.5, Expression supports a wide range of the operations required to either query data or to create new objects:

This will out-perform standard reflection (on an object-by-object basis) by a huge margin, without the need to maintain a separate clone method per type and the risk of forgetting to add new properties to the clone.

A similar approach might be used, for example, to map data between DTO entities and entity objects, or between DataTable and class representations of objects – or perhaps to compare two objects for property equality – writing something like:

Limitations of Expression – and the future

So far, Expression has seemed pretty versatile – however, there are some significant limitations that relate to Expression’s LINQ origins:

There is no inbuilt mechanism for changing object properties / fields

There is no way of performing a series of arbitrary operations

In the above example, we got away with performing multiple steps because we could chain all of our AndAlso into a single logical statement. However, there is simply no way (in .NET 3.5) of expressing something like the following in an expression tree (for use as a compiler):

The Bind operation we saw earlier only works when creating new objects. We could obtain the setter method, but we also have no way to chain method calls together (except as part of a “fluent” API, which this isn’t).

Fortunately, .NET 4.0 extends the Expression API with additional types and methods, in order to support the Dynamic Language Runtime (DLR). This covers a much broader spectrum of typical code:

This allows much greater use when writing dynamic code (indeed, it even includes the facility to compile into a MethodBuilder for use in runtime type generation). For example, to write a simple “for” loop that prints the numbers 0 thru 9:

Which might not seem hugely impressive – but for writing compiled code at runtime, this is a lot more convenient and flexible than the alternatives.

Earlier we looked at how to clone an object using properties – but we now have the ability to push data from one object into another – very useful with ORM tools where we might need to obtain a change-tracked object, update it with external changes (from either a DTO or a detached instance), and submit the changes. We can do this as follows:

Expressions with runtime Type creation

Another interesting feature in .NET 4.0 is that in addition to compiling an Expression to a standalone delegate, you can now use an expression tree to provide the body of a MethodBuilder (used when creating new Types at runtime), via the new CompileToMethod. This hints at a future where we can write rich types at runtime (including polymorphism and interface implementation) without ever having to touch raw IL.

Advanced Expressions in the language

Although the runtime now supports much more sophisticated expression trees, there is no additional language support for this in C# 4.0 – so the only way to create this type of expression is manually. This isn’t actually an issue, as no LINQ provider will support these expressions, and if you know the code at compile-time, you’d just write a regular method / anonymous method. It does leave some interesting possibilities for post-4.0, though. For example (and this is pure speculation), imagine if the language could use Expression.Assign and Expression.Block to build an expression tree for use with a database, something like:

I stress that this is just “what if?” – it would require a huge amount of work to implement by the LINQ providers, but as we’ve shown above: in theory the Expression object is capable of encapsulating our intent (to update multiple properties of an entity). Major language and framework changes don’t come easily though; only time will tell.

While I can't honestly say I "understand" the concepts on first reading, I think you've done a great job of enabling me to focus my study in a productive way so that eventually I "will" understand the very interesting features of LINQ you are demonstrating.

I'm still learning about expression trees and I was wondering if you could help me modify your code slightly so that I am able to only copy the properties that aren't null from a source object ot a destination object:

Hi MarcThanks for the post, I found it to be very informational and a great introduction to expressions. I did have a question regarding your property equality checker method. I ran the same with a simple anonymous type and it worked just fine. However, when I ran it against two strings, my application threw a VerificationException with the message "Operation could destabilize the runtime." Would you happen to know why this particular exception was thrown and if there are similar caveats to watch out for when writing expressions which are to be compiled?ThanksTejas

Thanks for the content. I am new to .net development; I can created VS projects and run them. How can I validate the LINQ code you shared here? I have a Win 10 laptop and an Ubuntu laptop. I have installed databases on Ubuntu in the past, and I can do it again. I like to know the shortest way. I think Tejas just used a text file. Can I do that and put the data in there?

InfoQ Weekly Newsletter

Join a community of over 250 K senior developers by signing up for our newsletter. If you are based in the EEA, please contact us so we can provide you with the protections afforded to you under EEA protection laws.

InfoQ Weekly Newsletter

Join a community of over 250 K senior developers by signing up for our newsletter. If you are based in the EEA, please contact us so we can provide you with the protections afforded to you under EEA protection laws.

Is your profile up-to-date? Please take a moment to review and update.

Email Address

Note: If updating/changing your email, a validation request will be sent

Company name:

Keep current company name

Update Company name to:

Company role:

Keep current company role

Update company role to:

Company size:

Keep current company Size

Update company size to:

Country/Zone:

Keep current country/zone

Update country/zone to:

State/Province/Region:

Keep current state/province/region

Update state/province/region to:

Subscribe to our newsletter?

Subscribe to our architect newsletter?

Subscribe to our industry email notices?

By subscribing to this email, we may send you content based on your previous topic interests. See our privacy notice for details.

You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.