Introduction

Flee is a .NET library that allows you to parse and evaluate arbitrary expressions. The main feature that distinguishes it from other expression evaluators is that it uses a custom compiler to convert expressions into IL and then, using Lightweight CodeGen, emits the IL to a dynamic method at runtime. This means that expression evaluation is several orders of magnitude faster than when using interpretive expression evaluators. In fact, the entire design of the library and the expression language is geared towards making evaluation as fast as possible.

Features

Here a list of the major features:

Fast and efficient expression evaluation

Compiles expressions to IL using a custom compiler

Small, lightweight library

Code generated for expressions can be garbage-collected

Does not create any dynamic assemblies that stay in memory

Supports all arithmetic operations including the power (^) operator

Supports string, char, boolean, and floating-point literals

Supports 32/64 bit, signed/unsigned, and hex integer literals

Features a true conditional operator

Emits short-circuited logical operations

Allows for customization of expression compilation

About Lightweight CodeGen

Lightweight CodeGen (LCG) is a feature introduced in .NET 2.0 to allow for efficient runtime code generation. Using its DynamicMethod class, you are able to create a method at runtime, set the code of the method body, and finally call the created method. Its main advantages are that the generated method and its code can be garbage collected when no longer referenced, and it does not generate any dynamic assemblies that stay in memory. One of its weaknesses is that the only way to create the method body is to emit IL. Since IL is basically a form of assembly language, performing even simple tasks requires lots of instructions and an understanding of the low-level workings of the CLR. For example, here is the IL required to evaluate the expression DateTime.Now.ToString():

Automating the translation of expressions to IL is where this library comes in. Essentially, it does the same thing as a C# compiler except that it compiles at runtime, emits its IL to memory instead of an assembly, and is designed to work with expressions (which are a subset of a complete programming language).

Walkthrough: Creating and Evaluating an Expression

For this walkthrough, we are going to compute the hypotenuse of a triangle with sides a and b, using the expression sqrt(a^2 + b^2). To start, we will need to add a reference to the ciloci.Flee.dll and import the ciloci.Flee namespace. First, we need to declare our expression owner. This is a class to which the dynamic method generated by the expression will be attached. Once attached, the dynamic method will behave as a static function on the class and will have access to all of its members (even non-public ones). We will declare a class and give it two fields to represent the a and b sides of the triangle:

publicclass ExpressionOwner
{
publicint a;
publicint b;
}

To be able to use the sqrt function in our expression, we have to import the Math class. Importing determines what members, besides the ones on the owner, are visible to an expression. By default, no types are imported, and references to types using their fully-qualified names are not allowed. This is a security feature: since expressions are compiled and evaluated at runtime, and could be entered by users, it would not be a good idea to allow them to call arbitrary methods on any type loaded into the program. Once we import the Math class, we can use all of its constants and methods in the expression. We will use the sqrt function, but we don't need the pow function since the power operator is built in to the expression language. We will need to create an instance of the ExpressionOptions class and use its Imports property to do the actual import:

There is now a method in memory that contains the code for the expression. The expression has an Evaluator property that exposes a delegate pointing to this method. The delegate will always be an instance of ExpressionEvaluator, which is a generic delegate defined in the library. To evaluate our expression, we need to retrieve the delegate and cast it to an ExpressionEvaluator with the correct return type. Although this seems more work than just returning an object, it ensures that there will be no boxing for expressions that evaluate to value types. Applying the above, we can now evaluate our expression:

That's it! You can now update the fields on the owner, and evaluate the expression again to get the updated result. If you are curious as to what IL was generated by the expression, you can use its EmitToAssembly() method to have it dump the IL to an assembly on disk. You can then use ILDasm (or better yet, Reflector) to inspect it. Here is the IL for our hypotenuse expression:

Note how the a and b fields, which are integers, were implicitly converted to doubles for the ^ operator, and how the ^ operator was translated to a call to the Pow function. The IL stream above is only 35 bytes long, and will be JITed to native processor instructions. Compare this with interpretive evaluators that have to go through several function (and possibly Reflection) calls and if statements to evaluate the same expression.

Demo Application

There is also a demo application that shows a practical usage of the library. The concept was inspired by Pascal Ganaye's demo for his expression evaluator. Essentially, the demo generates an image by evaluating expressions for the red, green, and blue components where the expressions can reference the x and y co-ordinates of the current pixel. This is, basically, a color version of plotting a graph for a function. Using the demo will give you a good sense of how fast expressions can be evaluated. Even on my outdated computer at home, I still manage over a million evaluations per second. This means that, on a modern computer, you could generate an 800x600 image in under a second.

The Expression Language

The expression language that this library parses is a mix of elements of C# and VB.NET. Since the aim of this library is speed, the language is strongly typed (same rules as C#) and there is no late binding. Unlike C#, the language is not case-sensitive. Here is a breakdown of the language elements:

Element

Description

Example

+, -

Additive

100 + a

*, /, %

Multiplicative

100 * 2 / (3 % 2)

^

Power

2 ^ 16

-

Negation

-6 + 10

+

Concatenation

"abc" + "def"

<<, >>

Shift

0x80 >> 2

=, <>, <, >, <=, >=

Comparison

2.5 > 100

And, Or, Xor, Not

Logical

(1 > 10) and (true or not false)

And, Or, Xor, Not

Bitwise

100 And 44 or (not 255)

If

Conditional

If(a > 100, "greater", "less")

Cast

Cast and conversion

cast(100.25, int)

[]

Array index

1 + arr[i+1]

.

Member

varA.varB.function("a")

String literal

"string!"

Char literal

'c'

Boolean literal

true AND false

Real literal

Double and single

100.25 + 100.25f

Integer literal

Signed/unsigned 32/64 bit

100 + 100U + 100L + 100LU

Hex literal

0xFF + 0xABCDU + 0x80L + 0xC9LU

License

The library is licensed under the LGPL. This means that as long as you dynamically link (i.e., add a reference) to the assemblies from the official releases, you are free to use the library for any purpose.

Implementation Details

Implementing an expression evaluator requires two things: a parser that converts an expression string into a syntax tree, and a compiler to take the tree and convert it into IL.

The Parser

I chose not to write my own parser, and used Grammatica instead because it is written by a guy who specializes in parsing and language theory. Grammatica is a parser generator: a program that takes a grammar (set of rules for a language) and outputs a class that will parse it. Writing grammars for it is not difficult if you are comfortable with recursion and Regular Expressions. It has great error handling, and will report any invalid grammatical constructs. The source code package for this library includes the grammar file for the expression language, so you can see how it all works. Grammatica is a great tool if you want to do parsing that requires more power than Regular Expressions can provide. Don't be fooled by its alpha status, it is solid as a rock, and has never given me problems.

Once I wrote the grammar for the expression language, I had Grammatica generate a parser for me. The parser class has callback methods that I can hook into to let me know when the parser has hit a certain language element. From there on, it's up to me to decide what to do with that element. As it turns out, the best way to represent the language elements is via a tree structure. An expression like 1 + 2 will be represented as an ADD element with two children representing the operands. Every element validates its children as they are set to make sure that they are valid for its particular operation, and throws an ExpressionCompileException if they are not. As you keep parsing, the tree gets built up until you reach the end of the expression and you are left with one root node representing the whole expression. At this point, we have a valid expression and its compiled representation; we are finished with parsing and have to move on to code generation.

The Compiler

We now have a tree consisting of various operators and operands that we have to translate to IL. Every element in our tree derives from the ExpressionElement class, which requires that the element be able to emit IL. From here, the main challenge is to emit IL that is as compact as possible and follows the type rules of the CLR. This means that you always have to watch for whether you are working with a value type or reference type. The IL emitted for the expression mystring.func() is different than the IL for myint.func(). To generate the IL for our method, we let the root element emit its IL and the IL of all its children. We then call the CreateDelegate method on our DynamicMethod to "burn" the code into memory and get a delegate that points to it. From here, the method's code is set and cannot be changed.

Testing

One thing that is essential for a project like this is unit testing. As you tweak the grammar for your expression, you have to make sure that it doesn't invalidate previously valid expressions and that the result of evaluating an expression is correct. Fortunately, this type of project lends itself perfectly to unit testing. All you have to do is create a test consisting of a text file with an expression on each line and a loop that feeds the expression into the evaluator. If you don't get an exception and the result is correct, the test passes. As you build up your language and add new elements, you keep adding to your text file, so that at any point, you can be sure that you haven't broken any previous expressions.

Conclusion

The idea for this project came from the lessons I learned from my previous project. It turns out that people mainly want to be able to define variables and evaluate them in expressions. The previous project lets them do that, but since it is Excel compatible, it has to evaluate expressions in a particular way and support a more dynamic type system. For this project, I wanted to remove those limitations and try to create a fast and efficient expression evaluator. I also wanted to try my hand at building a compiler since it's something every programmer should do. I look forward to your feedback, and plan on actively maintaining and improving this library based on it. Thanks!

Comments and Discussions

I am trying to evaluate below expression and its returning index out of bound error:

IDynamicExpression ext = context.CompileDynamic("IF(1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=1 AND 1=0,\"True\",\"False\")");bool bResult = (bool)ext.Evaluate();

If I will execute above expression without IF, it is returning result.

The functionality would remain exactly the same, as would all of the method signatures.

But I see the project has been inactive on CodePlex for quite a while. Would you be interested in hosting a slightly updated version on your CodePlex page, or would we be better off creating a fork and setting up our own CodePlex page?

Based on the different combinations of "parameters" values I need to get different values back. I need to define the logic in app.config to be able to change it “dynamically” (without changing my application).

So I thought about a rule set defined like following stored in app.config (example based on the input XML file above):

Lately I have successfully used your flee library in my application. It is a very good library!!There is one piece of functionality that I miss: Traversal of the parse tree seems not be possible.I found some clue on the flee.codeplex.com website (see http://flee.codeplex.com/Thread/View.aspx?"ThreadId=59791) that this kind of development is under way.Can you give a hint when it will be released?

Excellent. Just a quick question. Why didn't you try just to port Rhino to C#? Generating IL is the biggest part of the port and having standard implementation would give you work so much more head-start?

Hi,EugeneIm developing a Payroll application, I'm evaluating your exp eval for it, my intent is to use the exp eval for evaluating the payroll rules ("formulas") I started with your "Excel-Like Formula engine" project but it's to slow during the evaluate process this one seems to be ok for my pourposes

I just want to confirm if what im seeing is for these facts: 1) Expression a + b where a = 100 and b = 200 2) Compile Time for the Expr "222 ms" 3) Evaluate the Expression "04 ms" 4) Change the Values for the variables to a = 50 and b = 50 5) Reevaluate the Expression "00 ms"

Question? Its true to assume that when I change the values a and b 4) in the formula and revaluate it then the subsecuent (re)evaluations will always get a "00 ms evaluation time" -at least in this example formula

>Its true to assume that when I change the values a and b 4) in the formula and revaluate it>then the subsecuent (re)evaluations will always get a "00 ms evaluation time"Yes. Once the expression is compiled, evaluating it just executes the same IL code and performs dictionary lookups for the variables. This means that there shouldn't be much variance in evaluation speed from one evaluation to the next.

From my tests, evaluation speed is always in the microseconds range rather than milliseconds. I've also found that Flee is about an order of magnitude faster than my excel formula library. This means that you can take whatever timings you got from the formula engine, divide them by 10, and you should get a good idea of the performance improvement that Flee will give you.

We may provide a proprity on how date and time variable are evaluated by just providing the formatting stringex Flee.DateFromattingString = "'yyyy-MM-dd HH:mm:ss'"orFlee.DateFromattingString = "#yyyy/MM/dd HH:mm:ss#"

Then I think I could be a Good Idea to convert datetime to serials (numeric data) like excel does to ease the calculation.

Flee 0.9.22.0 now supports DateTime and TimeSpan literals. Since Flee supports all overloaded operators on those types as well as all methods and properties, you have the entire .NET DateTime API available for use in expressions.

Example:Subtract two dates:

#2008/08/12# - #2008/08/04#

Add a timespan to a date:

#2008/08/12# + ##14:45#

Format a date:

#2008/08/12#.ToLongDateString()

The format of the dates is customizable using the ExpressionOptions.DateTimeFormat property.

Ok I will explain more why i would like to get directly in the dictionnary. In my Project The Expression will be enter from the GUI in a texbox, Like in the demo version. But the user wont know the name of the dictionnary of course...

In my project most variables arent define at the begining. You can define them by using an assignation operator. By putting the equation x=5 the user add the varaiable x to the owner with a fixed value of 5. I have already deal with the assignation problem simply by doing a preparsing to separate the assign variable (left side of the =)from it's value (the rest of the equation)

I was thinking that my owner could simply containt one dictionnary and each time a variable is assignI simply add it to the dictionnary in the owner so that the variable will be know in futur equation. But for for now i simply cannot figure out wich part of the code i have ot modify to do that..

So it would work a little bit like writting a series of equations in Matlab...

I think you can achieve what you want by using the variables collection. Something like the following:- Create an expression context. This holds the expression's variables.- User enters the expression- You parse out the variable name and the actual expression- You compile the expression- You add the compiled expression to your context's variables collection using the variable name your parsed (Expressions can be used as variables for other expressions)- If you now compile a new expression using the same expression context and reference one of the variables you added, that variable's expression will be evaluated and you'll get the desired result.

Is it possible make variable names case-sensitive? It is critical for me...

-- modified at 12:45 Wednesday 10th October, 2007

Another feature which would be VERY useful to somehow get a collection of used variables without passing them explicitely. E.g. DynamicOwner would create a collection { M, m, G, r } from an expression "G*M*m/r^2".

Thank you in advance.

Greetings - Gajatko

Portable.NET is part of DotGNU, a project to build a complete Free Software replacement for .NET - a system that truly belongs to the developers.

What a pity... Do you know any good and simple (not necessary fast) math parser you could recommend? My goal is to determine whether two math expressions are equivalent. They may contain some simple functions, like sin/cos/abs/exp/arcsin etc. and case-sensitive variables, both in greek* and arabic alphabet (*wishful thinking). Expressions are collected straight from a classroom - maths, physichs, chemistry, etc.. The ideal solution would be a perfect SIMPLIFY procedure, but i would be satisfied with comparing them by putting random numeric values.

So what lib would you use in such case?

Greetings - Gajatko

Portable.NET is part of DotGNU, a project to build a complete Free Software replacement for .NET - a system that truly belongs to the developers.

Could you implement IEnumerable for DynamicExpressionOwner? Then it would be much easier to access unknown number of variables. (E.g. set random values few times to compare results of two (different?) expressions).

Greetings - Gajatko

Portable.NET is part of DotGNU, a project to build a complete Free Software replacement for .NET - a system that truly belongs to the developers.

Thanks for the feedback. To use the values in an array, you just declare a field or property as an array on the expression owner. You can then use the [] indexer on it in an expression. The indexer can also be used on values that declare a default indexer property (such as collections).

There is one bug I discovered, though:at ciloci.Flee.DoubleConstantElement.GetValue(String image)throws System.FormatException when a decimal number in format for example 3,1416 is entered (in any format that uses non-standard decimal separator - '.') - this number format is used in some cultures.

Also you have probably forgotten to upload the source code, though it is possible to grab it from CodePlex.

Thanks for your feedback. So far it (and the article) haven't proven to be very popular and I was starting to wonder if I'd done something wrong.

Juraj Borza wrote:

a decimal number in format for example 3,1416 is entered (in any format that uses non-standard decimal separator - '.') - this number format is used in some cultures.

No problem, I will put in a culture-sensitive decimal separator for real numbers. Do you think I should also put in a culture-sensitive list separator for function arguments? So, for example, you could type "func(a; b; c)" instead of "func(a, b, c)". Is that as important as the decimal separator or do people in other cultures always use the "," as a list separator?

Also, how were you able to get the DoubleConstant to throw a FormatException? If you enter a number like "3,124", the parser should throw a syntax error and not even get to the parse method on that element.

Juraj Borza wrote:

Also you have probably forgotten to upload the source code

I double checked and I do have a source link and it does point to the right zip file (I was able to download it). Does the link not show up for you?

Just the following comment. I suspect that now you can handle a comma (',') as a decimal separator you'll now have issues when handling numeric constants as function arguments.e.g.

Sqrt(12,34)

May be incorrectly parsed as two double arguments 12.0 and 34.0or more likely the following

Pow(12,3)

will be incorrectly parsed as a single argument of 12.3 to the Pow function that takes 2 arguments.I guess you'll need to add an NUnit for that?

Typically, computer languages DO NOT use the current culture's decimal separator when interpreting floating point literals. i.e. in C# the decimal separator is always a period ('.') character when used in numeric literals. This is the desired behaviour - you don't want the compiler to generate different code (or fail to compile as the case may be) when the user's current culture is changed but the source code hasn't. Since you are compiling these expressions at run time then maybe this behaviour is okay, however if the raw expression strings where ever persisted then it can mean the behaviour could change depending on the current culture.

I suspect that now you can handle a comma (',') as a decimal separator you'll now have issues when handling numeric constants as function arguments.

I was thinking about this yesterday when I made the change.

Mark Dunmill wrote:

Sqrt(12,34)

This will be parsed as 12.34 because Grammatica can use Regexes as tokens and "12,34" will be picked up as a token before it gets to the argument list production.

Mark Dunmill wrote:

Pow(12,3)

I ran into this same problem in my previous project (Excel has a culture-sensitive decimal separator). The reason I got away with it there is because I also used the list separator of the current culture. As it turns out, there is no culture where the decimal and list separators are both ",".

Mark Dunmill wrote:

Typically, computer languages DO NOT use the current culture's decimal separator when interpreting floating point literals

I see your point.

Mark Dunmill wrote:

if the raw expression strings where ever persisted then it can mean the behaviour could change depending on the current culture.

I'm still deciding how I'm going to serialize expressions. I can either save the raw expression string and reparse it on deserialization or save the compiled expression tree. The .NET regex class does the former. What's your opinion?

Based on your feedback, I think I'm going to retract this feature. If someone needs expressions with the correct decimal separator, they can always run a regex replace on the expression string before they give it to the expression class.

Regarding how to serialize the expressions, it clearly is much easier to just save the raw expression string. It also means that you can easily preserve the user's initial formatting of the expression string. If you were to save the expression tree then you'd need to traverse the tree appropriately to retrieve the text string and you’d be hard pressed to get the user’s exact formatting unless you annotated the expression tree with extra info.

On the other hand this could be seen as an advantage as a way to ensure consistent formatting e.g. consistent casing of identifiers, spacing etc. This can also be useful if say a variable name is changed you could generate the new expression text without the need to do a specialised regex search and replace, being careful not to replace inside those string literals etc.

Another issue is, do you ever need to persist “invalid” expressions (i.e. syntax errors)? Since there is no valid expression tree in this case, then this could cause issues. Clearly using the raw string makes this easy. (A workaround would be to create an “Illegal Expression” object type and set the text to the raw expression text and also contain a reference to the syntax error.)

It's also a time vs. space thing. Saving the raw expression text is very compact but will take some CPU cycles to do the reparsing. Saving the expression tree will take more space, but deserialization will be fast.

I was also thinking of making it an option, so that the expression user gets to choose the best approach but I don't know if that's over-designing it.

Hmm. Right now the only way to create an expression is through its constructor. If an expression is invalid, an exception will always be thrown so it is not possible to have an expression instance that represents an invalid expression.