Introduction

Most applications need to evaluate a formula at run-time. The .NET Framework, despite allowing advanced compilation support, does offer a quick and light-weight eval function. This article introduces a usable eval function with some rarely available functionalities:

Fast, single-pass parser

Highly extensible; you can add your own variables and functions without having to change the library itself

Common priorities are supported : 2+3*5 returns 2+(3*5)=17

Boolean operations are supported, i.e. "and," "not," "or"

Comparison operators are supported, i.e. <=, >=, <>

Supports numbers, dates, strings, and objects

Supports calls to object properties, fields and methods

Runs an expression multiple times without needing reparsing

Can automatically detect when an expression needs to be re-evaluated

The expression syntax is completely checked before starting the evaluation

Fully human-readable code -- not generated using Lex/Yacc tools -- and therefore permits amendments to the core syntax of the evaluator, if needed

This article also attempts to explain how the whole thing works.

Why an interpreter?

People often tell me that there is no place in their application for an evaluator because it is too complicated for their users. I do not agree with this vision. An evaluator is a cheap way to HIDE the complexity for the average user and provide powerful features for the advanced user. Lets take an example.

In your application, you let the users choose the title of the window. This is convenient and simple; it's just a textbox where they can type what they want. The difficulty comes when some users want more. Let's say that they want to see their User IDs or the time. Then you have 3 alternatives:

You add a form in your program and give your user more options on what they can show.

You don't do it.

You use an evaluator (i.e. mine).

The first option requires lots of work on your side and can potentially confuse the more basic users. The second option won't confuse your users, but might lose you the more advanced ones. The third option is ideal because you can keep your textbox and let the powerful user type what they want.

Title of the window for user : %[USERID], the time is %[NOW]

And you're done. The interface is still using a regular textbox and is not complicated. On the coding side, it is really not much to add. In terms of power, you can add a new variable every day and as long as you document it all, your users stay satisfied.

Why not use the .NET Framework built-in compiler?

Using the .NET Framework compilation capabilities seem to be the most obvious way to make an evaluator. However, in practice this technique has a nasty side effect. It looks like it creates a new DLL in memory each time you evaluate your function and it seems nearly impossible to unload the DLL. You can refer to remarks at the end of the article Evaluating Mathematical Expressions by Compiling C# Code at Runtime for more details.

Using other engines or application domains is an option if you want a full VBScript or C# syntax. If you need to write classes and loops, this is probably the way to go. This evaluator is neither using CodeDOM nor trying to compile VB source. It parses an expression character-by-character and evaluates its value without using any third party DLL.

Using the code

The evaluator can be run with just two lines of code:

In VB

Dim ev AsNew Eval3.Evaluator
MsgBox(ev.Parse("1+2+3").value)

In C#

Providing variables or functions to the evaluator

By default, the evaluator does not define any function or variable anymore. This way, you can really decide which function you want your evaluator to understand. To extend the evaluator, you need to create a class. Below is a VB Sample; a C# version is available in the Zip file.

There is a shared function in the evaluator to return all those types as string:

Evaluator.ConvertToString(res)

This function will return every type using a default format.

How does this all work?

If you just want to use the library, please refer to the 'Using the code' section. The following sections are just for curious people who want to know it works. The techniques I used are rather traditional and can, I hope, be a good introduction to the compilation theory.

The evaluator is made of a classic Tokenizer followed by a classic Parser. I wrote both of them in VB, without using any Lex or Bisons tools. The aim was readability over speed. Tokenizing, parsing and execution are all done in one pass. This is elegant and, at the same time, quite efficient because the evaluator never looks ahead or backwards more than one character.

The tokenization

The first thing the evaluator needs to do is split up the string you provide into a set of Tokens. This operation is called tokenization and in my library it is done by a class called tokenizer

The tokenizer reads the characters one by one and changes its state according to the characters it encounters. When it recognizes one of the Token types, it returns it to the parser. If it does not recognize a character, it will raise a syntax error exception. Once the class is created with this command,

tokenizer = new Tokenizer("1+2*3+V1")

...the evaluator will just access tokenizer.type to read the type of the first token of the string. The type returned is one of those listed in the chart below. Note that the tokenizer is not reading the entire string. To improve performance, it will only read a single token at a time and return its type. To access the next token, the evaluator will call the method tokenizer.nextToken(). When the tokenizer reaches the end of the string, it returns a special token end_of_formula.

The parser

The parser has been completely rewritten in this version. The parser is using the information provided by the tokenizer (the big brown box) to build a set of objects out of it (the stack on the right). In my library, each of these objects is called an OpCode. Each OpCode returns a value and can have parameters or not.

Opcode 1
Opcode 2
Opcode 3
Opcode *
Two Opcode +
and Opcode +

The OpCodes + and * have two parameters. The rest of the OpCodes have none. One of the more complicated concepts of the parser is that of priorities. In our expression...

1 + 2 * 3 + v1

...the evaluator has to understand that what we really mean is:

1 + (2 * 3) + v1

In other words, we need to do the multiplication first. So, how can this be done in one pass? At any time, the parser knows its level of priority:

When the parser encounters an operator, it will recursively call the parser to get the right part. When the parser returns the right part, the operator can apply its operation (for example, +) and the parsing continues. The interesting part is that while calculating the right part, the Tokenizer already knows its current level of priority. Therefore, while parsing the right part, if it detects an operator with more priority it will continue its parsing and return only the resulting value.

The interpretation

The last part of the evaluation process, is the interpretation. This part is now running a lot faster thanks to the OpCode.

To get the result out of the stack of OpCodes, you just need to call the root OpCode value. In our sample, the root OpCode is a + operator. The property Value will in turn call the value of each of the operands and the result will be added and returned. As you can see from this picture, the speed of evaluation is now quite acceptable. The program below needs 3 full expression evaluations for every single pixel in the image. For this image, it required 196,608 evaluations and, despite that, it returned in less than a second.

The class at the core of this new project is the OpCode class. The key property in the opCode class is the property 'value'.

Is that really faster?

It is faster if you need to evaluate the functions more than once. If you need to evaluate the function only once, you might not care about speed anyway. So, I would recommend this new version in either case. As you can see from the picture above, 3 formulas are evaluated for every pixel of the image. The image being 256x256 pixels, the evaluator had to calculate 196,608 expressions. So, simple expressions are returned in less than 5 microseconds. I think this is acceptable for most applications.

Dynamic variables

Dynamic variables are an interesting concept. The idea is that if you use several formulas in your application, you don't want to recalculate all the formulas when a variable changes. The evaluator as a built-in ability to do that. On this page, the program uses the dynamic ability:

You said it supports objects?

Yes, the evaluator supports the . operator. If you enter the expression theForm.text then the evaluator will return the title of the form. If you enter the expression theForm.left, it will return its runtime left position. This feature is only experimental and has not been tested yet. That is why I have put this code here, hoping that others will find its features valuable and submit their improvements.

How does this work?

In fact, the object came free. I used System.Reflection to evaluate the custom functions. The same code is used to access the object's methods and properties. When the parser encounters an identifier that is a keyword without any meaning to it, it will try to reflect the CurrentObject to see if it can find a method or a property with the same name.

Are there any known bugs or requests?

The following are requests/bugs from the original project:

Someone reported that you need the option 'Compare Text' for the evaluator to work properly. I think this is fixed now. If you want the evaluator to be case-sensitive you can ask for it in the evaluator constructor.

Someone also reported that the evaluator did not like having a comma as a decimal point in the windows international settings. This is fixed, too, I believe.

My request: If you find this library useful or interesting, don't forget to vote for me. :-)

Points of interest

Speed Tests: I wish I could have the time to compare various eval methods. If someone wants to help, please contact me. To my knowledge, this is the only formula evaluator available on CodeProject with a separate Tokenizer, Parser and Interpretor. The extensibility is extremely easy due to internal use of System.Reflection.

History

18th May 2007

Article edited and posted to the main CodeProject.com article base.

4th May 2006

Fix a bug introduced in the last version where the functions were not recognized properly.

Add a few more samples in both C# and VB sample programs using Arrays and default members (Controls.Item).

27th April 2006

Implements Array

Starts differencing C# and Vb

20th April 2006

Try to Improve the article with more pictures.

19rd April 2006

C# compatibility (a few variables and members were renamed to avoid c# keywords conflicts).

C# sample

Move the core evaluator within a DLL

Allow 'on the fly variable' through a new interface called 'iVariableBag'

First of all thank you very much for this program as this is very useful. I m facing errors in all functions that have optional parameters. Whenever i pass parameters less than total parameters (or mandatory parameters), it is giving me an error "Parameter Count Mismatch". I went through the code and found a method called "System.Type.GetMember" which is giving the name of method , system have to call , but there is optional parameter is not specified in return value, therefore method is expecting all parameters including optional ones it is giving this error.

Please Help me in this code how to solve this if you can. i will be very thankful to you.

I tried adding the library to my project, and got various errors, mostly that some functions did not return values on all paths, or that a variable was not initialized. So I fixed those, but now I get the following message: "Eval3' is not declared. It may be inaccessible due to its protection level."
Also, some files start with "Imports Eval3". What is Eval3? Where is it?
The function that has the error message in it is:

I found the problem. I was not including assemblyinfo.vb in my project. My next question is: how do I supply variables (such as x = x * 2) where I know the value of x, but still want to supply it has a variable and not a value?

i need a way to program on VBA Excel a function than expand algebraic expression without evaluation
I mean feeding "(a+b+c)*(d+e+f)" (literally typed) get "a*d+b*d+c*d+a*e+b*e+c*e+a*f+b*f+c*f"
I has to only work for () * and +
Could this code be adapted?
Jaime Segura

I don't know what happened. After seeing this article, I want to try download related codes more times, but download link always guide me to the page without start download.
Please someone send it to me, thanks very much.

Very excellent parser. I am using the latest Eval4 version hosted on assembla. I am trying to use this as a formula expression evaluator. I have dynamic heads which are created by the user from the application. I want to add those heads as a variable so that the parser can evaluate it correctly.

Currently I have few fixed variables like this:

Public Present_Days As Eval4.Core.Variable(OfDouble)
Public Absent_Days As Eval4.Core.Variable(OfDouble)
Present_Days = New Eval4.Core.Variable(OfDouble)(CDbl(25), "Present_Days")
Absent_Days = New Eval4.Core.Variable(OfDouble)(CDbl(5), "Absent_Days")

Is it possible to create something like this ? This is just a rough implementation.

First of all thanks for replying. My question is not about changing the value of the variable. It is about what happens after the value is changed. I am trying to trap the event that fire after the value is changed. I tried some thing like this but it does not work.

really like this software!
There is one problem I cannot figure out.
When running a formula like (1=1), the parser returns True.
How can I make the parser return 1 and 0 instead of True/False?
The formulas I have to evaluate use frequently the syntax e.g. (Variable=Value)*1.
Like a conditional statement.
Currently with the return being True/False, this expression cannot be evaluated.