My first idea was to create a parser to color some text. As I didn't know how to start seriously, I remembered an example I read from Bjarne Stroustrup's The C++ Language. The sample was a simple calculator that took characters from the standard input stream (console mode) to parse them for mathematical expressions. I reproduced it, improved it, and here is what I did. Enjoy...

Parsing is an interesting part of programming because we always need to analyze a stream to guess what the user wants the program to do. On "graphical windows", this is very well simplified because, in general, there is no need for full sentences on the input stream such as a command line. Anyway, that's what my calculator does, and I'll do my best to explain how.

At first, I'll present the parser, its members, its functions, and how they interact with each other. Then, I'll present my graphical interface, and in the end, I'll present how to use the parser in your own code or at least in your project.

VisualCalc was built at first with Microsoft Visual C++ 6. As this compiler is getting old seriously, I had to make that hard decision to migrate for a more recent version of Visual Studio, knowing that many people in the audience are still using it. I'll do my best to support VisualCalc for the ones who still don't have other compilers than VC6. If you are interested only in the use of the parser in your code, then there wouldn't be any problems to make it compile under Visual C++ 6 yet; maybe for future versions...

The VisualCalc project is going bigger and bigger, and I still have plenty of ideas to improve it again. Unfortunately, I work on it in my free times only, so the updates are done at a reasonable rhythm. However, since I've been working seriously since its first private release in August 2004, I request you to use my work with interest. That's why I place some conditions to using it:

This code may be used in compiled form in any way you desire
(including commercial use). The code may be redistributed unmodified by
any means providing it is not sold for profit without the author's
written consent, and providing that this notice and the author's name
and all copyright notices remain intact. However, this file
and the accompanying source code may not be hosted
on a website or bulletin board without the author's written permission.This software is provided "as is" without express or implied warranty.
Use it at your own risk!Whilst I have made every effort to remove any undesirable "features",
I cannot be held responsible if it causes any damage
or loss of time or data.

Hopefully, that isn't too much to ask considering the amount of work that went into this. If you do use it in a commercial application, then please send me an email letting me know.

To be compliant with the MVC (Model - View - Controller) architecture, the most important point I improved in the parser version 2.3 was to split the parser methods from the main dialog box class and create its own class. Basically, the MVC architecture requires the main layers of the application (business objects, background treatments, and graphical displays) to be separated. Now, not only the parser exists in its own class, but it has been physically separated as it remains in its own DLL project. The parser code is commented for the Doxygen documentation tool, and the generated HTML document can be downloaded from this page (see the downloads section at the top of the article).

The parser is using pure standard C++ so that it can be used on any platform. VCalcParserTypes defines several types, which will be presented later in the paragraph. One of the most important one is VALUES_TYPE:

The values handled by the parser and returned as results are currently longdoubles. I made the choice of using an alias so that I can more easily change the type for one that handles a lot of digits. Besides, I am working in parallel on my own "big decimal" data type, which would be able to store some floating point numbers bigger than what native types can. The difficulty in this part is to re-create the mathematic function that operates on those decimals. This is also the reason why I overloaded the mathematical functions (cos, sqrt, log, ...) so that the switch between the data types used will be eased.

The TokenValue type allows the GetToken() function to return the token read from the input stream, to the calling functions. It will grow up consequently when I have new operators in the calculator.

std::string m_Source;

Contains the entire input string. This is the effective stream that will be read by GetToken() to be parsed.

std::string m_strIdentifierValue;

Contains the name of a valid identifier. Whether the user sets a new variable, modifies or just recalls one, the string of the variable name will be stored here. This member also contains the function names the user calls.

std::string m_strWarningMsg;

This member stores a non blocking warning description. It can be printed or ignored by the user interface. Mine just prints it after outputting the result (for example, 0^0 gives the warning so that it is replaced by 1).

VALUES_TYPE m_valNumberValue;

Stores the value of a number (integer or floating point number) when a valid number is read from the input stream.

bool m_bWarningFlag;

This flag tells the function that ran the parser that the result is valid/printable, but a message may be added to notify the user that he made a questionable operation.

bool m_bEndEncountered;

This flag used by GetToken() is not really indispensable, but quite useful. It tells that all the input stream has been read (equivalent to eof() for files).

int m_iCurrentIndex;

Counter variable to indicate the reading position in the stream. It may be between 0 and "input string length"-1.

std::list<std::string> m_lstFunctions;

This member stores the list of the functions designed for the user.

std::map<std::string, VALUES_TYPE> m_mapVariables;

This is the table where the variables set by the user are associated with their values.

std::deque<AnswerItem> m_dqeAnswersHistory;

This list contains a set of last AnswerItems. AnswerItem is a type defined in VCalcParserTypes.h like this:

As we can see in the definition of the class, the parser is a set of seven private recursive-descendant functions plus a public function for the user interface to start the calculation. The calling tree is called "Recursive-Descendant" because each function is associated to an operator level; a function is normally called by the immediate lower level function and calls itself the immediate upper level function. The calls are explained in the following figure. In some exceptional cases, a function can call itself or another whose level is lower in the hierarchy:

Figure 1: Regular parser function call hierarchy.

As we know, the mathematical operators have priorities (for example, the basics teach us that multiplication is prior to addition). We also know another approach of the operator levels with the programming languages. So I had to parse what the user entered, taking care of the priorities. This part is implicitly improved by the recurrence of the functions. For example, the operators developed in this calculator are the following, in decreasing order of priority. The grouped operators have the same level:

Levels

Operators

9

()

parenthesis

8

+

unary plus

-

unary minus

7

!

factorial

°

Degrees to radians conversion

6

%

modulus

5

^

power

4

*

multiplication

/

division

3

+

addition

-

subtraction

2

=

assign

1

,

sequence

Figure 2 : Supported operators levels

Let's take an example of the priorities with the operator =. As it has almost the lowest level, it assigns to the variable (left operand) the result of the expression on its right. Be careful on this:

Figure 3: Be careful on operator priorities.

All the functions presented below run in the same way. First, they call the function to the upper level and get back its result. Then, m_CurrentToken is tested in a switch statement. If the token is recognized by the function, the associated operation is performed, otherwise, it returns to the lower level function.

This is the only public function the user interface can call to parse a mathematical expression. It initializes the parser members for a new calculation. It also ensures that the last answer (formula + result) is pushed in front of the answers history deque.

VALUES_TYPE Level_1(void) throw(CVCalcParserException);

This function is the lower level one. It manages the addition and subtraction operations by adding/subtracting its left operand to the value returned by Level_2(). It also takes care when the user tries to use the assign operator on a literal.

VALUES_TYPE Level_2(void) throw(CVCalcParserException);

This function manages multiplication and division. It multiplies/divides its left operand to the value returned by Level_2() (we can see here the recursive call). An exception is thrown if the dividend of a division is set to 0.

VALUES_TYPE Level_3(void) throw(CVCalcParserException);

This function manages the power operation. It raises its left operand to the power of the value returned by Level_4(). A warning is raised if the user asks for 0^0 telling that it is replaced by 1. In this case, I wanted to emit a warning to the user telling him that he asked for a special operation to the calculator. For the current version 2.24 of VisualCalc, the following cases are still not managed:

infinite^0 replaced by 1,

undef^0 replaced by 1,

1^infinite replaced by 1,

1^undef replaced by 1.

In general, positive and negative infinites and NaN cases are not supported yet.

VALUES_TYPE Level_4(void) throw(CVCalcParserException);

This function manages the modulus operation. It returns the modulus of the division of its left operand by the value returned by Level_5().

VALUES_TYPE Level_5(void) throw(CVCalcParserException);

This function manages two postfix unary operators currently implemented by parser version 2.1 : the ! and the ° operators.

°, is an operator that converts degrees to radians. Its main use is in trigonometric functions that need radians as parameters and is a shortcut for rad(). For example: cos(45°) = cos(rad(45)) = cos(pi/2).

VALUES_TYPE Primary(void) throw(CVCalcParserException);

One of the biggest functions of the parser. As it is near the highest level function, its work becomes heavy. An important note, it doesn't manage any operator that needs a left and a right operand. It calls the function at the higher level (get_token()) but doesn't get the value returned as a left operand. It is the compound of switch statements which test the following m_CurrentToken possible values:

TV_NUMBER

Returns the literal extracted from the input string.

TV_IDENTIFIER

It first tries to recognize the identifier in the functions list. If so, some tests are performed to check if the user is calling the function properly (parenthesis, with the correct number of parameters - depending on the function called ; not using the name of a function as a variable). Then, it calls the associated function and returns its value.

Secondly, the identifier may be a variable name (already existing, or just being created). It adds the value into the variables table if such a variable exists (already assigned). If the identifier is assigned (next token is = operator), it tests if the user is not trying to assign a constant (pi or e) and either affects the variable or returns an error. If the user just recalls a value previously stored in the identifier, the value is returned to the lower level calling function. If the variable is not found, an exception is thrown. It is also not allowed to implicitly multiply a variable with an expression. For example, these cases throw an exception:

2x instead of 2 * x.

foo(3!) instead of foo * (3!).

The functions available for the user in VisualCalc are detailed in the Help Dialog Box, in the Functions tab.

TV_PLUS/TV_MINUS

Returns the next Primary() with the associated sign.

TV_LP

This token is returned when an opening parenthesis is found to begin a new entire expression (not a function parameters list). Level_1() is called to evaluate the expression until the closing parenthesis is found on the input stream, otherwise, an exception is thrown.

TokenValue GetToken(void) throw(CVCalcParserException);

It is the most important function of the parser, the highest level one. It reads the characters from the input steam one by one, ignoring white spaces. We can type space/tab characters between significant characters without changing the meaning of the expression typed. Then, the usual switch tests the following cases:

*, /, +, -, ^, %, !, °, (, ), =, ,

Returns the related token to the character extracted. The token read is also stored into m_CurrentToken.

., 1, 2, 3, 4, 5, 6, 7, 8, 9, 0

Such characters can result in GetToken() returning a constant number if the characters following make a valid integer/floating point number. If two dots ('.') are found in the same number, the related exception is thrown. The number is stored in m_valNumberValue.

If one of these characters is found in the stream, the following identifier is extracted and stored in m_strIdentifierValue.

bool StepIndexForward(void) throw();

This function was created to factorize a piece a code called several times by GetToken(). Its only goal is to increase the index which is used to read the string to be parsed and then inform that the end of the string is reached, by setting the m_bEndEncountered flag.

As an error, during the parser's treatment, can occur anywhere and at any time, exceptions were the language functionality to use. The parser can throw seven types of exceptions, each inherited from CVCalcParserException. In the diagram below, the green and blue classes are abstract, that means they cannot be instantiated directly in the code:

Figure 4 : Parser exceptions hierarchy.

CVCalcParserException actually provides the public member functions to be used when catching an exception (GetExceptionNumber(), GetMessage(), GetErrorPos()). Then, each exception has its own class which inherits necessarily from one of the seven exception groups (CSyntaxException, CMathematicException, CFunctionException, CParameterException, CVariableException, CDomainException, CParserException).

The parser doesn't edit itself the controls of the graphical interface. The dialog box calls the public functions that return the current list of states.

Here are the four main areas:

The typing and the result edit boxes.

This is where the user writes the expression to be parsed and where the result appears. When an error/warning message is about to be displayed, it is printed in the result field.

The last answers/formulas list boxes.

When VisualCalc returns a correct answer (without error), the value and the formula typed are added to their respective lists. The most recent answers are inserted at the top of the list. It is possible to recall a result previously calculated, by double-clicking on it; it will be inserted at the cursor's position, or will replace the selection if many characters are selected in the input edit box. It is also possible to switch from one list to the other by clicking their button.

Figure 5: Last answers switch button.

The variables list box.

When a user assigns a new identifier, the name of the variable is added to the list. If he just modifies one, the value is internally modified, but there is no change in this window. The variables are stored in alphabetical order. Here again, it is possible to re-use a variable in a new formula by double-clicking on its name. It will be added at the cursor's position or will replace the selected characters in the input edit box. A button has been added for the variables since v2.23 to switch between the variable names and their values.

Figure 6: Variables switch button.

The functions list box

This feature is added to allow users to use the common mathematical functions. When you need a function, you can either click on its name (it is inserted at the cursor's position) or type the identifier. Be careful, the parser is case-sensitive! cos() is not the same as Cos().

Here are the functions implemented:

Functions

Use

Description

abs

abs(expr)

returns the absolute value of the expression.

Acos

Acos(expr)

returns the arc cosine of the expression. expr is expected in radians.

Ans

Ans(expr)

returns an answer in the history. expr must be between 1 and the number of answers in the list.

Asin

Asin(expr)

returns the arc sine of the expression. expr is expected in radians.

Atan

Atan(expr)

returns the arc tangent of the expression. expr is expected in radians.

cos

cos(expr)

returns the cosine of the expression. expr is expected in radians.

cosh

cosh(expr)

returns the hyperbolic cosine of the expression. expr is expected in radians.

deg

deg(expr)

returns the equivalence in degrees of the expression. expr is expected in radians.

exp

exp(expr)

returns the exponential of the expression.

ln

ln(expr)

returns the natural (Neperian, base-e) logarithm of the expression.

log

log(expr)

returns the decimal (base-10) logarithm of the expression.

logn

logn(expr, n)

returns the base-n logarithm of the expression.

nAp

nAp(n, p)

returns the arrangement of p elements in n

.

nCp

nCp(n, p)

returns the combination of p elements in n.

rad

rad(expr)

returns the equivalence in radians of the expression. expr is expected in degrees. The same as the ° operator.

sin

sin(expr)

returns the sine of the expression. expr is expected in radians.

sinh

sinh(expr)

returns the hyperbolic sine of the expression. expr is expected in radians.

sqrt

sqrt(expr)

returns the squared root of the expression.

sqrtn

sqrtn(expr, n)

returns the n-order root of the expression.

tan

tan(expr)

returns the tangent of the expression. expr is expected in radians.

tanh

tanh(expr)

returns the hyperbolic tangent of the expression. expr is expected in radians.

sum

sum(expr(var), var, low, high)

returns the sum of the expression when var goes from low to high. var must be an integer (not implemented yet).

product

product(expr(var), var, low, high)

returns the product of the expression when var goes from low to high. var must be an integer (not implemented yet).

Figure 7: Functions interfaced to the users.

I added a set of OnFocus() calls at the end of each handling function to avoid the user using his/her mouse too much. This way, when you finish writing the expression to be calculated, you just have to push Enter on your keyboard. If an error occurs, the cursor will be placed at the error position (when possible), otherwise the entire input text will be selected. This also happens when you double-click on the (last answers/variables) lists or on the Erase (clear list) buttons.

All the operators and their syntax are summed up in the Help dialog box, in the right tabs. Use the Help button or the system menu (Alt+SpaceBar) to get it.

This dialog presents, in tabbed views, the different functionalities available to the user. The three existing tabs show the functions, the operators and the error codes, as shown below:

Figure 8: Help dialog box tabs.

This dialog is provided as an informative tooltip for those who need to find quickly the functions and the operators available, their syntax and their meaning without opening the parser doc. The error codes are also summarized with an overview of the different error cases that may occur.

Here also, the first thing to do is to include the parser's header in your source (still after stdafx.h if compiling in an MFC project that uses precompiled headers):

#include"stdafx.h"#define VCALCPARSER_DLL
#include"VCalcParser.h"

- Hey, but what is this macro defined here?

Well, the VCALCPARSER_DLL macro has been defined to ease the reuse of the header in either "DLL-imports" or "source code use" contexts. It tells the compiler that you import the symbols from a DLL instead of defining them by yourself with a set of internal __declspec(dllimport). As this macro implicitly makes your project load the VCalcParser.lib into your project, you have to make it available in your source, either by copying it in your project sources folder or by changing your project settings.

One last point when using the DLL method is to set the following settings in your project:

Once the parser is inserted in your project using whatever way you chose, you can use it then. You can use your new parser as you like: as a global object (never say I told you to!), as a local variable in the function that needs to get the result of a formula, or as a member of a class (much appreciated - I personally declare a CVCalcParser m_Parser in my CVisualCalcDlg dialog box class). You don't even have to initialize it in your constructor (or wherever else).

When you have to start the parser, just call its CVCalcParser::Evaluate(std::string) member and pass to it the string to parse.

This could be simply done as follows:

StrDest.Format("%f", m_Parser.Evaluate(" 2 + 2 - 1 " );

I, however, emit an objection. The parser can throw exceptions which have to be caught so as not to terminate your program. So, there must be at least a basic try/catch statement around the previous instruction. The best would be to test each exception category plus the warnings.

VisualCalc has grown up from my version 1.0 (never distributed), to the actual version 3.0, passing through some private releases. It became a modular project which could be easily inserted into a whole calculation project. It also provides, as the subtitle tells, a starting point to write a recursive-descendent parser.

So, I'd like to thank all of you who contributed in little and big ways to this project with your questions, suggestions, helps etc. (thank you Cedric Moonen for your very valuable help on the /MD compiler option, and thank you VuNic for having pointed me in the right direction in the darkness of DLLs). I invite all of you to continue your participation in this project providing new ideas, new features, and giving some concurrent implementations too...

Share

About the Author

Toxcct is an electronics guy who felt in love with programming at the age of 10 when he discovered C to play with Texas-Instruments calculators.

Few years later, he discovered "The C++ Language" from Bjarne Stroustrup ; a true transformation in his life.

Now, toxcct is experiencing the Web by developing Siebel CRM Applications for a living. He also respects very much the Web Standards (YES, a HTML/CSS code MUST validate !), and plays around with HTML/CSS/Javascript/Ajax/PHP and such.

_____

After four years of services as a Codeproject MVP, toxcct is now taking some distance as he doesn't like how things are going on the forums. he particularly doesn't accept how some totally ignorant people got the MVP Reward by only being arrogant and insulting while replying on the technical forums.

d'you know what ?! i was already working on this... but thanks for the suggestion. actualy, i'm thinking of an entiere namespace for the Calculator engine (Lexer, Parser, Exceptions...), but as it is a whole rewriting job, it will be for a further version.
humm, tell me, did you read the "Future features" section ???

Another thing i did not really took time to think about (and maybe you could tell me good points to see), is the way the edit control displays the answers. i'd like to display more significant figures (ie, pi is known as a 32 digits number, but when you display it directly, only the 5 or 6 decimals are shown...) or less (ie. round numbers...). any idea ?

would have returned the value of myvar as = 10. Instead of assigning , how about being able to figure out an unknown value in this situation, so that it comes out like this (I don't think you need an explanation of grade-school algebra, I'm just trying to make it clear what I'm talking about):

2 + myvar = 5 + 7
(2 + myvar) - 2 = (5 + 7) - 2
myvar = 10

I don't know if this is related to the post you challenged me on earlier, but think how helpful this would be when actually looking for the value of something. I know you could make it work with something a lot more complex, unless that's something you want me to try to do?...

Who are all these people and what are they doing in my house?...Me in 30 years, inside a grocery store

Somewhere around version 5.0 of Visual C++, they forced "long double" to compile exactly the same as "double". So, "long double" still compiles, but you won't get the added precision of the extra 16 bits. However, the coprocessor stills uses 80-bit calculations internally, so your 64-bit storage is truncated from an 80-bit representation in the coprocessor. The idea is that you get the benefit of 80-bit calculations without the extra storage requirement.

> Somewhere around version 5.0 of Visual C++, they forced "long double" to compile exactly the same as "double"
>However, the coprocessor stills uses 80-bit calculations internally, so your 64-bit storage is truncated from an 80-bit representation in the coprocessor. The idea is that you get the benefit of 80-bit calculations without the extra storage requirement.

This is true only if no temporaries are created. In practice, temporaries are created very frequently, especially in C++, and you'll probably also write intermediate results in your code. Complex calculations will almost always gain significantly more accurate results if you use 80-bit reals. Microsoft did the scientific community a grave disservice when they removed support for long doubles from their compiler.
And the .NET virtual machine doesn't even have instructions for long doubles. The virtual machine is less powerful than the physical one it's running on! What a joke.

Other compilers do support long doubles -- eg. Intel C++, which can act as a drop-in replacement for MSVC.
Digital Mars C++ (www.digitalmars.com) takes engineering/scientific needs seriously and has extensive support for long doubles.

The following is taken from the latest MSDN library (URL http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_long_double.asp)

------------------------------------------
Long Double

Previous 16-bit versions of Microsoft C/C++ and Microsoft Visual C++ supported the long double, 80-bit precision data type. In Win32 programming, however, the long double data type maps to the double, 64-bit precision data type. The Microsoft run-time library provides long double versions of the math functions only for backward compatibility. The long double function prototypes are identical to the prototypes for their double counterparts, except that the long double data type replaces the double data type. The long double versions of these functions should not be used in new code.
------------------------------------------

They *may* actually allow you to perform 'long double' calculations in intermediate steps (non-coerced casts maybe?), but it's pretty clear from the above that Bill doesn't want us using long doubles. That stinks because you do get a couple of decimal digits precision beyond doubles.

At any rate, using the full precision of the FPU registers (80bit) doesn't necessarily mean that you get more exact results. Since IEEE floating point representation is inexact in itself, and moreso operations involving floating point numbers, the additional digits may not be meaningful at all.

If you find that you need the extra precision you are probably better off using an arbitrary precision math library. I assume that's what Microsoft did in their calc.exe application that ships with Windows.

To be more precise, I meant that they removed support for 80-bit reals. (It would even be standard-compliant to make long double=double=float).

> At any rate, using the full precision of the FPU registers (80bit) doesn't necessarily mean that you get more exact results. Since IEEE floating point representation is inexact in itself, and moreso operations involving floating point numbers, the additional digits may not be meaningful at all.

True, but it's no different to using doubles instead of floats. There are no guarantees, but you will very often get more exact results. Note that I am *not* talking about reporting extra digits in the final result, just about avoiding roundoff in the intermediate steps.

> If you find that you need the extra precision you are probably better off using an arbitrary precision math library.

A good point, that would often be true, but doing so has many consequences. It could slow your code down by orders of magnitude, which may not be acceptable.
To change your intermediate variables to 'long double' is trivial, with no effect on your source code, and has little effect on execution speed (it has twice as many memory accesses, but is otherwise identical in speed to double).
The point is that 80-bit reals give you a precision increase in many cases for negligible effort. There's no excuse for crippling the FPU.

> I assume that's what Microsoft did in their calc.exe application that ships with Windows.

I think you're right. Although the number of digits is not very much larger than for long double, they are definitely not using the FPU.
Try this: type in 1e20000.
Then hit x^2 a few times.
Looks like a junior programmer did this one. Amazing how you can do calculations in your head so much faster than a 2GHz P4.

(8)There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double.

yeah i know, it's been a long time, and maybe you already moticed that the article had changed (and is still about to change in few days), but i wanted to notify you that, even on what has been discussed here, i now use long doubles, hidden with the VALUES_TYPE typedef.

this was made for supporting future type changing, such as - and i'm still looking for a good one - a big floating point type library...