Mathematical Expression Parsers in Java and C++

When writing your own calculator it is necessary to build a converter that can transform an input mathematical expression such as ( 1 + 8 ) – ( ( 3 * 4 ) / 2 ), into a format that is more suited for evaluation by computers.

When evaluating expressions such as the one above (known as “infix notation“), that which appears simple and intuitive to us humans, is usually not so straightforward to implement in a programming language.

The shunting-yard algorithm is a method for parsing mathematical expressions written in infix notation to Reverse Polish Notation (RPN). The RPN notation is different to infix notation in that every operator (+, -, * etc) comes after the operands (numbers) and there are no parentheses (brackets).

So ( 3 * 4 ) for example becomes 3 4 *.

When given an input string in Reverse Polish Notation, it is then possible to employ a simple algorithm based around the use of a stack in order to evaluate the RPN expression and determine the mathematical result.

The Shunting-yard algorithm

This pseudocode shows how the shunting-yard algorithm converts an expression given in conventional infix notation into Reverse Polish Notation:

For each token
{
If (token is a number)
{
Add number to the output queue
}
If (token is an operator eg +,-,*...)
{
While (stack not empty AND
stack top element is an operator)
{
If ((token = left associative AND
precedence <= stack top element) OR
(token = right associative AND
precedence < stack top element))
{
Pop stack onto the output queue.
Exit while loop.
}
}
Push token onto stack
}
If (token is left bracket '(')
{
Push token on to stack
}
If (token is right bracket ')')
{
While (stack not empty AND
stack top element not a left bracket)
{
Pop the stack onto output queue
}
Pop the stack
}
}
While (stack not empty)
{
Pop stack onto output queue
}

Implementing the the shunting-yard algorithm in Java

The Java function to implement the shunting yard algorithm described above is as follows:

So when running the above code as a simple Windows Console application using an example input expression of ( 1 + 2) * ( 3 / 4 )-(5+6) we get the following RPN tokens and mathematical result:

If an inappropriate input expression is used, one with a mis-match in the number of left/right parentheses such as ( 1 + 2 * ( 3 / 4 )-(5+6), then we get an error:

And if we insert an additional minus sign ‘-‘ so that the original expression used is now -( 1 + 2) * ( 3 / 4 )-(5+6), the algorithm recognizes that this should get treated as a unary minus operator, resulting in the following output:

So far our implementation has been applied to handle basic expressions based on standard mathematical operators +, -, *, / etc. The following downloadable C++ code makes a number of further improvements to handle additional mathematical operators sin, cos, tan, log, exp etc, as well as much more complicated subexpressions.

Additional sanity checks are included to make sure there are no mismatched numbers of parentheses, as well as some additional work on the tokenization of RPM strings to handle unary minus operators occuring at positions where an expression is expected eg -8 + 5 = -3 or 11 ^ -7 = 5.13158e-08.

All examples in this code have been verified using Google’s online calculator. To use Google’s built-in calculator simply enter your mathematical expression into the search box. For example:

Some examples are shown in the following table:

Expression (infix)

RPN (postfix)

Result

exp( 1.11 )

1.11 exp

3.034358

sin( cos( 90 * pi / 180 ) )

90 pi * 180 / cos sin

0.000001

34.5*(23+1.5)/2

34.5 23 1.5 + * 2 /

422.625000

5 + ((1 + 2) * 4) – 3

5 1 2 + 4 * + 3 –

14

( 1 + 2 ) * ( 3 / 4 ) ^ ( 5 + 6 )

1 2 + 3 4 / 5 6 + ^ *

0.126705

3/2 + 4*(12+3)

3 2 / 4 12 3 + * +

61.5

PI*pow(9/2,2)

PI 9 2 / 2 pow *

63.617197

((2*(6-1))/2)*4

2 6 1 – * 2 / 4 *

20

ln(2)+3^5

2 ln 3 5 ^ +

243.693147

11 ^ -7

11 -7 ^

5.13158e-08

cos ( ( 1.3 + 1 ) ^ ( 1 / 3 ) ) – log ( -2 * 3 / -14 )

1.3 1 + 1 3 / ^ cos -2 3 * -14 / log –

0.616143

1 * -sin( Pi / 2)

1 Pi 2 / -sin *

-1

-8 + 5

-8 5 +

-3

1 – (-2^2) – 1

1 1 -2 2 ^ * – 1 –

4

The file main.cpp is an example usage of applying the Tokenize method of the ExpressionParser class to first prime the input expression string into a suitable format, which is subsequently converted to Reverse Polish Notation using the InfixToRPN method.

If successful, the reverse Polish Notation is then evaluated to give the mathematical result.

Related Posts

About The Author

Latest Comments

vahid24 January 2012

hi,
just want to say thanks, it was very useful and, u’ve made it easy.
P.S
i think, you should change this, like what u have in your java code :
If (token is right bracket ')')
{
While (stack not empty AND
stack top element not a left bracket)
{
Pop the stack onto output queue
Pop the stack
}
}

to this:
If (token is right bracket ')')
{
While (stack not empty AND
stack top element not a left bracket)
{
Pop the stack onto output queue
}
Pop the stack
}

I think I know what you mean. But the stack is not quite empty, the while loop keeps popping until it reaches a left parenthesis:
// Until the token at the top of the stack is a left parenthesis, pop operators
// off the stack onto the output queue.
while ( !stack.empty() && stack.top() != "(" )
{
// Add to end of list
outputQueue.push( stack.top() );
stack.pop();
}

If it is not left with a remaining “(” then the expression is invalid and the program exits.

I know, but in your original post the second pop statement from the stack is out of while loop .so if there is no opening ( and everything is just pop from the stack, it will release exception, so it should be handled in case user miswritting the opening bracket…

happyuk10 November 2013

Ah, I see what you mean. I think this improvement got put into the in the downloadable, but the one shown here still relies on a valid input expression. I will update the code in due course. Thanks for pointing that out.

happyuk10 November 2013

Hi I have updated the C++ code listing on this page to reflect this. You still need to enter each value separated by a whitespace though, the download has tokenizers etc to cope with these.

Hi Rakesh. I’m not sure I understand your question. Could you be more specific? It would help if you could express your problem in terms of what your current input is, and what your desired outcome would be. What do you mean by “passed as string from another java file?”

Given a series of numbers as the input, the last one as the result. Use the rest numbers to calculate the result,only +, -, *, / are allowed. The order of the input cannot be changed. If there is an equation, print it; or print “no equation”. If more than one solution, any working equation is fine.

Hi Andy. No this parser as it stands would not be able to do what describe. If I understand you correctly, would it not be a case of writing an additional routine to:

1. parse your original input string of “2,3,1,4”, separating it into an input part (“2,3,1”) and an expected output part (“4”).
2. replace each of the the commas in your input string with the arithmetic operators you prefer so that “2,3,1” becomes “2+3-1”
3. feed this new string “2+3-1” into the infixToRPN() routine and then the RPNtoDouble() routine.
4. compare the answer returned with the desired answer. If they are equal print the new string appended with the equals sign and the desired output string ie “2+3-1” + “=4” ; otherwise print “no equation”.