CS160: Project 1 - A simple calculator (10% of project score)

Project Goals

Administrative Information

This is an individual project.

The project is due on Friday, October 14 2016, 23:59:59 PST.

Project Introduction

In this assignment, we are developing a simple calculator. The
goal of this project is to demystify all of the front-end stuff
that a compiler does. Everything you need is right
in calc.cpp - one file makes all. The other stuff is
just to help you develop and test your calculator. We have not
used any complex tools to do things automatically, everything is
in very vanilla C++ and the code should be fully commented and
pretty readable. You _need_ to know what everything
in calc.cpp does before you can start writing
code. There should be a "WRITEME" everyplace in the code that is
in need of your assistance. However, the version that we provide
to start will compile and even run (although it does not
actually scan a file, and the grammar it parses is trivial). The
grammar we want our our simple calculator to recognize is as
follows:

While the grammar above is ambiguous (in the technical sense) in
many many ways, it will recognize any program written in our
calc language (in other words, it should be able to tell if it
is syntactically correct or not). You will need to change the
grammar around so that it recognizes the same language but is no
longer ambiguous. Furthermore, it should be LL(1) so we can
parse it with recursive decent. Of course, your parser must also
correctly handle associativity and precedence. Especially for
associativity, this can be a little tricky.

Tour of the Code

The tarball with a skeleton of calc.cpp (and some
additional files) can be
downloaded here. You
will see that the code in calc.cpp is divided into four
major parts and three classes. The first part contains some
enums and helper functions to aid in dealing with tokens,
non-terminals, and printing out all that stuff. Once you figure
out the grammar that you are going to use, it should be pretty
straightforward to add the other non-terminals into the
appropriate enum and helper functions.

The second chunk of code is the scanner (which is the C++ class
scanner_t). The scanner should handle reading
the input from stdin and identifying the appropriate tokens. The
interface listed should be supported because that is how the
recursive decent parser will actually be getting tokens. It does
not have to be a big state machine or a regular expression
... just something that is coded by you and that works. Make
sure you test your scanner well before you move on. The last
thing you want to be doing is trying to track down a weird
scanner problem in your parser. In addition to finding the
tokens, you should also keep track of newlines so that you can
find any syntax errors your parser says it finds (it does this
by calling get_line(), which should return the
number of the line of input that is now being scanned). The code
that is in there to start is just stub code so that everything
compiles and you actually get some visible output (the scanner
is just returning either a "+" or "eof" randomly). For the basic
part of the assignment, you do not need to handle any attributes
(such as the actual value of a number token).

The third chunk of code is the class that draws a parse tree (called
parsetree_t). The
class parsetree_t need not and should not be
modified. All it does is print out the parse tree as you
discover it. It prints the tree out in a format readable by the
program "dot", which can turn it to a postscript or graphic file
(the makefile shows how to do that). The output is really a
bunch of nodes and edges. When you start processing a
non-terminal, you push it on a stack (this will draw an edge
from that newly pushed node to the current node on the top of
the stack, which is its parent). When you finish, just pop
it. We will grade the dot file that you generate because we can
clearly see the parse tree you generate for some test inputs we
give to your calculator.

The fourth chunk is the parser itself
(class parser_t). There are already some helper
methods provided for you, but you need to figure out how to
structure your grammar such that it can be written as a
recursive decent parser (more on that later). The code that is
there now will parse the grammar "List -> '+' List | EOF". If
you run
parse(), it will call List(),
which recursively calls itself. As List() is
executed, it calls the scanner to get new tokens, and it
calls parsetree_t to actually print out the
parse tree.

Steps to Solve the Challenge

Get the scanner working:
Implement the scanner class and pass in some of the test inputs
(call make test_parse). You will need more test
inputs, the ones we are giving are just some examples. When
testing the scanner, we suggest to instantiate and call the scanner
directly from main, so that there is no parsing
involved. Then, check that the tokens returned are correct by
printing them out (call token_to_string on them
and printf them).

Modify the grammar to handle precedence correctly: The
grammar presented above is ambiguous and requires
modification. While you need to modify the grammar by hand, we
have included a second set of files to help
you test your modified
grammars. calc_def.l is a lexer
and calc_def.y is a parser written for flex
and bison respectively (you cannot use them in your
final code, but you can use them to help you write and
understand your grammar). If you look at
calc_def.y, you will see the ambiguous
grammar. If you call make, it will compile it
to calc_def, which you can then run on input files!
You will see which expressions parse and which cause syntax
errors. It works even though it is ambiguous (the
"shift/reduce warnings" are warning you that the grammar is
ambiguous). You can modify that grammar and then test that it
still recognizes the exact same set of programs (and find a
syntax error on the same set too). If your new grammar is
unambiguous, you should see no shift/reduce
warnings (or any other types of warnings). However, just
because your grammar does not have shift reduce errors, does
not mean it correctly handles precedence.

We use the standard precedence for operators (same as for C):
Multiplication and modulo have the same precedence (and are
left associative), addition and subtraction have the same
precedence, and multiplication and modulo have a higher
precedence than addition and subtraction.

Modify the grammar to be LL(1):
Again, you should test your grammar with bison (the .y file)
to make sure you did not break anything in the process (you
want to start Step 4 with the correct grammar instead of
finding problems there).

Get the parser written:
Now that you have a grammar ready to go, start writing the
parser (by adding new methods to parser_t). This step
should actually be very easy if you did the previous three
steps correctly. Do not skip this part! The whole point of
this assignment is to get you familiar with scanning AND
parsing. Solving the calculator problem is just an
exercise.

Make sure that you check for errors: You will need to
return an error as soon as possible. That means that if epsilon
can be derived from the non-terminal you are working on, you
need to check the following token to make sure that it is
allowed to appear after the non-terminal that you are currently
examining. To ensure that our automated grading system correctly
handles your submission, please follow the following
required guidelines for your program's output:

When you detect a parse error, print to stderr the
line where the error is detected. Your line counter must
start at 1. Also, the program must exit with an exit code
that is different than 0 (in principle, any code different
than 0 is fine, but to follow the Unix convention, return a
1).

If there is no error and your parser can correctly process
the input, you must exit with an error code of 0. The
generated dot file should be printed to stdout.

If you have implemented the full calculator (that is, you evaluate
expressions -- see below), the results for all calculations
(one for each "Expr") should be printed to stderr,
one result per line.

Nothing else must be printed to stderr or
stdout.

Getting to this point gives you full credit on
this assignment.

Make it work:
If you want that extra credit, and if you have steps 1 through
5 done and rock solid, step 6 is to finish the calculator (so
that it really does calculations). The calculator should
simply print out the signed integer that the expression
evaluates to (you can assume that the integer does not leave
the value range of what can be represented in a 32-bit
integer). The modulus (i.e., the number on the right side of
the modulo operation) has to be a positive integer. If it is
not, throw an error. Note that handling associativity is the
trickiest part here, since your LL(1) grammar likely does not
produce the correct associativity. Thus, the parser must do
some intelligent things to compensate for that. If you don't
do this part at all, we won't be insulted in the least, it is
more related to the later material and I won't cover it for
this project. However, we are reserving this extra credit for
those students that want to figure it out for themselves and
build something that is actually functional.

Deliverables / Turnin

Please follow the instructions below exactly!

Your files must be in a directory named "calc".

All files must be included (makefiles, everything!) in that folder.

Your project must compile on a CSIL machine. If you worked
on a Windows machine or your laptop at home, then make sure
it still works on CSIL or modify it appropriately!

Include a README with this project, and all subsequent
projects. Explain what you did in the README. If you had
problems, tell us why and what.

If you did the final part and evaluate (compute)
expressions, output the result on stderr.

Use this command to submit your work: turnin proj1@cs160
calc

Grading

Scanner works: 40% (of this project)

Parser works: 80%

Parser throws the error on time (i.e., on the correct line): 100%

Program solves the expressions correctly: Extra credit

Important Note: No README == No partial credit if the
project does not work 100%