Compiler DesignTccicomputercoaching.com

What is Compiler Design?

A compiler translates the code written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target code efficient and optimized in terms of time and space.

The compilation process is a sequence of various phases. Each phase takes input from its previous stage, has its own representation of source program, and feeds its output to the next phase of the compiler. Let us understand the phases of a compiler.

Lexical analysis:

Lexical analysis is the first phase of a compiler. It takes the modified source code from language preprocessors that are written in the form of sentences. The lexical analyzer breaks these syntaxes into a series of tokens, by removing any whitespace or comments in the source code.

A programming language must include the specification of syntax (structure) and semantics (meaning).

Syntax typically means the context-free syntax because of the almost universal use of context-free-grammar (CFGs)

Ex.

a = b + c is syntactically legal

b + c = a is illegal

Token:

The token name is an abstract symbol representing a kind of lexical unit, e.g., a particular keyword, or sequence of input characters denoting an identifier.

Syntax Analysis:

The next phase is called the syntax analysis or parsing. It takes the token produced by lexical analysis as input and generates a parse tree (or syntax tree). In this phase, token arrangements are checked against the source code grammar, i.e. the parser checks if the expression made by the tokens is syntactically correct.

Semantic Analysis:

Semantic analysis checks whether the parse tree constructed follows the rules of language. For example, assignment of values is between compatible data types, and adding string to an integer.

Intermediate Code Generation:

Interpreters are easier to write and can provide better error messages (symbol table is still available) Interpreters are at least 5 times slower than machine code generated by compilers Interpreters also require much more memory than machine code generated by compilers Examples: Perl, Python, Unix Shell, Java, BASIC, LISP

Code Optimization:

The next phase does code optimization of the intermediate code.

Code Generation:

In this phase, the code generator takes the optimized representation of the intermediate code and maps it to the target machine language.