Transcription

2 William Tell is a folk hero of Switzerland; she was an exceptional marksman.

3 Conference in Vienna in 1964 best summarized by T. B. Steel: I don t fully know myself how to describe the semantics of a language. I daresay nobody does or we wouldn t be here The Genesis of Atribute Grammars Donald E. Knuth

7 Semantic Analysis Compilation is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the program based on its syntactic structure two purposes: finish analysis by deriving context-sensitive information begin synthesis by generating the IR or target code associated with individual productions of a context free grammar or sub-trees of a syntax tree 7 One of the main goals is to find errors early. If the instructions are ambiguous, or wrong, you don t want to follow them.

8 Context-sensitive analysis What context-sensitive questions might the compiler ask? 1. Is x scalar, an array, or a function? 2. Is x declared before it is used? 3. Are any names declared but not used? 4. Which declaration of x is being referenced? 5. Is an expression type-consistent? 6. Does the dimension of a reference match the declaration? 7. Where can x be stored? (heap, stack,...) 8. Does *p reference the result of a malloc()? 9. Is x defined before it is used? 10.Is an array reference in bounds? 11.Does function foo produce a constant value? 12.Can p be implemented as a memo-function? These questions cannot be answered with a context-free grammar 8

19 Example: evaluating signed binary numbers Attributed parse tree for -101 val and neg are synthetic attributes pos is an inherited attribute 19 Note that the val attributes propagate upwards while the pos attributes propagate downward. The production rule List -> List1 Bit must be left recursive; otherwise the algorithm won t work.

20 Attribute dependency graph nodes represent attributes edges represent flow of values graph must be acyclic topologically sort to order attributes use this order to evaluate rules order depends on both grammar and input string! Evaluating in this order yields NUM.val = -5 20

21 Evaluation strategies > Parse-tree methods 1. build the parse tree 2. build the dependency graph 3. topologically sort the graph 4. evaluate it > Rule-based methods 1. analyse semantic rules at compiler-construction time 2. determine static ordering for each production s attributes 3. evaluate its attributes in that order at compile time > Oblivious methods 1. ignore the parse tree and the grammar 2. choose a convenient order (e.g., left-to-right traversal) and use it 3. repeat traversal until no more attribute values can be generated 21

25 Symbol table information > What kind of information might the compiler need? textual name data type dimension information (for aggregates) declaring procedure lexical level of declaration storage class (heap, stack, text ) offset in storage if record, pointer to structure table if parameter, by-reference or by-value? can it be aliased? to what other names? number and type of arguments to functions 25

26 Lexical Scoping class C { int x; void m(int y) { int z; if (y>x) { int w=z+y; return w; } return y; } } With lexical scoping the definition of a name is determined by its static scope. A stack suffices to track the current definitions. scope of y and z scope of w scope of x 26 Some older languages provided dynamic scoping, but it is much harder to reason about. Nowadays only exception handlers are dynamically scoped.

27 Nested scopes: block-structured symbol tables > What information is needed? when we ask about a name, we want the most recent declaration the declaration may be from the current scope or some enclosing scope innermost scope overrides declarations from outer scopes > Key point: new declarations (usually) occur only in current scope > What operations do we need? void put(symbol key, Object value) bind key to value Object get(symbol key) return value bound to key void beginscope() remember current state of table void endscope() restore table to state at most recent scope that has not been ended May need to preserve list of locals for the debugger 27

30 Efficient Implementation of Symbol Tables int foo, bar; foo = ++bar; if (bar>10) then { boolean baz; baz = true; } Hash tables support an imperative (destructive) implementation // and assume hash(foo)=hash(bar) hash(baz)=hash(quux) 30 If we have multiple symbols in the new environment we must have a stack to keep track of the symbols in each environment. With red we are trying to copy the array. That is not efficient!

31 Efficient Implementation of Symbol Tables (2) (Balanced) binary trees support a functional (non-destructive) implementation. A persistence data structure. 31 Question: How fast is the copying of the needed nodes to create an entry point for a new environment? To insert a node at depth n I have to add a maximum of n nodes. Thus insertion, and search can all happen in log(n) time.

33 Static and Dynamic Typing A language is statically typed if it is always possible to determine the (static) type of an expression based on the program text alone. A language is dynamically typed if only values have fixed type. Variables and parameters may take on different types at run-time, and must be checked immediately before they are used. A language is strongly typed if it is impossible to perform an operation on the wrong kind of object. Type consistency may be assured by I. compile-time type-checking, II. type inference, or III.dynamic type-checking. See: Programming Languages course 33

37 Type compatibility: example Consider: type link = ^cell var next : link; var last : link; var p : ^cell; var q, r : ^cell; Under name equivalence: next and last have the same type p, q and r have the same type p and next have different type Under structural equivalence all variables have the same type Ada/Pascal/Modula-2 are somewhat confusing: they treat distinct type definitions as distinct types, so p has different type from q and r (!) 37

38 Type compatibility: Pascal-style name equivalence Build compile-time structure called a type graph: each constructor or basic type creates a node each name creates a leaf (associated with the type s descriptor) Type expressions are equivalent if they are represented by the same node in the graph 38

41 Type rules Type-checking rules can be formalized to prove soundness and correctness. f : A B, x : A f(x) : B If f is a function from A to B, and x is of type A, then f(x) is a value of type B. 41

42 Example: Featherweight Java Used to prove that generics could be added to Java without breaking the type system. Igarashi, Pierce and Wadler, Featherweight Java: a minimal core calculus for Java and GJ, OOPSLA 99 doi.acm.org/ /

43 Can you answer these questions? Why can semantic analysis be performed by the parser? What are the pros and cons of introducing an IR? Why must an attribute dependency graph be acyclic? Why would be the use of a symbol table at run-time? Why does Java adopt nominal (name-based) rather than structural type rules? 43

44 What you should know! Why is semantic analysis mostly context-sensitive? What is peephole optimization? Why was multi-pass semantic analysis introduced? What is an attribute grammar? How can it be used to support semantic analysis? What kind of information is stored in a symbol table? How is type-checking performed? 44

45 Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) You are free to: Share copy and redistribute the material in any medium or format Adapt remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. Under the following terms: Attribution You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. ShareAlike If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. No additional restrictions You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Semantic Analysis The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the program based on its syntactic structure

Acknowledgement CS3300 - Compiler Design Intro to Semantic Analysis V. Krishna Nandivada IIT Madras Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this

For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes What items should be entered? variable names defined constants procedure and function

Syntax-directed translation Context-sensitive analysis The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the

Symbol Table Information For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes What items should be entered? variable names defined constants

Semantic Processing The compilation process is driven by the syntactic structure of the program as discovered by the parser Semantic routines: interpret meaning of the program based on its syntactic structure

CA4003 - Compiler Construction Semantic Analysis David Sinclair Semantic Actions A compiler has to do more than just recognise if a sequence of characters forms a valid sentence in the language. It must

Semantic Processing Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2005 Handout 6 Decaf Language Wednesday, September 7 The project for the course is to write a

UNIT-4 (COMPILER DESIGN) An important part of any compiler is the construction and maintenance of a dictionary containing names and their associated values, such type of dictionary is called a symbol table.

Project Compiler CS031 TA Help Session November 28, 2011 Motivation Generally, it s easier to program in higher-level languages than in assembly. Our goal is to automate the conversion from a higher-level

SYMBOL TABLE: A symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is associated with information relating

Compiler Theory (Semantic Analysis and Run-Time Environments) 005 Semantic Actions A compiler must do more than recognise whether a sentence belongs to the language of a grammar it must do something useful

CST-402(T): Language Processors Course Outcomes: On successful completion of the course, students will be able to: 1. Exhibit role of various phases of compilation, with understanding of types of grammars

Intermediate Representations & Symbol Tables Copyright 2014, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission

Syntax Directed Translation Beyond syntax analysis An identifier named x has been recognized. Is x a scalar, array or function? How big is x? If x is a function, how many and what type of arguments does

i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

#includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the

SEMANTIC ANALYSIS: Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known. Parsing only verifies that the program consists of tokens

Question of the Day Backpatching o.foo(); In Java, the address of foo() is often not known until runtime (due to dynamic class loading), so the method call requires a table lookup. After the first execution

Data Types A data type defines a collection of data values and a set of predefined operations on those values Some languages allow user to define additional types Useful for error detection through type

Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

When do We Run a Compiler? Prior to execution This is standard. We compile a program once, then use it repeatedly. At the start of each execution We can incorporate values known at the start of the run

CSCI-GA.3033.003 Scripting Languages 12/02/2013 OCaml 1 Acknowledgement The material on these slides is based on notes provided by Dexter Kozen. 2 About OCaml A functional programming language All computation

Compilers CS414-2003S-05 Semantic Analysis David Galles Department of Computer Science University of San Francisco 05-0: Syntax Errors/Semantic Errors A program has syntax errors if it cannot be generated

Formal Semantics Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1 Formal Semantics At the beginning of the book we saw formal definitions of syntax with BNF And how to make a BNF that generates