1 semantic analysis (symbol table and type checking) chapter 5

60
1 Semantic Analysis (Symbol Table and Type Checking) Chapter 5

Post on 18-Dec-2015

241 views

Category:

Documents


1 download

TRANSCRIPT

1

Semantic Analysis(Symbol Table and Type Checking)

Chapter 5

2

The Compiler So Far

• Lexical analysis– Detects inputs with illegal tokens

• Parsing– Detects inputs with ill-formed parse trees

• Semantic analysis (contextual Analysis)– Catches all remaining errors

3

What’s Wrong?

• Example 1 int y = x + 3;

• Example 2

String y = “abc” ;

y ++ ;

4

Why a Separate Semantic Analysis?

• Parsing cannot catch all errors

• Some language constructs are not context-free– Example: All used variables must have been

declared (i.e. in scope)– ex: { int x { .. { .. x ..} ..} ..}– Example: A method must be invoked with

arguments of proper type (i.e. typing)– ex: int f(int, int) {…} called by f(‘a’, 2.3, 1)

5

More problems require semantic analysis1. Is x a scalar, an array, or a function?2. Is x declared before it is used/defined? 3. Is x defined before it is used?4. Are any names declared but not used?5. Which declaration of x does this reference?6. Is an expression type-consistent?7. Does the dimension of a reference match the

declaration?8. Where can x be stored? (heap, stack, . . . )9. Does *p reference the result of a malloc()?10. Is an array reference in bounds?

6

Why is semantic analysis hard?

• need non-local information– a[x] = y + z; // type consistent ?

• answers depend on values, not on syntax– int a[10]; a[10] = 1;

• answers may involve computation– a[10 + BASE] = 6;

7

How can we answer these questions?1. use context-sensitive grammars (CSG)

2. use attribute grammars(AG)– augment context-free grammar with rules– calculate attributes for grammar symbols

3. our approach:– Build AST– Construct visitors (TreeWalker) to traverse AST to

collect information about names in symbol tables– Write various checking visitors to check possible

semantic errors.

8

Symbol Tables

• Symbol Tables Environments– Mapping IDs to infrmation about the IDs like Types ,

Locations, etc.– Definition Insert in the table– Use Lookup ID

• Scope– Where the IDs are “visible”

Ex: formal parameters, local variables in MiniJava

-> inside the method where defined

-- (private) variables in a class

-> inside the class

-- (public) method : visible anywhere (unless overridden)

9

Symbol Table

• What kind of IDs should be entered?– class or structure names– variable names– defined constants– procedure and function names– literal constants and strings– source text labels

• Separate table for structure layouts (types) (field offsets and lengths)

10

What kind of information should be included?

• textual name• data type• dimension information ( for aggregates)• declaring procedure or class• lexical level of declaration• storage class ( base address in stack; heap , global)• size and offset in storage• record (pointer to) structure table• parameter by-reference or by-value?• function number and type of arguments to functions• …

11

Attributes of symbol table• Attributes are properties of ID in declarations• Symbol table associates names with attributes• Names may have different attributes depending

on their meaning:– variables: type, procedure level, frame offset– types: type descriptor, data size/alignment– constants: type, value– procedures: formals (names/types), result type, block

information (local decls.), frame size– classes : name, parent class, fields, methods, …

12

Symbol Table construction strategy

• number of tables constructed:– one global symbol table : scope level info included– one symbol table per scope + links to parent scope

• life-time of symbol table– persistent once created multi-pass compiler– created on entering scope and destroyed on leaving

its scope. one pass compiler

13

Environments

• A set of bindings (nameattributes assoc.)Initial Env 0

Class C { int a; int b; int c;

Env ={parent0 }+ {a -> int, b -> int, c -> int} public void m() { System.out.println(a+c); int j = a+b;

Env = {parent 1 }+ {j -> int} String a = “hello”;

Env = 2 + {a -> String} System.out.println(a);

14

Environments (Cont’d)

Env = 2 + {a -> String}

System.out.println(a);

System.out.println(a);

System.out.println(a);

}

Env

}

Env

15

Implementing Environments

• Functional Style (non-destructive)– enter scope Keep previous env and create new one– leave scope discard/save new one and back to old– implementation: hashtable or tree.

• Imperative Style (using hashtable )– enter scope mark a new scope– definition encountered put new key-value– leave scope pop all key-value pairs since the last

entering scope)– a global table only.

16

Multiple Symbol Tables : Java-style

Package M;

class E {

static int a = 5;

}

class N {

static int b = 10

static int a = E.a + b

}

class D {

static int d = E.a+ N.a

}

End

Initial Env 0

= {a -> int}

= {E -> }

= {b -> int,a -> int}

= {N -> }

= {d -> int}

= {D -> }

=

{ E , N , D }

17

Implementation – Functional Symbol Table

• Efficient Functional Approach’a

would return [a

• If implemented with a Hashtable would have to create O(n) buckets for each scope

• Is this a good idea?

18

Implementation – Imperative Symbol Table(inefficient nondestructive update)

a

b

d

c

See Appel Program 5.2 (p106)

Update (clone & put)

Undo

’d

19

Implementation - Tree

dog 3

bat 1

dog 3

camel 2

emu 42

m1

m2

m1 = { bat |-> 1 , camel |-> 2, dog |-> 3 }

m2 = {m1 + emu |-> 42 }

How could this be implemented?

Want m2 from m1 in O(n)

20

Symbols v.s Strings as table key • problem with string as table key:

– time consuming for comparing of long names.– why not comparing address if equal strings are identical.

• Symbol:– a wrapper for (intern) Stirngs

• Symbol Representation– Comparing symbols for equality is fast.– Extracting an integer hash key is fast.– Comparing two symbols for “greater-than” is fast. (monotonic ?)

• Properties:– Symbol s1,s2 => – s1 == s2 iff s1.equals(s2) iff s1.string == s2.string

• public class Symbol { public String toString(); public static Symbol getSymbol(String n);}

21

symbol.Symbolpublic class Symbol { public String name; // Symbol cannot be constructed directly private Symbol(String n) { name = n;} public String toString(){ return name; }

private static Map map = new Hashtable();

public static Symbol getSymbol(String n){ String u = n.intern(); Symbol s = (Symbol) map.get(u); if (s == null){ s = new Symbol(u); map.put(u,s); } return s; } }

22

Symbol Table Implementastion(efficient destructive update)

a

a

b

b

Using a Hash Table

c

c

c

top: Symbol

marker: Binder

null

c

c

null

23

Some sample program(I)

/** * The Table class is similar to

java.util.Dictionary, * except that each key must be a Symbol and there is * a scope mechanism. */

public class Table {

private Hashtable dict = new java.util.Hashtable(); private Symbol top; private Binder marks;

public Table(){}

24

Some sample program(II)/** Gets the object associated with the specified * symbol in the Table. */ public Object get(Symbol key) {

Binder e = (Binder)dict.get(key);if (e==null) return null;else return e.value;

}

/** Puts the specified value into the Table, * bound to the specified Symbol. */ public void put(Symbol key, Object value) {

dict.put(key, new Binder(value, top,

(Binder)dict.get(key)));top = key; }

25

Some sample program(III)/** * Remembers the current state of the Table. */ public void beginScope() {marks = new Binder(null,top,marks); top=null;}

/** Restores the table to what it was at the most * recent beginScope that has not already been ended. */ public void endScope() {

while (top!=null) { Binder e = (Binder)dict.get(top); if (e.tail!=null) dict.put(top,e.tail); else dict.remove(top); top = e.prevtop;} top=marks.prevtop; marks=marks.tail; }

26

Some sample program(IV)

package Symbol;

class Binder {

Object value;

Symbol prevtop;

Binder tail;

Binder(Object v, Symbol p, Binder t) {

value=v; prevtop=p; tail=t;

}

}

27

Type-Checking in MiniJava

• Binding for type-checking in MiniJava– Variable and formal parameter

• Var name <-> type of variable

– Method• Method name <-> result type, parameters( including position

information), local variables

– Class• Class name <-> variables, method declarations, parent class

28

Symbol Table: example

See Figure 5.7 (next slide)

• Primitive types– int -> IntegerType()– Boolean -> BooleanType()

• Other types– Int [] -> IntArrayType()– Class -> IdentifierType(String s)

29

A MiniJava Program and its symbol table(Figure 5.7)

class B { C f; int[] j; int q;

public int start(int p, int q) { int ret; int a; /* … */ return ret; }

public boolean stop(int p) { /* …*/ return false; } }class{ C /* …*/ }

B

C

FIELDS

f C

j int[]

g int

METHODS

start int

stop boolean

PARAMS

p int

q int

LOCALS

ret int

a int

PARAMS

p int

LOCALS

….

30

Main Symbol tables in MiniJava

• Table class hierarchy : • NameTypeTable• GlobalTable• ClassTable• MethodTable

Containment Hierarchy:

GlobalTable ClassTables MethodTables NameTypeTables (locals+params) NameTypeTables (for fields)

31

NameTypeTable

• String name; // name of var+class+method etc.• Type type; // type of var+local+method+field+class etc.• NameTypeTable parent; // parent table• Map<Symbol, Object> map ; // locals + formal + fields +

attr …• getName() // name of class/method/vars assoc. with this

table• getType() // type of class/method/vars assoc. with this

table• getFromHierarchy(Symbol)• boolean put(Symbol, Object);• boolean contains(Symbol) ;• Object get(Symbol);• dump()

32

getMethod(ClassName, methodName)

33

Type getVarType(String varName) ;

34

Additional methods

• getVarType(String id) // in MethodTable– find type of variable id from current method– Precedence:– Locals in method– Foraml Parameters in parameter list– Fields in the containing class– Variable in the parent class

• getMethod(String) // classTable– May be defined in the parent Classes

35

Type-Checking : Two Phases • Build Symbol Table• Type-check statements and expressionspublic class Main { public static void main(String [] args) { try { Program prog = new MiniJavaParser(System.in).Program();MiniJavaSymbolTableBuilder v1 = new MiniJavaSymbolTableBuilder();

v1.visit(prog); new MiniJavaTypeCheckVisitor(v1.getTable()) .visit(prog); }catch (ParseException e) { System.out.println(e.toString()); }}}

36

Build Symbol Tablepublic class MiniJavaSymbolTableBuilder extends DepthFirstVisitor {

private GlobalTable gtable = new …; private ClassTable cTable ; private MethodTable mTable; boolean InMethod; String id; Type cType // Type t; // Identifier i;

37

Build Symbol Table ( Cont’d )public void visit(VarDecl n) { super.visit(n); // this will set id and cType if (inMethod) { if (!(mTable.addLocal(id, cType))) { err.printf( "Duplicate locals/parameters defined: %s %s !!", id, cType); } } else { if (!cTable.addField(id, cType)) {

err.printf("Duplicate fields defined: %s %s !!", id, cType); }}

38

visit() methods related to Symbol Table Building: DepthFirstVisitor()• need override them from their parent DepthFirstVisitor class.

public Type visit(MainClass n); public Type visit(ClassDeclSimple n); public Type visit(ClassDeclExtends n); public Type visit(VarDecl n); public Type visit(MethodDecl n); public Type visit(Formal n); public Type visit(IntArrayType n); public Type visit(BooleanType n); public Type visit(IntegerType n); public Type visit(IdentifierType n);

39

MiniJavaTypeCheckVisitor(SymbolTable);

package visitor;import syntaxtree.*;// statement & class member visitorpublic class MiniJavaTypeCheckVisitor extends DepthFirstVisitor {

private ClassTable cTable; private MethodTable mTable; private GlobalTable gTable; Ctype ctype, String id, bnoolean inMethod ;//ev is an expression visitor which needs to return the type of scanned Expressions.

private ExpTypeEvaluator ev = new

ExpTypeEvaluator();

40

public MiniJavaTypeCheckVisitor

(GlobalTable s)

{

super();

gTable = s;

}

41

MiniJavaTypeCheckVisitor(SymbolTable); - Cont’d// i = e ;// Identifier i; Exp e;public void visit(Assign n) { Type type1 = ev.visit(n.i); Type type2 = (Type)ev.visit(n.e); if (!type1.equals(type2)) {

err.printf(“[%s:%s] Expression [%s] of type:[%s] could not be assigned to ["%s] of type [%s]!\n", n.bl, n.bc, n.e, type2, n.i, type1);}

}

42

ExpTypeEvaluator : an inner class of MiniJavaTypeVisitor

// so we can share all fields declared in containing class : MiniJavaTypeVisitor

public class ExpTypeEvaluator extends DepthFirstVisitorR { …

// Exp e1,e2; public Type visit(Plus n) { if (! (visit(n.e1) == IntegerType.TYPE) ) { err.printf("Left side of Plus must be of type

integer"); } if (! (visit(n.e2) instanceof IntegerType) ) { err.printf("Right side of Plus must be of type

integer"); } return IntegerType.TYPE; }

43

Visit(IdentifierType )/** * 1. make sure that n has been defined in * global Table * 2. set cType = n ; */public void visit(IdentifierType n) { if(! ( gTable.contains(n.s) ) ){ err.printf( "%s:%s: The type: %s was not defined!!", n.bl, n.bl, n.s); } cType = n ;}

44

visit(Program)

// MainClass m;

// List<ClassDecl> cl;

public void visit(Program n) {

visit(n.m);

visitList(n.cl);

}

45

visit(While)

// Exp e;// Statement s;public void visit(While n) { Type tpe = (Type) ev.visit(n.e); if (! (BooleanType.TYPE.equals(tpe))) { err.printf(“%s:%s: the condtional: %s in While statement is not a boolean expression!\n",

n.e.bl, n.e.bc, n.e); } visit(n.s);}

46

MiniJavaTypeCheckVisitor extends DepthFirstVisitor

public void visit(Program), visit(MainClass n);

visit(ClassDeclSimple n);

visit(ClassDeclExtends n);

visit(MethodDecl n);

visit(Foraml n ), visit(VarDecl n );

visit(If n), visit(While n);

visit(Print n);

visit(Assign n); visit(ArrayAssign n);

visit(Identifer) ; visit(IdentifierType);

47

ExpTypeEvaluator extends DepthFirstVisitorR• Note: Must return a result type .

public Type visit(And n); // boolean public Type visit(LessThan n); // boolean public Type visit(Plus n); // int public Type visit(Minus n); public Type visit(Times n); public Type visit(ArrayLookup n); // int public Type visit(ArrayLength n); // int public Type visit(Call n); // result type public Type visit(IntegerLiteral n); // int public Type visit(True n); // boolean public Type visit(False n); public Type visit(IdentifierExp n); // symbol table lookup public Type visit(This n); // current class public Type visit(NewArray n); // int[] public Type visit(NewObject n); // IdentifierType(n.id) public Type visit(Not n); // boolean

48

Overloading of Operators, ….• When operators are overloaded, the compiler must

explicitly generate the code for the type conversion. – 2 + 2 2.0 + 3.4 2.4 + 4

– “abc” + 4

– need built-in int2float, float2int int2str system functions etc.

• For an assignment statement, both sides have the same type. When we allow extension of classes, the right hand side is a subtype of lhs.– long x = (int) y + 3

– Person p = new Student();

49

Error Handling

• For a type error or an undeclared identifier, it should print an error message.

• And must go on…..• Recovery from type errors?

– Do as if it were correct.– Not a big deal in our homework.

• Example:– int i = new C();– int j = i + 1;– still need to insert i into symbol table as an integer so

the rest can be typechecked..

50

Type Checking For MiniJava

51

Type Checkng for MiniJava (I)Package syntaxtree;Program(MainClass m, List<ClassDec> c1) // recursively type check m and clMainClass(Identifier i1, Identifier i2, Statement s)

// type check s----------------------------abstract class ClassDeclClassDeclSimple(Identifier i, List<VarDecl> vl, List<methodDecl> m1)

// recursively type check vl and mlClassDeclExtends(Identifier i, Identifier j, List<VarDecl> vl, List<MethodDecl> ml)// like above but must assure that j(parent class) is a declared class

52

-----------------------------

VarDecl(Type t, Identifier i)

Formal(Type t, Identifier i)

//2. recursively type check that t is a built-in or declared type

//will be checked in visit(IdentifierType t)

MethodDecl(Type t, Identifier i, List<Formal> fl, List<VarDecl> vl, List<Statement> sl, Exp e)

// same as (2). recursively type check fl, vl, sl and e.

// 3. type(e) == t

53

Type Checking for MiniJava (II)

abstract class type

IntArrayType()

BooleanType()

IntegerType()

// do nothing

IndentifierType(String s)

//check that gTable.conaitns(s).

// i.e., s muse be a class name.

---------------------------

54

abstract class Statement

Block(List<Statement> sl)

// recursively type check sl

// i.e., call visitList(sl);

// appear in DepthFirstVisitorR

If(Exp e, Statement s1, Statement s2)

//4. type(e) == boolean

// recursively type check sl and s2

While(Exp e, Statement s)

//4 + recursively check s.

55

Print(Exp e)

//5. type(e) == int

Assign(Identifier i, Exp e)

// 7. check type(i) == type(e)

// i[e1] = e2 ;

ArrayAssign(Identifier i,Exp e1,Exp e2)

// 8. type(i) == int[] &&

type(e1) == type(e2) == int

56

Type checking for MiniJava (III)abstract class Exp : // Arithmetic ExpressionPlus(Exp e1, Exp e2), Minus(Exp e1, Exp e2)Times(Exp e1, Exp e2)//10. type(e1) == type(e2) == int//11. type(rlt) intArrayLookup(Exp e1, Exp e2) // e1[e2]// type(e1) == int[] & type(e2) == int // type(rlt) int ArrayLength(Exp e) // e.length// type(e) == int[] ; type(rlt) intIntegerLiteral(int i) // 23// type(rlt) int

57

LessThan(Exp e1, Exp e2)// e1 < e2

// type(e1) == type(e2) == int

// type(rlt) boolean

True() False()

// type(rlt) boolean

And(Exp e1, Exp e2) // e1 && e2

// type(e1) == type(e2) == boolean

// type(rlt) boolean

Not(Exp e) // not e

// type(e) = boolean;

// return BooleanType.TYPE

58

IdentifierExp(String s)

//s(field or formal or local)was declared and in scope

//type(rlt) type part of the declaration that s is bound to.

This()

// type(rlt) current class type

NewArray(Exp e)

//type(e) == int

//type(rlt) int[]

NewObject(Identifier i)

// i is a class name in scope

// type(rlt) IdtentifierType(name of i )

59

--------------------------------------------Identifier(String s) // i = e ; // same as IdentifierExp--------------------------------------------Call(Exp e, Identifier i, List<Exp> el) // ex: Set s = … ; s.getMembers(20, 100 ) Call(s, “getMembers”, [20,100]). getMembers: (Int x int) Set. declared in class Set//c1. type(e) == IdentifierType(c) for some class c//c2. there is a method m in c named i with formal

parameters of types fl.// c3. fl and el has the same size k >= 0 and // c4. for j = 1 .. k type(el(j)) == fl(j)// type(rlt) returnType(m).

60

visit(IdentifierExp) in ExpTypeValuator

public Type visit(IdentifierExp n) { assert mTable != null && cTable != null;

Type tpe = mTable.getVarType(n.s);

if (tpe == null) {err.printf("The variable %s at %s:%s is not declared before!\n",

n.s, n.bl, n.bc); } return tpe;}