unit-iv syllabus introduction · 2019. 3. 22. · •static checks - reported during the...

9
Blog: anilkumarprathipati.wordpress.com 1 UNIT-IV SYLLABUS Semantic Analysis: Semantic Errors, Chomsky hierarchy of languages and recognizers, Type checking, type conversions, equivalence of type expressions, Polymorphic functions, overloading of functions and operators. 1 Introduction A program may have the following kinds of errors at various stages: Lexical: is a sequence of characters that does not match the pattern of any token. Name of some identifier typed incorrectly. Syntactical: is occur during the parsing of input code, and are caused by grammatically incorrect statements. Eg: An illegal character in the input, a missing operator, two operators in a row, two statements on the same line with no intervening semicolon, unbalanced parentheses, a misplaced reserved word, etc. 2 Semantical: is occur during the execution of the code, after it has been parsed as grammatically correct. These have to do not with how statements are constructed, but with what they mean. Eg: incorrect variable types or sizes, nonexistent variables, subscripts out of range, etc. Logical: is a mistake in a program's source code that results in incorrect or unexpected behaviour. It is a type of runtime error that may simply produce the wrong output or may cause a program to crash while running. Eg: code not reachable, infinite loop, etc. 3 Grammar Formally, a generative grammar G is a quad-tuple G= (V, T, P, S) Where, V is a finite set of non-terminals T is a finite set of terminals P is the finite set of production rules, each of the form (T U V)*V (T U V)* (T U V)* i.e., each production rule maps from one string of symbols to another, where, the first string contains at least one non-terminal symbol. S is the start symbol , S ϵ V

Upload: others

Post on 10-Nov-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 1

UNIT-IV SYLLABUS

Semantic Analysis: Semantic Errors, Chomsky hierarchy of languages and recognizers, Type checking, type conversions, equivalence of type expressions, Polymorphic functions, overloading of functions and operators.

1

Introduction A program may have the following kinds of errors at various stages: Lexical: is a sequence of characters that does not match the pattern of any token. Name of some identifier typed incorrectly. Syntactical: is occur during the parsing of input code, and are caused by grammatically incorrect statements. Eg: An illegal character in the input, a missing operator, two operators in a row, two statements on the same line with no intervening semicolon, unbalanced parentheses, a misplaced reserved word, etc.

2

Semantical: is occur during the execution of the code, after it has been parsed as grammatically correct. These have to do not with how statements are constructed, but with what they mean. Eg: incorrect variable types or sizes, nonexistent variables, subscripts out of range, etc. Logical: is a mistake in a program's source code that results in incorrect or unexpected behaviour. It is a type of runtime error that may simply produce the wrong output or may cause a program to crash while running. Eg: code not reachable, infinite loop, etc.

3

Grammar Formally, a generative grammar G is a quad-tuple

G= (V, T, P, S)

Where,

V is a finite set of non-terminals

T is a finite set of terminals

P is the finite set of production rules, each of the

form (T U V)*V (T U V)* (T U V)*

i.e., each production rule maps from one string of

symbols to another, where, the first string contains at

least one non-terminal symbol.

S is the start symbol , S ϵ V

Page 2: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 2

Chomsky hierarchy

Two main categories of formal grammar are:

• Generative grammars – are set of rules for generation of strings in a language.

• Analytic grammars – are set of rules to determine whether a string is a member of the language.

Types of Generative Grammar

In 1956, Noam Chomsky classified the generative grammars into four types known as the Chomsky hierarchy. Each type of generative grammar distinguishes the other types in the form of production rules.

5

Types of Grammars Unrestricted Grammar (T-0)

Formally, unrestricted grammar G is a quad-tuple

G= (V, T, P, S)

Where, V, T, S are same as in the case of generative grammar and P is the finite set of production rules, each of the form

(V U T )+ (V U T)*

i.e., no symbol (epsilon) on the left-hand side of any productions not allowed. However, can appear on the right-hand side of the productions.

Example: S AS | ϵ A aA | a aaaA ba

Types of Grammars Context Sensitive Grammar (T-1)

Formally, a context sensitive grammar G is a quad-tuple

G= (V, T, P, S)

Where, V,T,S are same as in the case of generative grammar and P is the finite set of productions, each of the form

(V U T)+ → (V U T)+

i.e., no ϵ on the left and right- hand sides of any productions. Hence, this grammar is ϵ -free.

Example: S aBb

aB bBB

Bb aa

B b

Page 3: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 3

Types of Grammars Context Free Grammar (T-2)

Formally, a context free grammar G is a quad-tuple

G= (V, T, P, S)

Where, V,T,S are same as in the case of generative grammar and P is the finite set of productions, each of the form

A (V U T)*

i.e., a single non-terminal symbol (A in this case) on left-hand side and any number of terminals and non-terminals (including ϵ ) on the right-hand side of production.

Example: S aSa

S bSb

S a|b| ϵ

Types of Grammars Regular Grammar (T-3)

Formally, a regular grammar G is a quad-tuple

G= (V, T, P, S)

Where, V,T,S are same as that for generative grammar and P is the finite set of productions, each of the form

A t

A t B

A ϵ

Where A,B ϵ V and t ϵ T

Example: S aA A aA A bB B bB B ϵ

11

Chomsky’s

hierarchy

Grammar Restrictions on productions Acceptor

Type 0

Type 1

Type 2

Type 3

Unrestricted

Context-

sensitive

Context-free

Regular

(VUT)+ (VUT)*

(VUT) + (VUT) +

A (VUT)*

A is a single non-terminal on left-

hand side of productions.

A t or

A tB or

A ϵ

Where A, B ⸦ V and t ⸦ T

TM

LBA

PDA

FA

Types of Grammars Comparison

Symbol Table

Symbol table is a data structure used by compiler to keep track of semantics of variable. i.e. symbol table stores the information about scope and binding information about names.

Symbol table is built in lexical and syntax analysis phases.

It is used by various phases as follows, semantic analysis phase refers symbol table for type conflict issues. Code generation refers symbol table to know how much run-time space is allocated? What type of space allocated?

Page 4: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 4

Use of symbol table

– To achieve compile time efficiency compiler makes use of symbol table

– It associates lexical names with their attributes.

– The items to be stored in symbol table are,

Variable name

Constants

Procedure names

Literal constants & strings

Compiler generated temporaries

Labels in source program

Contd..

Compiler uses following types of information from symbol table, i.e. attributes.

Data type

Name

Procedure declarations

Offset in storage

In case of structure or record, a pointer to structure table

For parameters, whether pass by value or reference

Number and type of arguments passed

Base address

15

Type Type: is a property of program constructs such as expressions. It defines a set of values (range of variables) and a set of operations on those values. • Compiler must check that the source program follows both the syntactic and semantic conventions of the source language. • Static checks - reported during the compilation phase. • Dynamic checks - occur during the execution of the program.

Type Expressions Example: int array[2][3]

array(2,array(3,integer))

• A basic type is a type expression

• A type name is a type expression

• A type expression can be formed by applying the array type constructor to a number and a type expression.

• A record is a data structure with named field

• A type expression can be formed by using the type constructor g for function types

• Type expressions may contain variables whose values are type expressions

Page 5: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 5

Type Equivalence

• They are the same basic type.

• They are formed by applying the same constructor to structurally equivalent types.

• One is a type name that denotes the other.

Conversions between primitive types Type conversions into expression evaluation

Page 6: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 6

Type Checking?

• Type checking is the process of verifying that each operation executed in a program respects the type system of the language.

• This generally means that all operands in any

expression are of appropriate types and number.

• Much of what we do in the semantic analysis phase is type checking.

Static type checking

• Static type checking is done at compile-time. The information the type checker needs is obtained via declarations and stored in a master symbol table.

• After this information is collected, the types involved in each operation are checked.

Cont…

• For example, if a and b are of type int and we assign very large values to them, a * b may not be in the acceptable range of int’s, or an attempt to compute the ratio between two integers may raise a division by zero. These kinds of type errors usually cannot be detected at compiler time.

Dynamic type checking

• Dynamic type checking is implemented by including type information for each data location at runtime.

• For example, a variable of type double would contain both the actual double value and some kind of tag indicating "double type".

• The execution of any operation begins by first checking these type tags. The operation is performed only if everything checks out. Otherwise, a type error occurs and usually halts execution.

Page 7: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 7

Implicit type conversion

• Many compilers have built-in functionality for correcting the simplest of type errors.

• Implicit type conversion, or coercion, is when a compiler finds a type error and then changes the type of the variable to an appropriate type.

• Ada and Pascal, for example, provide almost no automatic coercions, requiring the programmer to take explicit actions to convert between various numeric types.

Conversions between primitive types

Error Recovery

• Since type checking has the potential for catching errors in programs, it is important to do something when an error is discovered.

• The inclusion of error handling may result in a type system that goes beyond the one needed to specify correct programs.

Specifications of a simple type checker

The type of each identifier must be declared before the identifier is used. The type checker is a translation scheme that synthesizes the type of each expression from the type of its sub-expressions.

Page 8: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 8

Type Checking of Expressions

• E--> literal { E.type=char;}

• E--> num { E.type =integer;}

• E--> id { E.type =lookup(id.entry);}

• E--> E_1 mod E_2 { E.type =If (E_1.type ==integer) if (E_2. Type ==integer) integer; else type-error;

• E--> E_1[E_2] { E.type=if ((E_2.type==integer)&& (E_1.type==array(s,t)) t; else type-error;}

• E--> *E_1 { E.type = if (E_1.type ==pointer(t)) t else type-error;

Type Checking for Statements

• S--> id=E { if (id.type==E.type) void; else type-error;}

• S--> if E then S { if (E.type==boolean) S_1.type; else type-error;}

• S--> While E do S { if (E.type==boolean) S_1.type; else type-error;

• S--> S1; S2; { if (S_1.type==void) if (S_2.type ==void) void; else type-error;}

Type Checking of Functions

• T--> T ‘->’ T { T.type = T1.type -> T2.type}

• E--> E(E) {E.type = f ((E_2.type ==s) && (E_1.type == s--> t)) t; else type-error;

33

Polymorphism

The polymorphism refers to ‘one name having many forms’, i.e. ‘different behavior of an instance depending upon the situation’.

C++ implements polymorphism through overloaded functions and overloaded operators.

The term ‘overloading’ means a name having two or more distinct meanings.

Overloading occurs when the same operator or function name is used with different signatures.

Page 9: UNIT-IV SYLLABUS Introduction · 2019. 3. 22. · •Static checks - reported during the compilation phase. •Dynamic checks - occur during the execution of the program. Type Expressions

Blog: anilkumarprathipati.wordpress.com 9

Contd…

• Overloading occurs when the same operator or function name is used with different signatures.

• Both operators and functions can be overloaded. • Operators except below, are possible to overload. . (dot) :: (qualifier) ?: (ternary operator) sizeof

• Different definitions must be distinguished by their

signatures (otherwise which to call is ambiguous). – Reminder: signature is the operator/function name and the

ordered list of its argument types. – E.g., add(int,long) and add(long,int) have different

signatures.

34