cs 3240: languages and computation course overview sasha boldyreva

72
CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Upload: kory-daniels

Post on 26-Dec-2015

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

CS 3240: Languages and Computation

Course OverviewSasha Boldyreva

Page 2: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Personnel

Instructor: Alexandra (Sasha) Boldyreva Email: [email protected] Office: Klaus 3444 Office Hours:

Tue. & Wed. at 2:00 to 3:00 pm Or by appointment

TAs: TBA Email: TBA Office Hours: TBA

Page 3: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Required Textbooks

Bundle ISBN# 1418879746, including “Compiler Construction Principles and

Practice” by Kenneth C. Louden, Thompson Course Technology, 1997, ISBN 0534939724

“Introduction to Theory of Computation, Second Edition” by Michael Sipser, Thompson Course Technology, 2005, ISBN 0534950973

Page 4: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Course Objectives

Formal languages Understand definitions of regular and context-free languages

and their corresponding “machines” Understand their computational powers and limitations

Compiler concepts Understand their applications in compilers Front-end of compiler Lexical analysis, parsing, semantic analysis

Theory of computation Understand Turing machines Understand decidability

Page 5: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Course Syllabus Lexical analysis, scanners, pattern matching Regular expressions, DFAs, NFAs and automata Limits on regular expressions, pumping lemma Practical parsing, LL and LR parsing Context-free languages, grammars, Chomsky Hierarchy Pushdown automata, deterministic vs. non-deterministic Attribute grammars, type inferencing Context-free vs. context-sensitive grammars Decidable vs. Undecidable problems, Turing Machines,

Halting Problem Complexity of computation, classes of languages P/NP,

space and time completeness

Page 6: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Grading Homeworks: 25% Mini-project: 15% Midterm : 30% Final: 30% Homeworks to be submitted in class - hardcopy No late homework or assignments Homework should be concise, complete, and precise Tests will be in class

Page 7: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Class Policies

Students must write solutions to assignments completely independently

General discussions are allowed on assignments among students, but names of collaborators must be reported

Cell phones off, silence please

Page 8: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Resources

Class webpage: see T-Square

Check for schedule changes.

Page 9: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Introduction toCompiler Concepts

Page 10: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Compilers

What is a compiler? A program that translates an executable program from source

language into target language Usually source language is high-level language, and target

language is object (or machine) code Related to interpreters

Why compilers? Programming in machine (or assembly) language is tedious,

error prone, and machine dependent Historical note: In 1954, IBM started developing FORTRAN

language and its compiler

Page 11: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Why study theory of compiler?

Besides it is required… Prerequisite for developing advanced compilers,

which continues to be active as new computer architectures emerge

Useful to develop software tools that parse computer codes or strings E.g., editors, debuggers, interpreters, preprocessors, …

Important to understand how compliers work to program more effectively

Page 12: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

How Does Compiler Work?Scanner

Parser

SemanticAction

IntermediateRepresentation

IntermediateRepresentation

SemanticError

RequestToken

GetToken

Checking

Start

•Front End: Analysis of program syntax and semantics

Page 13: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Parts of Compilers

1. Lexical Analysis2. Syntax Analysis3. Semantic Analysis

4. Code Generation5. Optimization

Analysis

Synthesis

Fro

ntE

ndB

ack

End

Focus of this class.

Page 14: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

The Big Picture

Parsing: Translating code to rules of grammar. Building representation of code.

Scanning: Converting input text into stream of known objects called tokens. Simplifies parsing process.

Grammar dictates syntactic rules of language i.e., how legal sentence could be formed

Lexical rules of language dictate how legal word is formed by concatenating alphabet.

Page 15: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Overall Operation

Parser is in control of the overall operation Demands scanner to produce a token

Scanner reads input file into token buffer & forms a token (How?) Token is returned to parser

Parser attempts to match the token (How?) Failure: Syntax Error! Success:

Does nothing and returns to get next token, or Takes semantic action

Page 16: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Overall Operation

Semantic action: look up variable name If found okay If not: put in symbol table

If semantic checks succeed, do code-generation (How?)

Continue to get next token No more tokens? Done!

Page 17: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning/Tokenization

Input File Token Buffer

What does the Token Buffer contain?Token being identified

Why a two-way ( ) street? Characters can be readand unreadTermination of a token

Page 18: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()m

Page 19: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()am

Page 20: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()iam

Page 21: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()niam

Page 22: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()(niam

Page 23: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Example

main()niam

Keyword: main

Page 24: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Parser

Translating code to rules of a grammar Control the overall operation Demands scanner to produce a token Failure: Syntax Error! Success:

Does nothing and returns to get next token, orTakes semantic action

Page 25: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Grammar Rules<C-PROG> MAIN OPENPAR <PARAMS> CLOSEPAR <MAIN-BODY><PARAMS> NULL<PARAMS> VAR <VAR-LIST><VARLIST> , VAR <VARLIST><VARLIST> NULL<MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT>

CURLYCLOSE

<DECL-STMT> <TYPE> VAR <VAR-LIST>;<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR<EXPR> VAR<OP><EXPR><OP> +<OP> -<TYPE> INT<TYPE> FLOAT

Page 26: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner Token Buffer

Page 27: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner

"Please, get methe next token"

Token Buffer

Page 28: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner m

Page 29: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner am

Page 30: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner iam

Page 31: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner niam

Page 32: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner (niam

Page 33: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner niam

Page 34: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner

Token: main

Token Buffer

Page 35: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Demomain() { int a,b; a = b;}

Parser

Scanner

"I recognize this"

Token Buffer

Page 36: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Parsing (Matching) Start matching using a rule When match takes place at certain position,

move further (get next token & repeat) If expansion needs to be done, choose

appropriate rule (How to decide which rule to choose?)

If no rule found, declare error If several rules found, the grammar (set of rules)

is ambiguous

Page 37: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

"Please, get methe next token"

Page 38: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: MAIN

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>

Page 39: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

"Please, get methe next token"

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>

Page 40: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: OPENPAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>

Page 41: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: CLOSEPAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><PARAMETERS> NULL

Page 42: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: CLOSEPAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><PARAMETERS> NULL

Page 43: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: CLOSEPAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>

Page 44: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: CURLYOPEN

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE

Page 45: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: INT

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT

Page 46: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: INT

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT

Page 47: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: INT

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT

Page 48: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 49: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ',' [COMMA]

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 50: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 51: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 52: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 53: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 54: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL

Page 55: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>;

Page 56: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>;

Page 57: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR

Page 58: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: '='

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR

Page 59: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR

Page 60: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR

Page 61: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: VAR

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR

Page 62: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;

Page 63: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: ';'

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;

Page 64: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Scanning & Parsing Combined

main() { int a,b; a = b;}

Parser

Scanner

Token: CURLYCLOSE

<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE

Page 65: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

What Is Happening?

During/after parsing?Tokens get gobbled

Symbol tablesVariables have attributesDeclaration attached attributes to variables

Semantic actionsWhat are semantic actions?

Semantic checks

Page 66: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Symbol Table

int a,b; Declares a and b

Within current scope Type integer

Use of a and b now legal

Basic Symbol Table

Name Type Scope

a int "main"

b int "main"

Page 67: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Typical Semantic Actions

Enter variable declaration into symbol table Look up variables in symbol table Do binding of looked-up variables (scoping rules, etc.) Do type checking for compatibility Keep the semantic context of processing

a + b + c t1 = a + b t2 = t1 + c

SemanticContext

Page 68: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

How Are Semantic Actions Called?

Action symbols embedded in the grammarEach action symbol represents a semantic

procedureThese procedures do things and/or return values

Semantic procedures are called by parser at appropriate places during parsing

Semantic stack implements & stores semantic records

Page 69: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Semantic Actions<decl-stmt> <type>#put-type<var-list>#do-decl<type> int | float<var-list> <var>#add-decl <var-list><var-list> <var>#add-decl<var> ID#proc-decl#put-type puts given type on semantic stack#proc-decl builds decl record for var on stack#add-decl builds decl-chain#do-decl traverses chain on semantic stack using backwards pointers entering each var into symbol table

id3

id2

id1

type

#do-decl

Name Type Scope

id1 1 3

id2 1 3

id3 1 3

decl record

Page 70: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Semantic Actions

What else can semantic actions do in addition to storing and looking up names in a symbol table?

Two type of semantic actionsChecking (binding, type compatibility,

scoping, etc.)Translation (generate temporary values,

propagate them to keep semantic context).

Page 71: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Full Compiler StructureScanner

Parser

SemanticAction

Start

CodeGeneration

CODE

SemanticError

• Most compilers have two pass

Page 72: CS 3240: Languages and Computation Course Overview Sasha Boldyreva

Summary Front-end of compiler: scanner and

parser Translation takes place in back end Scanner, parser and code generator are

automatedHow? We will answer this question in this

class