cs 3240: languages and computation course overview sasha boldyreva
Post on 26-Dec-2015
226 Views
Preview:
TRANSCRIPT
CS 3240: Languages and Computation
Course OverviewSasha Boldyreva
Personnel
Instructor: Alexandra (Sasha) Boldyreva Email: aboldyre@cc.gatech.edu Office: Klaus 3444 Office Hours:
Tue. & Wed. at 2:00 to 3:00 pm Or by appointment
TAs: TBA Email: TBA Office Hours: TBA
Required Textbooks
Bundle ISBN# 1418879746, including “Compiler Construction Principles and
Practice” by Kenneth C. Louden, Thompson Course Technology, 1997, ISBN 0534939724
“Introduction to Theory of Computation, Second Edition” by Michael Sipser, Thompson Course Technology, 2005, ISBN 0534950973
Course Objectives
Formal languages Understand definitions of regular and context-free languages
and their corresponding “machines” Understand their computational powers and limitations
Compiler concepts Understand their applications in compilers Front-end of compiler Lexical analysis, parsing, semantic analysis
Theory of computation Understand Turing machines Understand decidability
Course Syllabus Lexical analysis, scanners, pattern matching Regular expressions, DFAs, NFAs and automata Limits on regular expressions, pumping lemma Practical parsing, LL and LR parsing Context-free languages, grammars, Chomsky Hierarchy Pushdown automata, deterministic vs. non-deterministic Attribute grammars, type inferencing Context-free vs. context-sensitive grammars Decidable vs. Undecidable problems, Turing Machines,
Halting Problem Complexity of computation, classes of languages P/NP,
space and time completeness
Grading Homeworks: 25% Mini-project: 15% Midterm : 30% Final: 30% Homeworks to be submitted in class - hardcopy No late homework or assignments Homework should be concise, complete, and precise Tests will be in class
Class Policies
Students must write solutions to assignments completely independently
General discussions are allowed on assignments among students, but names of collaborators must be reported
Cell phones off, silence please
Resources
Class webpage: see T-Square
Check for schedule changes.
Introduction toCompiler Concepts
Compilers
What is a compiler? A program that translates an executable program from source
language into target language Usually source language is high-level language, and target
language is object (or machine) code Related to interpreters
Why compilers? Programming in machine (or assembly) language is tedious,
error prone, and machine dependent Historical note: In 1954, IBM started developing FORTRAN
language and its compiler
Why study theory of compiler?
Besides it is required… Prerequisite for developing advanced compilers,
which continues to be active as new computer architectures emerge
Useful to develop software tools that parse computer codes or strings E.g., editors, debuggers, interpreters, preprocessors, …
Important to understand how compliers work to program more effectively
How Does Compiler Work?Scanner
Parser
SemanticAction
IntermediateRepresentation
IntermediateRepresentation
SemanticError
RequestToken
GetToken
Checking
Start
•Front End: Analysis of program syntax and semantics
Parts of Compilers
1. Lexical Analysis2. Syntax Analysis3. Semantic Analysis
4. Code Generation5. Optimization
Analysis
Synthesis
Fro
ntE
ndB
ack
End
Focus of this class.
The Big Picture
Parsing: Translating code to rules of grammar. Building representation of code.
Scanning: Converting input text into stream of known objects called tokens. Simplifies parsing process.
Grammar dictates syntactic rules of language i.e., how legal sentence could be formed
Lexical rules of language dictate how legal word is formed by concatenating alphabet.
Overall Operation
Parser is in control of the overall operation Demands scanner to produce a token
Scanner reads input file into token buffer & forms a token (How?) Token is returned to parser
Parser attempts to match the token (How?) Failure: Syntax Error! Success:
Does nothing and returns to get next token, or Takes semantic action
Overall Operation
Semantic action: look up variable name If found okay If not: put in symbol table
If semantic checks succeed, do code-generation (How?)
Continue to get next token No more tokens? Done!
Scanning/Tokenization
Input File Token Buffer
What does the Token Buffer contain?Token being identified
Why a two-way ( ) street? Characters can be readand unreadTermination of a token
Example
main()m
Example
main()am
Example
main()iam
Example
main()niam
Example
main()(niam
Example
main()niam
Keyword: main
Parser
Translating code to rules of a grammar Control the overall operation Demands scanner to produce a token Failure: Syntax Error! Success:
Does nothing and returns to get next token, orTakes semantic action
Grammar Rules<C-PROG> MAIN OPENPAR <PARAMS> CLOSEPAR <MAIN-BODY><PARAMS> NULL<PARAMS> VAR <VAR-LIST><VARLIST> , VAR <VARLIST><VARLIST> NULL<MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT>
CURLYCLOSE
<DECL-STMT> <TYPE> VAR <VAR-LIST>;<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR<EXPR> VAR<OP><EXPR><OP> +<OP> -<TYPE> INT<TYPE> FLOAT
Demomain() { int a,b; a = b;}
Parser
Scanner Token Buffer
Demomain() { int a,b; a = b;}
Parser
Scanner
"Please, get methe next token"
Token Buffer
Demomain() { int a,b; a = b;}
Parser
Scanner m
Demomain() { int a,b; a = b;}
Parser
Scanner am
Demomain() { int a,b; a = b;}
Parser
Scanner iam
Demomain() { int a,b; a = b;}
Parser
Scanner niam
Demomain() { int a,b; a = b;}
Parser
Scanner (niam
Demomain() { int a,b; a = b;}
Parser
Scanner niam
Demomain() { int a,b; a = b;}
Parser
Scanner
Token: main
Token Buffer
Demomain() { int a,b; a = b;}
Parser
Scanner
"I recognize this"
Token Buffer
Parsing (Matching) Start matching using a rule When match takes place at certain position,
move further (get next token & repeat) If expansion needs to be done, choose
appropriate rule (How to decide which rule to choose?)
If no rule found, declare error If several rules found, the grammar (set of rules)
is ambiguous
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
"Please, get methe next token"
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: MAIN
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
"Please, get methe next token"
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: OPENPAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: CLOSEPAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><PARAMETERS> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: CLOSEPAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><PARAMETERS> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: CLOSEPAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY>
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: CURLYOPEN
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: INT
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: INT
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: INT
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <TYPE> INT
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ',' [COMMA]
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>; <VARLIST> , VAR <VARLIST><VARLIST> NULL
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>;
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<DECL-STMT> <TYPE>VAR<VAR-LIST>;
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: '='
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: VAR
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;<EXPR> VAR
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: ';'
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE<ASSIGN-STMT> VAR = <EXPR>;
Scanning & Parsing Combined
main() { int a,b; a = b;}
Parser
Scanner
Token: CURLYCLOSE
<C-PROG> MAIN OPENPAR <PARAMETERS> CLOSEPAR <MAIN-BODY><MAIN-BODY> CURLYOPEN <DECL-STMT> <ASSIGN-STMT> CURLYCLOSE
What Is Happening?
During/after parsing?Tokens get gobbled
Symbol tablesVariables have attributesDeclaration attached attributes to variables
Semantic actionsWhat are semantic actions?
Semantic checks
Symbol Table
int a,b; Declares a and b
Within current scope Type integer
Use of a and b now legal
Basic Symbol Table
Name Type Scope
a int "main"
b int "main"
Typical Semantic Actions
Enter variable declaration into symbol table Look up variables in symbol table Do binding of looked-up variables (scoping rules, etc.) Do type checking for compatibility Keep the semantic context of processing
a + b + c t1 = a + b t2 = t1 + c
SemanticContext
How Are Semantic Actions Called?
Action symbols embedded in the grammarEach action symbol represents a semantic
procedureThese procedures do things and/or return values
Semantic procedures are called by parser at appropriate places during parsing
Semantic stack implements & stores semantic records
Semantic Actions<decl-stmt> <type>#put-type<var-list>#do-decl<type> int | float<var-list> <var>#add-decl <var-list><var-list> <var>#add-decl<var> ID#proc-decl#put-type puts given type on semantic stack#proc-decl builds decl record for var on stack#add-decl builds decl-chain#do-decl traverses chain on semantic stack using backwards pointers entering each var into symbol table
id3
id2
id1
type
#do-decl
Name Type Scope
id1 1 3
id2 1 3
id3 1 3
decl record
Semantic Actions
What else can semantic actions do in addition to storing and looking up names in a symbol table?
Two type of semantic actionsChecking (binding, type compatibility,
scoping, etc.)Translation (generate temporary values,
propagate them to keep semantic context).
Full Compiler StructureScanner
Parser
SemanticAction
Start
CodeGeneration
CODE
SemanticError
• Most compilers have two pass
Summary Front-end of compiler: scanner and
parser Translation takes place in back end Scanner, parser and code generator are
automatedHow? We will answer this question in this
class
top related