cs 153: concepts of compiler design november 2 class meeting department of computer science san jose...
TRANSCRIPT
CS 153: Concepts of Compiler DesignNovember 2 Class Meeting
Department of Computer ScienceSan Jose State University
Fall 2015Instructor: Ron Mak
www.cs.sjsu.edu/~mak
2SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Midterm Results
Median 76.0
Mean 72.2
Standard deviation 20.0
3
What JJTree, JJDoc, and JavaCC Do
You feed JJTree a .jjt grammar file Token specifications using regular expressions Production rules using EBNF
JJTree produces a .jj grammar file
JavaCC generates a scanner, parser, and tree-building routines Code for the visitor design pattern
to walk the parse tree.
JJDoc produces a .html containing the ENBF
4
What JJTree, JJDoc, and JavaCC Do
However, JJTree and JavaCC will not:
Generate code for a symbol table Generate any backend code
You have to providethis code!
5
Pcl
Pcl is a teeny, tiny subset of Pascal. Use JavaCC to generate a Pcl parser
and integrate with our Pascal interpreter’s symbol table components parse tree components
We’ll be able to parse and print the symbol table and the parse tree in our favorite XML format
Sample program test.pcl:
PROGRAM test;VAR i, j, k : integer; x, y, z : real;BEGIN i := 1; j := i + 3; x := i + j; y := 314.15926e-02 + i - j + k; z := x + i*j/k - x/y/zEND.
6
Pcl Challenges
Get the JJTree parse trees to build properly with respect to operator precedence. Use embedded definite node descriptors!
Decorate the parse tree with data type information.
Can be done as the tree is built, or as a separate pass. You can use the visitor pattern to implement the pass.
Hook up to the symbol table and parse tree printing classes from the Pascal interpreter.
7
Pcl, cont’doptions{ JJTREE_OUTPUT_DIRECTORY="src/wci/frontend"; NODE_EXTENDS="wci.intermediate.icodeimpl.ICodeNodeImpl"; ...}
PARSER_BEGIN(PclParser)...public class PclParser{ // Create and initialize the symbol table stack. symTabStack = SymTabFactory.createSymTabStack(); Predefined.initialize(symTabStack); ... // Parse a Pcl program. Reader reader = new FileReader(sourceFilePath); PclParser parser = new PclParser(reader); SimpleNode rootNode = parser.program(); ...
8
Pcl, cont’d ...
// Print the cross-reference table. CrossReferencer crossReferencer = new CrossReferencer(); crossReferencer.print(symTabStack);
// Visit the parse tree nodes to decorate them with type information. TypeSetterVisitor typeVisitor = new TypeSetterVisitor(); rootNode.jjtAccept(typeVisitor, null);
// Create and initialize the ICode wrapper for the parse tree. ICode iCode = ICodeFactory.createICode(); iCode.setRoot(rootNode); programId.setAttribute(ROUTINE_ICODE, iCode); // Print the parse tree. ParseTreePrinter treePrinter = new ParseTreePrinter(System.out); treePrinter.print(symTabStack);}PARSER_END(PclParser)
Demo
9SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
JavaCC Grammar Repository
Check these out to get ideas and models:http://mindprod.com/jgloss/javacc.html
10SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Syntax Error Handling and JavaCC
1. Detect the error. JavaCC does that based on the
grammar in the .jj file.
2. Flag the error. JavaCC does that for you with its error messages.
3. Recover from the error so you can continue parsing. You set this up using JavaCC.
11SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Token Errors
By default, JavaCC throws an exception whenever it encounters a bad token.
Token errors are considered extremely serious and will stop the translation unless you take care to recover from them.
Example LOGO program that moves a cursor on a screen:
FORWARD 20RIGHT 120FORWARD 20
12SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Token Errors, cont’d
What happens if we feed the tokenizer bad input?
SKIP : { " " | "\n" | "\r" | "\r\n"}
TOKEN : { <FORWARD : "FORWARD"> | <RIGHT : "RIGHT"> | <DIGITS: (["1"-"9"])+ (["0"-"9"])*>
logo_tokenizer.jj
FORWARD 20LEFT 120FORWARD 20
Demo
13SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Token Errors, cont’d
One way to recover from a token error is to skip over the erroneous token.public static void main(String[] args) throws Exception { java.io.Reader reader = new java.io.FileReader(args[0]); SimpleCharStream scs = new SimpleCharStream(reader); LogoTokenManager mgr = new LogoTokenManager(scs);
while (true) { try { if (readAllTokens(mgr).kind == EOF) break; } catch (TokenMgrError tme) { System.out.println("TokenMgrError: " + tme.getMessage()); skipTo(' '); } }} logo_skip_chars.jj
14SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Token Errors, cont’d
private static void skipTo(char delimiter) throws java.io.IOException { String skipped = ""; char ch; System.out.print("*** SKIPPING ... "); while ((ch = input_stream.readChar()) != delimiter) { skipped += ch; } System.out.println("skipped '" + skipped + "'");}
logo_skip_chars.jj
Demo
15SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Synchronize the Parser
Skipping over a bad token isn’t a complete solution.
The parser still needs to synchronize at the next good token and then attempt to continue parsing.
First, add an error token to represent any invalid input characters:
SKIP : { " " }
TOKEN : { <FORWARD : "FORWARD"> | <RIGHT : "RIGHT"> | <DIGITS : (["1"-"9"])+ (["0"-"9"])*> | <EOL : "\r" | "\n" | "\r\n"> | <ERROR : ~["\r", "\n"]>} Any character except \r or \n.
logo_synchronize.jj
16SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Synchronize the Parser, cont’d
A program consists of one or more move (FORWARD) and turn (RIGHT) commands. Must also allow for an erroneous command.
void Program() : {} { ( try { MoveForward() {System.out.println("Processed Move FORWARD");} | TurnRight() {System.out.println("Processed Turn RIGHT");} | Error() {handleError(token);} } catch (ParseException ex) { handleError(ex.currentToken); } )+ }
logo_synchronize.jj
17SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Synchronize the Parser, cont’d
The Error() production rule is invoked for the <ERROR> token. The <ERROR> token consumes the bad character.
void MoveForward() : {} { <FORWARD> <DIGITS> <EOL>}
void TurnRight() : {} { <RIGHT> <DIGITS> <EOL>}
void Error() : {}{ <ERROR>} logo_synchronize.jj
18SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Synchronize the Parser, cont’d
The JAVACODE header precedes pure Java code that’s inserted into the generated parser.
JAVACODEString handleError(Token token){ System.out.println("*** ERROR: Line " + token.beginLine + " after \"" + token.image + "\"");
Token t; do { t = getNextToken(); } while (t.kind != EOL); return t.image;}
logo_synchronize.jj
Synchronize the parser to thenext “good” token (EOL).You can do this better with acomplete synchronization set!
Demo
19SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Repair the Parse Tree
After the parser recovers from an error, you may want to remove a partially-built AST node. The erroneous production must call
jjtree.popNode().
JAVACODEString handleError(Token token) #void{ System.out.println("*** ERROR: Line " + token.beginLine + " after \"" + token.image + "\""); Token t; do { t = getNextToken(); } while (t.kind != EOL); jjtree.popNode(); return t.image;}
logo_tree_recover.jjt
Demo
20SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Debugging the Parser
Add the option to debug the parser.
options { DEBUG_PARSER=true;}
Print production rule method calls and returns. Print which tokens are consumed.
21SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Review: Interpreter vs. Compiler
Same front end parser, scanner, tokens
Same intermediate tier symbol tables, parse trees
Different back end operations
22SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Review: Interpreter vs. Compiler, cont’d
Interpreter: Use the symbol tables and parse trees to execute the source program. executor
Compiler: Use the symbol tables and parse trees to generate an object program for the source program. code generator
23SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Target Machines
A compiler’s back end code generator produces
object code for a target machine.
Target machine: the Java Virtual Machine (JVM)
Object language: the Jasmin assembly language
The Jasmin assembler translates the assembly language program into .class files.
Java implements the JVM which loads and executes .class files.
24SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Target Machines, cont’d
Instead of using javac to compile a source program written in Java into a .class file.
Use your compiler to compile a source program written in your chosen language into a Jasmin object program.
Then use the Jasmin assembler to create the .class file.
25SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Target Machines, cont’d
No matter what language the source program was originally written in, once it’s been compiled into a .class file, Java will be able to load and execute it.
The JVM as implemented by Java runs on a wide variety of hardware platforms.
26SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Java Virtual Machine (JVM) Architecture Java stack
runtime stack
Heap area dynamically allocated
objects automatic garbage
collection
Class area code for methods constants pool
Native method stacks support native methods,
e.g., written in C (not shown)
27SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Java Virtual Machine Architecture, cont’d The runtime stack
contains stack frames. Stack frame =
activation record.
Each stack frame contains local variables array operand stack program counter
(PC)
What is missing in the JVMthat we had in ourPascal interpreter?
28SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
The JVM’s Java Runtime Stack
Each method invocation pushes a stack frame.
Equivalent to the activation record of our Pascal interpreter.
The stack frame currently on top of the runtime stack is the active stack frame.
A stack frame is popped off when the method returns, possibly leaving behind a return value on top of the stack.
29SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Stack Frame Contents
Operand stack For doing computations.
Local variables array Equivalent to the memory map in our
Pascal interpreter’s activation record.
Program counter (PC) Keeps track of the currently executing instruction.
30SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
JVM Instructions
Load and store values Arithmetic operations Type conversions Object creation and management Runtime stack management (push/pop values) Branching Method call and return Throwing exceptions Concurrency
31SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Jasmin Assembler
Download from: http://jasmin.sourceforge.net/
Site also includes: User Guide Instruction set Sample programs
32SJSU Dept. of Computer ScienceFall 2013: October 29
CS 153: Concepts of Compiler Design© R. Mak
Example Jasmin Program
Assemble: java –jar jasmin.jar hello.j
Execute: java HelloWorld
.class public HelloWorld
.super java/lang/Object
.method public static main([Ljava/lang/String;)V
.limit stack 2
.limit locals 1
getstatic java/lang/System/out Ljava/io/PrintStream; ldc " Hello World." invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V return
.end method
hello.j
Demo