cs412/413 introduction to compilers and translators march 12, 1999 lecture 18: abstract data types...
TRANSCRIPT
CS412/413
Introduction to
Compilers and Translators
March 12, 1999
Lecture 18: Abstract Data Types and Objects
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
2
Administration
• Programming Assignment 3, Part I due next Friday
• Prelim 2 date changed to April 16 (in class)
• Homeworks 5 and 6 have merged
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
3
Outline
• Programming Assignment 3
• Objects and ADTs: first-class modules
• Encapsulation
• Subtyping
• Inheritance
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
4
Programming Assignment 3• Part I (checkpoint, due March 19)
– translate AST to IR– canonicalize IR representation
• hoist side-effects, CALLs• reorder basic blocks to make branches one-way
– support dump routines for both canonical and non-canonical IR
• Part II (due April 5)– convert IR to abstract assembly by tiling– use simple register allocation to generate code– support dump routines, generate running code!
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
5
Suggestions for Part I• Define internal interfaces first: IR
– non-canonical IR– IR with side-effects hoisted but 2-way branches
• Should be able to use the IR described in Appel or in lecture
• Minor modifications may be good idea– no SEQ nodes in canonical IR– SEQ nodes with any number of children– PUSH statement node, etc.
• Document changes you make to IR
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
6
Code Translation• Write translation and IR transformation
functions as recursive methods on AST and IR nodes
abstract class ASTNode {
abstract IRNode translate(SymTab A); …
abstract class IRNode {
abstract IRNode canonicalize( ); ...
• Problem: how to allow incremental development and testing?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
7
Coding Translations Incrementally
• Write placeholder translation method in ASTNode that is inherited by all AST nodes: instant translation phase!
• Translation can be refined by adding (and testing) translation methods to subclasses one by one
• Define special IR node IR_AST that is just a container for untranslated AST sub-trees.
• Placeholder translation generates this kind of node• Dump routine for this IR node uses AST dump!
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
8
Dumping IR• Goal of Part I is to support printout of the
various intermediate representations– dump_ast : dump the AST– dump_ir : dump the initial translated IR– dump_cir: dump the canonical IR
• Implement dump as recursive traversal but use pretty-printer support
abstract class IRNode {abstract void dump(PrettyPrinter pp);
• dump_ir, dump_cir use same code
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
9
Pretty Printing• Another application of parsing! Tree-structured
data can be formatted well• Provided code to support this: PrettyPrinter• 4 key operations
write (String s) : output a string of the text
begin(int n) : begin group, left margin = pos + n
end( ) : end current grouping unit
allowBreak(int n) : optional line break here -- if “broken”, introduce newline, left margin + n, otherwise emit no text
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
10
Pretty Printing algorithm• begin = [ end = ] allowBreak = |• begin/end mirror expression tree structure
• Format (alph + bet)*f(gam, del) + eps :
[([alph + |bet])* |f([gam, |del])] +|eps(alph + bet)*f(gam, del) + eps(alph + bet)* f(gam, del) + eps
(alph + bet)* f(gam, del) + eps
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
11
Choosing breaks optimally• Break-from-root rule: if a break is broken in
a group, all breaks in containing groups (up to root of group tree) must be broken
• Ensures that deeply nested (high-precedence) groups are broken last
(alph + bet)* f(gam,
del) + eps*zeta
(alph + bet)* f(gam, del) +
eps*zeta
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
12
What IR to generate?• For Part I, how to choose translations for
statement forms?
• Tip: write equivalent C code, compile as described in Pentium Code Samples handout to get Pentium assembly
• Map assembly backward by hand to IR
• Useful trick for doing Part II, helps to learn instruction set—and what instructions are worth tiling for
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
13
High-level languages• So far: how to compile simple languages
– Data types: primitive types, strings, arrays– No user-defined abstractions: objects– No first-class function values
• Next 3 lectures: supporting abstract data types and objects– semantic checking– code generation (IR and assembly)– Iota+ (Programming Assignment 4) has objects
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
14
Data types: records
• Records (C structs, Pascal records)– provide named fields of various types– implemented as a block of memory
{ int x; String s; char c,d,e; int y; }– accesses to data members compiled
to loads/stores indexed fromstart of record; compiler converts
name of field to an offset.
c d e
xs
y
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
15
Stack vs. heap• Records have known size; can be allocated
either on stack (e.g. C, Pascal) or heap• Accesses to stack records are fp-relative --
don’t need to compute address of record• Stack allocation means cache coherence
c d e
xs
yc d e
xs
y
{ int x; String s; char c,d,e; int y; }
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
16
Record Limitations• Records can be used to implement
abstractions, but fields are exposed
• Example: lists of strings with stored lengthList = { len: int, s: String, next: List }
• Abstract operations:– length, cons, first, rest
• Problem: list has representation invariant that len field must be equal to length of list, but any code can break this invariant.
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
17
Abstract Types• Next step: abstract types, where the representation
is hidden (or inaccessible from) code other than the implementation of the type itself (Ada, CLU, ML)
• Purest form: type has an interface and an implementation. Interface only mentions operations, implementation defines representation. (E.g. .h and .C files in C++)
• External code does not know representation, can’t violate the abstraction boundary
• Allows same interface to be reimplemented
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
18
Interface vs. Impl Type
List = { len: int, s: String, next: List }
lengthconsfirstrest
interface
implementation
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
19
Compiling Abstract Types• An abstract type is a first-class module
– has interface (interface files in Iota), implementation (module files in Iota)
– but we can create new instances of the type, each with its own state and operations
• Abstract types harder to implement– size of data not known statically outside own
implementation -- can’t stack-allocate– Abstract type operations still implementable as
simple function calls (resolved by linker)
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
20
Abstract Types3s
next 2s
nextlens
next
• Implemented just like heap-allocated records• C++ objects are abstract types; can be stack-
allocated. How does it work?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
21
Private/Protected• Objects in C++ are semi-abstract -- interface
declares representation, only method code hidden from outside (mostly)
class List {private: int len, String *s, List *l;public: int length( ); List *tail( ); ...
}• Allows outside code to know how much
space List objects take, but not to access fields -- allows allocation on stack
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
22
Multiple Implementations• Abstract types allow an interface to be
reimplemented, but only one implementation of any interface in a given program
• Next step upward: allow multiple implementations (e.g., Java)
interface List { int length(); List tail(); … }class LenList implements List { int len; ... }class SimpleList implements List { … }
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
23
Supporting Multiple Implementations
• Problem: from interface, don’t know which implementation we are dealing with.
x: List
x
LenListlens
nextSimpleList
snext
?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
24
Compiling Multiple Impls• Difficult to stack allocate -- need to be able
to figure out the concrete type of a reference (as in C++)
• Don’t know what code to run when an operation (e.g. length) is invoked.
length(l: LenList) = l.len;
length(l: SimpleList) =
1 + { if (l == null) 0; else length(l.next); }
LenListlens
nextSimpleList
snext
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
25
Dispatch Vectors• To figure out what code to run, add a
pointer to every object to a dispatch vector (dispatch table, virtual table, …)
LenListobject
lens
nextSimpleListobject
snext
length(l: SimpleList) = 1 + { if (l == null) 0; else length(l.next); }
length(l: LenList) = l.len;lengthfirstrest
lengthfirstrest
List dispatch vector code
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
26
Summary• Variety of different mechanisms for
providing data abstraction• Increased abstraction power leads to more
expensive implementations -- more indirections
• Next time: static semantics, plus subtyping, inheritance, other object-oriented features
• Later: optimizations for object-oriented languages