languages and compilers (sprog og oversættere)people.cs.aau.dk/~bt/spof07/spof07-2.pdf · cross...

83
1 Languages and Compilers (SProg og Oversættere) Lecture 2 Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Norm Hutchinson whose slides this lecture is based on.

Upload: others

Post on 14-Aug-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

1

Languages and Compilers(SProg og Oversættere)

Lecture 2

Bent ThomsenDepartment of Computer Science

Aalborg University

With acknowledgement to Norm Hutchinson whose slides this lecture is based on.

Page 2: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

2

Today’s lecture

• Three topics– Treating Compilers and Interpreters as black-boxes

• Tombstone- or T- diagrams– A first look inside the black-box

• Your guided tour – Some Language Design Issues

Page 3: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

3

Terminology

Translatorinput outputsource program object program

is expressed in thesource language

is expressed in theimplementation language

is expressed in thetarget language

Q: Which programming languages play a role in this picture?

A: All of them!

Page 4: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

4

Tombstone Diagrams

What are they?– diagrams consisting out of a set of “puzzle pieces” we can use

to reason about language processors and programs– different kinds of pieces– combination rules (not all diagrams are “well formed”)

M

Machine implemented in hardware

S -> TL

Translator implemented in L

ML

Language interpreter in L

Program P implemented in L

LP

Page 5: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

5

Tombstone diagrams: Combination rules

SP P

TS -> TMM

LP

S -> TMWRONG!

OK!OK!

OK!MMP

OK!

MLP

WRONG!

Page 6: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

6

Tetrisx86C

Tetris

Compilation

x86

Example: Compilation of C programs on an x86 machine

C -> x86x86

x86Tetris

x86

Page 7: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

7

TetrisPPCC

Tetris

Cross compilation

x86

Example: A C “cross compiler” from x86 to PPC

C -> PPCx86

A cross compiler is a compiler which runs on one machine (the host machine) but emits code for another machine (the target machine).

Host ≠ Target

Q: Are cross compilers useful? Why would/could we use them?

PPCTetris

PPC

download

Page 8: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

8

Tetrisx86

TetrisJVMJava

Tetris

Two Stage Compilation

x86

Java->JVMx86

A two-stage translator is a composition of two translators. The output of the first translator is provided as input to the second translator.

x86

JVM->x86x86

Page 9: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

9

x86Java->x86

Compiling a Compiler

Observation: A compiler is a program! Therefore it can be provided as input to a language processor.Example: compiling a compiler.

Java->x86C

x86

C -> x86x86

Page 10: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

10

Interpreters

An interpreter is a language processor implemented in software, i.e. as a program.

Terminology: abstract (or virtual) machine versus real machine

Example: The Java Virtual Machine

JVMx86x86

JVMTetris

Q: Why are abstract machines useful?

Page 11: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

11

Interpreters

Q: Why are abstract machines useful?

1) Abstract machines provide better platform independence

JVMx86x86 PPC

JVMTetris

JVMPPC

JVMTetris

Page 12: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

12

Interpreters

Q: Why are abstract machines useful?

2) Abstract machines are useful for testing and debugging.

Example: Testing the “Ultima” processor using hardware emulation

Ultimax86x86

Ultima≡Ultima

P

UltimaP

Functional equivalence

Note: we don’t have to implement Ultima emulator in x86 we can use a high-level language and compile it.

Page 13: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

13

Interpreters versus Compilers

Q: What are the tradeoffs between compilation and interpretation?

Compilers typically offer more advantages when – programs are deployed in a production setting– programs are “repetitive”– the instructions of the programming language are complex

Interpreters typically are a better choice when– we are in a development/testing/debugging stage– programs are run once and then discarded – the instructions of the language are simple – the execution speed is overshadowed by other factors

• e.g. on a web server where communications costs are much higher than execution speed

Page 14: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

14

Interpretive Compilers

Why?A tradeoff between fast(er) compilation and a reasonable runtime performance.

How?Use an “intermediate language”• more high-level than machine code => easier to compile to• more low-level than source language => easy to implement as an

interpreter

Example: A “Java Development Kit” for machine M

Java->JVMM

JVMM

Page 15: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

15

PJVMJava

P

Interpretive Compilers

Example: Here is how we use our “Java Development Kit” to run a Java program P

Java->JVMM JVM

MM

JVMP

Mjavac java

Page 16: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

16

Portable Compilers

Example: Two different “Java Development Kits”

Java->JVMJVM

JVMM

Kit 2:

Java->JVMM

JVMM

Kit 1:

Q: Which one is “more portable”?

Page 17: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

17

Portable Compilers

In the previous example we have seen that portability is not an “all or nothing” kind of deal.

It is useful to talk about a “degree of portability” as the percentage of code that needs to be re-written when moving to a different machine.

In practice 100% portability is as good as impossible.

Page 18: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

18

Example: a “portable” compiler kit

Java->JVMJava

JVMJava

Java->JVMJVM

Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)

Portable Compiler Kit:

Page 19: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

19

Example: a “portable” compiler kit

Java->JVMJava

JVMJava

Java->JVMJVM

Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)

JVMJava

JVMC

reimplement

C->MM

JVMM

M

Page 20: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

20

Example: a “portable” compiler kit

Java->JVMJava

JVMJava

Java->JVMJVM

JVMM

This is what we have now:

Now, how do we run our Tetris program?

TetrisJVMJava

Tetris

M

Java->JVMJVMJVM

M

JVMTetris

JVMMM

Page 21: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

21

Bootstrapping

Java->JVMJava

JVMJava

Java->JVMJVM

Remember our “portable compiler kit”:

We haven’t used this yet!

Java->JVMJava

Same language! Q: What can we do with a compiler written in itself? Is that useful at all?

JVMM

Page 22: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

22

Bootstrapping

Java->JVMJava

Same language!

Q: What can we do with a compiler written in itself? Is that useful at all?

• By implementing the compiler in (a subset of) its own language, we become less dependent on the target platform => more portable implementation.

• But… “chicken and egg problem”? How do to get around that?=> BOOTSTRAPPING: requires some work to make the first “egg”.

There are many possible variations on how to bootstrap a compiler written in its own language.

Page 23: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

23

Bootstrapping an Interpretive Compiler to Generate M code

Java->JVMJava

JVMJava

Java->JVMJVM

Our “portable compiler kit”:

PMJava

PGoal: we want to get a “completely native” Java compiler on machine M

Java->MM

JVMM

M

Page 24: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

24

Bootstrapping an Interpretive Compiler to Generate M code (first approach)

Step 1: implement

Java->MJava JVM

Java ->MJava->JVM

JVMJVM

MM

Java ->MJava

Step 2: compile it

Step 3: Use this to compile again

by rewriting Java ->JVMJava

Page 25: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

25

Bootstrapping an Interpretive Compiler to Generate M code (first approach)

Step 3: “Self compile” the Java (in Java) compiler

MJava->M

JVMMM

Java->MJava Java->M

JVM

This is our desired compiler!

Step 4: use this to compile the P program

PMJava

PJava->M

M

Page 26: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

26

Bootstrapping an Interpretive Compiler to Generate M code (second approach)

Idea: we will build a two-stage Java -> M compiler.

PM

PMJava

P PJVM

M

Java->JVMM

M

JVM->MM

We will make this by compiling

To get this we implement

JVM->MJava

Java->JVMJVM and compile it

Page 27: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

27

Bootstrapping an Interpretive Compiler to Generate M code (second approach)

Step 1: implement

JVM->MJava JVM

JVM->MJava->JVM

JVMJVM

MM

JVM->MJava

Step 2: compile it

Step 3: compile this

Page 28: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

28

Bootstrapping an Interpretive Compiler to Generate M code (second approach)

Step 3: “Self compile” the JVM (in JVM) compiler

MJVM->M

JVMMM

JVM->MJVM JVM->M

JVM

This is the second stage of our compiler!

Step 4: use this to compile the Java compiler

Page 29: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

29

Bootstrapping an Interpretive Compiler to Generate M code

Step 4: Compile the Java->JVM compiler into machine code

MJava->JVM

M

Java->JVMJVM JVM->M

M

The first stage of our compiler!

We are DONE!

PM

PMJava

P PJVM

M

Java->JVMM

M

JVM->MM

Page 30: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

30

Full Bootstrap

A full bootstrap is necessary when we are building a new compiler from scratch.

Example:We want to implement an Ada compiler for machine M. We don’t currently have access to any Ada compiler (not on M, nor on any other machine).

Idea: Ada is very large, we will implement the compiler in a subset of Ada and bootstrap it from a subset-of-Ada compiler in another language. (e.g. C)

Ada-S ->MC

v1Step 1: build a compiler for Ada-S in another language

Page 31: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

31

Full Bootstrap

Ada-S ->MC

v1Step 1a: build a compiler (v1) for Ada-S in another language.

Ada-S ->MC

v1

MAda-S->M

v1

Step 1b: Compile v1 compiler on M

M

C->MM

This compiler can be used for bootstrapping on machine M but we do not want to rely on it permanently!

Page 32: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

32

Full Bootstrap

Ada-S ->MAda-S

v2Step 2a: Implement v2 of Ada-S compiler in Ada-S

Ada-S ->MAda-S

v2

M

MAda-S->M

v2

Step 2b: Compile v2 compiler with v1 compiler

Ada-S ->MM

v1

We are now no longer dependent on the availability of a C compiler!

Q: Is it hard to rewrite the compiler in Ada-S?

Page 33: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

33

Full Bootstrap

Step 3a: Build a full Ada compiler in Ada-S

Ada->MAda-S

v3

M

MAda->M

v3

Ada-S ->MM

v2

Step 3b: Compile with v2 compiler

Ada->MAda-S

v3

From this point on we can maintain the compiler in Ada. Subsequent versions v4,v5,... of the compiler can be written in Adaand each compiled with the previous version.

Page 34: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

34

Half Bootstrap

We discussed full bootstrap which is required when we have no access to a compiler for our language at all.

Q: What if we have access to a compiler for our language on a different machine HM but want to develop one for TM ?

Ada->HMHM

We have:Ada->TM

TM

We want:

Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.

Ada->HMAda

Page 35: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

35

HMAda->TM

Half Bootstrap

Idea: We can use cross compilation from HM to M to bootstrap the M compiler.

Step 1: Implement Ada->TM compiler in Ada

Ada->TMAda

Step 2: Compile on HM

Ada->TMAda Ada->HM

HMHM

Cross compiler: running on HM but

emits TM code

Page 36: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

36

TMAda->TM

Step 3: Cross compile our TM compiler.

Half Bootstrap

Ada->TMAda Ada->TM

HMHM

DONE!

From now on we can develop subsequent versions of the compiler completely on TM

Page 37: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

37

Bootstrapping to Improve Efficiency

The efficiency of programs and compilers:Efficiency of programs:

- memory usage- runtime

Efficiency of compilers: - Efficiency of the compiler itself- Efficiency of the emitted code

Idea: We start from a simple compiler (generating inefficient code) and develop more sophisticated versions of it. We can then use bootstrapping to improve performance of the compiler.

Page 38: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

38

Bootstrapping to Improve Efficiency

We have:Ada->Mslow

AdaAda-> Mslow

Mslow

We implement:Ada->Mfast

Ada

Ada->Mfast

Ada

M

Ada->MfastMslow

Step 1

Ada-> MslowMslow

Step 2 Ada->Mfast

Ada

M

Ada->MfastMfastAda-> Mfast

Mslow

Fast compiler that emits fast code!

Page 39: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

39

Conclusion

• To write a good compiler you may be writing several simpler ones first

• You have to think about the source language, the target language and the implementation language.

• Strategies for implementing a compiler1. Write it in machine code2. Write it in a lower level language and compile it using an

existing compiler3. Write it in the same language that it compiles and bootstrap

• The work of a compiler writer is never finished, there is always version 1.x and version 2.0 and …

Page 40: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

40

Compilation

So far we have treated language processors (including compilers) as “black boxes”

Now we take a first look "inside the box": how are compilers built.

And we take a look at the different “phases” and their relationships

Page 41: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

41

The “Phases” of a Compiler

Syntax Analysis

Contextual Analysis

Code Generation

Source Program

Abstract Syntax Tree

Decorated Abstract Syntax Tree

Object Code

Error Reports

Error Reports

Page 42: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

42

Different Phases of a Compiler

The different phases can be seen as different transformation steps to transform source code into object code.

The different phases correspond roughly to the different parts of the language specification:

• Syntax analysis <-> Syntax• Contextual analysis <-> Contextual constraints• Code generation <-> Semantics

Page 43: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

43

Mini Triangle

Mini Triangle is a very simple Pascal-like programming language.

An example program:

!This is a comment.let const m ~ 7;

var nin

beginn := 2 * m * m ;putint(n)

end

!This is a comment.let const m ~ 7;

var nin

beginn := 2 * m * m ;putint(n)

end

Declarations

Command

Expression

Page 44: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

44

Syntax of Mini Triangle

Program ::= single-Commandsingle-Command

::= V-name := Expression| Identifier ( Expression )| if Expression then single-Command

else single-Command| while Expression do single-Command| let Declaration in single-Command| begin Command end

Command ::= single-Command | Command ; single-Command

Program ::= single-Commandsingle-Command

::= V-name := Expression| Identifier ( Expression )| if Expression then single-Command

else single-Command| while Expression do single-Command| let Declaration in single-Command| begin Command end

Command ::= single-Command | Command ; single-Command

Page 45: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

45

Syntax of Mini Triangle (continued)

Expression ::= primary-Expression| Expression Operator primary-Expression

primary-Expression ::= Integer-Literal| V-name| Operator primary-Expression| ( Expression )

V-name ::= IdentifierIdentifier ::= Letter

| Identifier Letter| Identifier Digit

Integer-Literal ::= Digit | Integer-Literal Digit

Operator ::= + | - | * | / | < | > | =

Expression ::= primary-Expression| Expression Operator primary-Expression

primary-Expression ::= Integer-Literal| V-name| Operator primary-Expression| ( Expression )

V-name ::= IdentifierIdentifier ::= Letter

| Identifier Letter| Identifier Digit

Integer-Literal ::= Digit | Integer-Literal Digit

Operator ::= + | - | * | / | < | > | =

Page 46: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

46

Syntax of Mini Triangle (continued)

Declaration ::= single-Declaration| Declaration ; single-Declaration

single-Declaration ::= const Identifier ~ Expression| var Identifier : Type-denoter

Type-denoter ::= Identifier

Declaration ::= single-Declaration| Declaration ; single-Declaration

single-Declaration ::= const Identifier ~ Expression| var Identifier : Type-denoter

Type-denoter ::= Identifier

Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLineGraphic ::= any printable character or space

Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLineGraphic ::= any printable character or space

Page 47: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

47

Syntax Trees

A syntax tree is an ordered labeled tree such that:a) terminal nodes (leaf nodes) are labeled by terminal symbolsb) non-terminal nodes (internal nodes) are labeled by non

terminal symbols.c) each non-terminal node labeled by N has children X1,X2,...Xn

(in this order) such that N := X1,X2,...Xn is a production.

Page 48: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

48

Syntax Trees

Example:

Expression

Expression

V-name

primary-Exp.

Expression

Ident

d +

primary-Exp

Op Int-Lit

10 *

Op

V-name

primary-Exp.

Ident

d

Expression ::= Expression Op primary-Exp1 2 3

1

2

3

Page 49: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

49

Concrete and Abstract Syntax

The previous grammar specified the concrete syntax of Mini Triangle.

The concrete syntax is important for the programmer who needs to know exactly how to write syntactically well-formed programs.

The abstract syntax omits irrelevant syntactic details and only specifies the essential structure of programs.

Example: different concrete syntaxes for an assignmentv := e (set! v e)e -> vv = e

Page 50: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

50

Concrete Syntax of Commands

single-Command ::= V-name := Expression| Identifier ( Expression )| if Expression then single-Command

else single-Command| while Expression do single-Command| let Declaration in single-Command| begin Command end

Command ::= single-Command | Command ; single-Command

single-Command ::= V-name := Expression| Identifier ( Expression )| if Expression then single-Command

else single-Command| while Expression do single-Command| let Declaration in single-Command| begin Command end

Command ::= single-Command | Command ; single-Command

Page 51: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

51

Abstract Syntax of Commands

Command ::= V-name := Expression AssignCmd| Identifier ( Expression ) CallCmd| if Expression then Command

else Command IfCmd| while Expression do Command WhileCmd| let Declaration in Command LetCmd| Command ; Command SequentialCmd

Command ::= V-name := Expression AssignCmd| Identifier ( Expression ) CallCmd| if Expression then Command

else Command IfCmd| while Expression do Command WhileCmd| let Declaration in Command LetCmd| Command ; Command SequentialCmd

Page 52: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

52

Concrete Syntax of Expressions (recap)

Expression ::= primary-Expression| Expression Operator primary-Expression

primary-Expression ::= Integer-Literal| V-name| Operator primary-Expression| ( Expression )

V-name ::= Identifier

Expression ::= primary-Expression| Expression Operator primary-Expression

primary-Expression ::= Integer-Literal| V-name| Operator primary-Expression| ( Expression )

V-name ::= Identifier

Page 53: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

53

Abstract Syntax of Expressions

Expression ::= Integer-Literal IntegerExp| V-name VnameExp| Operator Expression UnaryExp| Expression Op Expression BinaryExp

V-name::= Identifier SimpleVName

Expression ::= Integer-Literal IntegerExp| V-name VnameExp| Operator Expression UnaryExp| Expression Op Expression BinaryExp

V-name::= Identifier SimpleVName

Page 54: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

54

Abstract Syntax TreesAbstract Syntax Tree for: d:=d+10*n

BinaryExpression

VNameExp

BinaryExpression

Ident

d + 10 *

OpOp Int-Lit

SimpleVName

IntegerExp VNameExp

Ident

n

SimpleVName

AssignmentCmd

d

Ident

VName

SimpleVName

Page 55: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

55

1) Syntax Analysis

Syntax Analysis

Source Program

Abstract Syntax Tree

Error Reports

Note: Not all compilers construct anexplicit representation of an AST. (e.g. on a “single pass compiler” generally no need to construct an AST)

Note: Not all compilers construct anexplicit representation of an AST. (e.g. on a “single pass compiler” generally no need to construct an AST)

Page 56: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

56

Example Program

We now look at each of the three different phases in a little more detail. We look at each of the steps in transforming an example Mini Triangle program into TAM code.

! This program is useless except for! illustrationlet var n: integer;

var c: charin begin

c := ‘&’;n := n+1

end

! This program is useless except for! illustrationlet var n: integer;

var c: charin begin

c := ‘&’;n := n+1

end

Page 57: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

57

1) Syntax Analysis -> AST

Program

LetCommand

SequentialDeclaration

n Integer c Char c ‘&’ n n + 1

Ident Ident Ident Ident Ident Ident Ident OpChar.Lit Int.Lit

SimpleT

VarDecl

SimpleT

VarDecl

SimpleV

Char.Expr

SimpleV

VNameExp Int.Expr

AssignCommand BinaryExpr

SequentialCommand

AssignCommand

Page 58: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

58

2) Contextual Analysis -> Decorated AST

Contextual Analysis

Decorated Abstract Syntax Tree

Error Reports

Abstract Syntax Tree

Contextual analysis:• Scope checking: verify that all applied occurrences of

identifiers are declared• Type checking: verify that all operations in the program are

used according to their type rules.Annotate AST:

• Applied identifier occurrences => declaration• Expressions => Type

Page 59: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

59

2) Contextual Analysis -> Decorated AST

Program

LetCommand

SequentialDeclaration

n

Ident Ident Ident Ident

SimpleT

VarDecl

SimpleT

VarDecl

Integer c Char c ‘&’ n n + 1

Ident Ident Ident OpChar.Lit Int.Lit

SimpleV

Char.Expr

SimpleV

VNameExp Int.Expr

AssignCommand BinaryExpr

SequentialCommand

AssignCommand

:char

:char

:int

:int

:int :int

Page 60: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

60

Contextual Analysis

Finds scope and type errors.

AssignCommand

:char:int

***TYPE ERROR (incompatible types in assigncommand)

Example 1:

Example 2:

foo

Ident

SimpleV

foo not found

***SCOPE ERROR: undeclared variable foo

Page 61: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

61

3) Code Generation

• Assumes that program has been thoroughly checked and is well formed (scope & type rules)

• Takes into account semantics of the source language as well as the target language.

• Transforms source program into target code.

Code Generation

Decorated Abstract Syntax Tree

Object Code

Page 62: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

62

3) Code Generation

let var n: integer;var c: char

in beginc := ‘&’;n := n+1

end

PUSH 2LOADL 38STORE 1[SB]LOAD 0LOADL 1CALL addSTORE 0[SB]POP 2HALT

n

Ident Ident

SimpleT

VarDecl

Integer

address = 0[SB]

Page 63: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

63

Compiler Passes

• A pass is a complete traversal of the source program, or a complete traversal of some internal representation of the source program.

• A pass can correspond to a “phase” but it does not have to!

• Sometimes a single “pass” corresponds to several phases that are interleaved in time.

• What and how many passes a compiler does over the source program is an important design decision.

Page 64: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

64

Single Pass Compiler

Compiler Driver

Syntactic Analyzer

calls

calls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Single Pass Compiler:

A single pass compiler makes a single pass over the source text,parsing, analyzing and generating code all at once.

Page 65: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

65

Multi Pass Compiler

Compiler Driver

Syntactic Analyzer

callscalls

Contextual Analyzer Code Generator

calls

Dependency diagram of a typical Multi Pass Compiler:

A multi pass compiler makes several passes over the program. Theoutput of a preceding phase is stored in a data structure and used by subsequent phases.

input

Source Text

output

AST

input output

Decorated AST

input output

Object Code

Page 66: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

66

The Mini Triangle Compiler Driver

public class Compiler {public static void compileProgram(...) {

Parser parser = new Parser(...);Checker checker = new Checker(...);Encoder generator = new Encoder(...);

Program theAST = parser.parse();

checker.check(theAST);generator.encode(theAST);

}public void main(String[] args) {

... compileProgram(...) ...}

}

public class Compiler {public static void compileProgram(...) {

Parser parser = new Parser(...);Checker checker = new Checker(...);Encoder generator = new Encoder(...);

Program theAST = parser.parse();

checker.check(theAST);generator.encode(theAST);

}public void main(String[] args) {

... compileProgram(...) ...}

}

Page 67: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

67

Compiler Design Issues

Single Pass Multi Pass

Speed

Memory

Modularity

Flexibility

“Global” optimization

Source Language

better worse

better for large programs

(potentially) better for small programs

worse better

betterworse

impossible possible

single pass compilers are not possible for many programming languages

Page 68: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

68

Language Issues

Example Pascal:Pascal was explicitly designed to be easy to implement

with a single pass compiler:– Every identifier must be declared before it is first used.

var n:integer;

procedure inc;begin

n:=n+1end

Undeclared Variable!

procedure inc;begin

n:=n+1end;

var n:integer;

?

Page 69: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

69

Language Issues

Example Pascal:– Every identifier must be declared before it is used. – How to handle mutual recursion then?

procedure ping(x:integer)begin

... pong(x-1); ...end;

procedure pong(x:integer)begin

... ping(x); ...end;

Page 70: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

70

Language Issues

Example Pascal:– Every identifier must be declared before it is used. – How to handle mutual recursion then?

forward procedure pong(x:integer)

procedure ping(x:integer)begin

... pong(x-1); ...end;

procedure pong(x:integer)begin

... ping(x); ...end;

OK!

Page 71: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

71

Language Issues

Example Java:– identifiers can be declared before they are used. – thus a Java compiler needs at least two passes

Class Example {

void inc() { n = n + 1; }

int n;

void use() { n = 0 ; inc(); }

}

Page 72: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

72

Scope of Variable• Range of program that can reference that variable (ie

access the corresponding data object by the variable’s name)

• Variable is local to program or block if it is declared there

• Variable is non-local to program unit if it is visible there but not declared there

Page 73: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

73

Static vs. Dynamic Scope• Under static, sometimes

called lexical, scope, sub1 will always reference the xdefined in big

• Under dynamic scope, the x it references depends on the dynamic state of execution

procedure big;var x: integer;procedure sub1;begin {sub1}

... x ...end; {sub1}

procedure sub2;var x:

integer;begin {sub2}

...sub1;...

end; {sub2}

begin {big}

...sub1;sub2;

...end; {big}

Page 74: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

74

Static Scoping• Scope computed at compile time, based on program text• To determine the name of a used variable we must find statement

declaring variable• Subprograms and blocks generate hierarchy of scopes

– Subprogram or block that declares current subprogram or contains current block is its static parent

• General procedure to find declaration:– First see if variable is local; if yes, done– If non-local to current subprogram or block recursively search

static parent until declaration is found– If no declaration is found this way, undeclared variable error

detected

Page 75: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

75

Exampleprogram main;

var x : integer;procedure sub1;var x : integer;begin { sub1 }… x …

end; { sub1 }

begin { main }… x …end; { main }

Page 76: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

76

Dynamic Scope

• Now generally thought to have been a mistake • Main example of use: original versions of LISP

– Scheme uses static scope– Perl allows variables to be declared to have dynamic scope

• Determined by the calling sequence of program units, not static layout

• Name bound to corresponding variable most recently declared among still active subprograms and blocks

Page 77: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

77

Example

program main;var x : integer;procedure sub1;begin { sub1 }… x …

end; { sub1 }

procedure sub2;var x :

integer;begin { sub2

}… call sub1 …end; { sub2 }

… call sub2…end; { main }

Page 78: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

78

Binding

• Binding: an association between an attribute and its entity

• Binding Time: when does it happen?

• … and, when can it happen?

Page 79: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

79

Binding of Data Objects and Variables

• Attributes of data objects and variables have different binding times

• If a binding is made before run time and remains fixed through execution, it is called static

• If the binding first occurs or can change during execution, it is called dynamic

Page 80: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

80

Binding Time

Static• Language definition time• Language

implementation time• Program writing time• Compile time• Link time• Load time

Dynamic• Run time

– At the start of execution (program)

– On entry to a subprogram or block

– When the expression is evaluated

– When the data is accessed

Page 81: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

81

X = X + 10• Set of types for variable X• Type of variable X• Set of possible values for variable X• Value of variable X• Scope of X

– lexical or dynamic scope• Representation of constant 10

– Value (10)– Value representation (10102)

• big-endian vs. little-endian– Type (int)– Storage (4 bytes)

• stack or global allocation• Properties of the operator +

– Overloaded or not

Page 82: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

82

Little- vs. Big-Endians• Big-endian

– A computer architecture in which, within a given multi-byte numeric representation, the most significant byte has the lowest address (the word is stored `big-end-first').

– Motorola and Sun processors

• Little-endian– a computer architecture in which, within a given 16- or 32-bit word, bytes at lower

addresses have lower significance (the word is stored `little-end-first').

– Intel processors

from The Jargon Dictionary - http://info.astrian.net/jargon

Page 83: Languages and Compilers (SProg og Oversættere)people.cs.aau.dk/~bt/SPOF07/SPOF07-2.pdf · Cross compilation x86 Example: A C “cross compiler” from x86 to PPC C -> PPC x86 A cross

83

Binding Times summary

• Language definition time: – language syntax and semantics, scope discipline

• Language implementation time: – interpreter versus compiler, – aspects left flexible in definition, – set of available libraries

• Compile time: – some initial data layout, internal data structures

• Link time (load time): – binding of values to identifiers across program modules

• Run time (execution time): – actual values assigned to non-constant identifiers

The Programming language designer and compiler implementer have to make decisions about binding times