languages and compilers (sprog og oversættere)

36
1 Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University nowledgement to Norm Hutchinson who’s slides this lecture is based on.

Upload: alexandra-valdez

Post on 31-Dec-2015

32 views

Category:

Documents


2 download

DESCRIPTION

Languages and Compilers (SProg og Oversættere). Bent Thomsen Department of Computer Science Aalborg University. With acknowledgement to Norm Hutchinson who’s slides this lecture is based on. is expressed in the source language. is expressed in the target language. is expressed in the - PowerPoint PPT Presentation

TRANSCRIPT

1

Languages and Compilers(SProg og Oversættere)

Bent Thomsen

Department of Computer Science

Aalborg University

With acknowledgement to Norm Hutchinson who’s slides this lecture is based on.

2

Terminology

Translatorinput output

source program object program

is expressed in thesource language

is expressed in theimplementation language

is expressed in thetarget language

Q: Which programming languages play a role in this picture?

A: All of them!

3

Tombstone Diagrams

What are they?– diagrams consisting out of a set of “puzzle pieces” we can use

to reason about language processors and programs

– different kinds of pieces

– combination rules (not all diagrams are “well formed”)

M

Machine implemented in hardware

S -> T

L

Translator implemented in L

ML

Language interpreter in L

Program P implemented in L

LP

4

Tombstone diagrams: Combination rules

SP P

TS -> T

M

M

LP

S -> T

MWRONG!

OK!OK!

OK!MMP

OK!

M

LP

WRONG!

5

Tetrisx86C

Tetris

Compilation

x86

Example: Compilation of C programs on an x86 machine

C -> x86

x86x86

Tetris

x86

6

TetrisPPCC

Tetris

Cross compilation

x86

Example: A C “cross compiler” from x86 to PPC

C -> PPC

x86

A cross compiler is a compiler which runs on one machine (the host machine) but emits code for another machine (the target machine).

Host ≠ Target

Q: Are cross compilers useful? Why would/could we use them?

PPCTetris

PPC

download

7

Tetrisx86

TetrisJVMJava

Tetris

Two Stage Compilation

x86

Java->JVM

x86

A two-stage translator is a composition of two translators. The output of the first translator is provided as input to the second translator.

x86

JVM->x86

x86

8

x86

Java->x86

Compiling a Compiler

Observation: A compiler is a program! Therefore it can be provided as input to a language processor.Example: compiling a compiler.

Java->x86

C

x86

C -> x86

x86

9

Interpreters

An interpreter is a language processor implemented in software, i.e. as a program.

Terminology: abstract (or virtual) machine versus real machine

Example: The Java Virtual Machine

JVMx86

x86

JVMTetris

Q: Why are abstract machines useful?

10

Interpreters

Q: Why are abstract machines useful?

1) Abstract machines provide better platform independence

JVMx86

x86 PPC

JVMTetris

JVMPPC

JVMTetris

11

Interpreters

Q: Why are abstract machines useful?

2) Abstract machines are useful for testing and debugging.

Example: Testing the “Ultima” processor using hardware emulation

Ultimax86

x86

UltimaUltima

P

UltimaP

Functional equivalence

Note: we don’t have to implement Ultima emulator in x86 we can use a high-level language and compile it.

12

Interpreters versus Compilers

Q: What are the tradeoffs between compilation and interpretation?

Compilers typically offer more advantages when – programs are deployed in a production setting– programs are “repetitive”– the instructions of the programming language are complex

Interpreters typically are a better choice when– we are in a development/testing/debugging stage– programs are run once and then discarded – the instructions of the language are simple – the execution speed is overshadowed by other factors

• e.g. on a web server where communications costs are much higher than execution speed

13

Interpretive Compilers

Why?A tradeoff between fast(er) compilation and a reasonable runtime performance.

How?Use an “intermediate language”• more high-level than machine code => easier to compile to• more low-level than source language => easy to implement as an

interpreter

Example: A “Java Development Kit” for machine M

Java->JVM

M

JVMM

14

PJVMJava

P

Interpretive Compilers

Example: Here is how we use our “Java Development Kit” to run a Java program P

Java->JVM

M JVMMM

JVMP

M

15

Portable Compilers

Example: Two different “Java Development Kits”

Java->JVM

JVM

JVMM

Kit 2:

Java->JVM

M

JVMM

Kit 1:

Q: Which one is “more portable”?

16

Portable Compilers

In the previous example we have seen that portability is not an “all or nothing” kind of deal.

It is useful to talk about a “degree of portability” as the percentage of code that needs to be re-written when moving to a dissimilar machine.

In practice 100% portability is as good as impossible.

17

Example: a “portable” compiler kit

Java->JVM

Java

JVMJava

Java->JVM

JVM

Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)

Portable Compiler Kit:

18

Example: a “portable” compiler kit

Java->JVM

Java

JVMJava

Java->JVM

JVM

Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)

JVMJava

JVMC

reimplement

C->M

M

JVMM

M

19

Example: a “portable” compiler kit

Java->JVM

Java

JVMJava

Java->JVM

JVM

JVMM

This is what we have now:

Now, how do we run our Tetris program?

TetrisJVMJava

Tetris

M

Java->JVM

JVMJVM

M

JVMTetris

JVMM

M

20

Bootstrapping

Java->JVM

Java

JVMJava

Java->JVM

JVM

Remember our “portable compiler kit”:

We haven’t used this yet!

Java->JVM

Java

Same language! Q: What can we do with a compiler written in itself? Is that useful at all?

JVMM

21

Bootstrapping

Java->JVM

Java

Same language!

Q: What can we do with a compiler written in itself? Is that useful at all?

• By implementing the compiler in (a subset of) its own language, we become less dependent on the target platform => more portable implementation.

• But… “chicken and egg problem”? How do to get around that?=> BOOTSTRAPPING: requires some work to make the first “egg”.

There are many possible variations on how to bootstrap a compiler written in its own language.

22

Bootstrapping an Interpretive Compiler to Generate M code

Java->JVM

Java

JVMJava

Java->JVM

JVM

Our “portable compiler kit”:

PMJava

P

Goal we want to get a “completely native” Java compiler on machine M

Java->M

M

JVMM

M

23

PM

PMJava

P

Bootstrapping an Interpretive Compiler to Generate M code

Idea: we will build a two-stage Java -> M compiler.

PJVM

M

Java->JVM

MM

JVM->M

M

We will make this by compiling

To get this we implement

JVM->M

JavaJava->JVM

JVM and compile it

24

Bootstrapping an Interpretive Compiler to Generate M code

Step 1: implement

JVM->M

Java JVM

JVM->MJava->JVM

JVMJVM

M

M

JVM->M

Java

Step 2: compile it

Step 3: compile this

25

Bootstrapping an Interpretive Compiler to Generate M code

Step 3: “Self compile” the JVM (in JVM) compiler

M

JVM->M

JVMM

M

JVM->M

JVM JVM->M

JVM

This is the second stage of our compiler!

Step 4: use this to compile the Java compiler

26

Bootstrapping an Interpretive Compiler to Generate M code

Step 4: Compile the Java->JVM compiler into machine code

M

Java->JVM

M

Java->JVM

JVM JVM->M

M

The first stage of our compiler!

We are DONE!

27

Full Bootstrap

A full bootstrap is necessary when we are building a new compiler from scratch.

Example:We want to implement an Ada compiler for machine M. We don’t currently have access to any Ada compiler (not on M, nor on any other machine).

Idea: Ada is very large, we will implement the compiler in a subset of Ada and bootstrap it from a subset of Ada compiler in another language. (e.g. C)

Ada-S ->M

C

v1Step 1: build a compiler for Ada-S in another language

28

Full Bootstrap

Ada-S ->M

C

v1

Step 1a: build a compiler (v1) for Ada-S in another language.

Ada-S ->M

C

v1

M

Ada-S->Mv1

Step 1b: Compile v1 compiler on M

M

C->M

MThis compiler can be used for bootstrapping on machine M but we do not want to rely on it permanently!

29

Full Bootstrap

Ada-S ->M

Ada-S

v2

Step 2a: Implement v2 of Ada-S compiler in Ada-S

Ada-S ->M

Ada-S

v2

M

M

Ada-S->Mv2

Step 2b: Compile v2 compiler with v1 compiler

Ada-S ->M

M

v1

Q: Is it hard to rewrite the compiler in Ada-S?

We are now no longer dependent on the availability of a C compiler!

30

Full Bootstrap

Step 3a: Build a full Ada compiler in Ada-S

Ada->M

Ada-S

v3

M

M

Ada->Mv3

Ada-S ->M

M

v2

Step 3b: Compile with v2 compiler

Ada->M

Ada-S

v3

From this point on we can maintain the compiler in Ada. Subsequent versions v4,v5,... of the compiler in Ada and compile each with the the previous version.

31

Half Bootstrap

We discussed full bootstrap which is required when we have no access to a compiler for our language at all.

Q: What if we have access to an compiler for our language on a different machine HM but want to develop one for TM ?

Ada->HM

HM

We have:

Ada->TM

TM

We want:

Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.

Ada->HM

Ada

32

HM

Ada->TM

Half Bootstrap

Idea: We can use cross compilation from HM to M to bootstrap the M compiler.

Step 1: Implement Ada->TM compiler in Ada

Ada->TM

Ada

Step 2: Compile on HM

Ada->TM

Ada Ada->HM

HM

HM

Cross compiler: running on HM but

emits TM code

33

TM

Ada->TM

Half Bootstrap

Step 3: Cross compile our TM compiler.

Ada->TM

Ada Ada->TM

HM

HM

DONE!

From now on we can develop subsequent versions of the compiler completely on TM

34

Bootstrapping to Improve Efficiency

The efficiency of programs and compilers:Efficiency of programs:

- memory usage- runtime

Efficiency of compilers: - Efficiency of the compiler itself- Efficiency of the emitted code

Idea: We start from a simple compiler (generating inefficient code) and develop more sophisticated version of it. We can then use bootstrapping to improve performance of the compiler.

35

Bootstrapping to Improve Efficiency

We have:Ada->Mslow

Ada

Ada-> Mslow

Mslow

We implement:Ada->Mfast

Ada

Ada->Mfast

Ada

M

Ada->Mfast

Mslow

Step 1

Ada-> Mslow

Mslow

Step 2 Ada->Mfast

Ada

M

Ada->Mfast

MfastAda-> Mfast

Mslow

Fast compiler that emits fast code!

36

Conclusion

• To write a good compiler you may be writing several simpler ones first

• You have to think about the source language, the target language and the implementation language.

• The work of a compiler writer is never finished, there is always version 1.x and version 2.0 and …