automatically proving the correctness of compiler optimizations

Post on 22-Mar-2016

64 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Automatically Proving the Correctness of Compiler Optimizations. Sorin Lerner Todd Millstein Craig Chambers University of Washington. Goal: correct compilers. The compiler is usually part of the trusted computing base. “But I use gcc, and it works great!”. gcc-bugs mailing list. - PowerPoint PPT Presentation

TRANSCRIPT

Automatically Proving the Correctness of Compiler

OptimizationsSorin Lerner Todd Millstein Craig

ChambersUniversity of Washington

Goal: correct compilers

• The compiler is usually part of the trusted computing base.

• “But I use gcc, and it works great!”

gcc-bugs mailing list

• c/9525: incorrect code generation on SSE2 intrinsics• target/7336: [ARM] With -Os option, gcc incorrectly computes the

elimination offset• optimization/9325: wrong conversion of constants: (int)(float)(int)

(INT_MAX)• optimization/6537: For -O (but not -O2 or -O0) incorrect assembly is

generated• optimization/6891: G++ generates incorrect code when -Os is used• optimization/8613: [3.2/3.3/3.4 regression] -O2 optimization generates

wrong code • target/9732: PPC32: Wrong code with -O2 –fPIC• c/8224: Incorrect joining of signed and unsigned division • …

Searched for “incorrect” and “wrong” in the gcc-bugs mailing list.Some of the results:

And this is only for February 2003!On a mature compiler!

compilerSource CompiledProg

run!

inputexp-ectedoutput

Testing

• No correctness guarantees:• neither for the compiled

prog• nor for the compiler

DIFF

• To get benefits, must:• run over many inputs• compile many test cases

output

Verify each compilation

compilerSource CompiledProg

SemanticDIFF

• Translation validation [Pnueli et al 98, Necula 00]

• Credible compilation[Rinard 99]

• Compiler can still have bugs.

• Compile time increases.• “Semantic Diff” is hard.

Proving the whole compiler correct

compilerSource CompiledProg

Correctnesschecker

Proving the whole compiler correct

compiler

Correctnesschecker

Correctness checker

• Option 1: Prove compiler correct by hand.

• Proofs are long…

• And hard.• Compilers are

proven correct as written on paper. What about the implementation?

ProofProofProof«¬

$ \ rt l / .

Link?

Correctness checker

Our Approach

• Our approach: prove compiler correct automatically.

AutomaticTheoremProver

compiler

This seems really hard!

AutomaticTheoremProver

Task of provingcompiler correct

Complexity that an automatic theorem prover can handle.

Complexity of proving a compiler correct.

Making the problem easier

AutomaticTheoremProver

Task of provingcompiler correct

Making the problem easier

AutomaticTheoremProver

Task of provingoptimizer correct • Only prove optimizer correct.

• Trust front-end and code-generator.

Making the problem easier

AutomaticTheoremProver

Write optimizations in Cobalt, a domain-specific language.

Task of provingoptimizer correct

Making the problem easier

AutomaticTheoremProver

Separate correctness from profitability.

Write optimizations in Cobalt, a domain-specific language.

Task of provingoptimizer correct

Making the problem easier

Write optimizations in Cobalt, a domain-specific language.

Separate correctness from profitability.

Factor out the hard and common parts of the proof, and prove them once by hand.

AutomaticTheoremProver

Task of provingoptimizer correct

Results• Cobalt language

– realistic C-like IL– implemented const prop and folding, branch

folding, CSE, PRE, DAE, partial DAE, and simple forms of points-to analyses

• Correctness checker for Cobalt opts– using the Simplify theorem prover

• Execution engine for Cobalt opts– in the Whirlwind compiler

Caveats• May not be able to express your opt Cobalt:

– no interprocedural optimizations for now.– optimizations that build complicated data

structures may be difficult to express.

• A sound Cobalt optimization may be rejected by the correctness checker.

• Trusted computing base (TCB) includes:– front-end and code-generator, execution engine,

correctness checker, proofs done by hand once

Outline• Overview

• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound

• Profitability heuristics

• Pure analyses

y := 5

x := yREPLACE

x := 5

statement y := 5

statements thatdon’t define y

statement x := y

Constant Prop (straight-line code)

Adding arbitrary control flow

y := 5

x := y REPLACE x := 5

statement y := 5

statements thatdon’t define y

statement x := y

y := 5y := 5

is followed by

until

transform statement to x := 5

if

then

Constant prop in

statement y := 5

statements thatdon’t define y

is followed by

until

if

thentransform statement to x := 5

statement x := y

English

boolean expressions evaluated at nodes in the CFG

stmt(Y := C)

X := Y

followed by

until

Cobalt versionEnglish version

: mayDef(Y)

statement y := 5

statements thatdon’t define y

is followed by

until

if

thentransform statement to x := 5

statement x := y

Constant prop inCobalt

X := C

Outline• Overview

• Forward optimizations (see paper for backwards)– Example: constant propagation– Strategy for proving forward optimizations sound

• Profitability heuristics

• Pure analyses

Proving correctness automatically

y := 5

x := y x := 5

y := 5y := 5

• Witnessing region• Invariant: y == 5

Constant prop revisited

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

Ask a theorem prover to show:1. A statement satisfying stmt(Y :=

C) establishes Y == C2. A statement satisfying :mayDef(Y)

maintains Y == C3. The statements X := Y and X := C

have the same semantics in a program state satisfying Y == C

X := C

Generalize to any forward optimization

Ask a theorem prover to show:1. A statement satisfying 1

establishes P2. A statement satisfying 2

maintains P3. The statements s and s’

have the same semantics in a program state satisfying P

We showed by hand once that these conditions imply correctness.

1

2

s

followed by

until

with witnessP

s’

Outline• Overview

• Forward optimizations (see paper for backwards)

• Profitability heuristics

• Pure analyses

Profitability heuristics

• Optimization correct ) safe to perform any subset of the matching transformations.

• So far, all transformations were also profitable.

• In some cases, many transformations are legal, but only a few are profitable.

The two pieces of an optimization

1

followed by 2

until s

s’with witness Pfiltered through choose

• Transformation pattern:– defines which

transformations are legal.

• Profitability heuristic:– describes which of the legal

transformations to actually perform.

– does not affect soundness.– can be written in a language

of the user’s choice.

• This way of factoring an optimization is crucial to our ability to prove optimizations sound automatically.

Profitability heuristic example: PRE

• PRE as code duplication followed by CSE

Profitability heuristic example: PRE

a := ...;

b := ...;

if (...) {

a := ...;

x := a + b;

} else {

...

}

x := a + b;x := a + b;

• Code duplication

• PRE as code duplication followed by CSE

Profitability heuristic example: PRE

• PRE as code duplication followed by CSE

a := ...;

b := ...;

if (...) {

a := ...;

x := a + b;

} else {

}

x :=

x := a + b;

• Code duplication

• CSE• self-assignment

removal

a + b; x;

Profitability heuristic example: PRE

a := ...;

b := ...;

if (...) {

a := ...;

x := a + b;

} else {

...

}

x := a + b;

Legal placements of x := a + bProfitable placement

Outline• Overview

• Forward optimizations (see paper for backwards)

• Profitability heuristics

• Pure analyses

Constant prop revisited (again)

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayDef in Cobalt

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayDef in Cobalt

• Very conservative!• Can we do better?

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayDef in Cobalt

• Very conservative!• Can we do better?

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayDef in Cobalt

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayDef in Cobalt

• mayPntTo is a pure analysis.• It computes dataflow info,

but performs no transformations.

stmt(Y := C)

: mayDef(Y)

X := Y

followed by

until

with witnessY == C

X := C

mayPntTo in Cobalt

addrNotTaken(X)

“no location in the store points to X”

decl X

s

mayPntTo(X,Y) , : addrNotTaken(Y)

stmt(decl X)

followed by: stmt(... := &X)

defines

with witness

Future work

• Improving expressiveness– interprocedural optimizations– one-to-many and many-to-many

transformations

• Inferring the witness

• Generate specialized compiler binary from the Cobalt sources.

Summary and Conclusion

• Optimizations written in a domain-specific language can be proven correct automatically.

• Our correctness checker found several subtle bugs in Cobalt optimizations.

• A good step towards proving compilers correct automatically.

top related