proof-carrying code

50
Proof-Carrying Code

Upload: ciel

Post on 21-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Proof-Carrying Code. Programmable mobile devices. By 2003, one in five people will own a mobile communications device. Nokia expects to sell 500M Java-enabled phones in 2003. Most of these devices will be power and memory limited. Mobile/Wireless Devices. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Proof-Carrying Code

Proof-Carrying Code

Page 2: Proof-Carrying Code

Programmable mobile devices

By 2003, one in five people will own a mobile communications device.

Nokia expects to sell 500M Java-enabled phones in 2003.

Most of these devices will be power and memory limited.

Page 3: Proof-Carrying Code

Mobile/Wireless Devices

In ‘97, 101M mobile phones vs 82M PCs. (40% vs 14%.)

95% phones will be WAP enabled by ‘04.

64Mbits of RAM in 2002.

Battery life a primary factor.

Efficiency and bandwidth will still be precious.

Page 4: Proof-Carrying Code

Cheese and the Sum Total of Human Knowledge

Page 5: Proof-Carrying Code

The Code Safety Problem

Page 6: Proof-Carrying Code

Code Safety

CPU

Code

Trusted Host

Is this safe to execute?

Page 7: Proof-Carrying Code

Approach 1Trust the Code Producer

CPU

Code

Trusted Host

sig

Trusted 3rd Party

PK1

PK1

PK2

PK2

Trust is based on personal authority, not program properties

Scaling problems?

Page 8: Proof-Carrying Code

Approach 2Baby-sit the Program

CPU

Code

Trusted Host

Execution monitor

Expensive

Limited in expressive power(Why?)

E.g., Software Fault Isolation [Wahbe & Lucco], Inline Reference Monitors [Schneider]

Page 9: Proof-Carrying Code

Approach 3Java

CPU

Code

Trusted Host

Interp/ JIT

Expensive and/or big

Limited in expressive power

Verifier

Page 10: Proof-Carrying Code

TheoremProver

Approach 4Formal Verification

CPU

Code

Flexible andpowerful.

Trusted Host

But really reallyreally hard andmust be correct.

Page 11: Proof-Carrying Code

A Key Idea: Explicit Proofs

CertifyingProver

CPU

ProofChecker

Code

Proof

Trusted Host

Page 12: Proof-Carrying Code

A Key Idea: Explicit Proofs

CertifyingProver

CPU

Code

Proof

No longer need totrust this component.

ProofChecker

Page 13: Proof-Carrying Code

Proof-Carrying Code

CertifyingProver

CPU

Code

Proof

Simple,small (<52KB),and fast.

No longer need totrust this component.

ProofChecker

Reasonable in size (0-10%).

Page 14: Proof-Carrying Code

But...

...How to generate the proofs?

Proving theorems about real programs is hard.

Most useful safety properties of low-level programs are undecidable.

Theorem-proving systems are unfamiliar to programmers and hard to use even for experts.

Page 15: Proof-Carrying Code

The Role ofProgramming Languages

Civilized programming languages can provide “safety for free”.

Well-formed/well-typed safe.

Idea: Arrange for the compiler to “explain” why the target code it generates preserves the safety properties of the source program.

Page 16: Proof-Carrying Code

Certifying Compilers[Necula & Lee, PLDI’98]

Intuition:

Compiler “knows” why each translation step is semantics-preserving.

So, have it generate a proof that safety is preserved.

“Small theorems about big programs.”

Don’t try to verify the whole compiler, but only each output it generates.

Page 17: Proof-Carrying Code

Automation viaCertifying Compilation

CertifyingCompiler

CPULooks and smells like a compiler.

% spjc foo.java bar.class baz.c -ljdk1.2.2

Sourcecode

Proof

Objectcode

CertifyingProver

ProofChecker

Page 18: Proof-Carrying Code

Overview of the Necula/Lee Approach to PCC

Page 19: Proof-Carrying Code

High-Level Architecture

Explanation

CodeVerificationconditiongenerator

Checker

Safetypolicy

Agent

Host

Page 20: Proof-Carrying Code

Reference Interpreters

A reference interpreter (RI) is a standard interpreter extended with instrumentation to check the safety of each instruction before it is executed, and abort execution if anything unsafe is about to happen.

In other words, an RI is capable only of safe execution.

Page 21: Proof-Carrying Code

Reference Interpreterscont’d

The reference interpreter is never actually implemented.

The point will be to prove (by using the proof rules given in the safety policy) that execution of the code on the RI never aborts, and thus execution on the real hardware will be identical to execution on the RI.

Page 22: Proof-Carrying Code

Sample Reference Interpreter

Page 23: Proof-Carrying Code

High-Level Architecture

Explanation

CodeVerificationconditiongenerator

Checker

Safetypolicy

Agent

Host

Page 24: Proof-Carrying Code

The Safety Policy

The RI can be viewed as defining a safety policy

RI language is a restriction of x86 assembly language

Must prove that a given program always makes progress on the RI

We introduce verification conditions (VCs), whose truth implies that the corresponding instruction has a defined execution on the RI.

Page 25: Proof-Carrying Code

Verification Conditions

The point of the verification conditions, then, is to provide such progress theorems for each instruction in the program.

In other words, a VC’s validity says that the corresponding instruction has a defined execution in the s86 operational semantics.

Page 26: Proof-Carrying Code

The VCGen

The verification condition generator (VCGen) examines each instruction.

It essentially encodes the operational semantics of the language.

It checks some simple properties.

E.g., direct jumps go to legal addrs.

It invokes the Checker when dangerous instructions are encountered.

Page 27: Proof-Carrying Code

The VCGen, cont’d

Examples of dangerous instructions:

memory operations

procedure calls

procedure returns

For each such instruction, VCGen creates a verification condition (VC).

A VC is a logical predicate whose truth implies the instruction is safe.

Page 28: Proof-Carrying Code

Examples of Safety Properties

Memory safety. Which addresses are readable /

writable; when, and what values.

Type safety. What values can be stored and used

in operations.

System call safety. Which system routines can be called

and when.

Page 29: Proof-Carrying Code

Examples of Safety Policiescont’d

Action sequence safety.

E.g., no network send after reading a file.

Resource usage safety.

E.g., instruction counts, stack limits, etc.

Page 30: Proof-Carrying Code

What Can’t Be Enforced?

Informally:

Safety properties. Yes. “No bad thing will happen.”

Liveness properties. Not yet. “A good thing will eventually happen.”

Information-flow properties. ? “Confidentiality will be preserved.”

Page 31: Proof-Carrying Code

Example of type safety giving us VC validity?

Page 32: Proof-Carrying Code

Example: Source Code

public class Bcopy { public static void bcopy(int[] src,

int[] dst) { int l = src.length; int i = 0;

for(i=0; i<l; i++) { dst[i] = src[i]; } }}

Page 33: Proof-Carrying Code

Example: Target Code

ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3).text.align 4.globl _bcopy__6arrays5BcopyAIAI_bcopy__6arrays5BcopyAIAI:

cmpl $0, 4(%esp)je L6movl 4(%esp), %ebxmovl 4(%ebx), %ecxtestl %ecx, %ecxjg L22ret

L22:xorl %edx, %edxcmpl $0, 8(%esp)je L6movl 8(%esp), %eaxmovl 4(%eax), %esi

L7:ANN_LOOP(INV = {

(csubneq ebx 0),(csubneq eax 0),(csubb edx ecx),(of rm mem)},

MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM))cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edxjl L7ret

L13:call __Jv_ThrowBadArrayIndex

ANN_UNREACHABLEnop

L6:call __Jv_ThrowNullPointer

ANN_UNREACHABLEnop

Page 34: Proof-Carrying Code

Cut Points

Each loop entry must be annotated as a cut point.

VCGen requires this so that checking can be performed in a single scan of the code.

As a convenience, the modified registers are also declared in the cut annotations.

Page 35: Proof-Carrying Code

Example: Target Code

ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3).text.align 4.globl _bcopy__6arrays5BcopyAIAI_bcopy__6arrays5BcopyAIAI:

cmpl $0, 4(%esp)je L6movl 4(%esp), %ebxmovl 4(%ebx), %ecxtestl %ecx, %ecxjg L22ret

L22:xorl %edx, %edxcmpl $0, 8(%esp)je L6movl 8(%esp), %eaxmovl 4(%eax), %esi

L7:ANN_LOOP(INV = {

(csubneq ebx 0),(csubneq eax 0),(csubb edx ecx),(of rm mem)},

MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM))cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edxjl L7ret

L13:call __Jv_ThrowBadArrayIndex

ANN_UNREACHABLEnop

L6:call __Jv_ThrowNullPointer

ANN_UNREACHABLEnop

VCGen requires annotations in order to simplify the process.

Page 36: Proof-Carrying Code

Example: Source Code

public class Bcopy { public static void bcopy(int[] src,

int[] dst) { int l = src.length; int i = 0;

for(i=0; i<l; i++) { dst[i] = src[i]; } }}

Page 37: Proof-Carrying Code

The VCGen Process (1)_bcopy__6arrays5BcopyAIAI:

cmpl $0, src je L6 movl src, %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 retL22:

xorl %edx, %edx cmpl $0, dst je L6 movl dst, %eax movl 4(%eax), %esiL7: ANN_LOOP(INV = …

A0 = (type src_1 (jarray jint))A1 = (type dst_1 (jarray jint))A2 = (type rm_1 mem)A3 = (csubneq src_1 0)ebx := src_1ecx := (sel4 rm_1 (add src_1 4))

A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0)

edx := 0

A5 = (csubneq dst_1 0)eax := dst_1esi := (sel4 rm_1 (add dst_1 4))

Page 38: Proof-Carrying Code

The VCGen Process (2)

L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI, EDX, EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13

movl 8(%ebx,%edx,4), %edi

movl %edi, 8(%eax,%edx,4) …

A3A5A6 = (csubb 0 (sel4 rm_1 (add src_1 4)))

edi := edi_1edx := edx_1rm := rm_2

A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4))!!Verify!! (saferd4 (add src_1 (add (imul edx_1 4) 8)))

Page 39: Proof-Carrying Code

The Checker (1)

The checker is asked to verify that(saferd4 (add src_1 (add (imul edx_1 4) 8)))

under assumptionsA0 = (type src_1 (jarray jint))A1 = (type dst_1 (jarray jint))A2 = (type rm_1 mem)A3 = (csubneq src_1 0)A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0)A5 = (csubneq dst_1 0)A6 = (csubb 0 (sel4 rm_1 (add src_1 4)))A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4))

The checker looks in the PCC for a proof of this VC.

Page 40: Proof-Carrying Code

The Checker (2)

In addition to the assumptions, the proof may use axioms and proof rules defined by the host, such as

szint : pf (size jint 4)

rdArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} pf (type A (jarray T)) -> pf (type M mem) -> pf (nonnull A) -> pf (size T 4) -> pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (saferd4 (add A OFF)).

Page 41: Proof-Carrying Code

Checker (3)

A proof for

(saferd4 (add src_1 (add (imul edx_1 4) 8)))

in the Java specification looks like this (excerpt):

(rdArray4 A0 A2 (sub0chk A3) szint (aidxi 4 (below1 A7)))

This proof can be easily validated via LF type checking.

Page 42: Proof-Carrying Code

VC Explosion

a == b

a == c

f(a,c)

a := x c := x

a := y c := y

a=b => (x=c => safef(y,c) x<>c => safef(x,y))

a<>b => (a=x => safef(y,x) a<>x => safef(a,y))

Exponential growth in size ofthe VC is possible.

Page 43: Proof-Carrying Code

VC Explosion

a == b

a == c

f(a,c)

a := x c := x

a := y c := y

INV: P(a,b,c,x)

(a=b => P(x,b,c,x)

a<>b => P(a,b,x,x))

(a’,c’. P(a’,b,c’,x) =>

a’=c’ => safef(y,c’) a’<>c’ => safef(a’,y))

Growth can usually becontrolled by careful placementof just the right “join-point” invariants.

Page 44: Proof-Carrying Code

Stack Slots

Each procedure will want to use the stack for local storage.

This raises a serious problem because a lot of information is lost by VCGen (such as the value) when data is stored into memory.

We avoid this problem by assuming that procedures use up to 256 words of stack as registers.

Page 45: Proof-Carrying Code

Other Approaches to PCC

Page 46: Proof-Carrying Code

Typed Assembly Language[Morrisett, et al., ‘98]

Use modern type theory to develop a static type system for machine code.

Prove decidability of typechecking.

Prove soundness of type system.

Developing such a type system is very hard, but done only once.

Page 47: Proof-Carrying Code

TAL

fact: ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho} jgz r1, positive mov r1,1 retpositive: push r1 ; sp : int::{t1:int,sp:rho}::rho sub r1,r1,1 call fact[int::{r1:int,sp:rho}::rho] imul r1,r1,r2 pop r2 ; sp : {r1:int,sp:rho}:: ret

Page 48: Proof-Carrying Code

Eliminating VCGen

We can eliminate VCGen by using the logic to encode a global invariant on states, Inv(S).

Then, the proof must show:

Inv(S0)

S:State. Inv(S) ! Inv(Step(S))

S:State. Inv(S) ! SP(S)

Page 49: Proof-Carrying Code

Foundational PCC

Appel and Felty [’00] develop a semantic model of types, starting from the foundations of mathematical logic.

This model is used to construct the global invariant.

Hamid, Shao, et al. define the global invariant to be a syntactic well-formedness condition on machine states.

Page 50: Proof-Carrying Code

Temporal-logic PCC

Bernard and Lee [’02] define the global invariant via a temporal-logic specification.

A trusted generic program then interprets these specifications to extract verification conditions.