overview of back-end for ccomp

19
Overview of Back-end for CComp Zhaopeng Li Software Security Lab. June 8, 2009

Upload: tanner

Post on 15-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Overview of Back-end for CComp. Zhaopeng Li Software Security Lab. June 8, 2009. Outline. Design Points Assembly Language : “x86” Low-level Intermediate Language Future Work. Design Points. Assembly Language Target : SCAP with x86 abstract machine ; - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overview of Back-end for CComp

Overview of Back-end for CComp

Zhaopeng LiSoftware Security Lab.

June 8, 2009

Page 2: Overview of Back-end for CComp

Outline

• Design Points• Assembly Language : “x86”• Low-level Intermediate Language• Future Work

Page 3: Overview of Back-end for CComp

Design Points

• Assembly Language– Target : SCAP with x86 abstract machine;– Maybe next version the program logic is changed;– Or another machine will be used.

• Low-level Intermediate Language– Hide some machine-specific things;– Note that, this level can be just a helper to

generate code and proof.

Page 4: Overview of Back-end for CComp

Assembly Language : “x86”

Page 5: Overview of Back-end for CComp

Some Topics about “x86”

• Data Representation– 32-bit vs “fake” 32-bit• Don’t care how to store the data as bits.• Integer : 4 bytes• Pointer : 4 bytes

• Data Alignment• Callee-saved Registers– EBX, ESI, EDI, EBP

Page 6: Overview of Back-end for CComp

Some Topics about “x86” (cont.)• Calling convention:

1. Parameters passed on the stack, pushed from right to left; Or the first three are passed through register EAX, ECX and EDX, and the other are passed on the stack;

2. Register EAX, ECX, and EDX are used in the callee; Other registers must be saved on the stack and pop before the return of the function;

3. Return value is stored in the register EAX ;

4. Caller cleans up the stack (parameter).

Page 7: Overview of Back-end for CComp

Some Topics about “x86” (cont.)

Prolog (typical)_function: push ebp ;store the old base pointer mov esp, ebp ;make the base

; pointer point to the current stack; location

sub x, esp ; x is the size, in bytes

Epilog(typical) mov ebp, esp ;reset the stack to

; "clean" away the local variables pop ebp ;restore the original base pointer ret ;return from the function

ebpold ebp

old eip

parameters

esp

local variables

ebp

esp

old ebp

old eip

parameters

local variables

…………

……

old eip

parameters

ebp……

esp

func. entry after Stack frame setup after the return

enter x, 0enter x, 0 leaveretleaveret

Page 8: Overview of Back-end for CComp

Assembly Abstract Machine “m86”

• Code Heap (C)– Code storage, – Unchanged during execution

• Machine State– Memory (M)– Register File (R)– Instruction Pointer (eip), • current instruction c = C(eip)• Or just use instruction sequence (I)

Page 9: Overview of Back-end for CComp

Assembly Language : “x86”• “AT&T-syntax”• Reg. r ::= eax | ebx | ecx | edx | esi | edi | esp | ebp• FReg. fr ::= sf | zf • Int. b ::= n (integer)• Instr. i ::= add r1, r2 | addi n, r | sub r1, r2 | subi n, r | mul r1, r2 | muli n, r | mov r1, r2 | movi n, r | movs r1, n(r2) | movl n(r1), r2

| push r | pop r | cmp r1, r2 | cmpi n, r | je r, b | jne r, b | jg r, b | jge r, b | jmp b | call b | ret | enter n, 0 | leave | malloc r | free r

Page 10: Overview of Back-end for CComp

Program Logic

• Based on SCAP

• Specification (p, g)– p : State -> Prop– g : State -> State -> Prop

• Inference Rules– Well-formed program• Well-formed basic block• Well-formed instruction

Page 11: Overview of Back-end for CComp

Main Objects

• Code Generation– Minimize the proof size

• Eg. the temporary result should be put in register not on the stack

• Assertion– Building (p, g) for each basic block– Generating (p, g) for each program point

• Proof– Generating proof for functions/basic blocks– (reusing the proof of VC in source level)

Page 12: Overview of Back-end for CComp

Assertion Relationship

Basic block1Basic block1

f : {p} //{q}

Basic block1Basic block1

Basic block2Basic block2Basic block2Basic block2

L1 : {p1}

f : {(p’, g)}

L1 : {(p’1,g1)}

Intermediate Language x86 Assembly Lanuage

p’ = trans(p) /\ paramp/\stack-regp g = trans(q) /\ callee-saved-regg /\ stackg

p’ = trans(p) /\ paramp/\stack-regp g = trans(q) /\ callee-saved-regg /\ stackg

p’ 1= trans(p1) /\ paramp 1/\ stack-regp 1 g1 = ?p’ 1= trans(p1) /\ paramp 1/\ stack-regp 1 g1 = ?

Page 13: Overview of Back-end for CComp

Figure Out G

push ebp

mov esp, ebp

sub $12, esp

push ebp

mov esp, ebp

sub $12, esp

Basic block2Basic block2

f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4}

L1 : {g1}

R0(ebp) = R(ebp) /\ R0(esp) = R(esp) -4R0(ebp) = R(ebp) /\ R0(esp) = R(esp) -4

R’(ebp) = R(ebp) /\ R0(ebp) = R(ebp) /\ R’(esp)=R(esp)+4 /\ R0(esp) = R(esp) -4R’(ebp) = R(ebp) /\ R0(ebp) = R(ebp) /\ R’(esp)=R(esp)+4 /\ R0(esp) = R(esp) -4

R’(ebp) = R0(ebp)/\ R’(esp)=R0(esp)+8

LeaveretLeaveret

R’

R

R0

g0

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

Page 14: Overview of Back-end for CComp

Figure Out G (cont.)

push ebp

mov esp, ebp

sub $12, esp

push ebp

mov esp, ebp

sub $12, esp

Basic block2Basic block2

f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4}

L1 : {g1}

R’(ebp) = R0(ebp)/\ R’(esp)=R0(esp)+8

R1(ebp) = R0(esp) /\ R1(esp) = R0(esp)R1(ebp) = R0(esp) /\ R1(esp) = R0(esp)

R’(ebp) = R0(ebp) /\ R1(ebp) = R0(esp) /\ R’(esp)=R0(esp)+8 /\ R1(esp) = R0(esp)R’(ebp) = R0(ebp) /\ R1(ebp) = R0(esp) /\ R’(esp)=R0(esp)+8 /\ R1(esp) = R0(esp)

R’(ebp) = M1(R1(ebp))/\ R’(esp)=R1(esp)+8

R0

R1

LeaveretLeaveret

R’

R

g0

g1

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

Page 15: Overview of Back-end for CComp

Figure Out G (cont.)

push ebp

mov esp, ebp

sub $12, esp

push ebp

mov esp, ebp

sub $12, esp

Basic block2Basic block2

f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4}

L1 : {g1}

R’(ebp) = R0(ebp)/\ R’(esp)=R0(esp)+8

R’(ebp) = M1(R1(ebp))/\ R’(esp)=R1(esp)+8

R0

R1

LeaveretLeaveret

R’

R

R2(ebp) = R1(ebp) /\ R2(esp) = R1(esp)-12R2(ebp) = R1(ebp) /\ R2(esp) = R1(esp)-12

R’(ebp) = M1(R1(ebp)) /\ R2(ebp) = R1(ebp) /\ R’(esp)=R1(esp)+8 /\ R2(esp) = R1(esp)-12

R’(ebp) = M1(R1(ebp)) /\ R2(ebp) = R1(ebp) /\ R’(esp)=R1(esp)+8 /\ R2(esp) = R1(esp)-12

R’(ebp) = M2(R2(ebp))/\ R’(esp)=R1(esp)+20

R2

g0

g1

g2

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

The method:1. Get state relation by rule of operational semantics;2. Use the g of previous program point;3. Do substitution and arithmetic.

Page 16: Overview of Back-end for CComp

Low-level Intermediate Language

Page 17: Overview of Back-end for CComp

Potential Benefits

• Hide some machine-specific things;• Some optimizations could be done (optional);• Make the implementation simple and

reusable– (*Note that, this level is just a helper to generate

code and proof.*)– Only add codes for translating from this level

when targeting different assembly logic

Page 18: Overview of Back-end for CComp

The Language• Loc. l ::= r | s• Int. o,b ::= n (integer)• Slot. s ::= local(o) | incoming(o) | outgoing(o)• Reg. r ::= r1 | r2 | r3 | … //infinite pseudo-registers• Instr. i ::= bop(bop, l1,l2, l) | uop(uop, l1, l) | load(r, o, l) | store(l, r, o) | getstack(s, r) | setstack(r, s) | call(id, l) | return r | malloc(r) | free(r) | goto b | label (b) | cond(l1, cmp,l2, btrue)• BinOp. bop::= add | sub | mul | …• UnOp. Uop::= minus | …• Comp. cmp::= gt | ge | eq | ne | lt | le

Page 19: Overview of Back-end for CComp

Code Generation (optional)

• Do some optimizations which do no affect proof, such as:– Branch tunneling– Dead code elimination

• Future optimizations– Other low-level optimizations may be done here