infrastructure for analysis of object code filedisassembly & associated data structures...

32
Infrastructure for Analysis of Object Code Gogul Balakrishnan University of Wisconsin

Upload: others

Post on 03-Sep-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Infrastructure forAnalysis of Object Code

Gogul BalakrishnanUniversity of Wisconsin

Page 2: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Overview

• About the existing infrastructure– IDAPro– CodeSurfer

• Extensions to CodeSurfer– Basic blocks– Templates for data flow analysis

• Points-to analysis on assembly code

Page 3: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Infrastructure

• Use existing analysis software• Augment them with features for

analyzing object code• IDA Pro (DataRescue)• Codesurfer (GrammaTech)

Page 4: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Disassembly & associated Data structures

Codesurfer and IDAPro

IDAPro

IDA2CS Plugin

CSurf SDGs

CSurf API

Binary file

IDAPro SDK

Page 5: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Codesurfer and IDAPro

IDA2CS Plugin

CSurf SDGs

CSurf API

IDAPro SDK

Disassembly & associated Data structuresIDAProBinary

file

Page 6: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Disassembly & associated Data structures

Codesurfer and IDAPro

IDAPro

IDA2CS Plugin

CSurf SDGs

CSurf API

Binary file

Extend this API

IDAPro SDK

Page 7: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Basic Blockssub_10018A0 proc near ; CODE XREF: sub_1001AE3+445�p

; _WinMain@16+CF�pmov eax, hMempush esimov esi, ds:GlobalFreetest eax, eaxjz short loc_10018B3push eax ; hMemcall esi ; GlobalFree

loc_10018B3: ; CODE XREF: sub_10018A0+E�jmov eax, dword_1008BECtest eax, eaxjz short loc_10018BFpush eax ; hMemcall esi ; GlobalFree

loc_10018BF: ; CODE XREF: sub_10018A0+1A�jand hMem, 0and dword_1008BEC, 0pop esiretn Construct Basic Blocks

Page 8: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Data flow analysis templates -Live Variable Analysis

/* Demo CodeSurfer */#include <stdio.h>#include "hello.h"

static int debug = 0;

void main(void){long f,i,n,j,k; j=1; i=1; f=1; n=10; k=j*2; a: f=f*i; i=i+1; if(i<=n) goto a;

}Demo

Page 9: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Overview

• About the existing infrastructure– IDAPro– CodeSurfer

• Extensions to CodeSurfer– Basic blocks– Templates for data flow analysis

• Points-to analysis on assembly code

Page 10: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Disassembly & associated Data structures

Codesurfer and IDAPro

IDAPro

IDA2CS Plugin

CSurf SDGs

CSurf API

Binary file

IDAPro SDK

Page 11: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

• Better understanding of program behavior

Need for points-to analysis

main( ){

int c,b=10,a=20,*pa=&a;

c=*pa+b;

}

Page 12: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

• Better understanding of program behavior

• Pointers – a possible covert channel

Need for points-to analysis

main( ){

int b,a=20,*pa=&a;

b=*pa;

}

‘a’ – High security level variable

‘b’ – Low security level variable

Page 13: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

• Better understanding of program behavior

• Pointers – a possible covert channel• Can detect more buffer overruns

Need for points-to analysis

void main(){

char str[5],*a;

a=str;

strcpy(a,”Hello”);

}

Page 14: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Flow-Sensitive → Flow-Insensitive

start main

exit main

τ3

τ2

τ1

τ4τ5

τ3

τ2

τ1

τ4

τ5

Page 15: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Flow Sensitive vs. Flow Insensitive

• Flow-insensitive analysis– Less precise– More efficient in space and time– But works poorly with assembly code

• Register ebp used to access all variables ⇒ All variables treated as one

• Recover a degree of flow sensitivity– Rename registers according to live ranges– Perform flow-insensitive analysis

Page 16: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Live Ranges

push ebpmov ebp, esplea eax, [ebp-4]mov ebx, [eax]mov ecx, ebxmov edx, eaxmov ebx,edxlea eax, [ebp-8]mov ebx, ecxmov esi,eaxpop ebp

:LiveRange 0 of eax

:LiveRange 1 of eax

:LiveRange 0 of ecx

Page 17: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Flow-Insensitive Points-to Analysis[Andersen 94, Shapiro & Horwitz 97]

p = &q;

p = q;

p = *q;

*p = q;

p q

pr1

r2

q

r1

r2

qs1

s2

s3

p

ps1

s2

qr1

r2

Page 18: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Flow-Insensitive Points-To Analysis

a = &e;b = a;c = &f;

*b = c;d = *a;

a

d

b

cf

e

Page 19: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

What are the entities?

• Two kinds of storage areas– Registers– Memory

• Possible points-to relations– Registers Memory– Memory Memory– Registers Registers not possible– Memory Registers not possible

Page 20: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

What are the abstract variables?• Memory

– Register + Displacement• E.g., mov ebx,[ebp-4]

– Displacement• E.g., mov ebx,[12]

∴ Abstract variables have 2 components•[Optional] Register name•Displacement

Page 21: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

What are the abstract variables?

Really two kinds of abstract entities– Addresses: A_reg_lr_Displ– Memory Locations: M_reg_lr_Displ– e.g.,

• A_ebp_0_4 for ebp_0 – 4• M_ebp_0_4 for [ebp_0 – 4]

• Implicit points-to facts– A_reg_lr_D M_reg_lr_D

Page 22: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Memory modelRest of stack

Parameters

Var 1 ebpVar 2Var 3

esp

fn(..) {

int i,*j,k;

j=&i;

}

(i)(j)(k)

0ffffh

0h

Page 23: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly code

void main() {

int i,*j;

j=&i;

return ;

}

push ebp mov ebp, espsub esp,8lea eax, [ebp-4]mov [ebp-8], eaxmov esp, ebppop ebpretn

Page 24: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codepush ebpmov ebp, espsub esp,8lea eax_1, [ebp_0-4]mov [ebp_0-8], eax_1mov esp, ebppop ebpretn

Equivalent statements:

• A_eax_1_0 = &M_ebp_0_4

• M_ebp_0_8 = A_eax_1_0

A_eax_1_0M_ebp_0_4

M_ebp_0_8

Page 25: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codepush ebpmov ebp, espsub esp,8lea eax_1, [ebp_0-4]mov [ebp_0-8], eax_1mov esp, ebppop ebpretn

A_eax_1_0M_ebp_0_4

M_ebp_0_8(Var j)

(Var i)

Rest of stack

Parameters

Var 1ebp_0

Var 2(i)(j)

0ffffh

0h

Page 26: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codevoid main(){

int i,*j,*k;_asm {

push ebplea eax,[ebp-4]mov ebp,eax

}j=&i;_asm {

pop ebp}return ;

}

push ebpmov ebp, espsub esp, -12push ebplea eax, [ebp-4]mov ebp, eaxlea ecx, [ebp-4]mov [ebp-8], ecxpop ebpmov esp,ebppop ebpretn

Page 27: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codepush ebpmov ebp, espsub esp, -12push ebplea eax, [ebp-4]mov ebp, eaxlea ecx, [ebp-4]mov [ebp-8], ecxpop ebpmov esp,ebppop ebpretn

Rest of stack

Parameters

Var 1ebpVar 2Var 3esp

(i)(j)(k)

0ffffh

0h

Page 28: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codevoid main(){

int i,*j,*k;_asm {

push ebplea eax,[ebp-4]mov ebp,eax

}j=&i;_asm {

pop ebp}return ;

}

push ebpmov ebp, espsub esp, -12push ebplea eax, [ebp-4]mov ebp, eaxlea ecx, [ebp-4]mov [ebp-8], ecxpop ebpmov esp,ebppop ebpretn

(k=&j;)

Page 29: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codepush ebpmov ebp, espsub esp, -12push ebplea eax, [ebp-4]mov ebp, eaxlea ecx, [ebp-4]mov [ebp-8], ecxpop ebpmov esp,ebppop ebpretn

Page 30: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly codepush ebpmov ebp, espsub esp, -12push ebplea eax_1, [ebp_0-4]mov ebp_1, eax_1lea ecx_0, [ebp_1-4]mov [ebp_1-8], ecx_0pop ebpmov esp,ebppop ebpretn

Equivalent statements:

• A_eax_1_0 = &M_ebp_0_4

• ∀ i, A_ebp_1_i = A_eax_1_i

• A_ecx_0_0=&M_ebp_1_4

• M_ebp_1_8=A_ecx_0_0

A_eax_1_0 M_ebp_0_4

A_ecx_0_0M_ebp_1_4

M_ebp_1_8

Page 31: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Points-to analysis on assembly code

A_eax_1_0 M_ebp_0_4

A_ecx_0_0M_ebp_1_4

M_ebp_1_8

Rest of stack

Parameters

Var 1ebp_0

Var 2Var 3 esp

(i)(j)(k)

ebp_1

• A_eax_1_0 = &M_ebp_0_4

• ⇒ A_eax_1_0 = A_ebp_0_4

• ∀ i, A_ebp_1_i = A_eax_1_i

0ffffh

0h (M_ebp_0_12)

(M_ebp_0_8)

∴ ebp_1 = ebp_0 - 4

• M_ebp_1_4 M_ebp_0_8

• M_ebp_1_8 M_ebp_0_12

Page 32: Infrastructure for Analysis of Object Code fileDisassembly & associated Data structures Codesurfer and IDAPro IDAPro IDA2CS Plugin CSurf SDGs CSurf API Binary file IDAPro SDK

Future Work• Complete implementation of flow-insensitive

points-to analysis • Use flow-insensitive analysis to reduce work

for flow sensitive analysis• Transforming machine instructions to C like

expressionslea eax,[ebp-4]

mov ebx,[ebp-8]

mov ecx,[eax]

add ebx,ecx

mov [ebp-8],ebx

M_ebp_0_8= *A_ebp_0_4 + M_ebp_0_8