using dyninst for program binary analysis and instrumentation

35
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29 - May 1, 2013 Using Dyninst for Program Binary Analysis and Instrumentation Emily Jacobson

Upload: geoff

Post on 24-Feb-2016

91 views

Category:

Documents


0 download

DESCRIPTION

Using Dyninst for Program Binary Analysis and Instrumentation. Emily Jacobson. No Source Code — No Problem. Executables. a.out. prog.exe. With Dyninst we can: Find (stripped) code in program binaries in live processes Analyze code functions control-flow-graphs - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Dyninst for Program Binary Analysis and Instrumentation

Paradyn Project

Paradyn / Dyninst WeekMadison, Wisconsin

April 29 - May 1, 2013

Using Dyninst for Program Binary Analysis and Instrumentation

Emily Jacobson

Page 2: Using Dyninst for Program Binary Analysis and Instrumentation

No Source Code — No ProblemWith Dyninst we can:o Find (stripped) code

o in program binarieso in live processes

o Analyze code o functionso control-flow-graphso loop, dominator analyses

o Instrument codeo statically (rewrite binary)o dynamically (instrument

live process)Using Dyninst for Analysis and Instrumentation

Libraries

Executablesa.out

Live ProcessExecutableLibrary 1

Library N

lib.so

prog.exe

lib.dll

2

Page 3: Using Dyninst for Program Binary Analysis and Instrumentation

Static Rewriting Dynamic Instrumentation

oAmortize parsing and instrumentation time.

oExecute instrumentation at a particular time (oneTimeCode).

oPotential to generate more efficient modified binaries.

o Insert and remove instrumentation at run time.

o3rd party response to runtime events

o1st party response to runtime events

3Using Dyninst for Analysis and Instrumentation

Choice of Static vs. Dynamic Instrumentation

Page 4: Using Dyninst for Program Binary Analysis and Instrumentation

• Find memory leaks• Add printfs to malloc, free• Stackwalk malloc calls that are not

freed

4

Example Dyninst Program

Using Dyninst for Analysis and Instrumentation

ChaosPro ver 3.1

Page 5: Using Dyninst for Program Binary Analysis and Instrumentation

Dyninst Components

Using Dyninst for Analysis and Instrumentation

Binary Code

Code Generator

Instrumenter

Stack Walker

(Stackwalker-API)

Process Controller

(ProcControl-API)

Symbol Table Parser

(SymtabAPI) Code

Parser(ParsingAPIInstruction

Decoder(Instruction

-API)

InstrumentationRequests

Stack WalkRequests

Analysis Requests

5

Page 6: Using Dyninst for Program Binary Analysis and Instrumentation

Process Control• Several supported

OS’s

Using Dyninst for Analysis and Instrumentation

Linux

Windows

Process Controller

6

Page 7: Using Dyninst for Program Binary Analysis and Instrumentation

Process Control• Several supported

OS’s• Broad functionality• Attach/create process• Monitor process

status changes• Callbacks for

fork/exec/exit• Mutatee operations:

malloc, load library, inferior RPC

• Uses debugger interface Using Dyninst for Analysis and Instrumentation

Analyst Program(Mutator)

Dyninst Library

Monitored Process(Mutatee)

Dyninst Runtime Lib

Process Controller

Debugger Interface

7

Page 8: Using Dyninst for Program Binary Analysis and Instrumentation

...

...

Dyninst’s Process Interface

Using Dyninst for Analysis and Instrumentation

http://paradyn.org/html/manuals.html

8

Page 9: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Create a ChaosPro.exe Process

BPatch bpatch;

static void exitCallback(BPatch_thread*,BPatch_exitType) { printf(“About to exit\n”);}

int main(int argc, char *argv[]) { if (argc < 2) { fprintf(stderr, "Usage: %s prog_filename\n", argv[0]); return 1; }

BPatch_process *proc = bpatch.processCreate( argv[1] , argv+1 );

bpatch.registerExitCallback( exitCallback );

proc->continueExecution(); while ( ! proc->isTerminated() ) bpatch.waitForStatusChange(); return 0;}

> mutator.exe C:\Chaos\ChaosPro.exe

9Using Dyninst for Analysis and Instrumentation

Page 10: Using Dyninst for Program Binary Analysis and Instrumentation

Unified Abstractions

Using Dyninst for Analysis and Instrumentation 10

BPatch_processBPatch_binaryEdit

a.out

libc.so

Live Process

BPatch_addressSpace

a.out

libc.so

Add/remove instrumentation, lookups by

address, allocate

variables in mutatee

Process state,

threads, one-time

instrument-ation

write file

Page 11: Using Dyninst for Program Binary Analysis and Instrumentation

Symbol Table Parsing

Using Dyninst for Analysis and Instrumentation

Mutatee Code Generator

Instrumenter

Stack Walker

Process Controller

Symbol Table Parser

Code Parser

Instruction Decoder

chaospro.exe

Runtime Lib

msvcrt.dll

Where are malloc, free?Mutator

Dyninst Library

11

Page 12: Using Dyninst for Program Binary Analysis and Instrumentation

Symbol Table Parsing

Using Dyninst for Analysis and Instrumentation

Where are malloc, free?

Mutatee

Symbol Table Parser

PE

ELF

XCOFF

Program Headers

Shared Object

Dependencies

TypeInformation

ExceptionInformation

Symbols

SymbolVersions

SectionHeaders

SectionData

DynamicSegment

Information

Relocations

Local variableInformation

Line NumberInformation

Symbol Addressfunc1

func2 0x0804cd1d

variable10x0804cc840x0804cd00

Size100

4500

Runtime Lib

12

chaospro.exe

msvcrt.dll

Page 13: Using Dyninst for Program Binary Analysis and Instrumentation

int main(int argc, char *argv[]){ ...

BPatch_image* image = proc->getImage();

BPatch_module* libc = image->findModule( “msvcrt” );

vector< BPatch_function* > * funcs = libc->findFunction( “malloc” );

BPatch_function * bp_malloc = (*funcs)[0];

Address start = bp_malloc->getBaseAddr(); Address size = bp_malloc->getSize();

printf( “malloc: [%x %x]\n", start , start + size ); ...}

Example: Find malloc

Using Dyninst for Analysis and Instrumentation

Mutatee

Mutator

Dyninst Library

Runtime Lib

13

chaospro.exe

msvcrt.dll

Page 14: Using Dyninst for Program Binary Analysis and Instrumentation

Decoding and Parsing of Binary Code

Using Dyninst for Analysis and Instrumentation

Mutatee Code Generator

Instrumenter

Stack Walker

Code Parser

Instruction Decoder

Mutator

Dyninst Library

Runtime Lib

Process Controller

Symbol Table Parser

14

chaospro.exe

msvcrt.dll

Get parameters, return values for malloc, free

Page 15: Using Dyninst for Program Binary Analysis and Instrumentation

Instruction Decoding

Using Dyninst for Analysis and Instrumentation

Instruction Decoder

Abstract Syntax Treemov eax -> [ebx * 4 +

ecx]

deref

add

mult

mov

eax

[ebx * 4 + ecx]

ecx

ebx 4

IA32

AMD64

POWER

Mutatee8b 04 99 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73

1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07

57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b

15

Page 16: Using Dyninst for Program Binary Analysis and Instrumentation

Parsing

• Identify basic blocks, functions• Builds control-flow graph•Operate on stripped code, but use symbol information opportunistically

Using Dyninst for Analysis and Instrumentation

Instruction DecoderMutatee

8b 04 99 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73

1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07

57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b

Code Parser

mov eax -> [ebx * 4 + ecx]

deref

add

mult ecx

ebx 4

mov eax [ebx * 4 + ecx]

Parse-time analyses:

16

IA32

AMD64

POWER

Page 17: Using Dyninst for Program Binary Analysis and Instrumentation

Binary Code ParsingTask: instrument malloc at its entry

and exit points, instrument free at its entry point

Subtask: find malloc and parse it

Using Dyninst for Analysis and Instrumentation

Code Parser

Instruction Decoder

Process Controller

Symbol Table Parser

chaospro.exe

Mutatee84 04 99 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73

1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07

57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b

msvcrt.dllmalloc 77C2C407free 77C2C21Batoi 77C1BE7Bstrcpy 77C46030memmove 77C472B0

mov eax -> [ebx * 4 + ecx]

deref

add

mult ecx

ebx 4

mov eax [ebx * 4 + ecx]

17

Page 18: Using Dyninst for Program Binary Analysis and Instrumentation

Control Flow Traversal Parsing• Function symbols may

be sparse• Executables must

provide only one function address

• Libraries provide symbols for exported functions

• Parsing finds additional functions by following call edges

Using Dyninst for Analysis and Instrumentation

_start [80483b0 80483fa] _init [8048354 804836b] _fini [8048580 804859c] main [8048480 80484cf]targ3d4 [80483d4 80483fa]targ400 [8048400 804843e]targ440 [8048440 8048468]

18

Page 19: Using Dyninst for Program Binary Analysis and Instrumentation

Control Flow Graph

Using Dyninst for Analysis and Instrumentation

E

C

EE

C R

RRR

19

Address pointAddr;BPatch_procedureLocation type;enum { BPatch_entry,

BPatch_exit,BPatch_subroutine,BPatch_address }

• Graph elements:• BPatch_function• BPatch_basicBlock• BPatch_edge

• Instrumentation points:• BPatch_point

Page 20: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Find malloc’s Exit Points

vector< BPatch_function * > * funcs;• funcs = bp_image->getProcedures();• funcs = bp_image->findFunction(“malloc”);

Using Dyninst for Analysis and Instrumentation

E

C

EE

C R

RRR

Mutatee

chaospro.exe

msvcrt.dll

Parsing is triggered automatically as needed

malloc

kernel32.dll

20

Page 21: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Find malloc’s Exit Points

vector< BPatch_function * > * funcs;• funcs = bp_image->findFunction(“malloc”);

• funcs = libc_mod->findFunction(“malloc”);

Using Dyninst for Analysis and Instrumentation

E

C

EE

C R

RRR

Mutatee

chaospro.exe

msvcrt.dll

Parsing is triggered automatically as needed

malloc

kernel32.dll

21

Page 22: Using Dyninst for Program Binary Analysis and Instrumentation

BPatch_function * bp_malloc = (*funcs)[0];vector< BPatch_point* > * points = BPatch_entry bp_malloc->findPoints BPatch_subroutine ;

BPatch_exit

Example: Find malloc’s Exit Points

Using Dyninst for Analysis and Instrumentation

E

C

EE

C

R

R

RR

Mutatee

malloc

22

chaospro.exe

msvcrt.dll

kernel32.dll

Page 23: Using Dyninst for Program Binary Analysis and Instrumentation

Instrumentation (at last!)

Using Dyninst for Analysis and Instrumentation

Code Generator

Instrumenter

Stack Walker

Code Parser

Instruction Decoder

Mutatee

chaospro.exe

Mutator

Dyninst Library

Runtime Lib

msvcrt.dll

Process Controller

Symbol Table Parser

23

Page 24: Using Dyninst for Program Binary Analysis and Instrumentation

Instrument-ation Points

Abstract Syntax TreeSnippet

Specifying Instrumentation Requests

Using Dyninst for Analysis and Instrumentation

InstrumentationRequests

Code Generator

Instrumenter

R

R

what

where

24

Page 25: Using Dyninst for Program Binary Analysis and Instrumentation

BPatch_Snippet Subclasses• BPatch_sequence( vector < BPatch_Snippet*> items )

• BPatch_variableExpr() int value• BPatch_constExpr char* value void* value• BPatch_ifExpr( BPatch_boolExpr condition, BPatch_Snippet then_clause, BPatch_Snippet else_clause )• BPatch_funcCallExpr( BPatch_function * func, vector< BPatch_Snippet* > args )• BPatch_paramExpr( int param_number )

• BPatch_retExpr()

Using Dyninst for Analysis and Instrumentation 25

Page 26: Using Dyninst for Program Binary Analysis and Instrumentation

BPatch_Snippet Classes

Using Dyninst for Analysis and Instrumentation 26

Page 27: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Forming printf Snippet

Using Dyninst for Analysis and Instrumentation

printf( “free(%x)\n” , arg0 );

BPatch_funcCallExpr

BPatch_paramExpr arg0(0)

Bpatch_function bp_printf

Efree(ptr)

vector

“free(%x)\n”

BPatch_constExpr

BPatch_funcCallExpr ( BPatch_function * func, vector< BPatch_Snippet* > args )

27

Page 28: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Instrument free w/ call to printf

Using Dyninst for Analysis and Instrumentation

BPatch_function * bp_free;vector< BPatch_point * > entryPoints;...BPatch_constExpr arg0 ( “free(%x)\n” );BPatch_paramExpr arg1 (0);

vector< BPatch_snippet * > printf_args;printf_args.push_back( & arg0 );printf_args.push_back( & arg1 );

BPatch_funcCallExpr callPrintf( *bp_printf, printfArgs );

bpatch.beginInsertionSet();for ( int idx =0; idx < entryPoints.size(); idx++ ) proc->insertSnippet( callPrintf,

*entryPoints[idx] );bpatch.finalizeInsertionSet();

BPatch_funcCallExpr

BPatch_paramExpr arg0(0)

bp_printf vector

“free(%x)\n”

BPatch_constExpr

Efree(ptr)

28

Page 29: Using Dyninst for Program Binary Analysis and Instrumentation

Using Variables

• Find / create variablebp_image->findVariable(“global1”);bp_proc->malloc(bp_image->findType(“int”));

• Initialization instrumentation• e.g., assignment at entry point of main

• Manipulation instrumentation• e.g., arithmetic assignment expression

• Gather / print out values• e.g., through callback instrumentation

Using Dyninst for Analysis and Instrumentation 29

malloc instrumentation: save argument in a variable

Page 30: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Instrumenting malloc

Using Dyninst for Analysis and Instrumentation

void * malloc ( size_t size ){ MALLOC_ARG = size; ... if (MALLOC_ARG > 1000)

printf(“%x = malloc(%x)\n”,retnValue,MALLOC_ARG);

}

E

R

R

malloc

BPatch_assign

BPatch_arithExpr

MALLOC_ARG BPatch_constExpr

1

30

Page 31: Using Dyninst for Program Binary Analysis and Instrumentation

vector

Example: Instrumenting malloc

Using Dyninst for Analysis and Instrumentation

BPatch_ifExpr

Bpatch_boolExpr

E

R

R

malloc

BPatch_constExpr(100)

MALLOC_ARG

BPatch_gt

BPatch_funcCallExpr

BPatch_functionbp_printf “%x = malloc(.)\n”

BPatch_retExpr retnValue

BPatch_constExpr

31

void * malloc ( size_t size ){ MALLOC_ARG = size; ... if (MALLOC_ARG > 100)

printf(“%x = malloc(%x)\n”,retnValue,MALLOC_ARG);

}

Page 32: Using Dyninst for Program Binary Analysis and Instrumentation

Generating the Instrumentation Code

Using Dyninst for Analysis and Instrumentation

Code Generator

Instrumenter

BPatch_funcCallExpr

BPatch_paramExpr arg0(0)

bp_printf vector

“free(%x)\n”

BPatch_constExpr

mov eax -> [ebx * 4 + ecx]

deref

add

mult ecx

ebx 4

mov eax [ebx * 4 + ecx]

Instrumentation snippet

Code at the instrumented point

IA32

AMD64

POWER

32

Page 33: Using Dyninst for Program Binary Analysis and Instrumentation

Stack Walking

Using Dyninst for Analysis and Instrumentation

Code Generator

Instrumenter

Stack Walker

Code Parser

Instruction Decoder

Mutatee

chaospro.exe

Mutator

Dyninst Library

Runtime Lib

msvcrt.dll

Process Controller

Symbol Table Parser

33

Page 34: Using Dyninst for Program Binary Analysis and Instrumentation

Example: Stack Walk of malloc Call• Callback

triggers stackwalk• BPatch_thread:

: getCallStack(…)

Using Dyninst for Analysis and Instrumentation

Mutatee

chaospro.exe

Mutator

Dyninst Library

Runtime Lib

msvcrt.dll

Choose instrumentation point• the exit points of

malloc Insert callback

instrumentation• use stopThreadExpr

snippet

Stack Walker

E

R

R

malloc

34

Page 35: Using Dyninst for Program Binary Analysis and Instrumentation

Implementation Session

Code Coverage• Create a mutator that counts

function invocations• See description of the lab at

http://www.paradyn.org/tutorial/

Using Dyninst for Analysis and Instrumentation 35