malware analysis and instrumentation

48
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011

Upload: tracen

Post on 23-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Malware Analysis and Instrumentation. Andrew Bernat and Kevin Roundy. Paradyn Project. Center for Computing Science June 14, 2011. Forensic analysts need help. 90% of malware resists analysis [1] Malware attacks cost billions of dollars annually [2] - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Malware Analysis and Instrumentation

1

Malware Analysis and Instrumentation

Andrew Bernat and Kevin RoundyParadyn Project

Center for Computing Science

June 14, 2011

Page 2: Malware Analysis and Instrumentation

Forensic analysts need help

Malware Analysis and Instrumentation 2

90% of malware resists analysis[1]

Malware attacks cost billions of dollars annually[2]

65% of users feel effect of cyber crime[3]

69% cybercrimes are resolved[3]

28 days on average to resolve a cybercrime[3]

[1] McAfee. 2008 [2] Computer Economics. 2007 [3] Norton. 2010

7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95

Malware Binary

Page 3: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 3

7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95

Binary code identificationControl- and data-flow analysisInstrumentationEffectiveness on malware

The needed toolbox

Forensic analysts need help

Malware Binary

Page 4: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation

Dyninst

Dyninst is a toolbox for analysts

4

program

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

DyninstCFG

loop,block,

function,instructioninstrument-

ation

functionreplace-

ment

callstack

walking

forward &backward

slices

loopanalysis

processcontrol

libraryinjection symbol

tablereading,writing

binaryrewriting

machinelanguageparsing

Control flow

analyzerInstrument

erData flow analyzer

Page 5: Malware Analysis and Instrumentation

Analysis tool

Dyninst

Dyninst is a toolbox for analysts

Malware Analysis and Instrumentation

Mutator Specifies instrumentation Gets callbacks for runtime

events Builds high-level analysis

program

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

DyninstControl

flow analyzer

Instrumenter

Data flow analyzer

CFGCFG

5

loop,block,

function,instructioninstrument-

ation

functionreplace-

ment

callstack

walking

forward &backward

slices

loopanalysis

processcontrol

libraryinjection symbol

tablereading,writing

binaryrewriting

machinelanguageparsing

Page 6: Malware Analysis and Instrumentation

Analysis tool

Dyninst is a toolbox for analysts

Malware Analysis and Instrumentation 6

Analysis of network communications

Code visualizations

Time bomb detectionand analysis

Identification of stolen data

Reports on anti-analysis techniques

printf(…)

counter++if (pred)

callback(…)

getTarget(insn)

Code snippetsMutator

Specifies instrumentation Gets callbacks for runtime

events Builds high-level analysis

program

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

CFG

DyninstControl

flow analyzer

Instrumenter

Data flow analyzer

Page 7: Malware Analysis and Instrumentation

Analysis tool

Dyninst

Dyninst on malware

Malware Analysis and Instrumentation 7

printf(…)

counter++if (pred)

callback(…)

getTarget(insn)

Code snippetsMutator

Specifies instrumentation Gets callbacks for runtime

events Builds high-level analysis

Malware defeats static analysis &is sensitive to instrument-ationmalwar

e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

CFG

Analysis of network communications

Code visualizations

Time bomb detectionand analysis

Identification of stolen data

Reports on anti-analysis techniques

Analysis of network communications

Code visualizations

Time bomb detectionand analysis

Identification of stolen data

Reports on anti-analysis techniques

Control flow

analyzerInstrument

erData flow analyzer

Page 8: Malware Analysis and Instrumentation

Analysis tool

DyninstControl

flow analyzer

Instrument-er

Data flow analyzer

Dyninst on malware

Malware Analysis and Instrumentation 8

printf(…)

counter++if (pred)

callback(…)

getTarget(insn)

Code snippetsMutator

Specifies instrumentation Gets callbacks for runtime

events Builds high-level analysis

Malware defeats static analysis &is sensitive to instrument-ationmalwar

e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

CFGCFGCFG

SR- Dyninststatic-dynamic analysis

Analysis of network communications

Code visualizations

Time bomb detectionand analysis

Identification of stolen data

Reports on anti-analysis techniques

Control flow

analyzer

Sensitivity Resistant Instrumenter

Data flow analyzer

Page 9: Malware Analysis and Instrumentation

Outline

Malware Analysis and Instrumentation 9

Anti-analysis tricksHybrid static-dynamic analysisSensitivity resistanceResults

H.A.Anti

S.R.Res.

9

Page 10: Malware Analysis and Instrumentation

PC-sensitive code

Obfuscated control flow

Unpacked code

Overwritten code

Anti-patching

Address-space probing

PC-sensitive codecall-pop pairs, return-address manipulation, call-stack tampering & probing

Anti-analysis tricks

Malware Analysis and Instrumentation 10

Obfuscated control flowindirect control flow, stack tampering, overlapping code, signal-based ctrl flow

Unpacked codeall-at-once, block-, loop-, function-at-a-time, to empty or allocated space

Overwritten codesingle operand or opcode, whole instruction, function, code section, buffer

Anti-patchingchecksum whole regions, probe for patches, use code as data, move stack ptr

Anti

Address-space probingscans & probes of locations that should be un-allocated

Anti

-ana

lysi

sAn

ti-

inst

rum

enta

tion

Page 11: Malware Analysis and Instrumentation

03 04 05 06 07 08 09 0a 0b 0c 0de8 03 00 00 00 e9 eb 04 5d 45 55 c3CALL JMP40d00a 459dd4f7

JMP POP INC PUSH RET40d00e ebp ebp ebp

anti-patching

storm worm

Obfuscated control flow

Malware Analysis and Instrumentation 11

obfuscated control flow

40d002

address-space probing

unpacked codeoverwritten code

obfuscated control flow

Entry Point

pc-sensitive code

Anti

Page 12: Malware Analysis and Instrumentation

storm worm

Unpacked code

Malware Analysis and Instrumentation 12

Entry Point

7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01

7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01

obfuscated control flow

unpacked codeobfuscated control

flow

Anti

12

anti-patchingaddress-space

probing

overwritten codepc-sensitive code

Page 13: Malware Analysis and Instrumentation

Overwritten code

Malware Analysis and Instrumentation 13

Upack packer

obfuscated control flow

overwritten code

obfuscated control flow

Anti

13

anti-patchingaddress-space

probing

pc-sensitive code

unpacked code7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01

Entry Point

Page 14: Malware Analysis and Instrumentation

PC Sensitive code

Malware Analysis and Instrumentation 14

obfuscated control flow

overwritten code

obfuscated control flow

Anti

14

anti-patchingaddress-space

probing

pc-sensitive code

unpacked code

Local Data Access

call

pop esiadd esi, eaxmov ebx, ptr[esi]

data

Use call to get current PC

Pop PC into register

Construct pointer and dereference

e.g., ASProtect

Page 15: Malware Analysis and Instrumentation

anti-patching

obfuscated control flow

Anti-patching

Malware Analysis and Instrumentation 15

checksum routine protected

codexor eax, eax

cmp eax, .chksumjne .fail

e.g., PECompactChecksumming detects instrumentation

[Aucsmith 96]

add eax, ptr[ebx]add ebx, 4cmp ebx, 0x41000jne .loop

jmp

instrument-ation is detected

pass failfail

calculate checksum of protected regioncompare to expected value

Anti

15

address-space probing

unpacked codeoverwritten codepc-sensitive code

Page 16: Malware Analysis and Instrumentation

Address-space probing

Malware Analysis and Instrumentation 16

obfuscated control flow

overwritten code

obfuscated control flow

Anti

16

anti-patchingaddress-space

probing

pc-sensitive code

unpacked code

segv_handler() { ptr += PAGESIZE; goto RESTART:}

int *ptr = 0;

sigaction(SIGSEGV, segv_handler);

while(1) {RESTART: *ptr; ptr += PAGESIZE;}

data

code

code

instrumentation

Memory Scan

Page 17: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 17

Code discovery algorithmHybrid algorithm:

?

Parse from known entry points

Instrument control flow that may lead to new codeResume execution

H.A.

instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0

?

Page 18: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 18

Code discovery algorithm

?

Parse from known entry points

Instrument control flow that may lead to new codeResume execution ?

Hybrid algorithm:H.A.

instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0

Page 19: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 19

Code discovery algorithm

?

Parse from known entry points

Instrument control flow that may lead to new codeResume execution

Hybrid algorithm:H.A.

instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0

?

Page 20: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 20

Code discovery algorithm

?

Parse from known entry points

Instrument control flow that may lead to new codeResume execution

Hybrid algorithm:H.A.

instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0

?

Page 21: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 21

Code discovery algorithm

Parse from known entry points

Instrument control flow that may lead to new codeResume execution

Hybrid algorithm:H.A.

instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0

?

Page 22: Malware Analysis and Instrumentation

Standard control-flow traversal start from known entry points follow control flow to find

code New conservative

assumption unresolved calls may not

returnSo, we don’t parse garbage

code New stack tamper

detection backwards slice at ret

instructionSo, we detect modified return addresses

Hybrid Analysis of Program Binaries

call ptr[eax]

pop ebpinc ebppush ebpret

garbage

Accurate parsingH.A.

22

Page 23: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 23

Instrumentation-based discoveryH.A.Invalid control transfers

Indirect control transfers

Exception-based control transfers

push eax

ret

call 401000

Invalid Region

call ptr[eax]

?jmp eax

?

xor eax, eaxmov ebx, ptr[eax]

Exception Handler

Page 24: Malware Analysis and Instrumentation

…call ptr[eax]

Instrumentation-based discoveryH.A.

Hybrid Analysis of Program Binaries 24

?

process

Dyninst

Page 25: Malware Analysis and Instrumentation

…call ptr[eax]

Dyninst

Instrumentation-based discoveryH.A.

Hybrid Analysis of Program Binaries 25

findTarget(targ) { if ( !cacheLookup(targ) ) RPC_updateAnalysis(targ); }

jmp 823456 …

call ptr[eax]

call findTarget (ptr[eax])

restore state

save state

process

Page 26: Malware Analysis and Instrumentation

Overwritten code discovery

Malware Analysis and Instrumentation 26

Dyninst

write

RWX

26

H.A.

RWXRWX

Page 27: Malware Analysis and Instrumentation

Dyninst

Hybrid Analysis of Program Binaries

write

When to updateChallenges large incremental

overwrites writes to data writes to own page

R E R ER E

code write handler

CFG update routine

H.A.Overwritten code discovery

27

Page 28: Malware Analysis and Instrumentation

Dyninst

Hybrid Analysis of Program Binaries

When to updateChallenges large incremental

overwrites writes to data writes to own page

Approach Delay the update until

write routine terminates

R E R ER E

CFG update routine

code write handler

D.A.

write

Overwritten code discovery

28

Page 29: Malware Analysis and Instrumentation

Update after overwrite

1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution

2. Update CFG when writes enda) remove overwritten and

unreachable blocksb) parse at entry points to

overwritten regionsc) remove write permissionsd) resume execution

Overwritten code discovery

Malware Analysis and Instrumentation 29

Dyninst

R-XR-X

code write handler

CFG update routine

write

Update after overwrite

1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution

2. Update CFG when writes enda) remove overwritten and

unreachable blocksb) parse at entry points to

overwritten regionsc) remove write permissionsd) resume execution

cb

RWX

cb

R-X

29

H.A.

Page 30: Malware Analysis and Instrumentation

DyninstOverwritten code discovery

Malware Analysis and Instrumentation 30

Update after overwrite1. Handle overwrite signal

a) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution

2. Update CFG when writes enda) remove overwritten and

unreachable blocksb) parse at entry points to

overwritten regionsc) remove write permissionsd) resume execution

R-X R-XR-X RWX

code write handler

CFG update routine

cb

write

cb

30

H.A.

Page 31: Malware Analysis and Instrumentation

Behavior Changes

Program modification affects local behavior

These changes propagate

Malware detects changes (or crashes)Malware Analysis and Instrumentation 31

S.R.

Page 32: Malware Analysis and Instrumentation

Sensitivity Resistant Approach Identify instructions sensitive to

modificationMoved instructions that access the program counter

Memory operations that may access patched code

Memory operations that may scan the address space

Project effects on program behaviorAre output (or control flow) affected?Use a forward slice and symbolic evaluation

Determine how to compensate for modificationE.g. by emulating the original instruction

Malware Analysis and Instrumentation 32

S.R.

Page 33: Malware Analysis and Instrumentation

PC-sensitivity analysis

Malware Analysis and Instrumentation 33

S.R.

main: call foo ... call next <data>

next: pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx

foo: ... ret

main:

Sensitive: call foo

Slice: call foo ret

Symbolic expansion: pc = $retAddr + $delta

Sensitive: call next

Slice: call next pop %esi add %esi, %eax mov %(esi), %ebx jmp %ebx

Symbolic expansion: pc = [$next + %eax + $delta]

main: call foo ... push $next pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx

reloc_main:

Page 34: Malware Analysis and Instrumentation

Sensitivity ClassesPC (program counter) sensitive

Moved instruction that accesses the PCCF (control flow) sensitive

Instruction whose control flow successor was moved

CAD (code as data) sensitive Instruction that reads from overwritten memory

AVU (allocated vs. unallocated) sensitive Instruction that accesses newly allocated memory

Malware Analysis and Instrumentation 34

S.R.

Page 35: Malware Analysis and Instrumentation

Visible CompatibilityWhat behavior do we need to preserve?

Allow localized changes that aren’t visible from outside the program

Preserve:OutputApproximation: control flow

Malware Analysis and Instrumentation 35

S.R.

Page 36: Malware Analysis and Instrumentation

Handling CAD Sensitivity

Malware Analysis and Instrumentation 36

S.R.

checksum routinexor eax, eax

cmp eax, .chksumjne .fail

add eax, ptr[ebx]add ebx, 4cmp ebx, 0x41000jne .loop

pass failfail

data

code

code

instrumentation

patch

patch

patch

add ebx, 4cmp ebx, 0x41000jne .loop

emulate(add eax, ptr[ebx])

restore state

save statejmp 863828

shadow memory

Page 37: Malware Analysis and Instrumentation

Emulating Memory (Simplified)

Malware Analysis and Instrumentation 37

S.R.Save stateDetermine effective address

Translate effective address

Restore stateEmulate original memory instruction

push %eaxpush %ecxpush %edxlahfpush %eax

lea <original>, %ebx

call translate

pop %eaxsahfpop %edxpop %ecxpop %eax

mov (%ebx), %ebx

Page 38: Malware Analysis and Instrumentation

The Devil in the Details IA-32 is a rich instruction set

Most instructions can access memoryAnd malware uses a wide variety of them

Instruction classes:Most common: MOD/RM byteLess common: “string” operationsLeast common: absolute address

Malware Analysis and Instrumentation 38

S.R.

Page 39: Malware Analysis and Instrumentation

String Operations“String” instructions implicitly use ESI/EDI scas/lods/stos/movs/cmps/ins/outs

Some update ESI/EDI, making emulation tricky

Malware loves these for copying blocks of memory

Malware Analysis and Instrumentation 39

S.R.

movs

<save>mov %edi, %edxmov %esi, %ecxcall TranslateShiftadd %edx, %ediadd %ecx, %esimovssub %edx, %edisub %ecx, %esi<restore>

Page 40: Malware Analysis and Instrumentation

Address-space scanning

Malware Analysis and Instrumentation 40

S.R.

scan routinexor eax, eax

call chk_mem

mov ptr[eax], ebxadd eax, 4cmp eax, 0jne .loop

pass failfail

data

code

code

instrumentation

patch

patch

patch

add eax, 4cmp ebx, 0jne .loop

emulate(mov ptr[eax],

ebx)restore state

save statejmp 863828

segv_handler ... dyn_segv_han

dler ... ...

Page 41: Malware Analysis and Instrumentation

Exception Handler Interposition

Malware Analysis and Instrumentation 41

S.R.

push %eaxpush %ecxpush %edxlahfpush %eax

lea <original>, %eax

call translate

pop %eaxsahfpop %edxpop %ecxpop %eax

mov (%eax), %eax

WindowsLibraries

Faulting insn: <reloc_addr>Faulting addr: 0Registers:

dyn_segv_handler

... ...

segv_handler ...

Exception RecordFaulting insn: <orig_addr>Faulting addr: <eff_addr>Registers:

Page 42: Malware Analysis and Instrumentation

DyninstSR-

Dyninst

xx

√x

√√

√√

√ 

yesyesyes

yes

yes

yes

yesyes

yes

Malware Analysis and Instrumentation 42

The packers we’re studying

[1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008.

PackerMalware market share[1]

0.13%MEW0.17%WinUPack0.33%Yoda's Protector0.37%Armadillo0.43%Asprotect

1.26%FSG1.29%Aspack1.74%nPack2.08%Upack2.59%PECompact2.95%Themida4.06%EXECryptor6.21%PolyEnE9.45%UPX

0.89%Nspack

Res.

 

Self-modifyin

g

 

yes

 

yesyes

yesyesyes

Anti instru-

mentation

 

yes

 

yesyes

yesyes

Obfuscated

 

yes

 

yes

yes

yesyes

yesyes

yesyes

anti-

debu

ggin

g te

chni

ques

Page 43: Malware Analysis and Instrumentation

malware

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 SD-Dyninst

comprehensive

instrumentation

network call instrumentati

on

Stack trace at 1st network communication

Control flow graph showing executed blocks

Defensive tactics report unpacked code overwritten code control flow obfuscations

Trace of Win API calls

43

malware

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

malware

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

malware

binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21

200 binaries

Malware Analysis and Instrumentation

Res. Sample malware analysis factory

Page 44: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 44

Factory results for Conficker A

initial bootstrap code

packed payload

Res.

Page 45: Malware Analysis and Instrumentation

45

API func non

executed block

staticblock

unpacked block

Factory results for Conficker ARes.

Page 46: Malware Analysis and Instrumentation

Stack-walk of Conficker’s communications threadFrame pc=0x100016f7 func: DYNstopThread

at 0x100001670 [Dyninst]Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0 [Win DLL]Frame pc=0x401f34 func: nosym1f058 at 0x41f058 [Conficker]

Instrument network calls and perform a stack-walk

46

(We can also print stackwalks of Conficker’s other threads)

Malware Analysis and Instrumentation

Factory results for Conficker ARes.

Page 47: Malware Analysis and Instrumentation

Reduced relocation overhead despite emulation

Better handling of program featuresExceptions Indirect control flow

Malware Analysis and Instrumentation 47

Improved Dyninst overheadRes.

Page 48: Malware Analysis and Instrumentation

Malware Analysis and Instrumentation 48

ConclusionSR-Dyninst gives you

All the benefits of Dyninst on malwareSafer instrumentation on normal binaries

Ongoing workAnti-debugger techniquesMore descriptive CFGsAutomated defensive-mode activationSR-Dyninst in next Dyninst release