malware analysis and instrumentation
DESCRIPTION
Malware Analysis and Instrumentation. Andrew Bernat and Kevin Roundy. Paradyn Project. Center for Computing Science June 14, 2011. Forensic analysts need help. 90% of malware resists analysis [1] Malware attacks cost billions of dollars annually [2] - PowerPoint PPT PresentationTRANSCRIPT
1
Malware Analysis and Instrumentation
Andrew Bernat and Kevin RoundyParadyn Project
Center for Computing Science
June 14, 2011
Forensic analysts need help
Malware Analysis and Instrumentation 2
90% of malware resists analysis[1]
Malware attacks cost billions of dollars annually[2]
65% of users feel effect of cyber crime[3]
69% cybercrimes are resolved[3]
28 days on average to resolve a cybercrime[3]
[1] McAfee. 2008 [2] Computer Economics. 2007 [3] Norton. 2010
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95
Malware Binary
Malware Analysis and Instrumentation 3
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95
Binary code identificationControl- and data-flow analysisInstrumentationEffectiveness on malware
The needed toolbox
Forensic analysts need help
Malware Binary
Malware Analysis and Instrumentation
Dyninst
Dyninst is a toolbox for analysts
4
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
DyninstCFG
loop,block,
function,instructioninstrument-
ation
functionreplace-
ment
callstack
walking
forward &backward
slices
loopanalysis
processcontrol
libraryinjection symbol
tablereading,writing
binaryrewriting
machinelanguageparsing
Control flow
analyzerInstrument
erData flow analyzer
Analysis tool
Dyninst
Dyninst is a toolbox for analysts
Malware Analysis and Instrumentation
Mutator Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
DyninstControl
flow analyzer
Instrumenter
Data flow analyzer
CFGCFG
5
loop,block,
function,instructioninstrument-
ation
functionreplace-
ment
callstack
walking
forward &backward
slices
loopanalysis
processcontrol
libraryinjection symbol
tablereading,writing
binaryrewriting
machinelanguageparsing
Analysis tool
Dyninst is a toolbox for analysts
Malware Analysis and Instrumentation 6
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
program
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFG
DyninstControl
flow analyzer
Instrumenter
Data flow analyzer
Analysis tool
Dyninst
Dyninst on malware
Malware Analysis and Instrumentation 7
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
Malware defeats static analysis &is sensitive to instrument-ationmalwar
e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFG
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Control flow
analyzerInstrument
erData flow analyzer
Analysis tool
DyninstControl
flow analyzer
Instrument-er
Data flow analyzer
Dyninst on malware
Malware Analysis and Instrumentation 8
printf(…)
counter++if (pred)
callback(…)
getTarget(insn)
Code snippetsMutator
Specifies instrumentation Gets callbacks for runtime
events Builds high-level analysis
Malware defeats static analysis &is sensitive to instrument-ationmalwar
e binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
CFGCFGCFG
SR- Dyninststatic-dynamic analysis
Analysis of network communications
Code visualizations
Time bomb detectionand analysis
Identification of stolen data
Reports on anti-analysis techniques
Control flow
analyzer
Sensitivity Resistant Instrumenter
Data flow analyzer
Outline
Malware Analysis and Instrumentation 9
Anti-analysis tricksHybrid static-dynamic analysisSensitivity resistanceResults
H.A.Anti
S.R.Res.
9
PC-sensitive code
Obfuscated control flow
Unpacked code
Overwritten code
Anti-patching
Address-space probing
PC-sensitive codecall-pop pairs, return-address manipulation, call-stack tampering & probing
Anti-analysis tricks
Malware Analysis and Instrumentation 10
Obfuscated control flowindirect control flow, stack tampering, overlapping code, signal-based ctrl flow
Unpacked codeall-at-once, block-, loop-, function-at-a-time, to empty or allocated space
Overwritten codesingle operand or opcode, whole instruction, function, code section, buffer
Anti-patchingchecksum whole regions, probe for patches, use code as data, move stack ptr
Anti
Address-space probingscans & probes of locations that should be un-allocated
Anti
-ana
lysi
sAn
ti-
inst
rum
enta
tion
03 04 05 06 07 08 09 0a 0b 0c 0de8 03 00 00 00 e9 eb 04 5d 45 55 c3CALL JMP40d00a 459dd4f7
JMP POP INC PUSH RET40d00e ebp ebp ebp
anti-patching
storm worm
Obfuscated control flow
Malware Analysis and Instrumentation 11
obfuscated control flow
40d002
address-space probing
unpacked codeoverwritten code
obfuscated control flow
Entry Point
pc-sensitive code
Anti
storm worm
Unpacked code
Malware Analysis and Instrumentation 12
Entry Point
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
obfuscated control flow
unpacked codeobfuscated control
flow
Anti
12
anti-patchingaddress-space
probing
overwritten codepc-sensitive code
Overwritten code
Malware Analysis and Instrumentation 13
Upack packer
obfuscated control flow
overwritten code
obfuscated control flow
Anti
13
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01
Entry Point
PC Sensitive code
Malware Analysis and Instrumentation 14
obfuscated control flow
overwritten code
obfuscated control flow
Anti
14
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code
Local Data Access
call
pop esiadd esi, eaxmov ebx, ptr[esi]
data
Use call to get current PC
Pop PC into register
Construct pointer and dereference
e.g., ASProtect
anti-patching
obfuscated control flow
Anti-patching
Malware Analysis and Instrumentation 15
checksum routine protected
codexor eax, eax
cmp eax, .chksumjne .fail
e.g., PECompactChecksumming detects instrumentation
[Aucsmith 96]
add eax, ptr[ebx]add ebx, 4cmp ebx, 0x41000jne .loop
jmp
instrument-ation is detected
pass failfail
calculate checksum of protected regioncompare to expected value
Anti
15
address-space probing
unpacked codeoverwritten codepc-sensitive code
Address-space probing
Malware Analysis and Instrumentation 16
obfuscated control flow
overwritten code
obfuscated control flow
Anti
16
anti-patchingaddress-space
probing
pc-sensitive code
unpacked code
segv_handler() { ptr += PAGESIZE; goto RESTART:}
int *ptr = 0;
sigaction(SIGSEGV, segv_handler);
while(1) {RESTART: *ptr; ptr += PAGESIZE;}
data
code
code
instrumentation
Memory Scan
Malware Analysis and Instrumentation 17
Code discovery algorithmHybrid algorithm:
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution
H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
?
Malware Analysis and Instrumentation 18
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution ?
Hybrid algorithm:H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
Malware Analysis and Instrumentation 19
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution
Hybrid algorithm:H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
?
Malware Analysis and Instrumentation 20
Code discovery algorithm
?
Parse from known entry points
Instrument control flow that may lead to new codeResume execution
Hybrid algorithm:H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
?
Malware Analysis and Instrumentation 21
Code discovery algorithm
Parse from known entry points
Instrument control flow that may lead to new codeResume execution
Hybrid algorithm:H.A.
instrument exceptionoverwriteCALL ptr[eax] DIV eax, 0
?
Standard control-flow traversal start from known entry points follow control flow to find
code New conservative
assumption unresolved calls may not
returnSo, we don’t parse garbage
code New stack tamper
detection backwards slice at ret
instructionSo, we detect modified return addresses
Hybrid Analysis of Program Binaries
call ptr[eax]
pop ebpinc ebppush ebpret
garbage
Accurate parsingH.A.
22
Malware Analysis and Instrumentation 23
Instrumentation-based discoveryH.A.Invalid control transfers
Indirect control transfers
Exception-based control transfers
push eax
ret
call 401000
Invalid Region
call ptr[eax]
?jmp eax
?
xor eax, eaxmov ebx, ptr[eax]
Exception Handler
…call ptr[eax]
Instrumentation-based discoveryH.A.
Hybrid Analysis of Program Binaries 24
?
process
Dyninst
…call ptr[eax]
Dyninst
Instrumentation-based discoveryH.A.
Hybrid Analysis of Program Binaries 25
findTarget(targ) { if ( !cacheLookup(targ) ) RPC_updateAnalysis(targ); }
jmp 823456 …
call ptr[eax]
call findTarget (ptr[eax])
restore state
save state
process
Overwritten code discovery
Malware Analysis and Instrumentation 26
Dyninst
write
RWX
26
H.A.
RWXRWX
Dyninst
Hybrid Analysis of Program Binaries
write
When to updateChallenges large incremental
overwrites writes to data writes to own page
R E R ER E
code write handler
CFG update routine
H.A.Overwritten code discovery
27
Dyninst
Hybrid Analysis of Program Binaries
When to updateChallenges large incremental
overwrites writes to data writes to own page
Approach Delay the update until
write routine terminates
R E R ER E
CFG update routine
code write handler
D.A.
write
Overwritten code discovery
28
Update after overwrite
1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
Overwritten code discovery
Malware Analysis and Instrumentation 29
Dyninst
R-XR-X
code write handler
CFG update routine
write
Update after overwrite
1. Handle overwrite signala) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
cb
RWX
cb
R-X
29
H.A.
DyninstOverwritten code discovery
Malware Analysis and Instrumentation 30
Update after overwrite1. Handle overwrite signal
a) instrument write loop exitsb) copy overwritten pagec) restore write permissionsd) resume execution
2. Update CFG when writes enda) remove overwritten and
unreachable blocksb) parse at entry points to
overwritten regionsc) remove write permissionsd) resume execution
R-X R-XR-X RWX
code write handler
CFG update routine
cb
write
cb
30
H.A.
Behavior Changes
Program modification affects local behavior
These changes propagate
Malware detects changes (or crashes)Malware Analysis and Instrumentation 31
S.R.
Sensitivity Resistant Approach Identify instructions sensitive to
modificationMoved instructions that access the program counter
Memory operations that may access patched code
Memory operations that may scan the address space
Project effects on program behaviorAre output (or control flow) affected?Use a forward slice and symbolic evaluation
Determine how to compensate for modificationE.g. by emulating the original instruction
Malware Analysis and Instrumentation 32
S.R.
PC-sensitivity analysis
Malware Analysis and Instrumentation 33
S.R.
main: call foo ... call next <data>
next: pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx
foo: ... ret
main:
Sensitive: call foo
Slice: call foo ret
Symbolic expansion: pc = $retAddr + $delta
Sensitive: call next
Slice: call next pop %esi add %esi, %eax mov %(esi), %ebx jmp %ebx
Symbolic expansion: pc = [$next + %eax + $delta]
main: call foo ... push $next pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx
reloc_main:
Sensitivity ClassesPC (program counter) sensitive
Moved instruction that accesses the PCCF (control flow) sensitive
Instruction whose control flow successor was moved
CAD (code as data) sensitive Instruction that reads from overwritten memory
AVU (allocated vs. unallocated) sensitive Instruction that accesses newly allocated memory
Malware Analysis and Instrumentation 34
S.R.
Visible CompatibilityWhat behavior do we need to preserve?
Allow localized changes that aren’t visible from outside the program
Preserve:OutputApproximation: control flow
Malware Analysis and Instrumentation 35
S.R.
Handling CAD Sensitivity
Malware Analysis and Instrumentation 36
S.R.
checksum routinexor eax, eax
cmp eax, .chksumjne .fail
add eax, ptr[ebx]add ebx, 4cmp ebx, 0x41000jne .loop
pass failfail
data
code
code
instrumentation
patch
patch
patch
add ebx, 4cmp ebx, 0x41000jne .loop
emulate(add eax, ptr[ebx])
restore state
save statejmp 863828
shadow memory
Emulating Memory (Simplified)
Malware Analysis and Instrumentation 37
S.R.Save stateDetermine effective address
Translate effective address
Restore stateEmulate original memory instruction
push %eaxpush %ecxpush %edxlahfpush %eax
lea <original>, %ebx
call translate
pop %eaxsahfpop %edxpop %ecxpop %eax
mov (%ebx), %ebx
The Devil in the Details IA-32 is a rich instruction set
Most instructions can access memoryAnd malware uses a wide variety of them
Instruction classes:Most common: MOD/RM byteLess common: “string” operationsLeast common: absolute address
Malware Analysis and Instrumentation 38
S.R.
String Operations“String” instructions implicitly use ESI/EDI scas/lods/stos/movs/cmps/ins/outs
Some update ESI/EDI, making emulation tricky
Malware loves these for copying blocks of memory
Malware Analysis and Instrumentation 39
S.R.
movs
<save>mov %edi, %edxmov %esi, %ecxcall TranslateShiftadd %edx, %ediadd %ecx, %esimovssub %edx, %edisub %ecx, %esi<restore>
Address-space scanning
Malware Analysis and Instrumentation 40
S.R.
scan routinexor eax, eax
call chk_mem
mov ptr[eax], ebxadd eax, 4cmp eax, 0jne .loop
pass failfail
data
code
code
instrumentation
patch
patch
patch
add eax, 4cmp ebx, 0jne .loop
emulate(mov ptr[eax],
ebx)restore state
save statejmp 863828
segv_handler ... dyn_segv_han
dler ... ...
Exception Handler Interposition
Malware Analysis and Instrumentation 41
S.R.
push %eaxpush %ecxpush %edxlahfpush %eax
lea <original>, %eax
call translate
pop %eaxsahfpop %edxpop %ecxpop %eax
mov (%eax), %eax
WindowsLibraries
Faulting insn: <reloc_addr>Faulting addr: 0Registers:
dyn_segv_handler
... ...
segv_handler ...
Exception RecordFaulting insn: <orig_addr>Faulting addr: <eff_addr>Registers:
DyninstSR-
Dyninst
xx
√
√
√x
√√
√√
√
√
yesyesyes
yes
yes
yes
yesyes
yes
Malware Analysis and Instrumentation 42
The packers we’re studying
[1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008.
PackerMalware market share[1]
0.13%MEW0.17%WinUPack0.33%Yoda's Protector0.37%Armadillo0.43%Asprotect
1.26%FSG1.29%Aspack1.74%nPack2.08%Upack2.59%PECompact2.95%Themida4.06%EXECryptor6.21%PolyEnE9.45%UPX
0.89%Nspack
Res.
Self-modifyin
g
yes
yesyes
yesyesyes
Anti instru-
mentation
yes
yesyes
yesyes
Obfuscated
yes
yes
yes
yesyes
yesyes
yesyes
√
√
√
anti-
debu
ggin
g te
chni
ques
malware
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 SD-Dyninst
comprehensive
instrumentation
network call instrumentati
on
Stack trace at 1st network communication
Control flow graph showing executed blocks
Defensive tactics report unpacked code overwritten code control flow obfuscations
Trace of Win API calls
43
malware
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
malware
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
malware
binary7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21
200 binaries
Malware Analysis and Instrumentation
Res. Sample malware analysis factory
Malware Analysis and Instrumentation 44
Factory results for Conficker A
initial bootstrap code
packed payload
Res.
45
API func non
executed block
staticblock
unpacked block
Factory results for Conficker ARes.
Stack-walk of Conficker’s communications threadFrame pc=0x100016f7 func: DYNstopThread
at 0x100001670 [Dyninst]Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0 [Win DLL]Frame pc=0x401f34 func: nosym1f058 at 0x41f058 [Conficker]
Instrument network calls and perform a stack-walk
46
(We can also print stackwalks of Conficker’s other threads)
Malware Analysis and Instrumentation
Factory results for Conficker ARes.
Reduced relocation overhead despite emulation
Better handling of program featuresExceptions Indirect control flow
Malware Analysis and Instrumentation 47
Improved Dyninst overheadRes.
Malware Analysis and Instrumentation 48
ConclusionSR-Dyninst gives you
All the benefits of Dyninst on malwareSafer instrumentation on normal binaries
Ongoing workAnti-debugger techniquesMore descriptive CFGsAutomated defensive-mode activationSR-Dyninst in next Dyninst release