eecs 583 class 21 research topic 3: dynamic taint...

28
EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysis University of Michigan December 5, 2012

Upload: others

Post on 19-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

EECS 583 – Class 21

Research Topic 3:

Dynamic Taint Analysis

University of Michigan

December 5, 2012

Page 2: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 1 -

Announcements

Exams returned

» Answer Key can be found on the course website

» One week to turn in written requests for exam regrades

Sign up for project presentation slots

» Friday on Scott’s office door, Monday in class

Today’s class reading » "Dynamic taint analysis for automatic detection, analysis, and signature

generation of exploits on commodity software,"

James Newsome and Dawn Song, Proceedings of the Network and

Distributed System Security Symposium, Feb. 2005.

» Optional background reading: "All You Ever Wanted to Know About

Dynamic Taint Analysis and Forward Symbolic Execution (but might

have been afraid to ask),"

Edward J. Schwartz, Thanassis Avgerinos, David Brumley, Proceedings

of the 2010 IEEE Symposium on Security and Privacy, May 2010.

Page 3: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 2 -

EECS 583 Exam Statistics

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61

Series1

Exam

Sco

re

Mean: 70.6 StDev: 12.8 High: 94 Low: 44

Page 4: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 3 -

The Problem

Unknown Vulnerability Detection **

» Buffer Overflows

» Format string vulnerabilties

» The goal of TaintCheck

Malware Analysis

» What is the software doing with sensitive data?

» Data propagation from triggers

» Ex. TaintDroid

More Generally - Tracking the flow of data through a

program

Page 5: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 4 -

Buffer Overflow

Example Stack Buffer Overflow

Page 6: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 5 -

Buffer Overflow

Page 7: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 6 -

Format String Vulnerability

Attacker controls a format string

Page 8: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 7 -

Format String Vulnerability

Attacker controls a format string

» Overwrite arbitrary memory

Take control of program execution

%n commonly used in conjunction with other format specifiers

Writes number of bytes written thus far to the stack

» Dump memory

ex. printf ("%08x.%08x.%08x.%08x.%08x\n");

» Crash the program

ex. printf ("%s%s%s%s%s%s%s%s%s%s%s%s");

Denial of Service

Page 9: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 8 -

Dynamic Taint Analysis

Track information flow through a program at runtime

Identify sources of taint – “TaintSeed”

» What are you tracking?

Untrusted input

Sensitive data

Taint Policy – “TaintTracker”

» Propagation of taint

Identify taint sinks – “TaintAssert”

» Taint checking

Special calls

Jump statements

Format strings

Outside network

Page 10: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 9 -

TaintCheck

Performed on x86 binary

» No need for source

Implemented using Valgrind skin

» X86 -> Valgrind’s Ucode

» Taint instrumentation added

» Ucode -> x86

Sources -> TaintSeed

Taint Policy -> TaintTracker

Sinks -> TaintAssert

Add on “Exploit Analyzer”

Page 11: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 10 -

Taint Analysis in Action

Page 12: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

11

x = get_input( )

y = x + 42

goto y Input is tainted

untainted tainted

x 7

Δ Var Val

T x

Tainted? Var

τ

Input t = IsUntrusted(src) get_input(src)↓ t

TaintSeed

Page 13: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

12

x = get_input( )

y = x + 42

goto y Data derived from

user input is tainted

untainted tainted

y 49

Δ Var Val

x 7

T y

Tainted?

T

Var

x

τ

BinOp t1 = τ[x1] , t2 = τ[x2]

x1 + x2 ↓ t1 v t2

TaintTracker

Page 14: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

Pgoto(ta) = ¬ ta

(Must be true to execute)

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

13

Policy Violation Detected

x = get_input( )

y = x + 42

goto y

untainted tainted Δ Var Val

x 7

y 49

Tainted?

T

T

Var

x

y

τ TaintAssert

Page 15: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

14

x = get_input( )

y = …

goto y

… strcpy(buffer,argv[1]) ; … return ;

Jumping to overwritten

return address

Page 16: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 15 -

TaintCheck - Exploit Analyzer

TaintPolicy - Keep track of Taint propagation info

» Backtrace chain of taint structures

» Allow generation of attack signatures

Transfer control to sandbox for analysis

Page 17: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 16 -

TaintCheck - Evaluation

False Negatives

» Use control flow to change value without gathering taint

Example: if (x == 0) y=0; else if (x == 1) y=1;

Equivalent to x=y;

» Tainted index into a hardcoded table

Policy – value translation is not tainted

» Enumerating all sources of taint

False Positives

» Vulnerable code?

» Sanity Checks not removing taint

Requires fine-tuning

Taint sanitization problem

» 0 false positives?

Thoughts?

Page 18: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 17 -

Policy Considerations?

Page 19: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

Memory Load

12/6/2012 Carnegie Mellon University 18

Variables Memory

Δ Var Val

x 7

Tainted?

T

Var

x

τ

μ Addr Val

7 42

Tainted?

F

Addr

7

τμ

Page 20: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

Problem: Memory Addresses

12/6/2012 Carnegie Mellon University 19

x = get_input( ) y = load( x ) … goto y

All values derived from user input

are tainted??

7 42 μ

Addr Val

Tainted?

F

Addr

7 τμ

x 7 Δ

Var Val

Page 21: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

μ Addr Val

x = get_input( ) y = load( x ) … goto y

Jump target could be any untainted

memory cell value

Policy 1:

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

20

Load v = Δ[x] , t = τμ[v]

load(x) ↓ t

Taint depends only on the memory cell

Taint Propagation

7 42

Tainted?

F

Addr

7 τμ

x 7 Δ

Var Val

Undertainting Failing to identify tainted

values - e.g., missing exploits

Page 22: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

jmp_table

Policy Violation?

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

21

x = get_input( ) y = load(jmp_table + x % 2 ) … goto y

Policy 2:

Memory

printa

printb

Address expression is tainted

Load v = Δ[x] , t = τμ[v], ta = τ[x]

load(x) ↓ t v ta

If either the address or the memory cell is tainted, then the value is tainted

Taint Propagation

Overtainting Unaffected values are tainted

- e.g., exploits on safe inputs

Page 23: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

General Challenge State-of-the-Art is not perfect for all

programs

12/6/2012

All You Ever Wanted to Know About

Dynamic Taint Analysis

22

Undertainting: Policy may miss taint

Overtainting: Policy may wrongly

detect taint

Page 24: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 23 -

TaintCheck - Attack Detection

Synthetic Exploits

» Buffer overflow -> function pointer

» Buffer overflow -> format string

» Format string -> info leak

» Success!

Actual Exploits

» 3 real world examples

» Random sample?

» Prevalence of protected exploits?

slightly more convincing

Page 25: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 24 -

Performance

Implementation decisions

» Use of Valgrind

» Better on IO bound tasks

» How much performance would you give up?

Page 26: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 25 -

TaintCheck Uses

Individual Sites

» Cost?

Honeypots

“TaintCheck plus OS Randomization”

» Rely on OS randomization to cause crashes

» Log and reproduce with tainting

Sampling

» Users rather than requests

Do I just need more infected computers?

» Distributed Sampling

Page 27: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 26 -

Signatures

Automatic Semantic Analysis

» Exploit Analyzer

Generated signature validation

Page 28: EECS 583 Class 21 Research Topic 3: Dynamic Taint Analysisweb.eecs.umich.edu/~mahlke/courses/583f12/lectures/583L21.pdf- 2 - EECS 583 Exam Statistics 0 10 20 30 40 50 60 70 80 90 100

- 27 -

Other considerations

Effectiveness in the wild

» Side-channel attacks

Can the attacker determine when someone is using taint tracking?

Speed?

Perform the attack only on native systems

Similar to existing virtual machine and honeypot problems

» Circumventing the taint system

Clean your data before exploit

Depends on policy

Example: Hex to ASCII translation

Cause false positives

Raises cost of running the system

Admins may turn it off

» Difficult to evaluate without motivated attackers