inse 6150 security evaluation methodologiesclark/courses/1401-6150/...inse 6150 security evaluation...
TRANSCRIPT
INSE 6150 Security Evaluation Methodologies
Vulnerability Analysis Prof. Jeremy Clark
Presented by: Gaby Dagher
The lecture notes are based on materials by Dr. M. Debbabi 1
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
2
Motivations
[ Source: National Vulnerability Database (NIST) ]
3
n Securing a software is a challenging task
n Lines of Code: n Windows Vista: 50 Millions
n Mac OS X 10.4: 86 Millions
n GNU/ Linux: 283 Millions
Motivation
4
Vulnerability Analysis
n Could be achieved by program analysis. n Program analysis:
n Determine program/expression/statement/data properties.
n Extract information from programs.
n Two types of analyses: n Static Analysis:
n Analyze programs without executing them.
n Dynamic Analysis: n Analyze programs at runtime.
5
Dynamic Analysis
n Advantages: n Information is available at runtime. n Easier to implement.
n Disadvantages: n Valid only for one execution path. n Significant overhead during program
execution.
6
Static Analysis
n Advantages: n No overhead at runtime. n A lot of research results (algorithms,
methodologies, frameworks, tools, etc.). n Analyze all the execution paths.
n Disadvantages: n Somewhat elaborate to design and
implement. n Non-decidability issues.
7
Static Analysis
q Static Analysis initially emerged in the domain of compiler optimization
Lexical Analysis
Source program Grammar
Tokens
Syntactic Analysis
Abstract Syntax Tree (AST)
Semantic/Static Analysis
Optimization
Modified AST
Code generation
Object/Executable Code 8
Parser
Static Information Inline Expansion
Dead Code Elimination
Common Expression Elimination
Moving Loop Invariants
Constant Propagation
.
.
.
Parallel Execution
Definitions
lexical analysis is the process of converting a sequence of characters into a sequence of tokens. Syntactic analysis involves parsing the token sequence to identify the syntactic structure of the program. This phase typically builds a parse tree, which replaces the linear sequence of tokens with a tree structure built according to the rules of a formal grammar which define the language's syntax. Semantic analysis is the phase in which the compiler adds semantic information to the parse tree. This phase performs semantic checks such as type checking (checking for type errors), or assignment checking (requiring all local variables to be initialized before use), rejecting incorrect programs or issuing warnings.
9
Optimization – Example1
n Common expression elimination.
10
Optimization – Example2
n Moving loop invariants.
11
Optimization – Example3
n Constant Propagation.
12
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
13
Flow Analysis
n The purpose of flow analysis is to determine information about functions and data structures that can be called from various program points during execution of the program.
n Generally, flow analysis refers to two types: n Control-flow analysis n Data-flow analysis
14
Control-Flow Analysis
n x := a + b; n y := a * b; n while (y > a + b) { n a := a + 1; n x := a + b n }
if y > a + b
a := a + 1
x := a + b
y := a * b
x := a + b
Ø Control-flow graphs are state-transition systems.
15
Control-flow graph
Data-Flow analysis.
n For each program point p, which expressions must have already been computed, and not later modified, on all paths to p.
n Optimization: Where available, expressions need not be recomputed.
if y > a + b
a := a + 1
x := a + b
y := a * b
x := a + b
a+b is available here
16
Data-Flow analysis - Example
if y > a + b
a := a + 1
x := a + b
y := a * b
x := a + b a+b
a+b, a*b
-
a+b
-
a+b, y > a+b
17
a+b, y <= a+b
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
18
MetaCompilation Approach
n The MetaCompilation (MC) approach takes advantage of the compilation process to check violations of security rules in source code.
n Main objectives: n Find as much security bugs as possible n Define an approach that scales to large programs
19
MetaCompilation Approach
n MetaCompilation approach: n Maps a security rule to code statements n Define rules as high-level system-specific checkers n Dynamically link the checkers to the compiler n Compiler performs the essence of the analysis and
checks security rule violations
20
MetaCompilation Components
n metal high-level automata language to express security properties
n xgcc interprocedural analysis engine that executes metal extensions
ent->data = kmalloc(..) if(!ent->data)
free(ent); goto out;
… out: return ent;
Linux fs/proc/ generic.c
xgcc compiler
free checker “using ent after free!”
21
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
22
Metal Extensions/Security Checkers
n Programmers use a high-level automata language called metal to express their application-specific security checkers or extensions.
n A metal extension define a collection of one or more state machines (SM).
n There are two types of metal extensions: n Global extension, n Variable-specific extension.
23
Global Extensions n A global extension tracks program-wide properties such as
"interrupts are disabled" n An interrupt is simply a signal that a hardware or software
can send when it wants the processor's attention to execute its critical section of code (a section that should not be stopped in the middle).
n If a program is executing a critical section of code, the programmer should turn off interrupts and disable acknowledgment of all incoming interrupts.
n To build a critical section, at the kernel-level, there are two simple instructions to aid us in doing so: n cli() is a mnemonic for CLear Interrupts: it turns off interrupts n sti() is a mnemonic for SeT Interrupts: it turns on interrupts
n When cli() is called, interrupts are disabled. This means that no other system tasks can be performed until you enable interrupts by executing sti().
24
Global Extension: Interrupt Checkers
//Global interrupt checker that warns //when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
initial global state
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
n A global extension tracks transition in a single SM that defines a list of global states,
n The first state in the list is implicitly defined as the initial global state of the SM,
n Each transition is defined with a pattern that identifies a source statement, when encountered in the source code, will cause the transition to execute.
disabled
25
patterns
special pattern that evaluates to true when the program terminates
Global Extension: Interrupt Checkers //Global interrupt checker that warns //when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
//goes to disabled state int fun_caller() {
cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
Initial global state
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
26
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
27
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
// false
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
28
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
29
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
30
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
31
First Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns //when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
//goes to disabled state int fun_caller() {
cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
Initial global state
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
32
Second Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
33
Second Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
// true
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
34
Second Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
35
Second Execution
Global Extension: Interrupt Checkers //Global interrupt checker that warns
//when interrupts are not restored sm cli_sti {
enabled: { cli(); } ==> disabled |{ sti(); } ==> stop,
{err(“Double sti”);} disabled:
{ sti(); } ==> enabled |{ cli(); } ==> stop,
{err(“Double cli”);} | $end_of_path$ ==> stop,
{err(“Did not reverse”);} stop:
}
int fun_caller() { cli(); fun();
} int fun(void) {
if (random()) sti();
} void err(char* error) {
print (error); }
stop
cli()
enabled
sti()
cli()
end_of_path
sti()
disabled
36
Second Execution
Variable-Specific Extensions
n A variable-specific extension captures properties associated with specific program objects (any expression that has an associated state): n Structure fields, arithmetic expressions, pointers, etc.
n Variable-specific properties can be: n e.g.: A NULL pointer p should not be de-referenced. n e.g.: A freed pointer p should not be used.
n A variable-specific extension is comprised of a series of state machines, each of which tracks the state attached to a single object.
37
Variable-Specific Extension : Free Checker
sm free_checker { state decl any_pointer v; start:{ kfree(v) } ==> v.freed; v.freed:{ *v } ==> v.stop,
{ err(“using %s after free”, mc_identifier(v));}
| { kfree(v) } ==> v.stop, { err(“double free of %s”,
mc_identifier(v));} ; }
n The keyword state define a single type identifier used to refer to the program object that a single SM is tracking.
n The keyword decl define a metal hole variable that will match source construct of the appropriate type. The hole variable v matches any pointer to any type
n The state of a SM consists of the value of the global instance and the value of one of the variable specific instances
n The state value start is bound to the implicitly defined global state
n The notation v.freed means that the state value freed is bound to v.
This meta type matches pointers to any type
38
Variable-Specific Extension : Free Checker
sm free_checker { state decl any_pointer v; start:{ kfree(v) } ==> v.freed; v.freed:{ *v } ==> v.stop,
{ err(“using %s after free”, mc_identifier(v));}
| { kfree(v) } ==> v.stop, { err(“double free of %s”,
mc_identifier(v));} ; }
n A variable-specific extension has two types of transitions: n Creation transition: it tells the
analysis when to begin track a new object and is guarded by the identifier start. When the encountered code statement matches pattern kfree(v) a new state machine is created to track the new instance of object v
n State transition: it describes the SM that each program object must follow.
39
Variable-Specific Extension : Free Checker
sm free_checker { state decl any_pointer v; start:{ kfree(v) } ==> v.freed; v.freed:{ *v } ==> v.stop,
{ err(“using %s after free”, mc_identifier(v));}
| { kfree(v) } ==> v.stop, { err(“double free of %s”,
mc_identifier(v));} ; }
1: int contrived_caller (int *w, int x, int *p) { 2: kfree (p); 3: kfree (w); 4: contrived (p, w, x); 5: return *w; 6:}
n Extension Initial state: {(start,<>)} n The special value <> reflects the
fact that the extension does not know about any freed variables
n Extension state after line 2: {(start, <>), (start, v : p → freed)}
n A new SM is created to track the pointer p.
n Extension state after line 3: {(start, <>), (start, v: p → freed), (start, v : w → freed) }
n Another SM is created to track the pointer w.
n Thus, a variable-specific extension state is a collection of one or more SM states
40
Metal Patterns
n Metal patterns are used to identify source code action relevant to a given security rule such as dereferencing pointers { *v } or freeing pointers { kfree(v)}
n Patterns are written in an extended version of the source language (C) and can specify almost arbitrary language constructs such as declarations, expressions and statements.
n Metal patterns define the SM alphabets
41
Metal Patterns
n A metal hole variable declared with the keyword decl will match source construct of the appropriate type.
Hole Type Matches
any_expr any legal expression
any_scalar any scalar value (int, float, etc.)
any_pointer any pointer of any type
any_arguments any argument list
any_fn_call any function call
42
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
43
Basic Blocks
§ A basic block is a sequence of consecutive intermediate language statements in which flow of control can only enter at the beginning and leave at the end. § Only the last statement of a basic block can be a branch statement and only the first statement of a basic block can be a target of a branch.
44
Basic Block Partitioning Algorithm
1. Identify leader statements (i.e. the first statements of basic blocks) by using the following rules:
(i) The first statement in the program is a leader
(ii) Any statement that is the target of a branch statement is a leader
(iii) Any statement that immediately follows a branch or return statement is a leader
45
Example: Finding Leaders
begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 end
The following code computes the inner product of two vectors.
Source code
(1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
Three-address code
46
Example: Finding Leaders
The following code computes the inner product of two vectors.
(1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) …
Source code
Three-address code
Rule (i)
begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 end
47
Example: Finding Leaders
The following code computes the inner product of two vectors.
(1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) …
Source code
Three-address code
Rule (i)
Rule (ii) begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 end
48
Example: Finding Leaders
The following code computes the inner product of two vectors.
(1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) …
Source code
Three-address code
Rule (i)
Rule (ii)
Rule (iii)
begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 end
49
Forming the Basic Blocks
2. The basic block corresponding to a leader consists of the leader, plus all statements up to but not including the next leader or up to the end of the program.
Now that we know the leaders, how do we form the basic blocks associated with each leader?
50
Example: Forming the Basic Blocks
Basic Blocks:
(1) prod := 0 (2) i := 1
(3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
(13) …
B1
B2
B3 51
Control Flow Graph (CFG)
§ A control flow graph (CFG), or simply a flow graph, is a directed multigraph in which: (i) the nodes are basic blocks; and (ii) the edges represent flow of control (branches or fall-through execution).
§ In a CFG we have no information about the data. Therefore an edge in the CFG means that the program may take that path.
§ The basic block whose leader is the first intermediate language statement is called the start node.
52
Control Flow Graph (CFG)
There is a directed edge from basic block B1 to basic
block B2 in the CFG if:
(1) There is a branch from the last statement of B1 to the first statement of B2, OR (2) Control flow can fall through from B1 to B2 because:
(i) B2 immediately follows B1, and (ii) B1 does not end with an unconditional
branch. 53
Example: Control Flow Graph Formation
(1) prod := 0 (2) i := 1
(3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
(13) …
B1
B2
B3
B1
B2
B3
54
Example: Control Flow Graph Formation
(1) prod := 0 (2) i := 1
(3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
(13) …
B1
B2
B3
Rule (2)
B1
B2
B3
55
Example : Control Flow Graph Formation
(1) prod := 0 (2) i := 1
(3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
(13) …
B1
B2
B3
Rule (2) Rule (1)
B1
B2
B3
56
Example : Control Flow Graph Formation
(1) prod := 0 (2) i := 1
(3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3)
(13) …
B1
B2
B3
Rule (2)
Rule (2)
B1
B2
B3
Rule (1)
57
Control Flow Graph (CFG)
n Control Flow Graph: Directed graph, G = (V,E) where each vertex V is a basic block and there is an edge E, v1 (BB1) à v2 (BB2) if BB2 can immediately follow BB1 in some execution sequence
n Basic block – a sequence of consecutive operations in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end n A BB has an edge to all blocks it can branch
to n Standard representation used by many
compilers n Often have 2 pseudo vertices
n entry node n exit node
BB1
BB2
BB4
BB3
BB5 BB6
BB7
Entry
Exit
58
Control Flow Graph (CFG)
int contrived(int *p, int *w, int x){ int *q; if(x) { kfree(w); q = p; } if(!x) return *w; return *q; } int contrived_caller (int *w, int x, int *p) { kfree (p); contrived (p, w, x); return *w; }
B1 entry to contrived_caller
B2 kfree(p);
B3 contrived(p,w,x);
B5 entry to contrived
B6 int *q; if(x)
B7 kfree(w); q=p;
B8 if(!x)
B9 return *w;
B11 exit from contrived
B3’ return *w;
B4 exit from contrived_caller
B10 return *q;
59
Intraprocedural Analysis-DFS
n The Depth-First Search (DFS) algorithm is used by xgcc to traverse the CFG starting at the entry block.
n Single control path is followed until the end of the function
n The traversal then backtracks to the last branch point
n The extension state is recorded at each block
n If the block is traversed again, the traversal is aborted and backtracks to the last branch point
B1
B2
B5 B3
B6 B4
B7
60
Block Summary
n The extensions are applied to each basic block to check for security rules violations
n Each basic block has a block summary that records: n All extensions states that reach that block n All SM transitions executed during the block analysis
n Transition edges : (s,v : t → vs) ==> (s’,v : t → v’s) example transition edge of the Free Checker: (start, v: p → freed ) ==> (start, v: p → stop )
n Add edges : (s,v : t → unknown) ==> (s’,v : t → v’s) example add edge of the Free Checker: (start, v: p → unknown ) ==> (start, v: p → freed )
n Block summaries take advantage of the determinism property of metal extensions: n Applying a metal extension at the same program point with the
same state always produces the same result. 61
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w,int x,int *p){ 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
62
B1: (start, <>) → (start,<>)
B1: (start, <>) → (start,<>)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
63
B2: (start,v: p → unknown) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
64
B3: (start,v: p → freed) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
65
B5: (start,v: p → freed) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
66
B6: (start,v: p → freed) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
67
B7: (start,v: w → unknown) → (start,v: w → freed) (start,v: q → unknown) → (start,v: q → freed) (start,v: p → freed) → (start,v: p → freed)
Source code example
68
1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
B8: (start,v: w → freed) → (start,v: w → freed) (start,v: q → freed) → (start,v: q → freed) (start,v: p → freed) → (start,v: p → freed)
Source code example
69
1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
B9: (start,v: p → freed) → (start,v: p → freed)
Source code example
70
1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
B10: (start,v: w → freed) → (start,v: w → freed) (start,v: q → freed) → (start,v: q → stop) (start,v: p → freed) → (start,v: p → freed)
Source code example
71
1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; 11: } // Exit from Contrived 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'w' after free! 16:}
B11: (start,v: w → freed) → (start,v: w → freed) (start,v: p → freed) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; 16:}
72
B3’: (start,v: p → freed) → (start,v: p → freed) (start,v: w → freed) → (start,v: w → stop)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: 7: } 8: if(!x) 9: return *w; // safe 10: return *q; // using 'q' after free! 11: } 12: int contrived_caller (int *w, int x, int *p) { 13: kfree (p); 14: contrived (p, w, x); 15: return *w; // using 'q' after free! 16:} //Exit from contrived_caller
73
B4: (start,v: p → freed) → (start,v: p → freed)
Source code example 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) { 4: kfree(w); 5: q = p; 6: } 7: if(!x) 8: return *w; // safe 9: return *q; // using 'q' after free! 10: } 11: int contrived_caller (int *w, int x, int *p) { 12: kfree (p); 13: contrived (p, w, x); 14: return *w; // using 'w' after free! 15:}
74
Agenda
n Introduction n Flow Analysis n MetaCompilation approach
n Metal Extensions n Intraprocedural algorithm
n Conclusion
75
Unsound approach
n Metal extensions and xgcc interprocedural are unsound n Some security violations can remain undetected
n MC focuses on executing metal extensions effectively to find as much security bugs as possible. n Metal can easily express violations of known
correctness rules n Metal automatically infers such rules from source
code
76
Conclusion
n MC has been used to detect over 100 security holes in Linux and BSD
n MC is now a commercial tool n www.coverity.com
77