testing part 2 - william & marykemper/cs435/slides/l15.pdfstructural coverage testing • idea...

84
Testing Part 2 1

Upload: others

Post on 18-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

TestingPart 2

1

Page 2: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Three Important Testing Questions

• How shall we generate/select test cases?

• Did this test execution succeed or fail?

• How do we know when we’ve tested enough?

65

Page 3: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

1. How do we know when we’ve tested enough?

Test Coverage Measures

66

Page 4: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Structural Coverage Testing• Idea

– Code that has never been executed likely has bugs

• At least the test suite is clearly not complete

• This leads to the notion of code coverage– Divide a program into elements (e.g.,

statements)– Define the coverage of a test suite to be

# of elements executed by suite# of elements in program 67

Page 5: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Types of Program Elements• Selection of test suite is based on some

elements in the code• Assumption: Executing the faulty

element is a necessary condition for revealing a fault

• Several types of elements– Control flow: statement, branch, path– Condition: simple, multiple– Loop– Dataflow– Fault based (mutation)– …

68

Page 6: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Control Flow Graphs: The One Slide Tutorial

•A graph

•Nodes are basic blocks

• statements

• Edges are transfers of control between basic blocks

X := 3

Y := Z + W Y := 0

A := 2 * 3

B > 0

X := 3;if (B > 0) Y := 0;else Y := Z + W;A = 2 * 3;

69

Page 7: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Control Flow Graph

void testme1(int x) { int j = 0; for (j=0; j < 2; j++) { if (x==j) { printf(“Good\n”); } } x = j;}

•A graph

•Nodes are basic blocks

• statements

• Edges are transfers of control between basic blocks 70

Page 8: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Control Flow Graph

void testme1(int x) { int j =0; for (j=0; j < 2; j++) { if (x==j) { printf(“Good\n”); } } x = j;}

j=0

j < 2

x==j

printf

j++

x = j

71

Page 9: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Statement Coverage• Test requirements: Statements in a

program

# of executed statements

# of statements in the program

72

Page 10: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Statement Coverage in Practice• The good old days: (Stucki 1973)

– Only about 1/3 of NASA statements were executed under test before software was released

• Better results:– Microsoft reports 80-90% statement

coverage– Boeing must get 100% statement

coverage (feasible?) for all software• 100% can be hard or impossible (why?)

74

Page 11: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Statement Coverage: Example• Test requirements

– Nodes 3, …, 9• Test cases

– (x = 20, y = 30)

73

Any problems with this example?

Page 12: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Statement Coverage: Example• Test requirements

– Nodes 3, …, 9• Test cases

– (x = 20, y = 30)• Such test does not

reveal the fault at statement 7

• To reveal it, we need to traverse edge 4-7

=> Branch Coverage

75

Page 13: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Branch Coverage• Test requirements: Branches in a

program

# of executed branches

# of branches in the program

76

Page 14: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Branch Coverage: Example

• Test requirements– Edges 4-6, Edges

4-7• Test cases

– (x = 20, y = 30)– (x = 0, y = 30)

77

Page 15: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Branch Coverage: Example

1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 0;9. if (y>0)10. w = y / z;11.12. }

• Branch Coverage• Test Cases

– (x = 1, y = 22)– (x = 0, y = -10)

• Is the test suite adequate for branch coverage?

78

Page 16: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Branch Coverage: Example

1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 0;9. if (y>0)10. w = y / z;11.12. }

• Branch Coverage• Test Cases

– (x = 1, y = 22)– (x = 0, y = -10)

• Is the test suite adequate for branch coverage?

• Yes, but it does not reveal the fault at statement 10

• Test case (x = 0, y = 22)– Reveals fault

79

Page 17: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Data Flow Coverage• Test requirements: Def-use pairs in a

program

# of executed def-use pairs

# of def-use pairs in the program

80

Page 18: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Data Flow Coverage: Example

1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 0;9. if (y>0)10. w = y / z;11.12. }

• Test Requirements:– DU pairs 6-10, 8-10

• Test Cases– (x = 1, y = 22)– (x = 0, y = 10)

• Reveals fault

81

Page 19: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Data Flow Coverage: Example

1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 1;9. if (y>0)10. w = y / z;11. else12. assert(0);13. }

• Test Requirements:– DU pairs 6-10, 8-10

• Test Cases– (x = 1, y = 22)– (x = 0, y = 10)

• Is the test suite adequate for data flow coverage?

82

Page 20: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Data Flow Coverage: Example

1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 1;9. if (y>0)10. w = y / z;11. else12. assert(0);13. }

• Test Requirements:– DU pairs 6-10, 8-10

• Test Cases– (x = 1, y = 22)– (x = 0, y = 10)

• Is the test suite adequate for data flow coverage?

• Yes, but it does not reveal the fault at statement 12

• Test case (x = 1, y = -1)– Reveals fault

83

Page 21: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

All Path Coverage: Example1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 1;9. if (y>0)10. w = y / z;10. else11. assert(0);12.13.14. }

• Test Requirements:– 4 paths

• Test Cases– (x = 1, y = 22)– (x = 0, y = 10)– (x = 1, y = -22)– (x = 1, y = -10)

84

Page 22: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

All Path Coverage: Example1. main() {2. int x, y, z, w;3. read(x);4. read(y);5. if (x != 0)6. z = x + 10;7. else8. z = 1;9. if (y>0)10. w = y / z;10. else11. w = 0;12.13.14. }

• Test Requirements:– 4 paths

• Test Cases– (x = 1, y = 22)– (x = 0, y = 10)– (x = 1, y = -22)– (x = 1, y = -10)

• Faulty if x = -10– Structural coverage may

not reveal this error

86

Page 23: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Pitfalls of Coveragestatus = perform_operation();if (status == FATAL_ERROR) exit(3);

• Coverage says the exit is never taken• Straightforward to fix

– Add a case with a fatal error

87

Page 24: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Examplestatus = perform_operation();if (status == FATAL_ERROR) exit(3);

• Coverage says the exit is never taken• Straightforward to fix

– Add a case with a fatal error

• But are there other error conditions that are not checked in the code?– Coverage does not help with faults of omission

88

Page 25: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Code Coverage (Cont.)• Code coverage has proven value

– It’s a real metric, though far from perfect

• But 100% coverage does not mean no bugs– Many bugs lurk in corner cases– E.g., a bug visible after loop executes 1,025 times

• And 100% coverage is almost never achieved– Products ship with < 60% coverage– High coverage may not even be economically desirable

• May be better to devote more time to tricky parts with good coverage

89

Page 26: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Coverage: the Bad assumptions

Logical Fallacies• TRUE If low coverage then poor tests;

Not low coverage, therefore not poor tests

• TRUE If good tests then high coverage; High coverage, therefore good tests

90

Page 27: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Coverage: the Bad assumptions

Logical Fallacies• TRUE If low coverage then poor tests;

FALSE Not low coverage, therefore not poor tests

• TRUE If good tests then high coverage; FALSE High coverage, therefore good tests

91

Page 28: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Using Code Coverage• Code coverage helps identify weak test

suites

• Tricky bits with low coverage are a danger sign

• Areas with low coverage suggest something is missing in the test suite

92

Page 29: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

The Lesson• Code coverage can’t complain about

missing code– The case not handled

93

Page 30: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Structural Coverage in Practice

• Statement and sometimes edge or condition coverage is used in practice – Simple lower bounds on adequate testing

• Additional control flow heuristics sometimes used– Loops (never, once, many), combinations of conditions

• See “How to Misuse Code Coverage” in course schedule

94

Page 31: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Standard Testing Questions• How shall we generate/select test

cases?

• Did this test execution succeed or fail?

• How do we know when we’ve tested enough?

95

Page 32: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

2. Was this test execution correct?

96

Page 33: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

What is an Oracle?• Oracle = alternative realization of the specification

• Examples of oracles– The “eyeball oracle”

• Expensive, not dependable, lack of automation– A prototype, or sub-optimal implementation

• E.g., bubble-sort as oracle for quick sort– A manual list of expected results

Program

Oracle

input output

correctoutput

compare

97

Page 34: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Record-and-Replay Oracles• Record prior runs• Test recording is usually very fragile

– Breaks if environment changes anything– E.g., location, background color of textbox

• More generally, automation tools cannot generalize– They literally record exactly what happened– If anything changes, the test breaks

• A hidden strength of manual testing– Because people are doing the tests, ability to adapt tests

to slightly modified situations is built-in

98

Page 35: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Result Checking• Easy to check the result of some

algorithms– E.g., computing roots of polynomials, vs.

checking that the result is correct

– E.g., executing a query, vs. checking that the results meet the conditions

• Not easy to check that you got all results though !

Programinput output check

99

Page 36: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Assertions• Use assert(…) liberally

– Documents important invariants– Makes your code self-checking

• reduces propagation from fault to failure– And does it on every execution !

• for a performance price– You still have to worry about coverage

• May need to write functions that check invariants

• Opinion: Most programmers don’t use assert enough

100

Page 37: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Standard Testing Questions• How shall we generate/select test

cases?

• Did this test execution succeed or fail?

• How do we know when we’ve tested enough?

101

Page 38: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

3. How Shall We Generate/Select Tests?

Automatic Test Generation

102

Page 39: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Fuzz Testing• What is Fuzz Testing?

– Throw random data at an app– See what breaks

103

Page 40: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Bugs Found Through Fuzzing• About ¼ of Unix utilities crash when

fed random input strings!!• Microsoft estimates 20-25% of bugs

found this way• Examples

– Microsoft JPEG parsing error– Linksys WRT54G wireless router crack– Device Driver crashes– Skype remote DOS from heap overflow

104

Page 41: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Office 2007: 4 bugs, 3 hours, 7 lines of python fuzzer

http://www.pakwarez.com/showthread.php?1813-Multiple-Microsoft-Office-Security-Vulnerabilities + Unspecified Overflow in word 2007

- Crash in wwlib.dll . Code execution is not trivial.

+ Word 2007 CPU exhaustion DOS - CPU shoots up to 100 %.

+ Word 2007 CPU exhaustion DOS + ding - CPU shoots up to 100 %, and windows goes .ding!.

+ Heap overflow in Windows HLP files - Funky heap overflow crash. 105

Page 42: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Random Bugs• What sort of bugs does fuzz testing find?

– Buffer overruns– Format string errors– Wild pointers/array out of bounds– Signed/unsigned characters– Failure to handle return codes– Race conditions

106

Page 43: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

World’s Simplest Fuzzer• while [ 1 ]; do cat /dev/urandom | nc -vv target port; done

– (Sutton, 2007)

• Can be hard to figure out exactly what caused a crash– We’ll see a technique later

107

Page 44: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Why Fuzzers Work?• Test engineers try to generate bad

inputs– but human generated test cases tend to

contain fewer tests than a fuzzer can produce.

• Test engineers may make implicit assumptions input– an automated, brainless fuzzer will just

try anything

108

Page 45: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Fuzzing Techniques• Dumb fuzzing

– sends random bytes of data to a program and see what, if anything, happens

• Smart fuzzing – extends dumb fuzzing with more domain specific data– A fuzzer targeted at web applications might generate GET

and POST queries using (and abusing) the variables that the form or page submits as well as adding in some random variables and values.

– A fuzzer targeting a web browser might generate random input that conformed to HTML syntax, with random tags and attributes as well as abusing the defined tags.

• A combination of both dumb and smart fuzzing is likely the best approach.

109

Page 46: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Fuzz Generation: String Heuristics• Very long strings, even of single character• Strings of high-ASCII• Non-alphanumerics:

– !@#$%^&*()_-+=:;'"• Unicode:

– non-ASCII malformed surrogate pairs, ASCII encoded in several bytes, etc.

• Directory traversal: – .., ../, ../.., ../../.., etc.– Sendmail vulnerability “///////…. /”, “<><><>…<>”

• Format string: %n%n%n%n %s%n%s%n110

Page 47: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Fuzz + Enumeration• Sometimes the strings come from an

underlying “grammar”– In a compiler

• Mix fuzzing and enumeration– Enumerate strings from the grammar, with

“holes”– Fill holes with fuzz– Command injection: “ls *”, “SELECT FROM”

• Useful in compiler testing111

Page 48: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Web App Heuristics: URLs• Directory traversal• Fuzz file extensions in

particular: .htm, .html, etc.• Long URLs• Query strings, names and values• Malformed URLs with unescaped

characters

112

Page 49: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Open Source Fuzzers• http://www.infosecinstitute.com/blog/2005/12/

fuzzers-ultimate-list.html• (L)ibrary (E)xploit API - lxapi - A collection of

python scripts for fuzzing

• … many more

• Original Fuzz tool by Bart Miller:• http://pages.cs.wisc.edu/~bart/fuzz/fuzz.html

113

Page 50: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Comments About Fuzz Testing

• Differs from Targeted Testing– May well miss dangerous paths– Will take paths no logical person would ever

consider– Tests vastly more conditions and paths than

a human could– Like unit testing, usually starts from a clean

slate. Thus misses multicondition bugs– A supplement to traditional boundary value

analysis and other human driven testing; • not a replacement

114

Page 51: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Limitations of Fuzz Testing• Access control• Checksums and file verification• Grammars• Multistep bugs

115

Page 52: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Limitations of Fuzzing

• Probability of reaching an error can be astronomically low

test_me(int x){ if(x==94389){ ERROR; }}

Probability of hitting ERROR = 1/232

116

Page 53: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Symbolic Execution

• Use symbolic values for input variables• Execute the program symbolically on

symbolic input values• Collect symbolic path constraints• Use constraint solver to generate test

inputs for each execution path

• Lots of recent work on constraint solving– Well-engineered tools available (Z3, Yices)– Unfortunately, cannot cover in this class

117

Page 54: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Computation Tree: Execution Paths of a Program

• Can be seen as a binary tree with possibly infinite depth– Computation tree

• Each node represents the execution of a “if then else” statement

• Each edge represents the execution of a sequence of non-conditional statements

• Each path in the tree represents an equivalence class of inputs

0 1

0 0

0

0

1

1

1

1

1

1

118

Page 55: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Example of Computation Tree

void test_me(int x, int y) { if(2*x==y){ if(x != y+10){ printf(“I am fine here”); } else { printf(“I should not reach here”); ERROR; } }}

2*x==y

x!=y+10

N Y

N Y

ERROR

119

Page 56: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Active Checking for Errors

Divide by 0 Error

x = 3 / i;

Buffer Overflow

a[i] = 4;

120

Page 57: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Active Checking for Errors

Divide by 0 Error

if (i !=0) x = 3 / i;else ERROR;

Buffer Overflow

if (0<=i && i < a.length)

a[i] = 4;else ERROR;

Key: Add Checks Automatically and

Try to Cover all Branches

121

Page 58: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concrete Execution

int x = read();int y = read();int t;

t = x;x = x + y;if (x > y) { y = 2*t; if (x+1 == y) { assert(false); }}

122

Page 59: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concrete Execution

int x = read();int y = read();int t;

t = x;x = x + y;if (x > y) { y = 2*t; if (x+1 == y) { assert(false); }}

x = 4y = 9

t = 4x = 13(13 > 9)y = 8(14 == 8)

END

123

Page 60: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Symbolic Execution

int x = read();int y = read();int t;

t = x;x = x + y;if (x > y) { y = 2 * t; if (x+1 == y) { assert(false); }}

(true | x = x0 )

(true | x = x0,y=y0)

(true | x = x0,y=y0,t=x0)

(true | x = x0+y0,y=y0,t=x0)

x0+y0 > y0

(x0+y0<= y0 | x = x0+y0,y=y0,t=x0)

(x0+y0> y0 | x = x0+y0,y=2x0,t=x0)

x0+y0 + 1 == 2x0

(x0 +y0> y0, x0 + y0 + 1 == 2x

0 | x =

x0+y0,y=2x0,t=x0)

(x0 +y0> y0, x0 + y0 + 1 ≠ 2x

0 | x =

x0+y0,y=2x0,t=x0) 124

Page 61: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Symbolic Execution

int x = read();int y = read();int t;

t = x;x = x + y;if (x > y) { y = 2 * t; if (x+1 == y) { assert(false); }}

(true | x = x0 )

(true | x = x0,y=y0)

(true | x = x0,y=y0,t=x0)

(true | x = x0+y0,y=y0,t=x0)

x0+y0 > y0

(x0+y0<= y0 | x = x0+y0,y=y0,t=x0)

(x0+y0> y0 | x = x0+y0,y=2x0,t=x0)

x0+y0 + 1 == 2x0

(x0 +y0> y0, x0 + y0 + 1 == 2x

0 | x =

x0+y0,y=2x0,t=x0)

(x0 +y0> y0, x0 + y0 + 1 ≠ 2x

0 | x =

x0+y0,y=2x0,t=x0)

Solve constraintSolution: x = 0, y=0

Solve constraintSolution: x = 1, y=9

Solve constraintSolution: x = 2, y=1

125

Page 62: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Approach

• Combine concrete and symbolic execution for unit testing– DART: Directed Automated Random Testing– Concrete + Symbolic = Concolic

• Why would you do this?

126

Page 63: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Approach

• Combine concrete and symbolic execution for unit testing

• Why would you do this?

• Concrete execution helps Symbolic execution to simplify complex and unmanageable symbolic expressions– By performing address resolution at run time– By replacing symbolic values by concrete

values127

Page 64: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Random Test Driver: random value for

x and y Probability of

reaching ERROR is extremely low

128

Page 65: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 22, y = 7

x = x0, y = y0

129

Page 66: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 22, y = 7, z = 14

x = x0, y = y0, z =

2*y0

130

Page 67: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approachint double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 22, y = 7, z = 14

x = x0, y = y0, z =

2*y0

2*y0 != x0

131

Page 68: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approachint double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

2*y0 != x0

Solve: 2*y0 == x0

Solution: x0 = 2, y0 = 1

x = 22, y = 7, z = 14

x = x0, y = y0, z =

2*y0

132

Page 69: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 2, y = 1 x = x0, y = y0

133

Page 70: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 2, y = 1, z = 2

x = x0, y = y0, z =

2*y0

134

Page 71: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 2, y = 1, z = 2

x = x0, y = y0, z =

2*y0

2*y0 == x0

135

Page 72: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approachint double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 2, y = 1, z = 2

x = x0, y = y0, z =

2*y0

2*y0 == x0

x0 > y0+10

136

Page 73: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 2, y = 1, z = 2

x = x0, y = y0, z =

2*y0

Solve: (2*y0 == x0) && (x0 > y0 + 10)

Solution: x0 = 30, y0 = 15 2*y0 == x0

x0 > y0+10

137

Page 74: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 30, y = 15

x = x0, y = y0

138

Page 75: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing Approach

int double (int v) { return 2*v; }

void testme (int x, int y) {

z = double (y);

if (z == x) {

if (x > y+10) {

ERROR; } }

}

Concrete Execution

Symbolic Execution

concrete state

symbolic state

path condition

x = 30, y = 15

x = x0, y = y0

2*y0 == x0

x0 > y0+10

Program Error

139

Page 76: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Traverse all Paths

• Traverse all execution paths one by one to detect errors– assertion violations– program crash– uncaught exceptions

• combine with valgrind to discover memory errors

F T

F F

F

F

T

T

T

T

T

T

140

Page 77: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Traverse all Paths

• Traverse all execution paths one by one to detect errors– assertion violations– program crash– uncaught exceptions

• combine with valgrind to discover memory errors

F T

F F

F

F

T

T

T

T

T

T

141

Page 78: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Traverse all Paths

• Traverse all execution paths one by one to detect errors– assertion violations– program crash– uncaught exceptions

• combine with valgrind to discover memory errors

F T

F F

F

F

T

T

T

T

T

T

142

Page 79: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Concolic Testing: A Middle Approach

+ Complex programs

+ Efficient- Less coverage+ No false positive

- Simple programs- Not efficient+ High coverage- False positive

Random Testing

Symbolic Testing

Concolic Testing

+ Complex programs+/- Somewhat efficient+ High coverage+ No false positive

143

Page 80: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

144

Limitations: A Comparative View

Concolic: Broad, shallow Random: Narrow, deep

Page 81: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Hybrid Concolic Testing

Interleave Random Testing and Concolic Testing to increase coverage

Key idea: – Random testing leads to state s deep inside program– Concolic testing explores path starting in s and gives good local coverage

Good for: Reactive programs (periodic input from environment)

Not good for: Transformational programs

Reference Paper: Majumdar, Sen, Hybrid Concolic Testing

http://srl.cs.berkeley.edu/~ksen/pubs/paper/hybrid.ps

Aside: Motivated by similar idea used in VLSI design validation:

Ganai et al. 1999, Ho et al. 2000145

Page 82: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

146

Hybrid Concolic Testing

• Interleave Random Testing and Concolic Testing to increase coverage

while (not required coverage) {

while (not saturation)

perform random testing;

Checkpoint;

while (not increase in coverage)

perform concolic testing;

Restore;

}

Deep, broad, hybrid Search

Page 83: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Further reading

S. Anand et al. (2013) "An orchestrated survey of methodologies for automated software test case generation," Journal of Systems and Software, 86:8, pp. 1978–2001. DOI: j.jss.2013.02.061

http://www.sciencedirect.com/science/article/pii/S0164121213000563

See also: Zoltán MICSKEI’s page with a recent overview on tools:http://home.mit.bme.hu/~micskeiz/pages/code_based_test_generation.html

83

Page 84: Testing Part 2 - William & Marykemper/cs435/slides/l15.pdfStructural Coverage Testing • Idea – Code that has never been executed likely has bugs • At least the test suite is

Acknowledgements• Some of the slides were adapted from

the lecture materials by Rupak Majumdar

147