Download - Pruning Dynamic Slices With Confidence
Pruning Dynamic Slices
With Confidence
Xiangyu Zhang
Neelam Gupta
Rajiv Gupta
The University of Arizona
2
Dynamic Slicing
Dynamic slice is the set of statements that did affect the value of a variable at a program point for a specific program execution. [Korel and Laski, 1988]
……
10. A =
…...
20. B =
……
30. P =
31. If (P<0) {
......
35. A = A + 1
36. }
37. B=B+1
……
40. Error(A)
Dynamic Slice (A@40) = {10, 30, 31, 35, 40}Dynamic Slice (A@40) = {10, 30, 31, 35, 40}
3
Effectiveness of Dynamic Slicing
Dynamic slicing is very effective in containing the faulty statement, however it usually produces over-sized slices -- [AADEBUG’05].
Problem:
How to automatically prune dynamic slices?
Approaches:• Coarse-grained pruning by intersecting multiple types
(backward, forward, bidirectional) of dynamic slices -- [ASE’05, ICSE’06]
• Fine-grained pruning of a backward slice by using confidence analysis -- this paper.
4
Types of Evidence Used in Pruning
Buggy Execution
output_x
Classical dynamic slicing algorithms investigate bugs through the evidence of the wrong output
Critical Predicate [ICSE’06]
input0input_x
input2
output_x
predicate_x
output0
output1
predicate_x
Other types of evidence: Failure inducing input [ASE’05]
Partially correct output -- this paper
Benefits of more evidence Narrow the search for faulty stmt. Broaden the applicability
5
Input
Output
Coarse-grained Pruning by Intersecting Slices
failure inducing input
BS
FS
FS(CP)BiS(CP)
++CP
BS^FS
6
Fine-grained Pruning by Exploiting Correct Outputs
……
10. A = 1 (Correct: A=3)
…...
20. B = A % 2
……
30. C = A + 2
……
40. Print (B)
41. Print (C)
Correct outputs produced in addition to wrong output.
BS(Owrong) – BS (Ocorrect) is problematic.
BS(C@41)= {10, 30, 41}
BS(B@40)= {10, 20, 40}
BS(C@41)-BS(B@40)
= {30,41}
7
Confidence Analysis
Value produced at node n can reach only wrong output nodes
nn
There is no evidence that n is correct, so it should be in the pruned slice.
Should we include n in the slice?
??
Confidence(n)=0
Confidence(n)=?; 0 ≤ ? ≤ 1
Value produced at node n can reach both the correct and wrong output nodes.
nnnn
nn
Confidence(n)=1
Value produced at n can reach only correct outputs There is no evidence of incorrectness of n.
Therefore it cannot be in the slice.
8
|)(|log1)( )|(| nAltnCf nrange
Confidence Analysis
nnnn
Range(n)={ a, b, c, d, e, f, g }
Alt(n)={ a }
Value(n) = a
Value(n) = bValue(n) = c
, c
• When |Alt(n)|==1, we have the highest confidence (=1) on the correctness of n;
• When |Alt(n)|==|Range(n)|, we have the lowest confidence (=0).
• |Range(n)| >= |Alt(n)|>=1
Alt(n) is a set of possible values of the variable defined by n, that when propagated through the dynamic dependence graph, produce the same values for correct outputs.
9
Confidence Analysis: Example
……
10. A = ...
…...
20. B = A % 2
……
30. C = A + 2
……
40. Print (B)
41. Print (C) 0)41( Cf
1)40( Cf
0)30( Cf
1)20( Cf
2log2
|)(|log1)10( )|(|)|(| ArangeArange
ArangeCf
10
Confidence Analysis: Two Problems
How to decide the Range of values for a node n?• Based on variable type (e.g., Integer).• Static range analysis.• Our choice:
Dynamic analysis based on value profiles. Range of values for a statement is the set of values defined by all of
the execution instances of the statement during the program run.
How to compute Alt(n)?• Consider the set of correct output values as constraints.• Compute Alt(n) by backward propagation of constraints
through the dynamic dependence subgraph corresponding to the slice.
11
Computing Alt(n) Along Data Dependence
S1: T=... 9
S2: X=T+1 10 S3: Y=T%3 0
(X,T)= (6,5) (9,8)
(10,9)
(T,...)= (1,...) (3,...) (5,...) (8,...) (9,...)
(Y,T)=(0,3) (0,9) (1,1) (2,5) (2,8)
alt(T@S2)={9} alt(T@S3)={1,3,9}
alt(S1) = alt(T@S2) ∩ alt (T@S3) = {9}
alt(S2)={10} alt(S3)={0,1}
12
Computing Alt(n) Along Control Dependence
S1: if (P) … True
S2: X=T+1 10 S3: Y=T%3 0
(X,T)= (6,5) (9,8)
(10,9)
(Y,T)=(0,3) (0,9) (1,1) (2,5) (2,8)
alt(S1) = {True}
alt(S2)={10} alt(S3)={0,1}
13
Characteristics of Siemens Suite Programs
Program Description LOC Versions Tests
print_tokens Lexical analyzer 565 5 4072
print_tokens2 Lexical analyzer 510 5 4057
replace Pattern replacement 563 8 5542
schedule Priority scheduler 412 3 2627
schedule2 Priority scheduler 307 3 2683
gzip Unix utility 8009 1 1217
flex Unix utility 12418 8 525
• Each faulty version has a single manually injected error.• All the versions are not included:
No output is produced. Faulty statement is not contained in the backward slice.
• For each version three tests were selected.
14
Results of Pruning
Program DS PDSmax PDSmax / DS PDSmin %Missed by PDSmin
print_tokens 110 35 31.8% 35 0%
print_tokens2 114 55 48.2% 55 0%
replace 131 60 45.8% 43 38.1%
schedule 117 70 59.8% 56 20%
schedule2 90 58 64.4% 50 0%
gzip 357 121 33.9% 10 100%
flex 727 27 3.7% 25 0%
On average, PDSmax = 41.1% of DS
15
Confidence Based Prioritization
DD – dep. distance
CV – confidence values
Executed statement instances examined (%)
16
The Potential of Confidence Analysis (1)
Case Study (replace v14)• 88 74 23
Dynamic SlicerWith Confidence
Pruned Slices
User Verified Statements as correct
Buggy Code
Input User
17
The Potential of Confidence Analysis (2)
Relevant slicing (gzip v3 run r1)
Potential dep.Data dep.
18
Conclusions
We have presented a new approach - Confidence analysis - that exploits the correct output values produced in an execution to prune the dynamic slice of an incorrect output.
We have developed a novel dynamic analysis based implementation of confidence analysis, which effectively pruned backward dynamic slices in our experiments.
• Pruned Slices = 41.1% Dynamic Slices, and still contain the faulty statement.
Our study shows that confidence analysis has additional applications beyond pruning – prioritization, interactive pruning & relevant slicing.
19
The End