soen 6431 winter 2011 - program slicing updated
TRANSCRIPT
SOEN 6431
SOFTWARE MAINTENANCE AND EVOLUTION
Dr. Juergen Rilling
Notes:#7
Program Slicing
2
Program comprehension
Program comprehension is the study of how
software engineers understand programs.
Program comprehension is needed for: Debugging
Code inspection
Test case design
Re-documentation
Design recovery
Code revisions
3
Program comprehension process
Involves the use of existing knowledge to acquire new knowledge about a program.
Existing knowledge: Programming languages
Computing environment
Programming principles
Architectural models
Possible algorithms and solution approaches
Domain-specific information
Any previous knowledge about the code
New knowledge: Code functionality
Architecture
Algorithm implementation details
Control flow
Data flow
4
Comprehension techniques
Reading by step-wise abstraction Determine the function of critical subroutines, work through the program
hierarchy until the function of the program is determined.
Checklist-based reading Readers are given a checklist to focus their attention on particular issues
within the document.
Different readers were given different checklists, therefore each reader would concentrate on different aspects of the document.
Defect-based reading Defects are categorized and characterized (e.g., data type inconsistency,
incorrect functionality, missing functionality, etc.)
A set of steps (a scenario) is then developed for each defect class to guide the reader to find those defects.
Perspective-based reading Similar to defect-based reading, but instead of different defect classes,
readers have different roles (tester, designer and user) to guide them in reading.
Program
Comprehension
8
Sources of variation
Aside from the issue of how comprehension
occurs, comprehension performance and
effectiveness are affected by many factors: Maintainer characteristics
Program characteristics
Task characteristics
Program
Comprehension
9
Maintainer characteristics
Familiarity with code base
Application domain knowledge
Programming language knowledge
Programming expertise
Tool expertise
Individual differences
10
Program characteristics
Application domain
Programming domain
Quality of problem to be understood
Program size and complexity
Availability and accuracy of documentation
11
Task characteristics
Task type Experimental: recall, modification
Perfective, corrective, adaptive, reuse, extension.
Task size and complexity
Time constraints
Environmental factors
12
Models
Mental models Internal working representation of the software under consideration.
Cognitive models Theories of the processes by which software engineers arrive at a
mental model.
Program Mental Model CognitiveMod
el
13
Mental models
Static elements Text structure knowledge
Microstructure
Chunks (macrostructure)
Plans (objects)
Hypotheses
Dynamic elements Strategies (chunking and cross-referencing)
Supporting elements Beacons
Rules of discourse
14
Text structure
The program text and its structure Control structure: iterations, sequences, conditional constructs
Variable definitions
Calling hierarchies
Parameter definitions
Microstructure – actual program statements
and their relationships.
15
Chunks
Contain various levels of text structure
abstractions.
Also called macrostructure.
Can be identified by a descriptive label.
Can be composed into higher level chunks.
16
Plans (objects)
Knowledge elements for developing and validating expectations, interpretations, and inferences.
Include causal knowledge about information flow and relationships between parts of a program.
Programming plans Based on programming concepts.
Low level: iteration and conditional code segments.
Intermediate level: searching, sorting, summing algorithms; linked lists and trees.
High level
Domain plans All knowledge about the problem area.
Examples: problem domain objects, system environment, domain-specific solutions and architectures.
17
Hypotheses
Conjectures that are results of comprehension activities that can take seconds or minutes to occur.
Three types:
Why – hypothesize the purpose/rationale of a function of design choice.
How – hypothesize the method for accomplishing a certain goal.
What – hypothesize classification.
Hypotheses are drivers of cognition. They help to define the direction of further investigation.
Code cognition formulates hypotheses, checks them whether they are true or false, and revises them when necessary.
Hypotheses fail for several reasons:
Can’t find code to support a hypothesis.
Confusion due to one piece of code satisfying different hypothesis.
Code cannot be explained.
18
Supporting elements
Beacons Cues that index into existing knowledge.
A swap routine can be a beacon for a sorting function.
Experienced programmers recognize beacons much faster than
novice programmers.
Used commonly in top-down comprehension.
Rules of discourse Rules that specify programming conventions.
Examples: coding standards, algorithm implementations,
expected use of data structures.
19
Mental models – dynamic
elements
Strategies Sequences of actions that lead to a particular goal.
Actions Classify programmer activities implcitly and explicitly during a
maintenance task.
Episodes Sequences of actions.
Processes Aggregations of episodes.
20
Strategies
Guide the sequence of actions while following
a plan to reach a goal.
Match programming plans to code. Shallow reasoning – do not perform in-depth analysis; stop upon
recognition of familiar idioms and programming plans.
Deep reasoning – perform detailed analysis.
Mechanisms for understanding Chunking
Cross-referencing
21
Chunking
Creates new, higher-level abstraction
structures
Labels replace the detail of the lower level
chunks.
22
Cross-referencing
Map program parts to functional descriptions
temp = a;
a = b;
b = temp;
for (i=0; i<size; i++)
if (array[i]==target)
return true;
swap
sequential search
23
Cognitive models
Letovsky
Shneiderman and Mayer
Brooks
Soloway, Adelson and Ehrlich
Pennington
Mayrhauser and Vans (Integrated)
24
Letovsky model
25
Shneiderman model
Program
Compreh
ension
26
Brooks model
Program
Compreh
ension
27
Soloway model
Program
Compreh
ension
28
Pennington model
Program
Compreh
ension
29
Integrated model
Program
Comprehension
30
Distributed cognition
Traditional cognitive models deal the
cognitive processes inside one person’s
brain.
On real projects, software developers: Work in teams
Can ask people questions
Can surf the web for answers
How do these affect the cognitive process?
Program Slicing
31 SOEN 6431
Program Comprehension - support
Can we learn from other domains?
We might not want or can digest a whole
32
Solution?
33
34
35
What is Program Slicing?
More descriptively, it is a decomposition
technique that extracts statements relevant to
a particular computation from a program.
Slicing Criterion <s, v>
Program Slices as Originally introduced by
Weiser[1] are known as executable backward
static slices
36
36
37
Given:
(1) A program
(2) A variable v at some point P in the
program
Goal:
Finding the part of the program that is responsible for
the computation of variable v at point P.
Basic Idea
38
Why Program Slicing?
Program Debugging: that’s how slicing was discovered!
Testing: reduce cost of regression testing after modifications (only run those tests that needed)
Parallelization
Integration : merging two programs A and B that both resulted from modifications to BASE
Reverse Engineering: comprehending the design by abstracting out of the source code the design decisions
Software Maintenance: changing source code without unwanted side effects
Software Quality Assurance: validate interactions between safety-critical components 39
39
40
41
Types of Slicing (Executable)
42
Static
Slice
v’
Static Backward Program Slicing was original introduced
by Weiser in 1982. A static program slice consists of
these parts of a program P that potentially could affect
the value of a variable v at a point of interest.
Static Backward Program Slicing
For all possible program
inputs (executions)
v = v’ Program P
v
43
43
Slicing Properties:
Static Slicing
Statically available information only
No assumptions made on input
Computed slice can never be accurate (minimal slice)
Problem is undecidable – reduction to the halting
problem
Current static methods can only compute
approximations
Result may not be usefull
44
45
46
47
Data Dependence: Represents a data flow (definition-use chain). => Data dependence between 2 and 7 but
not between 2 and 8.
Control Dependence: The execution of a node depends on the outcome of a
predicate node. => Control dependence between node 6 and 8, but not
between 6 and 15.
Creating a PDG
1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max) ; 16 output (min);
48
Program Dependence Graph (PDG) A Program dependence graph is formed by combining data and control dependencies
between nodes.
1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max); 16 output (min);
16
49
Any problems within this PDG? 49
Control Dependency
Data Dependency
Slicing Example
1 main( )
2 {
3 int i, sum;
4 sum = 0;
5 i = 1;
6 while(i <= 10)
7 {
8 sum = sum + 1;
9 ++ i;
10 }
11 cout<< sum;
12 cout<< i;
13 }
An Example Program & its slice w.r.t. <12, i>
50
50
PDG of the Example Program
1
3 4 5 6 11 12
8 9
Slice Point
Control Dep. Edge
Data Dep. Edge
51
51
52
new
53
new
1. read (n);
2. i :=n;
3. sum :=0;
4. product:= 1;
3 while (i>0)
{
4 sum:= sum+i
5 product:= product*i;
6 i:=i -1;
}
7 write(sum);
8 write (product);
54 SOEN 6431
Loops
Static Backward slicing example
55
new
Forward Slice (static)
56
Note: It is not necessarily value preserving - meaning the value for the variable in the
Slice might not be the same as in the original program.
Slicing – Forward Static
57
Objective: what parts of a program
are affected by a modification to the
the variable specified in the slicing
criterion.
Slicing – Forward Static
58
Slicing – Forward Static
59
60
Controversial statement:
Forward slicing provides more meaningful
insights compared to backward slicing?
Question : Yes – No
Justify your answer
61
Slicing classifications
Types of slices Static Dynamic
Direction of slicing Backward Forward
Executabiliy of slice Executable Closure
Levels of slices Intraprocedural Interprocedural
62
62
63
Executable vs. non-executable slice
64
65
Dynamic slicing was originally introduced by Korel and
Laski in 1988. A dynamic slice is an executable part of a
program P whose behavior is identical, for the same
program input, to that of the original program with
respect to a variable v at some execution position.
for a specific program
input (execution)
v = v’
Dynamic Program Slicing
Program P
v v’
Dyn.
Slice
Slicing Properties
Dynamic Slicing Computed for a single input scenario
Deterministic instead of probabilistic
Useful for applications that are input driven (debugging, testing)
Slicing criterion <i, p, v>
66
66
67
Two Major Dynamic Slicing Categories
Require first the recording of an execution trace and then
compute a dynamic slice based on the recorded execution
trace.
Execution trace based algorithms:
Non-Execution trace based algorithms:
Compute the dynamic slice during run-time without requiring
any major recording of the program execution.
68
Program Execution Trace
Sample program: 1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max); 16 output (min);
11 input (n,a); 22 max := a[1]; 33 min := a[1]; 44 i := 2; 55 s := 0; 66 i n 77 max < a[i]; 88 max := a[i]; 99 s := max; 1010 min > a[i];
1311 output(s);
1412 i := i +2; 613 i n 1514 output (max); 1615 output (min);
Execution trace for n=2 ,a =(1,2)
Dynamic Dependency Graph
69
Execution trace based Algorithms
• Original dynamic slicing algorithm presented by Korel and Laski in 1988.
• Based on a recorded execution trace for an input x.
• Traces the execution trace backwards to derive dynamic data and control dependencies.
• Create individual node in the PDG for each executed statement.
Backward Algorithm
70
71
Backward Algorithm
Program Execution for n=2, a[1,2] at statement 15
15
16
72
Static and Dynamic slice for variable s
Static Slice: 1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then
begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end;
Dynamic Slice: 1 input (n,a); 2 max := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 13 output (s); 14 i := i +2; end;
Dynamic slice for variable s on input n=2, a[1,2];
Another Dynamic Slice Example
73
new
Another Dynamic Slice Example
74
new
c
Static vs. Dynamic Slice
75
new
Any problems?
76
Q: How many nodes do we have in a Dynamic Dependency Graph?
A: ???
Q: How many dynamic slices can we compute?
A: ???
Q: Any suggestion on how to reduce the complexity?
A: ???
new
Dynamic Forward Slicing
77
78
Algorithm based on removable blocks
• Presented by Korel in 1994 and extended later on.
• Execution trace based.
• Overcomes limitations of dependency based algorithms
with respect to unstructured programs.
• Uses data dependency.
• Uses removable blocks instead of control dependencies.
• All Blocks are initially marked as removable => identify the blocks which are not removable.
79
B1 1 input (n,a);
B2 2 max := a[1];
B3 3 min := a[1];
B4 4 i := 2;
B5 5 s := 0;
B6 6 i < n
B7 7 max < a[i];
B8 8 max := a[i];
B9 9 s := max;
B10 10 min > a[i];
B13 13 output(s);
B14 14 i := i + 2;
6 i < n
B15 15 output (max];
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Challenge: Complexity of Dynamic Slicing
A removable block is informally:
The smallest part of program text that can be removed during a slice computation without violating the syntactical correctness of the program, e.g.: loops, if/then/else, assignment-statements, goto-statements,and break statements.
Sample program
Please note variables, a[] and n are omitted to reduce the complexity of the table
80
81
new
82
Challenges: Slicing unstructured programs
Explicit control transfer statements (goto, return, exit,
break, continue) complicate the construction of control
set
A conservative solution: if goto statement has a non-
empty relevant set, include goto and its target in the
slice
An alternative approach: look for labeled statements in
the slice, then include goto statements that branch to
these labels
new
Challenges: Arrays, Records, and Pointers (mainly static slicing)
Arrays: Simple approach: treat each array assignment as
both definition and use. Problem: too conservative To determine if use of a[g(j)] depends on definition
of a[f(i)], we need to test whether f(i) can be equal to g(j) Undecidable in general but can be solved for some
expression types The solutions are one sided: can determine if f(i) and g(j)
cannot be equal, but no information otherwise
Records: Easy: treat record.field as record_field
Pointers: Hard: requires points-to analysis
83
new
84
new
85
new
86
new
87
new
Example Procedural
88
new
89
new
90
new
91
new
92
new