soen 6431 winter 2011 - program slicing updated

SOEN 6431

SOFTWARE MAINTENANCE AND EVOLUTION

Dr. Juergen Rilling

Notes:#7

Program Slicing

2

Program comprehension

Program comprehension is the study of how

software engineers understand programs.

Program comprehension is needed for: Debugging

Code inspection

Test case design

Re-documentation

Design recovery

Code revisions

3

Program comprehension process

Involves the use of existing knowledge to acquire new knowledge about a program.

Existing knowledge: Programming languages

Computing environment

Programming principles

Architectural models

Possible algorithms and solution approaches

Domain-specific information

Any previous knowledge about the code

New knowledge: Code functionality

Architecture

Algorithm implementation details

Control flow

Data flow

4

Comprehension techniques

Reading by step-wise abstraction Determine the function of critical subroutines, work through the program

hierarchy until the function of the program is determined.

Checklist-based reading Readers are given a checklist to focus their attention on particular issues

within the document.

Different readers were given different checklists, therefore each reader would concentrate on different aspects of the document.

Defect-based reading Defects are categorized and characterized (e.g., data type inconsistency,

incorrect functionality, missing functionality, etc.)

A set of steps (a scenario) is then developed for each defect class to guide the reader to find those defects.

Perspective-based reading Similar to defect-based reading, but instead of different defect classes,

readers have different roles (tester, designer and user) to guide them in reading.

Program

Comprehension

8

Sources of variation

Aside from the issue of how comprehension

occurs, comprehension performance and

effectiveness are affected by many factors: Maintainer characteristics

Program characteristics

Task characteristics

Program

Comprehension

9

Maintainer characteristics

Familiarity with code base

Application domain knowledge

Programming language knowledge

Programming expertise

Tool expertise

Individual differences

10

Program characteristics

Application domain

Programming domain

Quality of problem to be understood

Program size and complexity

Availability and accuracy of documentation

11

Task characteristics

Task type Experimental: recall, modification

Perfective, corrective, adaptive, reuse, extension.

Task size and complexity

Time constraints

Environmental factors

12

Models

Mental models Internal working representation of the software under consideration.

Cognitive models Theories of the processes by which software engineers arrive at a

mental model.

Program Mental Model CognitiveMod

el

13

Mental models

Static elements Text structure knowledge

Microstructure

Chunks (macrostructure)

Plans (objects)

Hypotheses

Dynamic elements Strategies (chunking and cross-referencing)

Supporting elements Beacons

Rules of discourse

14

Text structure

The program text and its structure Control structure: iterations, sequences, conditional constructs

Variable definitions

Calling hierarchies

Parameter definitions

Microstructure – actual program statements

and their relationships.

15

Chunks

Contain various levels of text structure

abstractions.

Also called macrostructure.

Can be identified by a descriptive label.

Can be composed into higher level chunks.

16

Plans (objects)

Knowledge elements for developing and validating expectations, interpretations, and inferences.

Include causal knowledge about information flow and relationships between parts of a program.

Programming plans Based on programming concepts.

Low level: iteration and conditional code segments.

Intermediate level: searching, sorting, summing algorithms; linked lists and trees.

High level

Domain plans All knowledge about the problem area.

Examples: problem domain objects, system environment, domain-specific solutions and architectures.

17

Hypotheses

Conjectures that are results of comprehension activities that can take seconds or minutes to occur.

Three types:

Why – hypothesize the purpose/rationale of a function of design choice.

How – hypothesize the method for accomplishing a certain goal.

What – hypothesize classification.

Hypotheses are drivers of cognition. They help to define the direction of further investigation.

Code cognition formulates hypotheses, checks them whether they are true or false, and revises them when necessary.

Hypotheses fail for several reasons:

Can’t find code to support a hypothesis.

Confusion due to one piece of code satisfying different hypothesis.

Code cannot be explained.

18

Supporting elements

Beacons Cues that index into existing knowledge.

A swap routine can be a beacon for a sorting function.

Experienced programmers recognize beacons much faster than

novice programmers.

Used commonly in top-down comprehension.

Rules of discourse Rules that specify programming conventions.

Examples: coding standards, algorithm implementations,

expected use of data structures.

19

Mental models – dynamic

elements

Strategies Sequences of actions that lead to a particular goal.

Actions Classify programmer activities implcitly and explicitly during a

maintenance task.

Episodes Sequences of actions.

Processes Aggregations of episodes.

20

Strategies

Guide the sequence of actions while following

a plan to reach a goal.

Match programming plans to code. Shallow reasoning – do not perform in-depth analysis; stop upon

recognition of familiar idioms and programming plans.

Deep reasoning – perform detailed analysis.

Mechanisms for understanding Chunking

Cross-referencing

21

Chunking

Creates new, higher-level abstraction

structures

Labels replace the detail of the lower level

chunks.

22

Cross-referencing

Map program parts to functional descriptions

temp = a;

a = b;

b = temp;

for (i=0; i<size; i++)

if (array[i]==target)

return true;

swap

sequential search

23

Cognitive models

Letovsky

Shneiderman and Mayer

Brooks

Soloway, Adelson and Ehrlich

Pennington

Mayrhauser and Vans (Integrated)

24

Letovsky model

25

Shneiderman model

Program

Compreh

ension

26

Brooks model

Program

Compreh

ension

27

Soloway model

Program

Compreh

ension

28

Pennington model

Program

Compreh

ension

29

Integrated model

Program

Comprehension

30

Distributed cognition

Traditional cognitive models deal the

cognitive processes inside one person’s

brain.

On real projects, software developers: Work in teams

Can ask people questions

Can surf the web for answers

How do these affect the cognitive process?

Program Slicing

31 SOEN 6431

Program Comprehension - support

Can we learn from other domains?

We might not want or can digest a whole

32

Solution?

33

What is Program Slicing?

More descriptively, it is a decomposition

technique that extracts statements relevant to

a particular computation from a program.

Slicing Criterion <s, v>

Program Slices as Originally introduced by

Weiser[1] are known as executable backward

static slices

36

36

37

Given:

(1) A program

(2) A variable v at some point P in the

program

Goal:

Finding the part of the program that is responsible for

the computation of variable v at point P.

Basic Idea

38

Why Program Slicing?

Program Debugging: that’s how slicing was discovered!

Testing: reduce cost of regression testing after modifications (only run those tests that needed)

Parallelization

Integration : merging two programs A and B that both resulted from modifications to BASE

Reverse Engineering: comprehending the design by abstracting out of the source code the design decisions

Software Maintenance: changing source code without unwanted side effects

Software Quality Assurance: validate interactions between safety-critical components 39

39

Types of Slicing (Executable)

42

Static

Slice

v’

Static Backward Program Slicing was original introduced

by Weiser in 1982. A static program slice consists of

these parts of a program P that potentially could affect

the value of a variable v at a point of interest.

Static Backward Program Slicing

For all possible program

inputs (executions)

v = v’ Program P

v

43

43

Slicing Properties:

Static Slicing

Statically available information only

No assumptions made on input

Computed slice can never be accurate (minimal slice)

Problem is undecidable – reduction to the halting

problem

Current static methods can only compute

approximations

Result may not be usefull

44

Data Dependence: Represents a data flow (definition-use chain). => Data dependence between 2 and 7 but

not between 2 and 8.

Control Dependence: The execution of a node depends on the outcome of a

predicate node. => Control dependence between node 6 and 8, but not

between 6 and 15.

Creating a PDG

1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max) ; 16 output (min);

48

Program Dependence Graph (PDG) A Program dependence graph is formed by combining data and control dependencies

between nodes.

1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max); 16 output (min);

16

49

Any problems within this PDG? 49

Control Dependency

Data Dependency

Slicing Example

1 main( )

2 {

3 int i, sum;

4 sum = 0;

5 i = 1;

6 while(i <= 10)

7 {

8 sum = sum + 1;

9 ++ i;

10 }

11 cout<< sum;

12 cout<< i;

13 }

An Example Program & its slice w.r.t. <12, i>

50

50

PDG of the Example Program

1

3 4 5 6 11 12

8 9

Slice Point

Control Dep. Edge

Data Dep. Edge

51

51

52

new

53

new

1. read (n);

2. i :=n;

3. sum :=0;

4. product:= 1;

3 while (i>0)

{

4 sum:= sum+i

5 product:= product*i;

6 i:=i -1;

}

7 write(sum);

8 write (product);

54 SOEN 6431

Loops

Static Backward slicing example

55

new

Forward Slice (static)

56

Note: It is not necessarily value preserving - meaning the value for the variable in the

Slice might not be the same as in the original program.

Slicing – Forward Static

57

Objective: what parts of a program

are affected by a modification to the

the variable specified in the slicing

criterion.


58


59

Controversial statement:

Forward slicing provides more meaningful

insights compared to backward slicing?

Question : Yes – No

Justify your answer

61

Slicing classifications

Types of slices Static Dynamic

Direction of slicing Backward Forward

Executabiliy of slice Executable Closure

Levels of slices Intraprocedural Interprocedural

62

62

Executable vs. non-executable slice

64

65

Dynamic slicing was originally introduced by Korel and

Laski in 1988. A dynamic slice is an executable part of a

program P whose behavior is identical, for the same

program input, to that of the original program with

respect to a variable v at some execution position.

for a specific program

input (execution)

v = v’

Dynamic Program Slicing

Program P

v v’

Dyn.

Slice

Slicing Properties

Dynamic Slicing Computed for a single input scenario

Deterministic instead of probabilistic

Useful for applications that are input driven (debugging, testing)

Slicing criterion <i, p, v>

66

66

67

Two Major Dynamic Slicing Categories

Require first the recording of an execution trace and then

compute a dynamic slice based on the recorded execution

trace.

Execution trace based algorithms:

Non-Execution trace based algorithms:

Compute the dynamic slice during run-time without requiring

any major recording of the program execution.

68

Program Execution Trace

Sample program: 1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max); 16 output (min);

11 input (n,a); 22 max := a[1]; 33 min := a[1]; 44 i := 2; 55 s := 0; 66 i n 77 max < a[i]; 88 max := a[i]; 99 s := max; 1010 min > a[i];

1311 output(s);

1412 i := i +2; 613 i n 1514 output (max); 1615 output (min);

Execution trace for n=2 ,a =(1,2)

Dynamic Dependency Graph

69

Execution trace based Algorithms

• Original dynamic slicing algorithm presented by Korel and Laski in 1988.

• Based on a recorded execution trace for an input x.

• Traces the execution trace backwards to derive dynamic data and control dependencies.

• Create individual node in the PDG for each executed statement.

Backward Algorithm

70

71

Backward Algorithm

Program Execution for n=2, a[1,2] at statement 15

15

16

72

Static and Dynamic slice for variable s

Static Slice: 1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then

begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end;

Dynamic Slice: 1 input (n,a); 2 max := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 13 output (s); 14 i := i +2; end;

Dynamic slice for variable s on input n=2, a[1,2];

Another Dynamic Slice Example

73

new

Another Dynamic Slice Example

74

new

c

Static vs. Dynamic Slice

75

new

Any problems?

76

Q: How many nodes do we have in a Dynamic Dependency Graph?

A: ???

Q: How many dynamic slices can we compute?

A: ???

Q: Any suggestion on how to reduce the complexity?

A: ???

new

Dynamic Forward Slicing

77

78

Algorithm based on removable blocks

• Presented by Korel in 1994 and extended later on.

• Execution trace based.

• Overcomes limitations of dependency based algorithms

with respect to unstructured programs.

• Uses data dependency.

• Uses removable blocks instead of control dependencies.

• All Blocks are initially marked as removable => identify the blocks which are not removable.

79

B1 1 input (n,a);

B2 2 max := a[1];

B3 3 min := a[1];

B4 4 i := 2;

B5 5 s := 0;

B6 6 i < n

B7 7 max < a[i];

B8 8 max := a[i];

B9 9 s := max;

B10 10 min > a[i];

B13 13 output(s);

B14 14 i := i + 2;

6 i < n

B15 15 output (max];

1

2

3

4

5

6

7

8

9

10

11

12

13

14

Challenge: Complexity of Dynamic Slicing

A removable block is informally:

The smallest part of program text that can be removed during a slice computation without violating the syntactical correctness of the program, e.g.: loops, if/then/else, assignment-statements, goto-statements,and break statements.

Sample program

Please note variables, a[] and n are omitted to reduce the complexity of the table

80

81

new

82

Challenges: Slicing unstructured programs

Explicit control transfer statements (goto, return, exit,

break, continue) complicate the construction of control

set

A conservative solution: if goto statement has a non-

empty relevant set, include goto and its target in the

slice

An alternative approach: look for labeled statements in

the slice, then include goto statements that branch to

these labels

new

Challenges: Arrays, Records, and Pointers (mainly static slicing)

Arrays: Simple approach: treat each array assignment as

both definition and use. Problem: too conservative To determine if use of a[g(j)] depends on definition

of a[f(i)], we need to test whether f(i) can be equal to g(j) Undecidable in general but can be solved for some

expression types The solutions are one sided: can determine if f(i) and g(j)

cannot be equal, but no information otherwise

Records: Easy: treat record.field as record_field

Pointers: Hard: requires points-to analysis

83

new

84

new

85

new

86

new

87

new

Example Procedural

88

new

89

new

90

new

91

new

92

new

soen 6431 winter 2011 - program slicing updated

Documents

dynamic dependency

program dependence

data dependence

control dependence

mental model

dynamic slice

programming

based reading