1
Integrating Influence Mechanisms into Impact Analysis for Increased Precision
Ben Breech
Lori PollockMike Tegtmeyer
University of Delaware Army Research Lab
2
Background: Impact Analysis
• If I change function C, what other functions could be affected (impacted)?
• Uses• Regression testing• Planning changes
A
mainB
C
D
E
G H
F
C
3
Background: Challenges• Conservative -- impacts shouldn’t be
missed
• Precision -- impacts should be close as possible to “true” results
• Efficiency -- analysis should be quick
4
Background: Impact Analysis Kinds
5
Background:Impact Analysis Kinds
SourceCode
Static Analyzer Impacts
Obtain conservative results+ Accounts for all possible inputs and behaviors- Can give very large impact sets
Bohner, Arnold 1996, Ryder and Tip, 2001, Turver and Munro, 1994
6
Background:Impact Analysis Kinds
ExecuteInstrumented
Program
Post ExecAnalysis
Dynamic Information
Impacts
Input
- Not conservative -- results depend on input+ Give impacts related to program use
Orso et al. 2003, Law and Rothermel, 2003, Breech et al. 2005,Apiwattanapong et al. 2005
7
Background:Impact Analysis Kinds
Method A Method B
stmt a1stmt a2…
stmt b1stmt b2…
Statement Level
• Expensive• Precise
Slicing
8
Background:Impact Analysis Kinds
Method A Method B
Method Level
• Less expensive• Less precise
Call graph traversals
9
Our Approach
• Augment method level analysis with some statement level information
• Use influence graph to capture how changes propagate
• Both static and dynamic analysis
• Goal: better results w/out large overhead
10
Example of Dynamic Impact Analysis
PathImpact, Law and Rothermel, ICSE 03
A
mainB
C
D
E
G H
F
main
A
C
D
F
E
Find Impact of G:
Trace: Main G r A C F C r r D r E r D r r r x
G
(call graph only for demonstration)
11
Can we improve results?
• PathImpact is “safe”• Doesn’t miss impacts among exec’d methods
• Purely method level
• What can statement level info add?
12
How do changes propagate?
int G (void) { int x, y; …. if (x == 0)
y = 10; else
y = -10; return y;}
int G (void) { int x, y; …. if (x != 0)
y = 10; else
y = -10; return y;}
int G (void) { int x, y; …. if (x == 0)
y = 10; else
y = -10; return y;}
13
Insight: Propagation of Changes
• Change propagates through variables • Change propagates into methods by
parameters/return values• Global variables can propagate change
14
Example Revisited
A
mainB
C
D
E
G H
F
main
A
C
D
F
E
Find Impact of G:
Trace: Main G r A C F C r r D r E r D r r r x
G
Assume: No Global Variables in program C has no parameters/return vals
15
Basis of Approach: Influence Graph
Formalize ideas in influence graph• Nodes are methods• Edge p q change can propagate
from p to q• 3 Types of edges
• Parameters/returns by value• Parameters/returns by reference• Global variables
16
Influence Graph Example
A
main BC
D
E
G H
F
A
C
G
int G (void)
v
int H (int x)
Hv
void A (struct_type &x)
r
void C (void)r
v, rr
r
r: ref. edgev: val. edge(none): call edge (demo only)
17
Conservative Assumptions
Building influence graph• Ignore const • Ref. Parameters always modified• Function pointers edges to all
functions with address taken
18
Dynamic Impact Analysis with the Influence Graph
• Generate influence graph statically• Perform impact analysis
for function of interest• Intuition: Check influence graph at each
call/return to see if change can propagate• Global variables require more book
keeping (details in paper)
19
Example
A
main BC
D
E
G H
F
EC
D
v
v
r
rv, r
r
r
Trace: Main G r A C F C r r D r E r D r r r x
Find Impact of C:
r: ref. edgev: val. edge
20
Research Questions
• Is the impact analysis cost reasonable?
• Does the dynamic analysis give more precision than current techniques?
21
Methodology
• Analyzed 8 medium sized C programs(5,000 - 40,000 LoC)
• Created gcc passes to build influence graph (little overhead)
• Generated traces 8KB - 9 GB
22
Timing Results
• Influence Graph - little overhead to build
• Impact Analysis on 9 GB trace ~ 7 mins• PathImpact ~ 5mins• Reasonable time for analysis
23
Precision Results
avg min max008.espresso 5.8 (3.8%) 1 8
099.go 1.0 (0.3%) 0 307
130.li 4.9 (4.6%) 2 117
132.ijpeg 0 (0%) 0 0
147.vortex 2.7 (1%) 1 363
164.gzip 1.0 (2%) 1 2
300.twolf 1.6 (2%) 0 3
Space 4.3 (18%) 1 35
• Always subset of PathImpact
• ~ 4% savings
24
Why so little precision gain?
• Changes propagated due to parameters/returns (esp. reference)
• Conservatively assumed all reference vars were modified
• Large percentage of functions with ref. vars
• Performed basic analysis to build infl. graph
25
Better Results?(new work!)
• More static program analysis better influence graph better precision
• Danger: more static analysis, more expensive build influence graph
• Analyzed small program by hand• Got 10% gain when using def-use pairs
26
Summary and Future Work
• Can improve method level impact analysis by using some statement level information
• Reasonable impact analysis time
• Reasonable to spend time building better influence graph?