esec/fse-99 1 data-flow analysis of program fragments atanas rountev 1 barbara g. ryder 1 william...
TRANSCRIPT
1
ESEC/FSE-99
Data-Flow Analysis of Program Fragments
Atanas Rountev 1 Barbara G. Ryder
1 William Landi 2
1 Department of Computer Science, Rutgers University2 Siemens Corporate Research
http://www.prolangs.rutgers.edu
Funded by NSF grants CCR-9501761, CCR-9804065and Siemens Corporate Research
2
ESEC/FSE-99
Overview
• Motivation
• Theoretical model
• Application for pointer alias analysis
• Experimental results
3
ESEC/FSE-99
Data-Flow Analysis
• Information about program behavior
• Defines:– Graph for the control-flow structure– Lattice L of data-flow values– Transfer functions fi : L L
• Flow sensitivity: propagate data-flow values by respecting execution order of statements
4
ESEC/FSE-99
Limitations of Whole-Program Analysis
• Traditionally designed as whole-program analysis
• Precise analyses do not scale for large programs
• Incomplete programs cannot be analyzed: e.g., programs with libraries
• Information may be needed only for a small part of a large program
5
ESEC/FSE-99
Fragment Data-Flow Analysis
• Idea: analyze a program fragment instead of a whole program
• Use summary information about the rest of the program
• Advantages:– Analyze fragments of large programs– Analyze incomplete programs– Analyze only the “interesting part” of the program
6
ESEC/FSE-99
Questions
• What is the analysis structure?
• What is the relationship to whole-program analysis?
• How to define and ensure safety?
• What factors affect analysis cost and precision?
7
ESEC/FSE-99
Model of Whole-Program Analysis
• Consider only flow-sensitive analysis
• Interprocedural control-flow graph:
• Lattice L of data-flow values
• Node transfer functions fi : L L
• Solutions and safety
Call
ReturnExit
Entry
Return
CallProcedure
8
ESEC/FSE-99
Fragment Analysis Structure• Input: fragment + whole-program information
• Graph, lattice , node transfer functions
• Boundary nodes: entry, call, return
• Boundary entry: summary value from
• Boundary call: summary function
Call Exit
Entry
Return
Call EntryCall
′ L
′ L
f : ′ L → ′ L
Fragment
9
ESEC/FSE-99
Fragment Analysis Safety
• All possible containing programs: p Progs
• Abstraction relation
If , then safely abstracts x
• A safe solution safely abstracts the most precise whole-program solution for every p
• Sufficient requirements for analysis safety: transfer functions, boundary summaries
α p ⊆Lp × ′ L
(x, ′ x ) ∈ α p ′ x
10
ESEC/FSE-99
An Application
• Initial whole-program flow-insensitive analysis
• Fragment analysis input– Flow-insensitive solution– Call graph
• Use flow-insensitive solution at the boundary
• Two fragment pointer alias analyses
11
ESEC/FSE-99
Pointer Alias Analysis
• Aliases refer to the same memory location
Example: p = &x; (*p,x)
• Whole-program flow- and context-sensitive analysis [Landi-Ryder]
• Fixed and non-fixed locations: x, s.f, *p, pg
• Resolution of through-deref assignments
Example: *p = 0;
12
ESEC/FSE-99
Fragment Alias Analyses
• Input: whole-program flow-insensitive solution– Flow-insensitive analysis: almost linear time
[Steensgaard, Zhang-Ryder-Landi]
• Basic analysis: assumptions at boundary
• Extended analysis: include called procedures; no boundary calls
13
ESEC/FSE-99
Experiments
• Sun Sparc-20, 75 MHz, 352 MB
• 6 data programs: 8K - 25K LOC
• 12 fragments: – Cohesive subsets of procedures implementing
certain functionality– Size: 2%-22% of program size, median 7%
• Resolved through-deref assignments– Metric: average number of modified fixed locations
14
ESEC/FSE-99
Analysis Precision
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
Fragment number
Flow-insensitive Basic Extended
15
ESEC/FSE-99
Analysis Time• Flow-insensitive analysis
– Range: 2-9 s– Median: 7 s
• Basic analysis– Range: 18-99 s– Median: 52 s
• Extended analysis– Range: 18-187 s– Median: 85 s
16
ESEC/FSE-99
Summary• Fragment analysis as an alternative to whole-
program analysis• Theoretical issues of safety and feasibility• Application using inexpensive whole-program
analysis• Initial experiments
– Extended analysis: significant precision increase at
a practical cost
• Ongoing work: scalability, incomplete programs
17
ESEC/FSE-99
The New Lattice
• What is the set of names?
• Number of names should not depend on the size of the whole program
• Each whole-program name is:– preserved– ignored– represented by a placeholder
• One placeholder name per equivalence class
18
ESEC/FSE-99
Fragment Sizes
0
5
10
15
20
25
1 2 3 4 5 6 7 8 9 10 11 12
Fragment number