a probabilistic pointer analysis for speculative optimizations jeff dasilva greg steffan jeff...
TRANSCRIPT
A Probabilistic Pointer Analysis for Speculative
Optimizations
A Probabilistic Pointer Analysis for Speculative
Optimizations
Jeff DaSilva Greg Steffan
Jeff DaSilva Greg Steffan
Electrical and Computer Engineering
University of Toronto
Electrical and Computer Engineering
University of TorontoASPLOS XIIASPLOS XII
ASPLOS XII 2 University Of Toronto
Pointers Impede OptimizationPointers Impede Optimization
foo(int *a) { … while(…) {
x = *a; … }}
Loop InvariantCode Motion
Parallelize
Can pointer analysis help?
ASPLOS XII 3 University Of Toronto
Pointer AnalysisPointer Analysis
Do pointers a and b point to the same location? Repeat for every pair of pointers at every program point
*a = ~ ~ = *b
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
PointerAnalysis
optimize
MaybeHow can we optimize cases?
ASPLOS XII 4 University Of Toronto
Lets SpeculateLets Speculate Implement a potentially unsafeunsafe optimization
VerifyVerify and RecoverRecover if necessary
int *a, x;…while(…){
x = *a; …}
a is probably loop invariant
int *a, x, tmp;…tmp = *a;while(…){
x = tmp; …}
<verify, recover?><verify, recover?>
ASPLOS XII 5 University Of Toronto
Data Speculative OptimizationsData Speculative Optimizations EPIC Instruction sets
Support for speculative load/store instructions (eg. Itanium)
Speculative compiler optimizations Dead store elimination, redundancy elimination, copy propagation,
strength reduction, register promotion
Thread-level speculation (TLS) Hardware and compiler support for speculative parallel threads
Transactional programming Hardware and software support for speculative parallel transactions
Heavy reliance on detailed profile feedback
ASPLOS XII 6 University Of Toronto
Can We Quantify ?Can We Quantify ?
Estimate the potential benefit for speculating:
Speculate?
ExpectedExpectedspeedupspeedup(if successful)(if successful)
RecoveryRecoverypenaltypenalty
(if unsuccessful)(if unsuccessful)
OverheadOverheadfor verifyfor verify
Ideally would be a probabilityMaybe
Maybe
Maybe
ProbabilityProbabilityof successof success
ASPLOS XII 7 University Of Toronto
Conventional Pointer AnalysisConventional Pointer Analysis
Do pointers a and b point to the same location? Do this for every pair of pointers at every program point
*a = ~ ~ = *b
*a = ~ ~ = *b
Definitely Not
Definitely
Maybe
PointerAnalysis
optimize
Probabilistic Pointer Analysis Probabilistic Pointer Analysis (PPA)(PPA)
PPApp = 0.0
pp = 1.0
0.0 < pp < 1.0
With what probability pp, do pointers a and b point to the same location? Repeat for every pair of pointers at every program point
optimize
Potential bonus: PPA doesn’t have to be safe
ASPLOS XII 8 University Of Toronto
PPA Research ObjectivesPPA Research Objectives
Accurate points-to probability informationat every static pointer dereference
Scalable analysis Goal: entire SPEC integer benchmark suite
Understand scalabilityscalability/accuracyaccuracy tradeoff through flexible static memory model
Improve our understanding of programs
ASPLOS XII 9 University Of Toronto
Related WorkRelated WorkPointers? SPEC INT? Probabilities?
Ju. et.al. Chen et.al. Fernandez & Espasa Hot code Bhowmik & Franklin Select
Apps Our PPA
ASPLOS XII 10 University Of Toronto
Algorithm Design ChoicesAlgorithm Design Choices FixedFixed
Bottom Up / Top Down Approach LinearLinear transfer functions (for scalability)
One-level contextcontext and flowflow sensitive FlexibleFlexible
Edge profiling (or static prediction)
Safe (or unsafe)
Field sensitive (or field insensitive)
ASPLOS XII 11 University Of Toronto
Traditional Points-To GraphTraditional Points-To Graphint x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else(…) a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
= pointer
= pointed at
Definitely
Maybe
=
=
Results are inconclusive
ASPLOS XII 12 University Of Toronto
Probabilistic Points-To GraphProbabilistic Points-To Graphint x, y, z, *b = &x;void foo(int *a) {
if(…) b = &y;
if(…) a = &z; else a = b; while(…) { x = *a; … }}
y UND
a
z
b
x
0.10.1 taken taken((edge profileedge profile))
0.20.2 taken taken((edge profileedge profile))
= pointer
= pointed at
p = 1.0
0.0<p< 1.0
=
=p
0.10.9
0.72
0.08
0.2
Results provide more information
ASPLOS XII 13 University Of Toronto
LOLLOLLLIPIPOOPP Our PPA AlgorithmOur PPA Algorithm
LLinear inear
OOne -ne -
LLevel evel
IInterprocedural nterprocedural
PProbabilistic robabilistic
PPointer Analysisointer Analysis
ASPLOS XII 14 University Of Toronto
FundamentalsFundamentals
Location sets (Wilson & Lam)One or more mem locationsClassified as pointer, pointer target, or both
Matrix-based approachNice properties, optimized implementations
Two key matrices:Points-to matrixLinear transformation matrix
ASPLOS XII 15 University Of Toronto
Points-To MatrixPoints-To Matrix
Location Sets
AreaOf
Interest
I
1
2
…
N-1
N
…
M-1 M
All matrix rows sum to 1.0
Location Sets
Pointer Sets
ASPLOS XII 16 University Of Toronto
Points-To Matrix ExamplePoints-To Matrix Example
y UNDzx
y
UND
z
x
a
b
0.72 0.08 0.20
0.90 0.10
1.0
1.0
1.0
1.0
Iy UND
a
z
b
x
0.10.9
0.72
0.08
0.2
ASPLOS XII 17 University Of Toronto
Computing New Points-To MatricesComputing New Points-To Matrices
IAny InstructionAny Instruction
I
Points-ToMatrix In
Points-ToMatrix Out
ASPLOS XII 18 University Of Toronto
The Fundamental PPA EquationThe Fundamental PPA Equation
Points-ToMatrix Out
Points-ToMatrix In
TransformationMatrix=
This can be applied to any instruction (incl. function calls)
ASPLOS XII 19 University Of Toronto
Transformation MatrixTransformation Matrix
1
2
N-1 N
ø I
1 2 3 …
…
N-1
N
…
Area of Interest
Location Sets Pointer Sets
All matrix rows sum to 1.0
Location Sets
Pointer Sets
ASPLOS XII 20 University Of Toronto
Transformation Matrix ExampleTransformation Matrix Example
y
UND
z
x
a
b
y UNDzxa b
1.0
1.0
1.0
1.0
1.0
1.0
S1:S1: a = &z;a = &z;
=TS1
ASPLOS XII 21 University Of Toronto
Example - The PPA EquationExample - The PPA EquationS1:S1: a = &z;a = &z;PTout = TS1 PTin
y UNDzx
y
UND
z
x
a
b
y UND
a
z
b
x
0.10.9
0.72
0.08
0.2
0.72 0.08 0.20
0.90 0.10
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
=PTout
ASPLOS XII 22 University Of Toronto
Example - The PPA EquationExample - The PPA Equation
y UNDzx
y
UND
z
x
a
b
1.0
0.90 0.10
1.0
1.0
1.0
1.0
I y UNDz
b
x
0.10.9
a
S1:S1: a = &z;a = &z;PTout = TS1 PTin
=PTout
ASPLOS XII 23 University Of Toronto
TS2
Combining Transformation MatricesCombining Transformation Matrices
IBasic Block
S1: S1: InstrInstr
S2: S2: InstrInstr
S3: S3: InstrInstr
I
PTout TS1=
PTin
TS3PTin PTin PTin
PTout = TBB
TS1
ASPLOS XII 24 University Of Toronto
Control flow - if/elseControl flow - if/else
YYXX
TX TY= +
p qp q
p + q = 1.0
ASPLOS XII 25 University Of Toronto
Control flow - loopsControl flow - loops
TX=XXNN
U
LiTY=
1U-L+1
i<L,U>
YY
Both operations can be implemented efficiently<L,U> <min,max>
ASPLOS XII 26 University Of Toronto
Safe vs. Unsafe Safe vs. Unsafe Pointer Assignment InstructionsPointer Assignment Instructions
x = &y Address-of Assignment
x = y Copy Assignment
x = *y Load Assignment
*x = y Store Assignment
Safe?
Shadow/invisible vars: non-linear to handle
ASPLOS XII 27 University Of Toronto
Safety for Shadow VariablesSafety for Shadow Variables
p
p_1
q
q
= pointer
= pointed at
= shadow ptr
Unsafe
+
SteensgaardFICI PA for
shadow vars
p_1 q=
p
p_1
q
q
p_1 and q are considered
aliased
Safe
ASPLOS XII 28 University Of Toronto
LOLLOLLLIPIPOOPP Implementation Implementation
.spd
EdgeProfile
Results
SUIF Infrastructure
Static Memory
Model
MATLABC Library
TF-Matrix Collector
Points-ToMatrix
Propagator Stats
ICFGICFG SMMSMM BUBU TDTD .spx
Implementation details:•Sparse matrices•Efficient matrix algorithms•Result memoization
ASPLOS XII 29 University Of Toronto
MeasuringMeasuring
LOLLOLLLIPIPOOPP’s ’s EfficiencyEfficiency and and AccuracyAccuracy
ASPLOS XII 30 University Of Toronto
SPEC2000 Benchmark DataSPEC2000 Benchmark Data
Experimental Framework: 3GHz P4 with 1GB of RAM
Benchmark LOC Matrix Size N
PPA Analysis Time
[Unsafe]PPA Analysis Time
[Safe]Bzip2 4686 251 0.3 seconds
Mcf 2429 354 0.39 seconds
Gzip 8616 563 0.71 seconds
Crafty 21297 1917 5.49 seconds
Vpr 17750 1976 9.33 seconds
Twolf 20469 2611 16.59 seconds
Parser 11402 2732 30.72 seconds
0.3 seconds
0.61 seconds
0.77 seconds
5.51 seconds
10.34 seconds
20.64 seconds
50.04 seconds
Vortex 67225 11018 3min 59seconds
Gap 71766 25882 54min 56seconds
Perlbmk 85221 20922 44min 15seconds
Gcc 22225 42109 5hour 10 min
4min 56seconds
83min 38seconds
89min 43seconds
N/A
Scales to all of SPECint
ASPLOS XII 31 University Of Toronto
SPEC2000 Benchmark DataSPEC2000 Benchmark Data
Experimental Framework: 3GHz P4 with 1GB of RAM
Benchmark PPA Analysis Time
[Safe]Points-To
Profile TimeBzip2 0.3 seconds
Mcf 0.61 seconds
Gzip 0.77 seconds
Crafty 5.51 seconds
Vpr 10.34 seconds
Twolf 20.64 seconds
Parser 50.04 seconds
Vortex 4min 56seconds
Gap 83min 38seconds
Perlbmk 89min 43seconds
Gcc N/A
More efficient than points-to profiling
13min 34seconds
19min 56seconds
3min 48seconds
14min 47seconds
3hour 17 min
N/A
84min 52seconds
0.7 seconds
55min 56seconds
N/A
39min 58seconds
ReducedInput Set
ASPLOS XII 32 University Of Toronto
Easy SPEC2000 BenchmarksEasy SPEC2000 Benchmarks
1.41.2
1.51.8
6.1
1.01.3
0
1
2
3
4
5
6
7
gzip vpr mcf crafty vortex bzip2 twolf
unsafe
safe
p > 0.001
Av
era
ge
De
refe
ren
ce
Siz
e
mor
e ac
cura
te
A one-level Analysis is often adequate (i.e. safe~=unsafe)
ASPLOS XII 33 University Of Toronto
Challenging SPEC 95/2000 BenchmarksChallenging SPEC 95/2000 Benchmarks
42.5
89.0
143.8
80.1
6.7
18.5
0
20
40
60
80
100
120
140
parser perlbmk gap li ijpeg perl
unsafe
safe
p > 0.001
Av
era
ge
De
refe
ren
ce
Siz
e
mor
e ac
cura
te
Many improbable points-to relations can be pruned away
ASPLOS XII 34 University Of Toronto
Metric: Average Max Certainty Metric: Average Max Certainty
while(…){
x = *a; …}
{ (0.72, ), (0.08, ), (0.2, ) }y zx
Σ(max probability value)
(num of loads & stores)Avg Certainty =
Max probability value = 0.72
ASPLOS XII 35 University Of Toronto
SPEC2000 Average CertaintySPEC2000 Average Certainty
0.9050.949 0.970
0.9200.946 0.969
0.870
0.783
0.908
1.000
0.901
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
gzip vp
r*g
cc mcf
craf
ty
pars
er
perlb
mk
gap
vorte
xbz
ip2
twolf
Pro
bab
ilis
tic
Cer
tain
ty
mor
e ac
cura
te
On average, LOLLIPOP can predict a single likely points-to relation
ASPLOS XII 36 University Of Toronto
Metric: NAEDMetric: NAED
while(…){
x = *a; …}
{ (0.72, ), (0.08, ), (0.2, ) }y zx
LOLIPoP’s Static prediction
Normalized Average Euclidean Distance
Σ║Ps – Pd ║
(num of loads & stores)NAED =
1√ 2
Ps
{ (0.88, ), (0.02, ), (0.1, ) }y zx
Pts-to Profiler’s Dynamic freq vectorPd
ASPLOS XII 37 University Of Toronto
Norm Avg Euclidean Dist (NAED)Norm Avg Euclidean Dist (NAED)
0
0.1
0.2
0.3
0.4
0.5
0.6
crafty gzip perl vortex spec-avg
uniform
safe-ref
unsafe-ref
unsafe-train
unsafe-noep
NA
ED
mor
e ac
cura
te
ASPLOS XII 38 University Of Toronto
SummarySummary
A novel matrix-based PPA algorithm Scales to the SPECint 95/2000 benchmarks
One level context and flow sensitive Future Work
Applying LOLLIPOP to TLS/TMFurther optimize LOLLIPOP’s implementation
ASPLOS XII 39 University Of Toronto
ASPLOS XII 40 University Of Toronto
ReferencesReferences Manuvir Das, Ben Liblit, Manuel Fahndrich, and Jakob Rehof. Estimating
the Impact of Scalable Pointer Analysis on Optimization. SAS 2001, 260-278.
Peng-Sheng Chen, Ming-Yu Hung, Yuan-Shin Hwang, Roy Dz-Ching Ju, and Jenq Kuen Lee. Compiler support for speculative multithreading architecture with probabilistic points-to analysis. PPOPP 2003, 25-36.
Jin Lin, Tong Chen, Wei-Chung Hsu, Peng-Chung Yew, Roy Dz-Ching Ju, Tin-Fook Ngai and Sun Chan, A Compiler Framework for Speculative Analysis and Optimizations. PLDI 2003, 289-299.
R.D. Ju, J. Collard, and K. Oukbir. Probabilistic Memory Disambiguation and its Application to Data Speculation. SIGARCH Comput. Archit. News 27 1999, 27-30.
Manel Fernandez and Roger Espasa. Speculative Alias Analysis for Executable Code. PACT 2002, 221-231.