Symbolic AnalysisSymbolic Analysis
Symbolic AnalysisSymbolic Analysis
Symbolic analysis tracks the values of variables in programs symbolically as expressions of input variables and other variables, which we call reference variables.
We may draw out useful information about relationships among variables that are expressed in terms of the same set of reference variables
An ExampleAn Example
1) x = input();2) y = x – 1;3) z = y – 1;4) A[x] = 10;5) A[y] = 11;6) if (z > x)7) z = x;
z = x – 2
&A[x] &A[y]
z > x is never true
can be removed
Abstract DomainAbstract Domain
Since we cannot create succinct and closed-form symbolic expressions for all values computed, we choose an abstract domain and approximate the computations with the most precise expressions within the domain.
Constant propagation: { constants, UNDEF, NAC }
Symbolic analysis: { affine-expressions, NAA }
Affine ExpressionsAffine Expressions
An expression is affine with respect to variables v1, v2, …, vn if it can be expressed as c0 + c1
v1 + … + cnvn, where c0, c1, …, cn are constants.
An affine expression is linear if c0 is zero.
Induction VariablesInduction Variables
An affine expression can also be written in terms of the count of iterations through the loop.
Variables whose values can be expressed as c1i + c0, where i is the count of iterations through the closest enclosing loop, are known as induction variables.
An ExampleAn Example
for (m = 10; m < 20; m++) { x = m * 3; A[x] = 0;}
x = 27;for (m = 10; m < 20; m++) { x = x + 3; A[x] = 0;}
i, m = i + 10x = 30 + 3 * i
for (x = &A + 30; x <= &A + 57; x = x + 3) { *x = 0;}
Other Reference VariablesOther Reference Variables
If a variable is not a linear function of the reference variables already chosen, we have the option of treating its value as reference for future operations.
a = f();b = a + 10;c = a + 11;
A Running ExampleA Running Example
a = 0;for (f = 100; f < 200; f++) { a = a + 1; b = 10 * a; c = 0; for (g = 10; g < 20; g++) { d = b + c; c = c + 1; }}
a = a + 1; b = 10 * a;c = 0;j = 1;
B2
a = 0;i = 1;
B1
d = b + c;c = c + 1;j = j + 1;if j <= 10 goto B3
B3
i = i + 1;if i <= 100 goto B2
B4
R5
R6
R7
R8
Data-Flow Values: Data-Flow Values: Symbolic MapsSymbolic Maps The domain of data-flow values for symbolic analysi
s is symbolic maps, which are functions that map each variable in the program to a value.
The value is either an affine function of reference values, or the special symbol NAA to represent a non-affine expression.
If there is only one variable, the bottom value of the semilattice is a map that sends the variable to NAA.
The semilattice for n variables is the product of the individual semillatices.
We use mNAA to denote the bottom of the semilattice which maps all variables to NAA.
The Running ExampleThe Running Example
var i = 1 1 i 100
j = 1, …, 10 j = 1, …, 10
a 1 i
b 10 10i
d 10, …, 19 10i, …, 10i + 9
c 1, …, 10 1, …, 10
a = a + 1; b = 10 * a;c = 0;j = 1;
B2
a = 0;i = 1;
B1
d = b + c;c = c + 1;j = j + 1;if j <= 10 goto B3
B3
i = i + 1;if i <= 100 goto B2
B4
R5
R6
R7
R8
The Running ExampleThe Running Example
m m(a) m(b) m(c) m(d)
IN[B1] NAA NAA NAA NAA
OUT[B1] 0 NAA NAA NAA
IN[B2] i – 1 NAA NAA NAA
OUT[B2] i 10i 0 NAA
IN[B3] i 10i j – 1 NAA
OUT[B3] i 10i j 10i + j – 1
IN[B4] i 10i j 10i + j – 1
OUT[B4] i – 1 10i – 10 j 10i + j – 11
The Running ExampleThe Running Example
a = 0;for (i = 1; i <= 100; i++) { a = i; b = 10 * i; c = 0; for (j = 1; j <= 10; j++) { d = 10 * i + j – 1; c = j; }}
Transfer FunctionsTransfer Functions
The transfer functions in symbolic analysis send symbolic maps to symbolic maps.
The transfer function of statement s, denoted fs, is defined as follows:
If s is not an assignment, then fs = I.
If s is an assignment to variable x, then fs(m)(x) m(v) for all variables v x,= c0+c1m(y)+c2m(z) if x is assigned c0+c1y+c2z, NAA otherwise.
Composition of Transfer Composition of Transfer FunctionsFunctions If f2(m)(v) = NAA, then (f2 。 f1)(m)(v) = NAA.
If f2(m)(v) = c0 + i cim(vi), then (f2 。 f1)(m)(v)
NAA, if f1(m)(vi) = NAA for some i 0, = ci 0 c0 + i ci f1(m)(vi) otherwise
The Running ExampleThe Running Example
f f(m)(a) f(m)(b) f(m)(c) f(m)(d)
f B1 0 m(b) m(c) m(d)
f B2 m(a) + 1 10m(a) + 10 0 m(d)
f B3 m(a) m(b) m(c) + 1 m(b) + m(c)
f B4 m(a) m(b) m(c) m(d)
Solutions to Data-Flow Solutions to Data-Flow ProblemProblem
OUT[Bk] = fB(IN[Bk]), for all Bk
OUT[B1] IN1[B2]
OUT[B2] INi,1[B3], 1 i 100
OUTi,j-1[B3] INi,j[B3], 1 i 100, 2 j 10
OUTi,10[B3] INi[B4], 2 i 100
OUTi-1[B4] INi[B2], 1 i 100
Meet of Transfer Meet of Transfer FunctionsFunctions The meet of two transfer functions:
f1(m)(v) if f1(m)(v) = f2(m)(v) (f2 f1)(m)(v) = NAA otherwise
Parameterized Function Parameterized Function CompositionsCompositions If f(m)(x) = m(x) + c, then f i(m)(x) = m(x) + ci fo
r all i 0, x is a basic induction variable. If f(m)(x) = m(x), then f i(m) (x) = m(x) for all i
0, x is a symbolic constant. If f(m)(x) = c0 + c1m(x1) + … + cnm(xn), where ea
ch xk is either a basic induction variable or a symbolic constant , then f i(m)(x) = c0 + c1 f i(m)(x1) + … + cn f i(m)(xn) for all i 0 , x is an induction variable.
In all other cases, f i(m)(x) = NAA.
Parameterized Function CompoParameterized Function Compositionssitions The effect of executing a fixed number of iter
ations is obtained by replacing i above by that number.
If the number of iterations is unknown, the value at the start of the last iteration is given by f *. m(v) if f(m)(v) = m(v) f *(m)(v) = NAA otherwise
The Running ExampleThe Running Example
m(a) if v = a m(b) if v = b f i
B3(m)(v) = m(c) + i if v = c m(b) + m(c) + i if v = d.
m(a) if v = a m(b) if v = b f *
B3(m)(v) = NAA if v = c NAA if v = d.
A Region-Based AlgorithmA Region-Based Algorithm
The effect of execution from the start of the loop region to the entry of the ith iteration
fR,i,IN[S] = ( Bpred(S) fS,OUT[B])i-1
If the number of iterations of a region is known, replace i with the actual count.
In the top-down pass, compute fR,i,IN[B]. If m(v) = NAA, introduce a new reference varia
ble t, all references of m(v) are placed by t.
The Running ExampleThe Running Example
fR5,j,IN[B3] = f j-1B3
fR5,j,OUT[B3] = f jB3
fR6,IN[B2] = I
fR6,IN[R5] = fB2
fR6,OUT[B4] = I 。 fR5,10,OUT[B3] 。 fB2
fR7,i,IN[R6] = f i-1R6,OUT[B4]
fR7, i,OUT[B4] = f iR6,OUT[B4]
fR8,IN[B1] = I
fR8,IN[R7] = fB1
fR8,OUT[B4] = I 。 fR7,100,OUT[B4] 。 fB1
The Running ExampleThe Running Example f f(m)(a) f(m)(b) f(m)(c) f(m)(d)
fR5,j,IN[B3] m(a) m(b) m(c)+j-1 NAA
fR5,j,OUT[B3] m(a) m(b) m(c)+j m(b)+m(c)+j-1
fR6,IN[B2] m(a) m(b) m(c) m(d)
fR6,IN[R5] m(a)+1 10m(a)+10 0 m(d)
fR6,OUT[B4] m(a)+1 10m(a)+10 10 10m(a)+9
fR7,i,IN[R6] m(a)+i-1 NAA NAA NAA
fR7, i,OUT[B4] m(a)+i 10m(a)+10i 10 10m(a)+10i+9
fR8,IN[B1] m(a) m(b) m(c) m(d)
fR8,IN[R7] 0 m(b) m(c) m(d)
fR8,OUT[B4] 100 1000 10 1000
The Running ExampleThe Running Example
IN[B1] = mNAA
OUT[B1] = fB1(IN[B1])
INi[B2] = fR7,i,IN[R6] (OUT[B1])
OUTi[B2] = fB2(INi[B2])
INi,j[B3] = fR5,j,IN[B3] (OUTi[B2])
OUTi,j[B3] = fB2(INi,j[B3])
The Running ExampleThe Running Example
for (i = 1; i < n; i++) { a = input(); for (j = 1; j < 10; j++) { a = a – 1; b = j + a; a = a + 1; }}
for (i = 1; i < n; i++) { a = input(); t = a; for (j = 1; j < 10; j++) { a = t – 1; b = t – 1 + j; a = t; }}