1 theory and practice of dependence testing data and control dependences scalar data dependences ...

1

Theory and Practice of Dependence Testing

Data and control dependences Scalar data dependences

True-, anti-, and output-dependences

Loop dependences Loop parallelism Dependence distance and direction vectors Loop-carried dependences Loop-independent dependences

Loop dependence and loop iteration reordering Dependence tests

2

Data and Control Dependences

Data dependence: data is produced and consumed in the correct order

Control dependence: a dependence that arises as a result of control flow

S1 PI = 3.14S2 R = 5.0S3 AREA = PI * R ** 2

S1 IF (T.EQ.0.0) GOTO S3

S2 A = A / TS3 CONTINUE

3

Data Dependence

Definition There is a data dependence from statement S1 to

statement S2 iff

1) both S1 and S2 access the same memory location and at least one of them stores into it, and

2) there is a feasible run-time execution path from S1 to S2

4

Data Dependence Classification

Dependence relations are similar to hardware hazards (which may cause pipeline stalls) True dependences (also called flow dependences), denoted by

S1 S2, are the same as RAW (Read After Write) hazards

Anti dependences, denoted by S1 -1 S2, are the same as WAR (Write After Read) hazards

Output dependences, denoted by S1 o S2, are the same as WAW (Write After Write) hazards

S1 X = …S2 … = X

S1 … = XS2 X = …

S1 X = …S2 X = …

S1 S2 S1 -1 S2 S1 o S2

5

Compiling for Loop Parallelism

Dependence properties are crucial for determining and exploiting loop parallelism

Bernstein’s conditions Iteration I1 does not write into a location that is read by

iteration I2 Iteration I2 does not write into a location that is read by

iteration I1 Iteration I1 does not write into a location that is written

into by iteration I2

6

Fortran’s PARALLEL DO

The PARALLEL DO guarantees that there are no scheduling constraints among its iterations

PARALLEL DO I = 1, N A(I) = A(I) + B(I)ENDDO

PARALLEL DO I = 1, N A(I+1) = A(I) + B(I)ENDDO

PARALLEL DO I = 1, N A(I-1) = A(I) + B(I)ENDDO

PARALLEL DO I = 1, N S = S + B(I)ENDDO

OK

Not OK

7

Loop Dependences

While the PARALLEL DO requires the programmer to assure the loop iterations can be executed in parallel, an optimizing compiler must analyze loop dependences for automatic loop parallelization and vectorization

Granularity of parallelism Instruction level parallelism (ILP), not loop-specific Synchronous (e.g. SIMD, also vector and SSE) parallelism is

fine grain and compiler aims to optimize inner loops Asynchronous (e.g. MIMD, work sharing) parallelism is coarse

grain and compiler aims to optimize outer loops

8

Synchronous Parallel Machines

SIMD model, simpler and cheaper compared to MIMD

Processors operate in lockstep executing the same instruction on different portions of the data space

Application should exhibit data parallelism

Control flow requires masking operations (similar to predicated instructions), which can be inefficient

p1,1 p1,2 p1,3 p1,4

p2,1 p2,2 p2,3 p2,4

p3,1 p3,2 p3,3 p3,4

p4,1 p4,2 p4,3 p4,4

Example 4x4 processor mesh

9

Asynchronous Parallel Computers

Multiprocessors Shared memory is directly

accessible by each PE (processing element)

Multicomputers Distributed memory requires

message passing between PEs

SMPs Shared memory in small

clusters of processors, need message passing across clusters

p1 p2 p3 p4

mem

p1 p2 p3 p4

m1 m2 m3 m4

bus

bus, x-bar, or network

10

Advanced Compiler Technology

Powerful restructuring compilers transform loops into parallel or vector code

Must analyze data dependences in loop nests besides straight-line code

DO I = 1, N DO J = 1, N C(J,I) = 0.0 DO K = 1, N C(J,I) = C(J,I)+A(J,K)*B(K,I) ENDDO ENDDOENDDO

PARALLEL DO I = 1, N DO J = 1, N C(J,I) = 0.0 DO K = 1, N C(J,I) = C(J,I)+A(J,K)*B(K,I) ENDDO ENDDOENDDO

DO I = 1, N DO J = 1, N, 64 C(J:J+63,I) = 0.0 DO K = 1, N C(J:J+63,I) = C(J:J+63,I) +A(J:J+63,K)*B(K,I) ENDDO ENDDOENDDO

parallelize vectorize

11

Normalized Iteration Space

In most situations it is preferable to use normalized iteration spaces

For an arbitrary loop with loop index I and loop bounds L and U and stride S, the normalized iteration number isi = (I-L+S)/S

DO I = 100, 20, -10 A(I) = B(100-I) + C(I/5)ENDDO

DO i = 1, 9 A(110-10*i) = B(10*i-10) + C(22-2*i)ENDDO

is analyzed withi = (110-I)/10

12

Dependence in Loops

In this example loop statement S1 “depends on itself” from the previous iteration

More precisely, statement S1 in iteration i has a true dependence with S1 in iteration i+1

DO I = 1, NS1 A(I+1) = A(I) + B(I) ENDDO

S1 A(2) = A(1) + B(1)S1 A(3) = A(2) + B(2)S1 A(4) = A(3) + B(3)S1 A(5) = A(4) + B(4) …

is executed as

13

Iteration Vector

Definition Given a nest of n loops, the iteration vector i of a

particular iteration of the innermost loop is a vector of integers

i = (i1, i2, …, in)where ik represents the iteration number for the loop at nesting level k

The set of all possible iteration vectors is an iteration space

14

Iteration Vector Example

The iteration space of the statement at S1 is {(1,1), (2,1), (2,2), (3,1), (3,2), (3,3)}

At iteration i = (2,1) the value of A(1,2) is assigned to A(2,1)

DO I = 1, 3 DO J = 1, IS1 A(I,J) = A(J,I) ENDDO ENDDO

(1,1)

(2,1) (2,2)

(3,2) (3,3)(3,1)

I

J

15

Iteration Vector Ordering

The iteration vectors are naturally ordered according to a lexicographical order, e.g. iteration (1,2) precedes (2,1) and (2,2) in the example on the previous slide

Definition Iteration i precedes iteration j, denoted i < j, iff

1) i[1:n-1] < j[1:n-1], or2) i[1:n-1] = j[1:n-1] and in < jn

16

Loop Dependence

Definition There exist a dependence from S1 to S2 in a loop nest iff

there exist two iteration vectors i and j such that

1) i < j and there is a path from S1 to S2

2) S1 accesses memory location M on iteration i and S2 accesses memory location M on iteration j

3) one of these accesses is a write

17

Loop Dependence Example

Show that the example loop has no loop dependences

Answer: there are no iteration vectors i and j in {(1,1), (2,1), (2,2), (3,1), (3,2), (3,3)} such that i < j and either S1 in i writes to the same element of A that is read at S1 in iteration j, or S1 in iteration i reads an element A that is written in iteration j

DO I = 1, 3 DO J = 1, IS1 A(I,J) = A(J,I) ENDDO ENDDO

18

Loop Reordering Transformations

Definitions A reordering transformation is any program

transformation that changes the execution order of the code, without adding or deleting any statement executions

A reordering transformation preserves a dependence if it preserves the relative execution order of the source and sink of that dependence

19

Fundamental Theorem of Dependence

Any reordering transformation that preserves every dependence in a program preserves the meaning of that program

20

Valid Transformations

A transformation is said to be valid for the program to which it applies if it preserves all dependences in the program

DO I = 1, 3S1 A(I+1) = B(I)S2 B(I+1) = A(I) ENDDO

DO I = 1, 3 B(I+1) = A(I) A(I+1) = B(I) ENDDO

DO I = 3, 1, -1 A(I+1) = B(I) B(I+1) = A(I) ENDDO

valid invalid

21

Dependences and Loop Transformations Loop dependences are tested before a

transformation is applied When a dependence test is inconclusive,

dependence must be assumed In this example the value of K is unknown and

loop dependence is assumed:

DO I = 1, NS1 A(I+K) = A(I) + B(I) ENDDO

22

Dependence Distance Vector

Definition Suppose that there is a dependence from S1 on iteration i

to S2 on iteration j; then the dependence distance vector d(i, j) is defined as d(i, j) = j - i

Example:

DO I = 1, 3 DO J = 1, IS1 A(I+1,J) = A(I,J) ENDDO ENDDO

True dependence between S1 and itself oni = (1,1) and j = (2,1): d(i, j) = (1,0)i = (2,1) and j = (3,1): d(i, j) = (1,0)i = (2,2) and j = (3,2): d(i, j) = (1,0)

23

Dependence Direction Vector

Definition Suppose that there is a dependence from S1 on iteration i

and S2 on iteration j; then the dependence direction vector D(i, j) is defined as

D(i, j)k = “<” if d(i, j)k > 0“=” if d(i, j)k = 0“>” if d(i, j)k < 0

24

Example


True dependence between S1 and itself oni = (1,1) and j = (2,1): d(i, j) = (1,0), D(i, j) = (<, =)i = (2,1) and j = (3,1): d(i, j) = (1,0), D(i, j) = (<, =)i = (2,2) and j = (3,2): d(i, j) = (1,0), D(i, j) = (<, =)

(1,1)

(2,1) (2,2)

(3,2) (3,3)(3,1)

I

J

source sink

25

Data Dependence Graphs

Data dependences arise between statement instances

It is generally infeasible to represent all data dependences that arise in a program

Usually only static data dependences are recorded using the data dependence direction vector and depicted in a data dependence graph to compactly represent data dependences in a loop nest

DO I = 1, 10000S1 A(I) = B(I) * 5S2 C(I+1) = C(I) + A(I) ENDDO

Static data dependences foraccesses to A and C:S1 (=) S2 and S2 (<) S2

Data dependence graph: S1

S2

(=)

(<)

26

Loop-Independent Dependences

Definition Statement S1 has a loop-independent dependence on S2 iff

there exist two iteration vectors i and j such that

1) S1 refers to memory location M on iteration i, S2 refers to M on iteration j, and i = j

2) there is a control flow path from S1 to S2 within the iteration

27

Loop-Carried Dependences

Definitions Statement S1 has a loop-carried dependence on S2 iff there exist

two iteration vectors i and j such that S1 refers to memory location M on iteration i, S2 refers to M on iteration j, and d(i, j) > 0, (that is, D(i, j) contains a “<” as its leftmost non-“=” component)

A loop-carried dependence from S1 to S2 is1) forward if S2 appears after S1 in the loop body2) backward if S2 appears before S1 (or if S1=S2)

The level of a loop-carried dependence is the index of the leftmost non- “=” of D(i, j) for the dependence

28

Example (1)

DO I = 1, 3S1 A(I+1) = F(I)S2 F(I+1) = A(I) ENDDO

S1 (<) S2 and S2 (<) S1

The loop-carried dependence S1 (<) S2 is forward

The loop-carried dependence S2 (<) S1 is backward

All loop-carried dependences are of level 1, because D(i, j) = (<) for every dependence

29

Example (2)

DO I = 1, 10 DO J = 1, 10 DO K = 1, 10S1 A(I,J,K+1) = A(I,J,K) ENDDO ENDDO ENDDO

S1 (=,=,<) S1

All loop-carried dependences are of level 3, because D(i, j) = (=, =, <)

Note: level-k dependences are also denoted by Sx k Sy

S1 3 S1

Alternative notation for alevel-3 dependence

30

Direction Vectors and Reordering Transformations (1)

Let T be a transformation on a loop nest that does not rearrange the statements in the body of the innermost loop; then T is valid if, after it is applied, none of the direction vectors for dependences with source and sink in the nest has a leftmost non-“=” component that is “>”

31


A reordering transformation preserves all level-k dependences if

1) it preserves the iteration order of the level-k loop,

2) if it does not interchange any loop at level < k to a position inside the level-k loop, and

3) if it does not interchange any loop at level > k to a position outside the level-k loop

32

Examples DO I = 1, 10 DO J = 1, 10 DO K = 1, 10S1 A(I+1,J+2,K+3) = A(I,J,K) + B ENDDO ENDDO ENDDO

DO I = 1, 10S1 F(I+1) = A(I)S2 A(I+1) = F(I) ENDDO

Both S1 (<) S2 and S2 (<) S1

are level-1 dependencesS1 (<,<,<) S1 is a level-1 dependence

DO I = 1, 10S2 A(I+1) = F(I)S1 F(I+1) = A(I) ENDDO

DO I = 1, 10 DO K = 10, 1, -1 DO J = 1, 10S1 A(I+1,J+2,K+3) = A(I,J,K) + B ENDDO ENDDO ENDDO

Find deps.

Choose transformChoosetransform

Find deps.

33


A reordering transformation preserves a loop-independent dependence between statements S1 and S2 if it does not move statement instances between iterations and preserves the relative order of S1 and S2

34

Examples DO I = 1, 10S1 A(I) = …S2 … = A(I) ENDDO

S1 (=) S1 is a loop-independentdependence

DO I = 10, 1, -1S2 A(I) = …S1 … = A(I) ENDDO

Choosetransform

Find deps.

DO I = 1, 10 DO J = 1, 10S1 A(I,J) = …S2 … = A(I,J) ENDDO ENDDO

S1 (=,=) S1 is a loop-independent dependence

DO J = 10, 1, -1 DO I = 10, 1, -1S1 A(I,J) = …S2 … = A(I,J) ENDDO ENDDO

Find deps.

Choose transform

35

Combining Direction Vectors

When a loop nest has multiple different directions at the same loop level k, we sometimes abbreviate the level-k direction vector component with “*”

DO I = 1, 10 DO J = 1, 10S1 A(J) = A(J)+1 ENDDO ENDDO

DO I = 1, 10S1 S = S + A(I) ENDDO

DO I = 1, 9S1 A(I) = …S2 … = A(10-I) ENDDO

S1 (*) S2

=S1 (<) S2

S1 (=) S2

S1 (>) S2

S1 (*) S1

S1 (*,=) S1

36

Iteration Vectors

Iteration vectors are renditions of the dependences in a loop’s iteration space on a grid

Helps to visualize dependences


J

I1 2 3

1

2

3

Distance vector is (1,0)S1 (<,=) S1

37

Example 1

DO I = 1, 4 DO J = 1, 4S1 A(I,J+1) = A(I-1,J) ENDDO ENDDO

J

I1 2 3

1

2

3

4

4

Distance vector is (1,1)S1 (<,<) S1

38

Example 2

I1 2 3

1

2

3

4

4

Distance vector is (1,-1) S1 (<,>) S1

DO I = 1, 4 DO J = 1, 5-IS1 A(I+1,J+I-1) = A(I,J+I-1) ENDDO ENDDO

J

39

Example 3

I1 2 3

1

2

3

4

4

Distance vectors are (0,1) and (1,0)S1 (=,<) S1 and S2 (<,=) S2

DO I = 1, 4 DO J = 1, 4S1 B(I) = B(I) + C(I)S2 A(I+1,J) = A(I,J) ENDDO ENDDO

J

40

Parallelization

It is valid to convert a sequential loop to a parallel loop if the loop carries no dependence

DO I = 1, 4 DO J = 1, 4S1 A(I,J+1) = A(I,J) ENDDO ENDDO

S1 (=,<) S1J

I1 2 3

1

2

3

4

4

PARALLEL DO I = 1, 4 DO J = 1, 4S1 A(I,J+1) = A(I,J) ENDDO ENDDO

parallelize

41

Vectorization (1)

A single-statement loop that carries no dependence can be vectorized

DO I = 1, 4S1 X(I) = X(I) + C ENDDO

S1 X(1:4) = X(1:4) + C

Fortran 90 array statement

X(1) X(2) X(3) X(4)

X(1) X(2) X(3) X(4)

C C C C+

vectorize

Vector operation

42

Vectorization (2)

This example has a loop-carried dependence, and is therefore invalid

DO I = 1, NS1 X(I+1) = X(I) + C ENDDO

S1 X(2:N+1) = X(1:N) + C

Invalid Fortran 90 statement

S1 (<) S1

43

Vectorization (3)

Because only single loop statements can be vectorized, loops with multiple statements must be transformed using the loop distribution transformation

Loop has no loop-carried dependence or has forward flow dependences

DO I = 1, NS1 A(I+1) = B(I) + CS2 D(I) = A(I) + E ENDDO

S1 A(2:N+1) = B(1:N) + CS2 D(1:N) = A(1:N) + E

DO I = 1, NS1 A(I+1) = B(I) + C ENDDO DO I = 1, NS2 D(I) = A(I) + E ENDDO

vectorize

loop distribution

44

Vectorization (4)

When a loop has backward flow dependences and no loop-independent dependences, interchange the statements to enable loop distribution

DO I = 1, NS1 D(I) = A(I) + ES2 A(I+1) = B(I) + C ENDDO

S2 A(2:N+1) = B(1:N) + CS1 D(1:N) = A(1:N) + E

DO I = 1, NS2 A(I+1) = B(I) + C ENDDO DO I = 1, NS1 D(I) = A(I) + E ENDDO

vectorize

loop distribution

45

Preliminary Transformations to Support Dependence Testing To determine distance and direction vectors with

dependence tests require array subscripts to be in a standard form

Preliminary transformations put more subscripts in affine form, where subscripts are linear integer functions of loop induction variables Loop normalization Induction-variable substitution Constant propagation

46

Loop Normalization

See slide on normalized iteration space

Dependence tests assume iteration spaces are normalized

The normalization distorts dependences, but it is otherwise preferred to support loop analysis Make lower bound 1 (or 0) Make stride 1

DO I = 1, M DO J = I, NS1 A(J,I) = A(J,I-1) + 5 ENDDO ENDDO

DO I = 1, M DO J = 1, N - I + 1S1 A(J+I-1,I) = A(J+I-1,I-1) + 5 ENDDO ENDDO

normalize

S1 (<,>) S1

S1 (<,=) S1

47

Data Flow Analysis

Data flow analysis with reaching definitions or SSA form supports constant propagation and dead code elimination

The above are needed to standardize array subscripts

K = 2 DO I = 1, 100S1 A(I+K) = A(I) + 5 ENDDO

DO I = 1, 100S1 A(I+2,I) = A(I) + 5 ENDDO

constant propagation

48

Forward Substitution

Forward substitution and induction variable substitution (IVS) yield array subscripts that are amenable to dependence analysis

DO I = 1, 100 K = I + 2S1 A(K) = A(K) + 5 ENDDO

forward substitution

DO I = 1, 100S1 A(I+2) = A(I+2) + 5 ENDDO

49

Induction Variable Substitution

Example loop with IVs

After IV substitution (IVS) (note the affine indexes)

After parallelization

I = 0 J = 1 while (I<N) I = I+1 … = A[J] J = J+2 K = 2*I A[K] = … endwhile

for i=0 to N-1 S1: … = A[2*i+1] S2: A[2*i+2] = … endfor

forall (i=0,N-1) … = A[2*i+1] A[2*i+2] = … endforall

GCD test to solve dependence equation 2id - 2iu = -1Since 2 does not divide 1 there is no data dependence.

W R W R W R

A[2*i+1]

…

A[2*i+2]

A[]

Dep testIVS

50

Example (1)

Before and after induction-variable substitution of KI in the inner loop

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 KI = KI + INC U(KI) = U(KI) + W(J) ENDDO S(I) = U(KI)ENDDO

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 U(KI+J*INC) = U(KI+J*INC) + W(J) ENDDO KI = KI + 100 * INC S(I) = U(KI)ENDDO

51

Example (2)

Before and after induction-variable substitution of KI in the outer loop

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 U(KI+J*INC) = U(KI+J*INC) + W(J) ENDDO KI = KI + 100*INC S(I) = U(KI)ENDDO INC = 2

KI = 0DO I = 1, 100 DO J = 1, 100 U(KI+(I-1)*100*INC+J*INC) = U(KI+(I-1)*100*INC+J*INC) + W(J) ENDDO S(I) = U(KI+I*100*INC)ENDDOKI = KI + 100*100*INC

52

Example (3)

Before and after constant propagation of KI and INC

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 U(KI+(I-1)*100*INC+J*INC) = U(KI+(I-1)*100*INC+J*INC) + W(J) ENDDO S(I) = U(KI+I*100*INC)ENDDOKI = KI + 100*100*INC

DO I = 1, 100 DO J = 1, 100 U(I*200+J*2-200) = U(I*200+J*2-200) + W(J) ENDDO S(I) = U(I*200)ENDDOKI = 20000

Affine subscripts

53

Dependence Testing

After loop normalization, IVS, and constant propagation, array subscripts are standardized

Affine subscripts are important, since most dependence tests requires affine forms:

a1 i1 + a2 i2 + … + an in + e Non-linear subscripts contain unknowns

(symbolic terms), function values, array values, etc.

54

Dependence Equation (1)

A dependence equation defines the access requirement

DO I = 1, NS1 A(I+1) = A(I) ENDDO

DO I = 1, NS1 A(f(I)) = A(g(I)) ENDDO

DO I = 1, NS1 A(2*I+1) = A(2*I) ENDDO

To prove flow dependence:for which values of < is

f() = g()

2*+1 = 2* has no solution

+1 = has solution =-1

55


A dependence equation defines the access requirement


DO I = 1, NS1 A(f(I)) = A(g(I)) ENDDO

DO I = 1, NS1 A(2*I) = A(2*I+2) ENDDO

To prove anti dependence:for which values of > is

f() = g()

2* = 2*+2 has solution =+1

+1 = has no solution

56


DO I = 1, NS1 A(5) = A(I) ENDDO

DO I = 1, NS1 A(5) = A(6) ENDDO

To prove flow dependence: for which values of < is f() = g()

5 = 6 has no solution

5 = has solution if 5 < N

DO I = 1, N DO J = 1, NS1 A(I-1) = A(J) ENDDO ENDDO

1-1 = 2 has solution,note =(1, 2) and =(1, 2)are iteration vectors

57

ZIV, SIV, and MIV

ZIV (zero index variable) subscript pairs

SIV (single index variable) subscript pairs

MIV (multiple index variable) subscript pairs

DO I = … DO J = … DO K = …S1 A(5,I+1,J) = A(N,I,K) ENDDO ENDDO ENDDO

ZIV pair SIV pair MIV pair

58

Example

Distance vector is (1,-1) S1 (<,>) S1

DO I = 1, 4 DO J = 1, 5-IS1 A(I+1,J+I-1) = A(I,J+I-1) ENDDO ENDDO

First distance vector component is easy (SIV), but second is not so easy (MIV)

SIV pair MIV pair

I1 2 3

1

2

3

4

4

J

59

Separability and Coupled Subscripts

When testing multidimensional arrays, subscript are separable if indices do not occur in other subscripts

ZIV and SIV subscripts pairs are separable

MIV pairs may give rise to coupled subscript groups

DO I = … DO J = … DO K = …S1 A(I,J,J) = A(I,J,K) ENDDO ENDDO ENDDO

SIV pairis separable

MIV pairgives coupled

indices

60

Dependence Testing Overview

1. Partition subscripts into separable and coupled groups

2. Classify each subscript as ZIV, SIV, or MIV

3. For each separable subscript, apply the applicable single subscript test (ZIV, SIV, or MIV) to prove independence or produce direction vectors for dependence

4. For each coupled group, apply a multiple subscript test

5. If any test yields independence, no further testing needed

6. Otherwise, merge all dependence vectors into one set

61

Dependence Testing is Conservative or Exact Dependence tests solve dependence equations Conservative tests attempts to prove that there is

no solution to a dependence equation Absence of a dependence is proven Proving dependence sometimes possible and useful But if proofs fail, dependence must be assumed When dependence has to be assumed, translated code

is correct but suboptimal

Exact tests detect dependences iff they exists

62

ZIV Test

The ZIV test compares two subscripts

If the expressions are proved unequal, no dependence can exist

DO I = 1, NS1 A(5) = A(6) ENDDO

K = 10 DO I = 1, NS1 A(5) = A(K) ENDDO

K = 10 DO I = 1, N DO J = 1, NS1 A(I,5) = A(I,K) ENDDO ENDDO

63

Strong SIV Test

Requires subscript pairs of the form aI+c1 and aI’+c2 for the def and use, respectively

Dependence equationaI+c1 = aI’+c2

has a solution if the dependence distanced = (c1 - c2)/a is integer and|d| < U - L with L and U loop bounds


DO I = 1, NS1 A(2*I+2) = A(2*I) ENDDO

DO I = 1, NS1 A(I+N) = A(I) ENDDO

64

Weak-Zero SIV Test

Requires subscript pairs of the form a1I+c1 and a2I’+c2 with either a1= 0 or a2 = 0

If a2 = 0, the dependence equation I = (c2 - c1)/a1 has a solution if (c2 - c1)/a1 is integer and L < (c2 - c1)/a1 < U

If a1 = 0, similar case

DO I = 1, NS1 A(I) = A(1) ENDDO

DO I = 1, NS1 A(2*I+2) = A(2) ENDDO

DO I = 1, NS1 A(1) = A(I) + A(1) ENDDO

65

GCD Test

Assumes that f() and g() are affine:f(x1,x2,…,xn)= a0+a1x1+…+anxn

g(y1,y2,…,yn)= b0+b1y1+…+bnyn

Reordering gives linear Diophantine equationa1x1-b1y1 +…+anxn-bnyn =b0-a0

which has a solution iffgcd(a1,…,an,b1,…,bn) divides b0-a0

DO I = 1, NS1 A(2*I+1) = A(2*I) ENDDO

DO I = 1, NS1 A(4*I+1) = A(2*I+3) ENDDO

DO I = 1, 10 DO J = 1, 10S1 A(1+2*I+20*J) = A(2+20*I+2*J) ENDDO ENDDO

66

Banerjee Test (1)

f() and g() are affine:f(x1,x2,…,xn)= a0+a1x1+…+anxn

g(y1,y2,…,yn)= b0+b1y1+…+bnyn

Dependence equation a0-b0+a1x1-b1y1 +…+anxn-bnyn=0

has no solution if 0 is not within the lower bound L and upper bound U of the equation’s LHS

IF M > 0 THEN DO I = 1, 10S1 A(I) = A(I+M) + B(I) ENDDO

Dependence equation = +M

is rewritten as- M + - = 0

Constraints:1 < M < 0 < < 90 < < 9

67


- M + - = 0

L U step

- M + - - M + - original eq.

- M + - 9 - M + - 0 eliminate

- M - 9 - M + 9 eliminate

- 8 eliminate M

Banerjee Test (2)

Test for (*) dependence:possible dependence because 0 lies within[-,8]

Constraints:1 < M < 0 < < 90 < < 9

68


- M + - = 0

L U step


- M + - 9 - M + - (+1) eliminate

- M - 9 - M - 1 eliminate

- - 2 eliminate M

Banerjee Test (3)

Test for (<) dependence:assume that < -1

No dependence because 0 does no lie within [-,-2]

Constraints:1 < M <

0 < < -1 < 81 < +1 < < 9

69


- M + - = 0

L U step


- M + - (+1) - M + eliminate

- M - 1 - M + 9 eliminate

- 8 eliminate M

Banerjee Test (4)

Test for (>) dependence:assume that +1 >

Possible dependence because 0 lies within[-,8]

Constraints:1 < M <

1 < -1 < < 90 < < +1 < 8

70


- M + - = 0

L U step


- M + - - M + - eliminate

- M - M eliminate

- - 1 eliminate M

Banerjee Test (5)

Test for (=) dependence:assume that =

No dependence because 0 does not lie within [-,-1]

Constraints:1 < M <

0 < = < 9

71

Testing for All Direction Vectors

Refine dependence between a pair of statements by testing for the most general set of direction vectors and upon failure refine each component recursively

(*,*,*)

(=,*,*)(<,*,*) (>,*,*)

(<,=,*)(<,<,*) (<,>,*)

(<,=,=)(<,=,<) (<,=,>)

yes (may depend)

yes

yesno (indep)

72

Run-Time Dependence Testing

When a dependence equation includes unknowns, we can sometimes find the constraints on these unknowns to solve the equation and impose these constraints at run time on multi-version code

DO I = 1, 10S1 A(I) = A(I-K) + B(I) ENDDO

IF K = 0 OR K < -9 OR K > 9 THEN PARALEL DO I = 1, 10S1 A(I) = A(I-K) + B(I) ENDDO ELSE DO I = 1, 10S1 A(I) = A(I-K) + B(I) ENDDO ENDIF

1 theory and practice of dependence testing data and control dependences scalar data dependences ...

Documents

x s2 x

s1 s2s1

statement s1

goto s3 s2

statement s2 iff

ai bi enddoparallel

loop iterations

memory location