' $ Dependence Analysis & % 1

Size: px

Start display at page:

Download "' $ Dependence Analysis & % 1"

Edith Norton
6 years ago
Views:

1 Dependence Analysis 1

2 Goals - determine what operations can be done in parallel - determine whether the order of execution of operations can be altered Basic idea - determine a partial order on operations - two operations not ordered in partial order => these operations can be done in parallel - any new execution order of operations that is consistent with partial order is legal What should be the granularity of operations? - depends on application - scheduling for pipelined machines: granularity = m/c instructions - loop restructuring: granularity = loop iterations 2

3 Dependences: data control flow anti output Flow dependence: S1 -> S2 (i) S1 executes before S2 in program order (ii) S1 writes into a location that is read by S2 Anti-dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 reads from a location that is overwritten later by S2 Output dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 and S2 write to the same location Input dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 and S2 both read from the same location output x := 2 y := x + 1 x := 3 y := 7 anti flow output 3

4 Conservative Approximation: - Real programs: imprecise information => need for safe approximation When you are not sure whether a dependence exists, you must assume it does. Example: procedure f (X,i,j) begin X(i) = 10; X(j) = 5; end Question: Is there an output dependence from the first assignment to the second? Answer: If (i = j), there is a dependence; otherwise, not. => Unless we know from interprocedural analysis that the parameters i and j are always distinct, we must play it safe and insert the dependence. Key notion: Aliasing : two program names may refer to the same location (like X(i) and X(j)) May-dependence vs must-dependence: More precise analysis may eliminate may-dependences 4

5 order = lexicographic order on iteration space: Program 1) (1; 2) (1; 3):::(1; 100) (2; 1) (2; 2)::: (100; 100) (1; Loop level Analysis: granularity is a loop iteration Key notion: iteration space of a loop DO I = 1, 100 DO J = 1, 100 S J each (I,J) value of loop indices corresponds to one point in picture I How do we compute dependences between loop iterations? 5

6 Dependences in loops DO 10 I = 1, N X(f(I)) = =...X(g(I)).. Conditions for ow dependence from iteration I w to I r : 1 I w I r N (write before read) f(i w ) = g(i r ) (same array location) Conditions for anti-dependence from iteration I g to I o : 1 I g < I o N (read before write) f(i o ) = g(i g ) (same array location) Conditions for output dependence from iteration I w1 to I w2 : 1 I w1 < I w2 N (write in program order) f(i w1 ) = f(i w2 ) (same array location) 6

7 Dependences in nested loops DO 10 I = 1, N DO 10 J = 1,M X(f(I,J),g(I,J)) = =...X(h(I,J),k(I,J)).. Recall: is the lexicographic order on iterations of nested loops. Conditions for ow dependence from iteration (I 1 ; J 1 ) to (I 2 ; J 2 ): (1; 1) (I 1 ; J 1 ) (I 2 ; J 2 ) (N; M) f(i 1 ; J 1 ) = h(i 2 ; J 2 ) g(i 1 ; J 1 ) = k(i 2 ; J 2 ) Anti and output dependences can be dened analogously. 7

8 general, need more information than presence/absence of In dependences. interchange: Loop in rst loop nest (eg: (2,1) depends on (1,2)) Illegal Example DO 10 I = 1,100 DO 10 I = 1,100 DO 10 J = 1,100 vs DO 10 J = 1, X(I,J) = X(I-1,J+1) X(I,J) = X(I-1,J-1) + 1 Check: both loop nests have dependences. Legal in second loop nest! What additional information about dependences should we compute? 8

9 IP calculator to compute the full dependence relation: Use 2)! (2; 1); (1; 3)! (2; 2); ::(2; 2)! (3; 1):::g f(1; (i) Expensive (time/space) to compute full relation Problems: Expensive to look up big relation (iii) Variable loop bounds? ii) One Approach: Compute the full dependence relation using IP calculator DO 10 I = 1,100 DO 10 J = 1, X(I,J) = X(I-1,J+1) + 1 Flow dependence constraints: (I w ; J w )! (I r ; J r ) (1; 1) (I w ; J w ) (I r ; J r ) (100; 100) I w = I r 1 J w = J r + 1 Use full relation to check legality of loop transformations. 9

10 Summarize dependence relation Distance/direction: at dependence relation from previous slide: Look between dependent iterations = (1; 1). That is, Dierence w ; J w )! (I r ; J r ) 2 dependence relation, implies (I r I w = 1 I r J w = 1 J f(1; 2)! (2; 1); (1; 3)! (2; 2); ::(2; 2)! (3; 1):::g We will say that the distance vector is [1; 1]. Note: From distance vector, we can easily recover the full relation. In this case, distance vector is an exact summary of relation. 10

11 of dependence relations are usuallyapproximate. Summaries Example: dependence equation: 2I w + 1 = I r. Flow relation: f(1! 3); (2! 5); (3! 7); :::g (1). Dependence xed distance between dependent iterations! No all distances are +ve, use direction vector instead. Since direction = [+]. In general, direction = [+] or [0] or [-]. Here, written by some authors as (<), (=), or (>). Also vectors are not exact. Direction we try to recover dependence relation from direction [+], we (eg):if bigger relation than (1): get! 2); (1! 3); :::; (1! 100); (2! 3); (2! 4); :::g f(1 DO 10 I = 1, X(2I+1) = X(I) + 1 Intuition: [+] direction = all distances in range [1; 1) etc. 11

12 loop nest is (I,J). Assume (I 1 ; J 1 )! (I 2 ; J 2 ) 2 dependence relation, then If = (I 2 I 1 ; J 2 J 1 ) Distance = (sign(i 2 I 1 ); sign(j 2 J 1 )) Direction Directions for Nested Loops J (<,>) (<,=) (=,=) (=,<) (<,<) I Figure 1: Legal Directions for Nested Loop (I,J) 12

13 on ow dependences: Focus w ) = g(i r ) f(i I w I r use inequalities shown above to test if dependence exists in First, direction (called [*] direction). any How to compute Directions: Use IP engine DO 10 I = 1, 100 X(f(I)) = =...X(g(I)).. If IP engine says there are no solutions, no dependence. Otherwise, determine the direction(s) of dependence. for direction [<]: add inequality I w < I r Test for direction [=]: add inequality I w = I r Test In a single loop, direction [>] cannot occur. 13

14 Rene direction vectors top down. (1) dependence in (; ) direction (eg),no Computing Directions: Nested Loops Same idea as single loop: hierarchical testing (*, *) (<, *) (=, *) (>, *) illegal directions (<, <) (<, =) (<, >) (=, <) (=, =) (=, >) Figure 2: Hierarchical Testing for Nested Loop Key ideas: ) no need to do more tests. (2) Do not test for impossible directions like (>; ). 14

15 In principle, we can use IP engine to compute all directions. Reality: most subscripts and loop bounds are simple! Engineering a dependence analyzer: First check for simple cases. Call IP engine for more complex cases. 15

16 in 1st dimension of A do not involve loop variables Subscripts subscripts called Zero Index Variable (ZIV) subscripts ) in 2nd dimension of A involve only one loop variable (I) Subscripts subscripts called Single Index Variable (SIV) subscripts ) in 3rd dimension of A involve many loop variables (J,K) Subscripts subscripts called Multiple Index Variable (MIV) subscripts ) Single Index Variable (SIV) subscript DO 10 I DO 10 J DO 10 K 10 A(5,I+1,J) =...A(N,I,K) + c 16

17 index variable in rst subscript (I) does not appear in However, other dimension any subscript is also SIV, but its index variable J appears in Second dimension as well. 3rd Separable SIV Subscript DO 10 I DO 10 J DO 10 K 10 A(I,J,J) =...A(I,J,K) + c Subscripts in both the rst and second dimensions are SIV. ) separable SIV subscript ) coupled SIV subscript 17

18 bounds on I are independent of J and K (as here), If component of direction vectors can be computed independently 1st Signicance of Separability: Break problem into smaller pieces DO 10 I DO 10 J DO 10 K 10 A(I,J,J) =...A(I,J,K) + c for ow dependence: Equations w = I r I w = J r J w = K r J First equation can be solved separately from the other two. of 2nd and 3rd components. In benchmarks, 80 of subscripts are separable! 18

19 SIV subscript: a = c Strong I r I w = (b d)=a ) a divides (b d), and quotient is within loop bounds of I, there If a dependence, and we have Ith component of the is no need to check other dimensions - no dependence Otherwise, exists! Separable,SIV subscript: Simple, precise tests exist. DO 10 J DO 10 I DO 10 K X(aI + b,..,..) =..X(cI + d,..,..).. Equation for ow dependence: a I w + b = c I r + d. direction/distance vector. In benchmarks, roughly 37 of subscripts are strong SIV! 19

20 Say c is 0 ) I w = (d b)=a and I r > I w a divides (d b), and quotient is within loop bounds, then If exists with all iterations beyond I w. dependence may be worth eliminating dependence by performing iterations It b)=a) 1 in one loop, iteration (d b)=a by itself and then 1..((d Another important case: DO 10 I 10 X(aI + b,..,..) =..X(cI + d,..,..).. Weak SIV subscript: Either a or c is 0. Important loop transformation: Index-set splitting - the remaining iterations in another loop. 20

21 can use column operations to reduce to echelon form etc. We usually, a and c are small integers (mag < 5). Exploit this. But entries in each table position: (i) gcd(a; c) Two one solution (I w ; I r ) = (s; t) to eqn a I w + c I r = gcd(a; c) (ii) Equation (1), if a and c are between 1 and 5, Given if gcd(a; c) does not divide (d b), no solution (i) otherwise, one solution is (s; t) (d b)=gcd(a; c) (ii) General solution: (iii) w ; I r ) = n (c; a)=gcd(c; a) + (s; t) (d b)=gcd(a; c) (I is parameter) (n when a or c in Equation (1) are -ve: minor modication of Case procedure. this General SIV Test Equation: a I w + b = c I r + d (1) Build a table indexed by (a; c) pairs for a and c between 1 and 5. 21

22 w ; J w ) = h(i r ; J r ) f(i w ; J w ) = k(i r ; J r ) g(i could convert system to single equation: (any constant C) We f(i w ; J w ) + g(i w ; J w ) = C h(i r ; J r ) + k(i r ; J r ) C solution of simultaneous system is soln of single equation. Any single equation may have solutions even if system does not. But Linearization: reduce number of equations Not as important nowadays, but widely used earlier. DO 10 I =... DO 10 J =... X(f(I,J),g(I,J)) = =...X(h(I,J),k(I,J)).. Equations for ow dependence: ) single equation is conservative approximation to system. 22

23 we choose C so we do not lose precision? Can range of functions g and k is less than some u, choose C = u. If case: We know the size of array X (say (1..50,1..100)). So Common and k values are < 101. Convert g w ; J w ) = h(i r ; J r ) f(i w ; J w ) = k(i r ; J r ) g(i 101 f(i w ; J w ) + g(i w ; J w ) = 101 h(i r ; J r ) + k(i r ; J r ) to loss of precision. w/o We may not know array size at compile time: Problem: for array parameters, dynamically allocated arrays. eg. Easy to verify that all solutions to single equation are also solutions to system

24 dependence constraints: Flow I w I r Banerjees test First dependence test to take inequalities into account. DO 10 I = 1, 100 X(2I+3) = =... X(I+7) 2I w + 3 = I r + 7 Inequalities: bounds on a convex region R Equation: write as h(i w ; I r ) = 2I w I r 4. Solving constraints is equivalent to following question: Does h become 0 at an integer point within region R? 24

25 test: checks whether h has a zero in region R Banerjees not check that zero is reached at integer point Does (Intermediate value theorem) A continuous function Theorem: y) has a zero in a convex, connected region R i f(x; case: f is linear (or ane) and region R is dened by linear Special In this case, inequalities. theorem: f obtains its minimum and maximum values at Dantzigs corners of region R. the idea of Banerjees test: Intuitive Find corners of region R (i) Evaluate f at the corners and nd maximum and minimum. (ii) If max 0 and min 0, then f has a zero in R. (iii) max R f(x; y) 0 and min R f(x; y) 0. 25

26 Example: = 3x - 2 f(x) < b ) 3a < 3b ) 3a 2 < 3b 2 a minimum value = 3a 2; maximum value = 3b 2 So x + : = x if x 0; 0 otherwise Notation: x : = x if x 0; 0 otherwise Similarly, f(x) = cx + d and R = [x min ; x max ] Let R f = (c) + x max + (c) x min + d max R = [a,b] ) `Corners are a and b Using these functions, we can compute max and min easily. min R f = (c) x max + (c) + x min + d What about higher dimensions? 26

27 F (I; J) What are max R F and min R F? Function: way: evaluate F at corners (compute F (1; 1), F (1; N) and One (N; N)) etc. F high dimensions, hard to determine corners of region. In special case: Important variables ordered so bounds on variable x i depend only on x 1, (i) 2,...x i 1 (eg.i bounds are constants, J bounds depend only on I) x each variable has just one linear/ane lower and upper bound (ii) special case: variables can be dealt with one at a time in inner In outer order. to J (N,N) R 1 < I < N < < 1 J I (1,1) (N,1) I 27

28 J F (I; J) = 3I+( 4) + J max +( 4) J min +3 = 3I 4+3 = 3I 1 max J F (I; J) = 3I + ( 4) J max + ( 4) + J min + 3 = 3 3I min R F (I; J) = (3) + I max + (3) I min 1 = 3N 1 max R F (I; J) = (3) I max + (3) + I min 1 = 2 min For special case, higher-dimensional problem is solved Summary: repeated application of the 1-D solution. by J (N,N) R 1 < I < N < < 1 J I Example: (1,1) (N,1) I Objective function: F (I; J) = 3I 4J + 3 Key idea: First, maximize/minimize over J, keeping I xed. Now use bounds on I: 28

29 Conditions: Single equality constraint (i) equalities: use linearization rst) (multiple Variables can be ordered x 1 ; x 2 ; ::: so that (ii) each upper/lower bound on x i is an ane function of x 1 ; x 2 ; :::; x i 1 test: Banerjees Interpret equality constraint as function F over domain R (i) Maximize/minimize F over each variable separately in (ii) to outer order: x i ; x i 1; :::; x 1. inner If max R F 0 and min R F 0, then F becomes 0 (iii) some point in R. at Declare conservatively that F becomes 0 at an integer point (iv) there exist integer solutions to system of inequalities/equality. ) Summary of Banerjees test: given by inequalities. 29

30 In hierarchical testing for directions, solution to equalities (II) be done only once. should Output of equality solver may be useful to determine (III) and to eliminate some directions from consideration. distances dependence equation: J w = J r + 1 ) distance(j) = 1 Flow vector cannot be (=; >). So only possibility is (<; >): Direction Implementation notes: (I) Check for ZIV/separable SIV rst before calling IP engine. (eg) DO 10 I DO 10 J A(J) = A(J+1) + 1 test only for this. 30

31 X and Y may be aliased, there are may-dependences in the loop. If convention: aliased parameters may not be modied in FORTRAN Pointer-based data structures: require totally dierent (V) technology (IV) Array aliasing introduces complications! procedure f(x,y) DO I... X(I) =... =...Y(I-1).. procedure. 31

32 we use I to index into iteration space, dependence distances If -ve! become dependence: from trip n w to n r ) Flow + n w s = 2(l + n r s) 5. l normalization: Transform all loops so low index is 0 and step Loop is 1. We are doing it implicitly. size (VI) Negative loop step sizes: Loop normalization DO 10 I = 10,1, Solution: Use trip counts (0,1,...) to index loop iterations. DO 10 I = l,u,s X(I) = X(2I-5)... Distance vector = [n r n w ] 32

33 should distance/direction analog be for imperfectly nested What loops? (VII)Imperfectly nested loops Distance/direction not adequate for imperfectly nested loops. Perfectly nested loop: all statements nested within same loops DO 10 I = 1,10 DO 10 J = 1,10 10 Y(I) = Y(I) + A(I,J)*X(J) Imperfectly nested loop: triangular solve/cholesky/lu DO 10 I = 1,N DO 20 J = 1, I-1 20 B(I) = B(I) - L(I,J)*X(J) 10 X(I) = B(I)/L(I,I) 33

34 One approach: Compute distance/direction only for common loops. Not adequate for many applications like imperfect loop interchange. (row triangular solve) DO 10 I = 1,N DO 20 J = 1, I-1 20 B(I) = B(I) - L(I,J)*X(J) 10 X(I) = B(I)/L(I,I) => (column triangular solve) DO 10 I = 1,N X(I) = B(I)/L(I,I) DO 20 J = I+1, N 20 B(J) = B(J) - L(I,J)*X(I) 34

35 is a good dependence abstraction for imperfectly nested What loops? 35

36 position: exact dependence testing (using IP engine) is Traditional expensive too exact dependence testing is OK provided we rst check for easy (i) (ZIV,strong SIV, weak SIV) cases Conclusions Recent experience: (ii) IP engine is called for 3-4 of tests for direction vectors (iii) Cost of exact dependence testing: 3-5 of compile time 36

Homework #2: assignments/ps2/ps2.htm Due Thursday, March 7th.

Homework #2: assignments/ps2/ps2.htm Due Thursday, March 7th. Homework #2: http://www.cs.cornell.edu/courses/cs612/2002sp/ assignments/ps2/ps2.htm Due Thursday, March 7th. 1 Transformations and Dependences 2 Recall: Polyhedral algebra tools for determining emptiness