Department of Computer Science and Engineering, IIT Kanpur, INDIA. Code Optimization. Sanjeev K Aggarwal Code Optimization 1 of I

Size: px
Start display at page:

Download "Department of Computer Science and Engineering, IIT Kanpur, INDIA. Code Optimization. Sanjeev K Aggarwal Code Optimization 1 of I"

Transcription

1 Code Optimization Sanjeev K Aggarwal Code Optimization 1 of I

2 Code Optimization int a,b,c,d ldw a,r1 add r1,r2,r3 c = a+b ldw b,r2 add r3,1,r4 d = c+1 add r1,r2,r3 stw r3,c ldw c,r3 add r3,1,r4 stw r4,d source code naive sparc code optimized code 10 cycles 2 cycles Sanjeev K Aggarwal Code Optimization 2 of I

3 Loop Optimizations Most Important Optimizations moving loop invariant computations simplifying or eliminating computations on induction variables Global register allocation Instruction scheduling Some optimizations may be relevant to a particular program and may vary according to the structure and details of the programs a highly recursive program may benefit from tail call optimization a program with few loops and large basic blocks may benefit from loop distribution in-lining may decrease call over heads; in-lining may increase the code size and may have negative effect on performance by increasing cache misses Sanjeev K Aggarwal Code Optimization 3 of I

4 lexical analyzer Parser Sematic analyzer Translator String of characters Parse tree Low level intermediate code Lexical analyzer Parser Semantic analyzer Parse tree intermediate code generator String of characters Medium level intermediate code Optimizer Optimizer Final Assembly Low level intermediate code Relocateable object code or executable machine code code generator Post pass optimizer Medium level intermediate code Low level intermediate code Relocateable object code or executable machine code Choice is largely one of investment and development focus. Sanjeev K Aggarwal Code Optimization 4 of I

5 Low Level model of Optimization Difficult to port unless second architecture is close to the first one used in IBM Power-PC and HP PA-RISC easier to avoid phase ordering problem exposes all addresses to the optimizer recommended for similar architectures Sanjeev K Aggarwal Code Optimization 5 of I

6 Mixed model of optimization More easily to be adapted to new architectures may be more efficient at compile time used in Sun Sparc, Digital Alpha, Intel, SGI MIPS more appropriate for same language and many architectures Sanjeev K Aggarwal Code Optimization 6 of I

7 Code optimization Criteria for code improving transformation preserve the meaning must speed up the program must be worth the effort the analysis must be fast local transformation : within basic blocks global transformation : across basic blocks Sanjeev K Aggarwal Code Optimization 7 of I

8 Common subexpression elimination X := a + b Y := a + b X := a + b Y := X t 6 := 4 i X := a[t 6 ] t 7 := 4 i t 8 := 4 j t 9 := a[t 8 ] a[t 7 ] := t 9 t 10 := 4 j a[t 10 ] := X goto L CSE = (local) t 6 := 4 i X := a[t 6 ] t 8 := 4 j t 9 := a[t 8 ] a[t 6 ] := t 9 a[t 8 ] := X gotol Sanjeev K Aggarwal Code Optimization 8 of I

9 t 6 := 4 i X := a[t 6 ] t 8 := 4 j t 9 := a[t 8 ] a[t 6 ] := t 9 a[t 8 ] := X gotol t 4 :=4 j already = computed t 6 := 4 i X := a[t 6 ] t 9 := a[t 4 ] a[t 6 ] := t 9 a[t 4 ] := X goto L t 6 := 4 i X := a[t 6 ] t 9 := a[t 4 ] a[t 6 ] := t 9 a[t 4 ] := X gotol t 9 computes = a[j] t 6 := 4 i X := a[t 6 ] a[t 6 ] := t 5 a[t 4 ] := X gotol Sanjeev K Aggarwal Code Optimization 9 of I

10 t 6 := 4 i X := a[t 6 ] a[t 6 ] := t 5 a[t 4 ] := X goto L value X contains a[i] = use t 3 X := t 3 a[t 2 ] := t 5 a[t 4 ] := X goto L Sanjeev K Aggarwal Code Optimization 10 of I

11 Use g for f after assignment f = g Copy propagation X := t 3 a[t 2 ] := t 5 a[t 4 ] := X goto L = X := t 3 a[t 2 ] = t 5 a[t 4 ] := t 3 gotol X := Y if X > n gotol = X := Y if Y > n gotol Sanjeev K Aggarwal Code Optimization 11 of I

12 Dead code elimination Dead operation : unreachable by any path produces a value not used if whose true and false arcs are same if whose B expr known at compile time loop not to be executed procedure not to be called debug := false. if (debug) {. } X := t 3 a[t 2 ] := t 5 a[t 4 ] := t 3 goto L = a[t 2 ] := t 5 a[t 4 ] := t 3 gotol Sanjeev K Aggarwal Code Optimization 12 of I

13 Renaming temporary variable Rename temporary variable t to u and replace all the occurrences of t by u. This transformation increases parallelism. Interchange statements Two statements may be interchanged if value of the block is not affected. t1 = b + c t2 = X + Y t2 = X + Y t1 = b + c Constant folding Evaluate constant expressions at compile time. X = X = 8 Y = X 2 Y = 16 Sanjeev K Aggarwal Code Optimization 13 of I

14 Algebraic Transformation eliminate addition/subtraction with 0 X = X ± 0 should be eliminated. eliminate multiplication/division by 1 X = X 1 or X = X/1 should be eliminated. eliminate multiplication by 0 X = X 0 should be replaced with X = 0 Strength reduction Costly operators should be replaced by cheaper operators replace T = X 2 by T = X X replace T = X 4 by T = ls(x, 2) replace 2 X by X + X replace X/2 by rs(x, 1) Sanjeev K Aggarwal Code Optimization 14 of I

15 Loop Optimizations Code motion Code motion : expression not evaluated in the code must be moved out of loop while( i <= limit - 2 ) { \* statement not changing limit *\ } t := limit - 2 while( i <= t ) { statement } Sanjeev K Aggarwal Code Optimization 15 of I

16 i := 1 i := 1 loop X := a+b if i > n then exitloop loop X := a+b if i > n then exit V[i] := X V[i] := X i := i+1 i : = i+1 exit exit Sanjeev K Aggarwal Code Optimization 16 of I

17 Loop-unrolling i := 1 i:=1 loop loop if i > n then exitloop if i > n then exitloop body body 0 exit if i > n then exitloop body 1. if i > n then exitloop body n 1 exit DO I = 1 to 100 by 1 DO I = 1 to 100 by 2 A(I) = A(I) + B(I) = A(I) = A(I) + B(I) END A(I+1) = A(I+1) + B(I+1) END Sanjeev K Aggarwal Code Optimization 17 of I

18 Induction variable simplification B_1 Department of Computer Science and Engineering, IIT Kanpur, INDIA B_1 i = m 1 j = n t_1 = 4 * n v =a[t_1] t_4 = 4 * j B_2 B_3 B_3 j = j 1 t_4 = 4 * j t_5 = a[t_4] if t_5 > b goto B_3 j = j 1 t_4 = t_4 4 t_5 = a[t_4] if t_5 > b goto B_3 B_4 if i >= j goto B_6 B_5 B_6 Sanjeev K Aggarwal Code Optimization 18 of I

19 Induction variable simplification B_1 Prod = 0 I = 1 B_3 T_1 = 0 B_1 T_2 = addr(a) 4 T_4 = addr(b) 4 T_1 = 4 * I T_3 = T_2[T_1] T_5 = T_4[T_1] T_6 = T_3 * T_5 Prod = Prod + T_6 I = I + 1 if I <= 20 goto B_2 B_2 T_1 = T_1 + 4 T_3 = T_2[T_1] T_5 = T_4[T_1] T_6 = T_3 * T_5 Prod = Prod + T_6 if T_1 <= 76 goto B_2 B_2 Sanjeev K Aggarwal Code Optimization 19 of I

20 Loop Jamming Two adjacent loops may me merged into a single loop For I = 1 to 100 A(I) = 0 Endfor For I = 1 to 100 B(I) = X(I) + Y Endfor can be replaced by For I = 1 to 100 A(I) = 0 B(I) = X(I) + Y Endfor Sanjeev K Aggarwal Code Optimization 20 of I

21 Similarly For I = 1 to 10 do For J = 1 to 10 do A(I,J) = 0; For I = 1 to 10 do A(I,I) = 1; can be replaced by For I = 1 to 10 do For J = 1 to 10 do A(I,J) = 0; A(I,I) = 1; Sanjeev K Aggarwal Code Optimization 21 of I

22 A loop may be split into two loops. For example the following loop Loop Unswitching For I = 1 to 100 do if Bexpr then X[I] = A[I] + B[I] else X[I] = A[I] - B[I] Endif Endfor Sanjeev K Aggarwal Code Optimization 22 of I

23 may be replaced by If Bexpr then For I = 1 to 100 do X[I] = A[I] + B[I] Endfor else For I = 1 to 100 do X[I] = A[I] - B[I] Endfor Endif Sanjeev K Aggarwal Code Optimization 23 of I

24 Control Flow Analysis Sanjeev K Aggarwal Control Flow Analysis 1 of I

25 Control Flow Analysis shows hierarchical flow of control source control flow is not available in MIR or LIR loops may be constructed of ifs and gotos Sanjeev K Aggarwal Control Flow Analysis 2 of I

26 example 1 unsigned int fib(m) unsigned int m; { unsignedintf0 = 1, f1 = 1, f2, i; if(m <= 1){ return m; } else{ for(i = 2; i <= m; i + +){ f2 = f0 + f1; f0 = f1; f1 = f2; } returnf2; } } Sanjeev K Aggarwal Control Flow Analysis 3 of I

27 receive m f0 0 f1 1 if m <= 1 got L3 i 2 L1: if i <= m goto L2 return f2 L2: f2 f0 + f1 f0 f1 f1 f2 i i + 1 goto L1 L3: return m example 1 continued... Sanjeev K Aggarwal Control Flow Analysis 4 of I

28 receive m f0 0 f1 1 Y m <= 1 N return m i 2 N i <= m Y return f2 f2 f0 + f1 f0 f1 f1 f2 i i + 1 Sanjeev K Aggarwal Control Flow Analysis 5 of I

29 Basic blocks Useful for collecting information for optimization Basic block : Sequence of consecutive statements where flow of control enters at the beginning and leaves at the end. No branching permitted at an intermediate statement. Sanjeev K Aggarwal Control Flow Analysis 6 of I

30 Partition into basic blocks Input : a sequence of three address statements output : a list of basic blocks 1. Mark leaders a : first statement is a leader b : any statement which is target of a goto is a leader c : any statement that follows a goto statement is leader 2. for each leader all the statement following it up to the next leader or the end of the program make a basic block. Sanjeev K Aggarwal Control Flow Analysis 7 of I

31 (1) i := m 1 (16) t 7 := 4 i (2) j := n (17) t 8 := 4 j (3) t 1 := 4 n (18) t 9 := a[t 8 ] (4) v := a[t 1 ] (19) a[t 7 ] := t 9 (5) i := i + 1 (20) t 10 := 4 j (6) t 2 := 4 i (21) a[t 10 ] := x (7) t 3 := a[t 2 ] (22) goto (5) (8) if t 3 < v goto (5) (23) t 11 := 4 i (9) j := j 1 (24) x := a[t 11 ] (10) t 4 := 4 j (25) t 12 := 4 i (11) t 5 := a[t 4 ] (26) t 13 := 4 n (12) if t 5 > v goto (9) (27) t 14 := a[t 13 ] (13) if i >= j goto (23) (28) a[t 12 ] := t 14 (14) t 6 := 4 i (29) t 15 := 4 n (15) x := a[t 6 ] (30) a[t 15 ] := x Sanjeev K Aggarwal Control Flow Analysis 8 of I

32 Flow graph Add flow of control information to basic blocks. The directed graph is called flow graph. The nodes of the flow graph are basic blocks. One node is initial. There is a directed edge from block B 1 to block B 2 if B 2 follows B 1 in execution order. jump from B 1 to B 2 B 2 follows B 1 in the order of the program Sanjeev K Aggarwal Control Flow Analysis 9 of I

33 i := m - 1 j := n t1 := 4 * n v := a[t1] B1 B2 i := i + 1 t2 := 4 * i t3 := a[t2] if t3 < v goto B2 B3 j := j-1 t4 := 4 * j t5 := a[t4] if t5 > v goto B3 B5 B4 if i >= j goto B6 F t6 := 4 * i x := a[t6] t7 := 4 * i t8 := 4 * j t9 := a[t8] a[t7] := t9 t10 := 4 * j a[t10] := x goto B2 T B6 t11 := 4 * i x := a[t11] t12 := 4 * i t13 := 4 * n t14 := a[t13] a[t12] := t14 t15 := 4 * n a[t15] := x Sanjeev K Aggarwal Control Flow Analysis 10 of I

34 Introduce two special nodes entry and exit Control always enters through the entry node Control always leaves through the exit node Sanjeev K Aggarwal Control Flow Analysis 11 of I

35 Loops in flow graphs A loop has a single entry point and the components of a loop are strongly connected in a CFG. Dominators A node d dominates a node n if every path from entry to n passes through d. Every node dominates itself. Dominance is a reflexive partial order. reflexive: a dom a a antisymmetry: a dom b and b dom a a=b transitive: a dom b and b dom c a dom c Sanjeev K Aggarwal Control Flow Analysis 12 of I

36 Algorithm to find dominators D(n 0 ) := {n 0 }; for n in N-{n 0 } do D(n) := N; while changes to any D(n) occur do for n in N-{n 0 } do D(n) := {n} D(p); Sanjeev K Aggarwal Control Flow Analysis 13 of I

37 Dominator Tree Control Flow Graph Sanjeev K Aggarwal Control Flow Analysis 14 of I

38 Loop detection search for a back-edge such that a b is an edge and b dom a. given a back edge a b find set of nodes that can reach node a without going through node b and node b these nodes form a natural loop with b as header. Given back edge a b natural loop is a sub graph which contains a, b and all the nodes which can reach a without passing through b Sanjeev K Aggarwal Control Flow Analysis 15 of I

39 Algorithm to detect loops stack := empty; loop := {b}; insert(a); while stack is not empty do begin pop m of the stack; for each predecessor p of m do insert(p) end; procedure insert(m); if m is not in loop then begin loop := loop {m}; push m onto stack end; Sanjeev K Aggarwal Control Flow Analysis 16 of I

40 Approaches to Control Flow Analysis Approach 1; use dominators to discover loops use loops in optimization do iterative data flow analysis Approach 2: use interval analysis analyze overall structure of the program decompose it into nested regions the nesting structure forms a control tree Approach 3: use structural analysis speeds up dataflow analysis also called elimination method Sanjeev K Aggarwal Control Flow Analysis 17 of I

41 most compilers use the first approach it is easy to implement and provides most of the information for optimization it is inferior to the other two approaches interval based approaches are faster interval based approach can be used in incremental analysis structural analysis makes control flow transformations easy Sanjeev K Aggarwal Control Flow Analysis 18 of I

42 Pre-Header Department of Computer Science and Engineering, IIT Kanpur, INDIA Many optimizations require code movement from inside a loop to just before it. Pre-header is a new empty block just before header. all edges from outside to header go to pre-header a new edge goes from pre-header to header header Pre-header B1 B2 header B3 B1 B2 Loop B3 Loop with pre-header Sanjeev K Aggarwal Control Flow Analysis 19 of I

43 Reducible Flow Graphs A graph is reducible if applying a sequence of transformations reduce it to a single node A flow graph G=(N,E) is reducible if E can be partitioned into two disjoint groups E b and E f such that: (N,E f ) forms an acyclic graph in which every node can be reached from entry E b are the backw edges, edges whose heads dominate their tails. Example of a non reducible flow graph A B C Sanjeev K Aggarwal Control Flow Analysis 20 of I

44 In a reducible graph all the loops are natural loops there are no jumps in the middle of a loop Improper regions are multiple entry strongly connected regions B1 B1 B2 B3 B2 B3 B4 Sanjeev K Aggarwal Control Flow Analysis 21 of I

45 programming languages do not allow irreducible flow graphs Fortran with loops of if and gotos construct irreducible flow graphs most of the optimizations do not work for irreducible flow graphs node splitting is a possible solution B1 B2 B3 B3a Sanjeev K Aggarwal Control Flow Analysis 22 of I

46 Node Splitting Transformation used for converting non-reducible flow graphs to reducible flow graphs. If there is a node n with k predecessors: split n into k nodes generating nodes n 1, n 2,...,n k the ith predecessor of n becomes the predecessors of n i all the successors of n become successors of all of the n i s 1 1 1,2a 1,2a,2b, a 3 2b,3 2b Sanjeev K Aggarwal Control Flow Analysis 23 of I

47 Interval Analysis divides flow graph into various regions consolidate each region into a (abstract) node resulting flow graph is an abstract flow graph result of transformations produce control tree root represents the original graph leaf of control tree are basic blocks internal nodes are abstract nodes edges represent relationship between abstract nodes Sanjeev K Aggarwal Control Flow Analysis 24 of I

48 Intervals An interval is a natural loop plus an acyclic structure that dangles from nodes of the loop. Each interval has a header node that dominates all the nodes in the interval. Given a flow graph with initial node n 0, the interval with header n is denoted by I(n) and is defined as: n is in I(n) if all the predecessors of a node m are in I(n) then m is in I(n) no other node is in I(n) Sanjeev K Aggarwal Control Flow Analysis 25 of I

49 Interval Partition construct I(n 0 ); while there is a node m not yet selected but with a selected predecessor do construct I(m); construct I(n); I(n) := {n}; while there exists a node m n 0 all of whose predecessors are in I(n) do I(n) := I(n) {m} I(1) = 1, 2 I(3) = 3 I(4) = 4, 5, 6 I(7) = 7, 8, 9, 10 Sanjeev K Aggarwal Control Flow Analysis 26 of I

50 Interval Graphs From the interval graph construct a new graph I(G) nodes of the new graph correspond to interval partitions. initial node is the node containing initial node of G. there is an edge from node I to node J if there is some edge from element of I to the header of J. Interval partition can be repeatedly applied to the new Interval graph. Sanjeev K Aggarwal Control Flow Analysis 27 of I

51 1,2 1,2 1,2 1,..., 10 Limit Flow Graph 3 3 3,..., 10 4,5,6 4,..., 10 7,8,9,10 If the limit flow graph is a single node then the graph is reducible. Sanjeev K Aggarwal Control Flow Analysis 28 of I

52 Control Tree The result of applying a sequence of such transformations produces a control tree. It is defined as follows: The root is sn abstract graph representing the original flowgraph The leaves are individual basic blocks the internal nodes are abstract nodes representing regions of the flowgraph the edges of the tree represent the relationship between each abstract node and the regions that are its descendants Sanjeev K Aggarwal Control Flow Analysis 29 of I

53 B1 B1 B2 B2, B4 B3 B4 B3 B5, B6 B5 B1 B7 B6 B2, B4 B1... B7 B7 B3, B5, B6 B7 Sanjeev K Aggarwal Control Flow Analysis 30 of I

54 { B1, { B2, B4 }, { B3, { B5, B6 } }, B7 } B1 { B2, B4 } { B3, { B5, B6 } } B7 B2 B4 B3 { B5, B6 } B5 B6 Sanjeev K Aggarwal Control Flow Analysis 31 of I

55 T 1 - T 2 Analysis T 1 transformation collapses one node self loop to a node T 2 transformation collapses sequence of two nodes into one if the second node has only one predecessor Sanjeev K Aggarwal Control Flow Analysis 32 of I

56 B1 B1a B1a T2 B1b T1 B2 T2 B3a B3b B3 B4 B1b B1a B3b B1 B2 B3a B3 B4 Sanjeev K Aggarwal Control Flow Analysis 33 of I

57 Structural Analysis it is a more refined form of interval analysis it uses syntax directed method of dataflow analysis for each structure in the source it gives a formula it is more efficient than iterative method it has a construct for each type of region control tree is larger than the one generated by interval analysis each region is simple and small every region has exectly one entry point Sanjeev K Aggarwal Control Flow Analysis 34 of I

58 Some types of acyclic regions used in structural analysis B1 B1 B1 B2 B2 B2 B3.... if - then if - then - else Bn B0 block B1 B2 B3.... Bn schema case/switch schema Sanjeev K Aggarwal Control Flow Analysis 35 of I

59 Some types of cyclic regions used in structural analysis B1 B1 self loop B2 while loop B1 B2 B1 natural loop B2 B3 improper interval Sanjeev K Aggarwal Control Flow Analysis 36 of I

60 An acyclic region that does not fit any of the simple categories and so is identified as a proper interval B1 B2 B3 B4 B5 B6 Sanjeev K Aggarwal Control Flow Analysis 37 of I

61 entry Structural Analysis of a flow graph entry Department of Computer Science and Engineering, IIT Kanpur, INDIA entry B1 B1 B1a B2 B2a B3 B3 B3 B4 B5 B4 B5a B4 B5b B6 exit B7 B7 exit exit entry B1a entrya B3a exit Sanjeev K Aggarwal Control Flow Analysis 38 of I

62 Control tree of the flow graph analyzed in the previous slide entrya entry B1a B3a exit B1 B2a B3 B4 B5b B2 B5a B7 B5 B6 Sanjeev K Aggarwal Control Flow Analysis 39 of I

63 B1 Department of Computer Science and Engineering, IIT Kanpur, INDIA B2 B3 B4 B5 B6 B1 B2 B3 B4 B5 Improper Intervals Sanjeev K Aggarwal Control Flow Analysis 40 of I

64 Dataflow Analysis Sanjeev K Aggarwal Data Flow Analysis 1 of I

65 Dataflow analysis provide info about how a program segment manipulates data analysis must be conservative and aggressive collect information for optimization reaching definition available expression live variable busy expression Sanjeev K Aggarwal Data Flow Analysis 2 of I

66 reaching definition : a definition d reaches a point p if there is a path from d to p and d is not killed on the path available expression : an expression X+Y is available at point p if every path to p evaluates X+Y and after the last such evaluation no assignment to X or Y live variable : for a variable X and point p whether value of X at p can be used along some path starting from p. If yes X is live at p else X is dead at p busy expression : an expression B op C is busy at point p if along every path from p we come to computation B op C before any definition of B or C Sanjeev K Aggarwal Data Flow Analysis 3 of I

67 Typical equation out(s) = gen(s) in(s) kill(s) gen : definitions generated kill : definitions killed in : input definitions out : output definitions Reaching definitions unambiguous definitions : assignments ambiguous definitions : procedure call with X as var parameter procedure that can access X pointer *q = y Sanjeev K Aggarwal Data Flow Analysis 4 of I

68 Analysis of structured programs d: a = b +c gen(s) = {d} kill(s) = D a {d} out(s) = gen(s) in(s) kill(s) Sanjeev K Aggarwal Data Flow Analysis 5 of I

69 S1 S2 gen(s) = gen(s 2 ) gen(s 1 ) kill(s 2 ) kill(s) = kill(s 2 ) kill(s 1 ) gen(s 2 ) in(s 1 ) = in(s) in(s 2 ) = out(s 1 ) out(s) = out(s 2 ) Sanjeev K Aggarwal Data Flow Analysis 6 of I

70 S1 S2 gen(s) = gen(s 1 ) gen(s 2 ) kill(s) = kill(s 1 ) kill(s 2 ) out(s) = out(s 1 ) out(s 2 ) Sanjeev K Aggarwal Data Flow Analysis 7 of I

71 S1 gen(s) = gen(s 1 ) kill(s) = kill(s 1 ) out(s) = out(s 1 ) in(s 1 ) = in(s) gen(s 1 ) Sanjeev K Aggarwal Data Flow Analysis 8 of I

72 Assumptions : All paths in the flow graph are possible S1 S2 Suppose E is true and it never goes to S 2 gen(s) = gen(s 1 ) kill(s) = kill(s 1 ) out(s) = out(s 1 ) Sanjeev K Aggarwal Data Flow Analysis 9 of I

73 Therefore true gen(s) gen(s) true kill(s) kill(s) true is what is computed during execution therefore, this is safe estimate - prevents optimization - no wrong optimization Sanjeev K Aggarwal Data Flow Analysis 10 of I

74 Reaching definition analysis in(b) = P is pred of B out(p) out(b) = gen(b) in(b) kill(b) a definition d reaches end of a block iff either - it is generated in the block - it reaches block and not killed Kill & gen known for each block. A program with N blocks has 2N equations with 2N unknowns and therefore, solution is possible. -use iterative forward bit vector approach Sanjeev K Aggarwal Data Flow Analysis 11 of I

75 for each block B do in(b) = φ; out(b) = gen(b) endfor; change = true; while change do change = false; for each block B do newin = out(p) if newin in(b) then { change = true; in(b) = newin; out(b) = in(b) - kill(b) gen(b); } endfor endwhile Sanjeev K Aggarwal Data Flow Analysis 12 of I

76 d1: I:=2 d2: J:=I+1 B1 d3: I:=1 B2 d4: J:=J+1 B3 d5: J:=J 4 B4 B5 block gen kill B 1 d 1 d 2 d 3 d 4 d 5 B 2 d 3 d 1 B 3 d 4 d 2 d 5 B 4 d 5 d 2 d 4 B 5 φ φ Sanjeev K Aggarwal Data Flow Analysis 13 of I

77 block init pass1 pass2 pass3 in out in out in out in out B 1 φ d 1 d 2 d 3 d 1 d 2 d 2 d 3 d 1 d 2 d 2 d 3 d 4 d 5 d 1 d 2 B 2 φ d 3 d 1 d 2 d 2 d 3 d 1 d 2 d 3 d 4 d 5 d 2 d 3 d 4 d 5 d 1 d 2 d 3 d 4 d 5 d 2 d 3 d 4 d 5 B 3 φ d 4 d 2 d 3 d 3 d 4 d 2 d 3 d 4 d 5 d 3 d 4 d 2 d 3 d 4 d 5 d 3 d 4 B 4 φ d 5 d 3 d 4 d 3 d 5 d 3 d 4 d 3 d 5 d 3 d 4 d 3 d 5 B 5 φ φ d 3 d 4 d 5 d 3 d 4 d 5 d 3 d 4 d 5 d 3 d 4 d 5 d 3 d 4 d 5 d 3 d 4 d 5 ud(i,d 2 ) = d 1 ud(j,d 4 ) = d 2 d 4 d 5 ud(j,d 5 ) = d 4 ud(i,b 5 ) = d 3 Sanjeev K Aggarwal Data Flow Analysis 14 of I

78 Constant Folding While changes occur do for all the stmts S of the program do for each operand B of S do if there is a unique definition of B that reaches S and is a constant C then replace B by C in S; if all the operands of S are constant then replace rhs by eval(rhs); endfor endfor endwhile Sanjeev K Aggarwal Data Flow Analysis 15 of I

79 Forward reaching def available expression Backward live variable busy expression smallest solution largest solution transfer function out = (in kill) gen in forward dataflow out(b)is a transfer function of in(b) in backward dataflow in(b) is a transfer function of out(b) Sanjeev K Aggarwal Data Flow Analysis 16 of I

80 A lattice L consists of a set of values meet operator join operator Sanjeev K Aggarwal Data Flow Analysis 17 of I

81 Properties of Lattices x, y L w, z L such that x y = w and x y = z (closure) x, y L x y = y x and x y = y x (commutative) x, y, z L (x y) z = x (y z) and (x y) z = x (y z) (associativity) x, y, z L (x y) z = (x z) (y z) and (x y) z = (x z) (y z) (distributive) it has two unique elements top and bottom such that x L x = and x = Sanjeev K Aggarwal Data Flow Analysis 18 of I

82 Forward Reaching Definition Analysis elements are bit vectors meet is bitwise and join is bitwise or bottom is 0 n and top is 1 n BV n denotes lattice of n bits for example BV 3 is Sanjeev K Aggarwal Data Flow Analysis 19 of I

83 Example entry 1: recieve m 2: f0 = 0 3: f1= 1 m <= 1 return m 4: i = 2 i <= m exit return f2 5: f2 = f0 + f1 6: f0 = f1 7: f1 = f2 8: i = i + 1 Sanjeev K Aggarwal Data Flow Analysis 20 of I

84 A function f mapping a lattice to itself f:l L is monotonic if for all x, y x y f(x) f(y) for example f : BV 3 BV 3 defined as f : (x 1,x 2,x 3 ) (x 1, 1, x 3 ) is monotonic Height: it is the length of the longest ascending chain such that there exists x 1, x 2,...,x n such that = x 1 x 2... x n = Monotonicity and finite height ensure that the data-flow algorithms implementing function f terminate flow function maps lattice to lattice. Flow function for B 1 is given as F B1 (< x 1 x 2...x 8 >) =< 111x 4 x 5 00x 8 > Let F B () be the flow function representing flow through block B and F p represent the composition of the flow functions encountered in following path p then F p = F Bn... F B1 Sanjeev K Aggarwal Data Flow Analysis 21 of I

85 Flow functions for the flow-graph in the example F entry = id F B1 (< x 1 x 2...x 8 >) =< 111x 4 x 5 00x 8 > F B2 = id F B3 (< x 1 x 2...x 8 >) =< x 1 x 2 x 3 1x 5 x 6 x 7 0 > F B4 = id F B5 = id F B6 (< x 1 x 2...x 8 >) =< x > Iterative (forward) Data-flow Analysis Compute in(b) and out(b) L for each B flow graph in(b) = out(b) = F B (in(b)) init forb = entry P pred(b) out(p) otherwise Sanjeev K Aggarwal Data Flow Analysis 22 of I

86 Control Tree based Data-Flow Analysis these methods are known as elimination methods applies to both intervals and structures significantly harder to implement - requiring node splitting, iteration, solving of data flow problems over improper regions can be easily adapted to incremental updating of data flow information it makes two passes over the control tree first pass (bottom up) constructs flow functions second pass (top down) constructs and evaluates data-flow equations for each region Sanjeev K Aggarwal Data Flow Analysis 23 of I

87 If-then Construct if F_if/Y then F_if/N if then F_if then F_then First pass F if then = (F then F if/y ) F if/n If we do not distinguish between F if/y and F if/n then F if then = (F then F if ) F if = (F then id) F if Sanjeev K Aggarwal Data Flow Analysis 24 of I

88 Second pass in(if) = in(if-then) in(then) = F if/y (in(if)) If we do not distinguish between the true and the false parts in(if) = in(if-then) in(then) = F if (in(if)) Sanjeev K Aggarwal Data Flow Analysis 25 of I

89 If-then-else Construct if F_if/Y F_if/N if then else then else F_if then else F_then F_else First pass F if then else = (F then F if/y ) (F else F if/n ) Second pass in(if) = in(if-then-else) in(then) = F if/y (in(if)) in(else) = F if/n (in(if)) Sanjeev K Aggarwal Data Flow Analysis 26 of I

90 While Loop while F_while/Y while loop body F_body F_while/N F_while loop First pass F loop = (F body F while/y ) F while loop = F while/n F loop Second pass in(while) = F loop (in(while-loop)) in(body) = F while/y (in(while)) Sanjeev K Aggarwal Data Flow Analysis 27 of I

91 If we do not distinguish between F while/y and F while/n then F loop = (F body F while ) F while loop = F while F loop = F while (F body F while ) in(while) = F loop (in(while-loop)) in(body) = F while (in(while)) Sanjeev K Aggarwal Data Flow Analysis 28 of I

92 Example Consider the flow graph in an earlier slide and its structural flow analysis entry entry B1 B1 B2 B3 while B2 B3 B4 B4a B5 B5 B6 exit exit Sanjeev K Aggarwal Data Flow Analysis 29 of I

93 entry entry block B1 if-then-else B1a block entrya B2 B3a exit exit Sanjeev K Aggarwal Data Flow Analysis 30 of I

94 First Pass F B4a = F B4 (F B6 F B4 ) F B3a = F B5 F B4a F B3 F B1a = (F B2 F B1 ) (F B3a F B1 ) F entrya = F exit F B1a F entry Second Pass for entrya block in(entry) = init in(b 1a ) = F entry (in(entry)) in(exit) = F B1a (in(b 1a )) for if-then-else block in(b 1 ) = in(b 1a ) in(b 2 ) = F B1 (in(b 1 )) in(b 3 ) = F B1 (in(b 1 )) Sanjeev K Aggarwal Data Flow Analysis 31 of I

95 for B 3a block in(b 3 ) = in(b 3a ) in(b 4a ) = F B3 (in(b 3 )) in(b 5 ) = F B4a (in(b 4a )) for the while loop in(b 4 ) = (F B6 F B4 ) (in(b 4a )) = (id (F B6 F B4 ))(in(b 4a )) in(b 6 ) = F B4 (in(b 4 )) Interval Analysis Interval analysis is trivial; it is identical to structural analysis. Only three kinds of regions appear: general acyclic, proper, and improper. Sanjeev K Aggarwal Data Flow Analysis 32 of I

96 Backward Analysis harder to model as single exit is not guaranteed in programs for constructs with single exit we can turn the equations around if B_if/Y B_if/N if then else then else B_if then else B_then B_else bottom up equation: B if then else = (B if/y B then ) (B if/n B else ) top down equation: out(then) = out(if-then-else) out(else) = out(if-then-else) out(if) = B then (out(then)) B else (out(else)) Sanjeev K Aggarwal Data Flow Analysis 33 of I

97 Available Expression Used in detecting common subexpressions B2 B1 t1 = 4 * i B2? i =... t0 = 4 * i B3 t2 = 4 * i B1 t1 = 4 * i B3 t2 = 4 * i expression 4*i in B 3 is a common subexpression if 4*i is available the entry point of B 3 it will be available if i is not assigned a value in B 2 or 4*i is re-computed after i is assigned in B 2 Sanjeev K Aggarwal Data Flow Analysis 34 of I

98 how to compute set of generated expressions: at a point prior to block no expressions are available if at a point p set A of expressions is available and q is a point after p with statement x:=y+z then set of expressions available at q is: add to A expression y+z delete from A any expression involving x Sanjeev K Aggarwal Data Flow Analysis 35 of I

99 Statement a:=b+c b:=a-d c:=b+c d:=a-d Example Available expression none b+c a-d a-d none Sanjeev K Aggarwal Data Flow Analysis 36 of I

100 U is the universal set of all expressions appearing in a program in[b] and out[b] are sets of expressions available at the beginning/end of B e-gen[b] and e-kill[b] are sets of expressions generated and killed in B out[b] = in[b] - e-kill[b] e-gen[b] in[b] = out[p] for B not initial in[b 0 ] = φ where B 0 is the initial block initialization in[b 0 ] = φ out[b 0 ] = e-gen[b 0 ] out[b] = U - e-kill[b] if B is not an entry block Sanjeev K Aggarwal Data Flow Analysis 37 of I

101 Live Variable Analysis Used for dead code elimination in[b] and out[b] are sets of live variables at entry and exit def[b] set of variables assigned value in B prior to use in B use[b] set of variables whose value may be used before definition in B in[b] = use[b] (out[b] - def[b]) out[b] = in[s] where S is successor of B a variable is live coming into a block if EITHER is it used in the block before re-definition OR it is live coming out and not re-defined a variable is live coming out of a block if it is live coming into one of its successors Initialization: in[b]= φ for all B Sanjeev K Aggarwal Data Flow Analysis 38 of I

102 Very Busy Expression in[b] and Out[B] are sets of VBE at the beginning and end of B use[b] set of expressions b+c computed in B with no prior definition of b or c def[b] set of expression b+c for which either b or c is defined in block B prior to computation of b+c in[b] = out[b] - def[b] use[b] out[b] = in[s] where S is successor of B an expression is VBE coming into a block if either it is used in B or it is live coming out and not defined in B an expression is VBE coming out of a block if it is live going into all the successors of B Initialization: in[b] = U for all B Sanjeev K Aggarwal Data Flow Analysis 39 of I

103 Common Sub-expression Elimination for every statement s of the form x=y+z such that y+z is available at the beginning of the block and y and z are not re-defined prior to s 1. find all definitions which have y+z that reach s block 2. create a new variable u 3. replace each w=y+z found in (1) by u=y+z; w=u 4. replace statement s by x=u Sanjeev K Aggarwal Data Flow Analysis 40 of I

104 t2 = 4 * i t3 = a[t2] CSE u = 4 * i t2 = u t3 = a[t2] copy u = 4 * i t2 = u t3 = a[u] propagation t6 = 4 * i t7 = a[t6] t6 = u t7 = a[t6] t6 = u t7 = a[u] CSE u = 4 * i t2 = u t3 = a[u] dead code elimination u = 4 * i t3 = a[u] t6 = u t7 = t3 t7 = t3 Sanjeev K Aggarwal Data Flow Analysis 41 of I

105 Copy Propagation Assignment s:x=y may be eliminated if at all the places where x is used we replace x by y statement s must be the only definition of x reaching where substitution is to be made on every path from s to target there are no assignments to y (additional data flow analysis needs to be done) Algorithm: for each copy s:x=y do the following: 1. determine those uses of x that are reached by this definition 2. determine whether it is the only definition of x reaching and there is no definition of y on the path 3. if s meets the above conditions then remove s and replace all uses of x found in (1) by y Sanjeev K Aggarwal Data Flow Analysis 42 of I

106 Loop Invariant Computations If for an assignment x=y+z all the definitions of y and z are outside loop then x=y+z is invariant of loop. Input: A loop L with basic blocks. Assume that ud chains are available for individual statements. 1. mark invariant statements whose operands are all either constants or or have their reaching definitions outside L 2. repeat step (3) until no new statements are marked invariant 3. mark invariant whose operands either are constant, have all their reaching definitions outside L, or have exactly one reaching definition and that definition is a statement in L marked invariant Sanjeev K Aggarwal Data Flow Analysis 43 of I

107 Performing Code Motion Move an invariant statement s to pre-header if following conditions are met: 1. the block containing s dominates all exit nodes of the nodes 2. there is no other statement in the loop that assign to x 3. no use of x in the loop is reached by any definition of x other than s Maintaining dataflow information 1. ud chains: does not change by code motion 2. dominator information: changes by code motion; it needs to be recomputed. More general code motion: if none of the three conditions are satisfied then for a loop invariant statement A=B+C define T=B+C in the pre header and replace A=B+C by A=T Sanjeev K Aggarwal Data Flow Analysis 44 of I

108 i = 1 B1 B1 i = 1 i = 2 B6 B2 B2 if u<v goto B3 if u<v goto B3 B3 B3 i = 2 u = u + 1 B4 v = v-1 if v<=20 goto B5 u = u + 1 v = v-1 if v<=20 goto B5 B4 J = i B5 J = i B5 Example of condition 1: Illegal code motion Sanjeev K Aggarwal Data Flow Analysis 45 of I

109 B1 i = 1 i = 1 B1 B2 B2 B3 i = 3 if u<v got B3 B3 if u<v goto B3 i = 2 u = u + 1 B4 i = 2 u = u + 1 B4 v = v-1 if v<=20 goto B5 k = i; v = v - 1 if v<=20 goto B5 J = i B5 J = i B5 Example of condition 2 Example of condition 3 Sanjeev K Aggarwal Data Flow Analysis 46 of I

110 Elimination of Induction Variable A variable X is called induction variable of a loop if in every iteration value of X is changed by a constant value. basic induction variables: as defined i=i ± c secondary induction variable: a basic function of basic induction variable Sanjeev K Aggarwal Data Flow Analysis 47 of I

111 Detection of Induction Variables Input: A loop L with reaching definition information and loop invariant computation information Output: A set of induction variables. Associated with each induction variable j is a triple (i, c, d) such that j=c*i+d. i is assumed to be basic induction variable, and j is said to belong to family of i. 1. find all basic induction variables of L (using loop invariant information). Each basic induction variable has a triple (i, 1, 0). 2. search for variable k with single assignment to k within L having one of the following forms: k=j*const, k=j/const, k=j ± const where j is an induction variable. Sanjeev K Aggarwal Data Flow Analysis 48 of I

112 1. if j is basic induction variable then k is in family of j. if j is not basic and is in family of i then there is no assignment to i between j and k no definition of j outside L reaches k 2. modify instructions computing induction variable such that ± are used rather than multiplication (strength reduction). Sanjeev K Aggarwal Data Flow Analysis 49 of I

113 Strength Reduction Consider each basic induction variable. For every induction variable j in family of i with triple (i, c, d) 1. create a new variable s 2. replace all assignments to j by j=s 3. immediately after each assignment i=i+n append s=s+c*n Place s in the family of i with triple (i,c,d) 4. initialize s to s=c*i+d in the pre-header Eliminate induction variables Sanjeev K Aggarwal Data Flow Analysis 50 of I

114 Pointers A := B + C P := D F := P E := B+C No definitions of B or C. Is B+C available at E := B+C depends whether P changes B or C safe assumption : indirect assignment can change any variable, indirect use can use any name Therefore, more live variable and reaching definitions than realistic. fewer available expressions than realistic Sanjeev K Aggarwal Data Flow Analysis 51 of I

115 The language consists of A simple pointer language elementary data types (integers and reals) requiring one word each array of these types pointer is used as cursor to run through an array pointer p points to an element of an array variables that could be used as pointers are those declared to be pointers and temporaries that received a value that is pointer plus or minus a constant Sanjeev K Aggarwal Data Flow Analysis 52 of I

116 if there is a statement s: p:=&a then after s, p points to a if there is a statement s: p:= q ± c where c is an integer, and p and q are pointers then after s, p points to an array that q could point to before s if there is a statement s: p:=q then after s, p points to whatever q could point to before s in[b] is a set of pairs (p,a) where p is a pointer and a is a variable out[b] is defined in a similar manner for set of values after a block B Sanjeev K Aggarwal Data Flow Analysis 53 of I

117 Transfer function if s is p:=&a or p:=&a ± c in the case a is an array, then trans s (S) = (S - {(p,b) any variable b}) {(p,a)} if s is p:=q ± c for pointer q and nonzero integer c then trans s (S) = (S - {(p,b) any variable b}) {(p,b) (q,b) is in S and b is an array variable } if s is p:=q then trans s (S) = (S - {(p,b) any variable b}) {(p,b) (q,b) is in S} if s assign to pointer p any other expression then trans s (S) = (S - {(p,b) any variable b}) if s is not an assignment to a pointer then trans s (S) = S Sanjeev K Aggarwal Data Flow Analysis 54 of I

118 We can relate in, out and transfer as follows Consider following control flow graph in[b] = P is pred of B out[p] out[b] = trans B (in[b]) Department of Computer Science and Engineering, IIT Kanpur, INDIA q := &c B1 p := &(a[0]) B3 p := &c q := &(a[2]) B2 p := p + 1 B4 p := q B5 Sanjeev K Aggarwal Data Flow Analysis 55 of I

119 Suppose a is an array, c is and integer, and p and q are pointer. Initially in[b 1 ] = φ out[b 1 ] = trans B1 (φ) = {(q,c)} in[b 2 ] = out[b 1 ] out[b 2 ] = trans B2 ({(q,c)}) = {(p,c),(q,a)} in[b 3 ] = out[b 1 ] out[b 3 ] = trans B3 ({(q,c)}) = {(p,a),(q,c)} in[b 4 ] = out[b 2 ] out[b 3 ] out[b 5 ] in[b 4 ] = {(p,a),(p,c),(q,a),(q,c)} out[b 4 ] = trans B4 (in[b 4 ] = {(p,a),(q,a),(q,c)} in[b 5 ] = out[b 4 ] out[b 5 ] = {(p,a),(p,c),(q,a),(q,c)} Sanjeev K Aggarwal Data Flow Analysis 56 of I

120 Interprocedural dataflow analysis Aliases : if two variables denote the same memory location s 1 : a := b+x s 2 : y := c s 3 : d := b+x is b+x available at s 3? Yes, provided x and y are not aliases language : permits recursive procedures may refer to both global & local definitions data variables consist of globals and its own locals (no block structuring) parameters by reference single return node Sanjeev K Aggarwal Data Flow Analysis 57 of I

121 Alias computation 1. Rename variables so that no two procedures use the same formal parameters or local identifiers 2. If there is a procedure P(X 1 X n ) and an invocation P(Y 1 Y n ), set X i Y i 3. Take reflexive and transitive closure by adding X Y whenever Y X X Z whenever X Y and Y Z Sanjeev K Aggarwal Data Flow Analysis 58 of I

122 Example global g,h zero(); local i; g := one(h, i); h w i x end zero; one(w, x) x := two(w, w); w y w z two(g, x); g y x z end one; two(y, z) local k; h := one(k, y) k w y x end two; Therefore, h w y z k x i g Sanjeev K Aggarwal Data Flow Analysis 59 of I

123 Data flow analysis in presence of Procedure Calls change[p ] set of global variables and formal parameters of p that might be changed during an execution of p. def[p ] set of formal parameters and global variables having explicit definition within p. A: { a a is a global variable or formal of p, such that for some procedure q and integer i, p calls q with a as the ith actual parameter and ith formal of q is in change[q] } G: { g g is a global in change[q] and p calls q } change[p] = def[p] A G Sanjeev K Aggarwal Data Flow Analysis 60 of I

124 Data Dependence Analysis Sanjeev K Aggarwal Data Dependence Analysis 1 of I

125 Used for instruction scheduling used for data cache optimization Data dependence Analysis determines ordering relationship; a dependence between two statements constraints their execution order Control dependence: arises from control flow S1: a = b+c S2: if a > 10 goto L1 S3: d = b*e S4: e = d+1 S5: L1: d = e/2 S3 and S4 are executed if S2 is false Therefore, S2 δ c S3 and S2 δ c S4 Sanjeev K Aggarwal Data Dependence Analysis 2 of I

126 Data dependence arises from flow of data between two statements Compiler must analyze programs to find constraints preventing the reordering of operations. Consider: A = 0 (1) B = A (2) C = A + D (3) D = 2 (4) Moving (2) above (1):: Value of A in (2) changes Moving (4) above (3):: results in wrong value of D in (3) Sanjeev K Aggarwal Data Dependence Analysis 3 of I

127 Three types of constraints: Flow or True dependence: When a variable is assigned or defined in one statement and used in subsequent statement Anti dependence: When a variable is used in one statement and reassigned in subsequently executed statement Output dependence: When a variable is assigned in one statement and reassigned in subsequent statement Anti dependence and Output dependence arise from reuse of variable and are also called False dependence. Flow dependence is inherent in computation and cannot be eliminated by renaming. Therefore it is also called True dependence. Sanjeev K Aggarwal Data Dependence Analysis 4 of I

128 Data Dependence Graph Data structure used to depict dependency between statements. Each statement represents a node in the graph Nodes are connected by directed edges 1. When S 2 is flow dependent on S 1, it is denoted by S 1 δ f S 2 or S 1 δ S 2 and represented by S 1 S 2 2. When there is an anti-dependence from S 1 to S 2, it is denoted by S 1 δ S 2 or S 1 δ a S 2 and represented by S 1 S 2 3. when there is an output-dependence from S 1 to S 2, it is denoted by S 1 δ o S 2 and represented by S 1 S 2 Sanjeev K Aggarwal Data Dependence Analysis 5 of I

129 S1 S3 S2 S4 Sanjeev K Aggarwal Data Dependence Analysis 6 of I

130 Approaches to data dependence relations: Address Based: Dependences which use the same address Value Based: Dependences which use the same value Consider A = 0 B = A A = B + 1 C = A Address-based approach Value-Based Approach There is a flow dependence In (4) value of A used is between S 1 and S 4 defined in (3) and not in (1) S 1 S 4 thus, there is no data dependence because S 4 uses A Value based dependence is a subset of Address based dependence. Sanjeev K Aggarwal Data Dependence Analysis 7 of I

131 For Address-based dependence: out(s 1 ) in(s 2 ) φ = S 1 δ f S 2 in(s 1 ) out(s 2 ) φ = S 1 δ a S 2 out(s 1 ) out(s 2 ) φ = S 1 δ o S 2 If in(s 1 ) in(s 2 ) φ then? This is written as S 1 δ i S 2 and is used for cache optimizations Sanjeev K Aggarwal Data Dependence Analysis 8 of I

132 Basic Block Dependence construct dependence graph for the instructions I 1 and I 2 may have flow, anti or output dependence can not determine whether I 1 can be moved beyond I 2 suppose an instruction reads from [r 11 ](4) and the next instruction writes to [r ](4) unless we know r 11 and r point to different locations assume a flow dependence I 1 is a predecessor of I 2 if I 2 must not execute before some cycles of I 1 Sanjeev K Aggarwal Data Dependence Analysis 9 of I

133 Latency: delay required between initiation times of I 1 and I 2 minus execution time required for I 1 before another instruction can start. For example, if two cycles must elapse between I 1 and I 2 then latency is 1. r2 [r1](4) r3 [r1+4](4) r4 r2 + r3 r5 r2-1 assume load has latency of 1; requires 2 cycles to finish Sanjeev K Aggarwal Data Dependence Analysis 10 of I

134 r3 [r15](4) r4 [r15 + 4](4) r2 r3 - r4 r5 [r12](4) r12 r r6 r3 * r5 [r15 + 4](4) r3 r5 r Sanjeev K Aggarwal Data Dependence Analysis 11 of I

135 Resource Vector: for an instruction is an array of sets of resources required. For Mips R4000 A: mantissa add E: exception test M: multiplier Ist stage N: mult second stage R: adder round S: operand shift U: unpack add.s U S,A A,R R,S mul.s U M M M N N,A R starting add in 4th cycle of mul results in conflict in 6th and 7th cycle of mul in the 6th cycle both need A and in the 7th both need R. competition by two or more instruction for a resource at the same time is called structural hazard. Sanjeev K Aggarwal Data Dependence Analysis 12 of I

136 Data Dependence in Loops Each statement executed many times dependence can flow from one statement to any other dependence can flow to the same statement for i = 2, 9 do x(i) = y(i) + z(i) S 1 a(i) = x(i-1) + 1 S 2 endfor S 1 S 2 Sanjeev K Aggarwal Data Dependence Analysis 13 of I

137 do i = 1, N a(i) = b(i) c(i) = a(i) + b(i) e(i) = c(i+1) enddo Unroll the loop 1 S S S S S S 2 S 3 2 S 2 S 1 S S 3 3 S 3 Sanjeev K Aggarwal Data Dependence Analysis 14 of I

138 do I = 1, N A = B(I) C(I) = A + B(I) E(I) = C(I+1) enddo S 1 S 1 S 1 S 1 S 2 S 2 S 2 S 2 S 3 S 3 S 3 S 3 Each node represents many instances of the same statement data dependence relations may be annotated with information about relative iterations in which the related instances occur Sanjeev K Aggarwal Data Dependence Analysis 15 of I

139 a data dependence is loop independent if dependence is between instances in the same iteration Consider a loop do i = 1, N X( f(i) ) =... S 1... = X( g(i) ) S 2 enddo there is a loop independent dependence from S 1 to S 2 if there is an integer i such that 1 i N and f(i) = g(i) OR there is an iteration in which S 1 writes into X and S 2 reads from the same element of X. Sanjeev K Aggarwal Data Dependence Analysis 16 of I

140 A data dependence is loop dependent if dependence is between different iterations. there is a loop dependent dependence from S 1 to S 2 if there exist integers i 1 and i 2 such that 1 i 1 < i 2 N and f(i 1 ) = g(i 2 ) OR S 1 writes into X in iteration i 1 and S 2 reads from the same location in a later iteration i 2. Therefore, to find out data dependence from S 1 to S 2, one has to solve f(i 1 ) = g(i 2 ) such that 1 i 1 i 2 N holds. Sanjeev K Aggarwal Data Dependence Analysis 17 of I

141 Example: do I = 1, 100 X( 2I+1 ) =... S 1... = X( 2I+4 ) S 2 enddo - Is there a dependence from S 1 to S 2? - Coarse grain analysis: S 1 writes into X and S 2 reads from X. Therefore, S 1 S 2 or S 1 δ f S 2 Sanjeev K Aggarwal Data Dependence Analysis 18 of I

142 Example: do I = 1, 100 X( 2I+1 ) =... S 1... = X( 2I+4 ) S 2 enddo - Is there a dependence from S 1 to S 2? - Coarse grain analysis: S 1 writes into X and S 2 reads from X. Therefore, S 1 S 2 or S 1 δ f S 2 - Fine grain analysis: Eqn: 2i = 2i has no integer solution Therefore, no dependence from S 1 to S 2 Sanjeev K Aggarwal Data Dependence Analysis 19 of I

143 Example: DO I = 1, 50 X(I) = = X(I+50) ENDDO - Fine grain analysis: Eqn. i 1 = i has integer solution. Sanjeev K Aggarwal Data Dependence Analysis 20 of I

144 Example: DO I = 1, 50 X(I) = = X(I+50) ENDDO - Fine grain analysis: Eqn. i 1 = i has integer solution. - However, no integer solution in the range 1 i 1 i 2 50 therefore, no dependence from S 1 to S 2 Sanjeev K Aggarwal Data Dependence Analysis 21 of I

145 If f and g are general functions, then the problem is intractable. If f and g are linear functions of loop index, then to test dependence we need to find values of two integers i 1 and i 2 such that 1 i 1 i 2 N and a 0 + a 1 i 1 = b 0 + b 1 i 2 which can be rewritten as 1 i 1 i 2 N a 1 i 1 b 1 i 2 = b 0 a 0 These are called Linear Diophantine Equations. Sanjeev K Aggarwal Data Dependence Analysis 22 of I

Dataflow Analysis. A sample program int fib10(void) { int n = 10; int older = 0; int old = 1; Simple Constant Propagation

Dataflow Analysis. A sample program int fib10(void) { int n = 10; int older = 0; int old = 1; Simple Constant Propagation -74 Lecture 2 Dataflow Analysis Basic Blocks Related Optimizations SSA Copyright Seth Copen Goldstein 200-8 Dataflow Analysis Last time we looked at code transformations Constant propagation Copy propagation

More information

Topic-I-C Dataflow Analysis

Topic-I-C Dataflow Analysis Topic-I-C Dataflow Analysis 2012/3/2 \course\cpeg421-08s\topic4-a.ppt 1 Global Dataflow Analysis Motivation We need to know variable def and use information between basic blocks for: constant folding dead-code

More information

Compiler Design. Data Flow Analysis. Hwansoo Han

Compiler Design. Data Flow Analysis. Hwansoo Han Compiler Design Data Flow Analysis Hwansoo Han Control Flow Graph What is CFG? Represents program structure for internal use of compilers Used in various program analyses Generated from AST or a sequential

More information

Dataflow Analysis Lecture 2. Simple Constant Propagation. A sample program int fib10(void) {

Dataflow Analysis Lecture 2. Simple Constant Propagation. A sample program int fib10(void) { -4 Lecture Dataflow Analysis Basic Blocks Related Optimizations Copyright Seth Copen Goldstein 00 Dataflow Analysis Last time we looked at code transformations Constant propagation Copy propagation Common

More information

Lecture 5 Introduction to Data Flow Analysis

Lecture 5 Introduction to Data Flow Analysis Lecture 5 Introduction to Data Flow Analysis I. Structure of data flow analysis II. III. IV. Example 1: Reaching definition analysis Example 2: Liveness analysis Framework Phillip B. Gibbons 15-745: Intro

More information

MIT Loop Optimizations. Martin Rinard

MIT Loop Optimizations. Martin Rinard MIT 6.035 Loop Optimizations Martin Rinard Loop Optimizations Important because lots of computation occurs in loops We will study two optimizations Loop-invariant code motion Induction variable elimination

More information

Data flow analysis. DataFlow analysis

Data flow analysis. DataFlow analysis Data flow analysis DataFlow analysis compile time reasoning about the runtime flow of values in the program represent facts about runtime behavior represent effect of executing each basic block propagate

More information

CSC D70: Compiler Optimization Dataflow-2 and Loops

CSC D70: Compiler Optimization Dataflow-2 and Loops CSC D70: Compiler Optimization Dataflow-2 and Loops Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

More information

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries Data-Flow Analysis Compiler Design CSE 504 1 Preliminaries 2 Live Variables 3 Data Flow Equations 4 Other Analyses Last modifled: Thu May 02 2013 at 09:01:11 EDT Version: 1.3 15:28:44 2015/01/25 Compiled

More information

(b). Identify all basic blocks in your three address code. B1: 1; B2: 2; B3: 3,4,5,6; B4: 7,8,9; B5: 10; B6: 11; B7: 12,13,14; B8: 15;

(b). Identify all basic blocks in your three address code. B1: 1; B2: 2; B3: 3,4,5,6; B4: 7,8,9; B5: 10; B6: 11; B7: 12,13,14; B8: 15; (a). Translate the program into three address code as defined in Section 6.2, dragon book. (1) i := 2 (2) if i > n goto (7) (3) a[i] := TRUE (4) t2 := i+1 (5) i := t2 (6) goto (2) (7) count := 0 (8) s

More information

CMSC 631 Program Analysis and Understanding. Spring Data Flow Analysis

CMSC 631 Program Analysis and Understanding. Spring Data Flow Analysis CMSC 631 Program Analysis and Understanding Spring 2013 Data Flow Analysis Data Flow Analysis A framework for proving facts about programs Reasons about lots of little facts Little or no interaction between

More information

3/11/18. Final Code Generation and Code Optimization

3/11/18. Final Code Generation and Code Optimization Final Code Generation and Code Optimization 1 2 3 for ( i=0; i < N; i++) { base = &a[0]; crt = *(base + i); } original code base = &a[0]; for ( i=0; i < N; i++) { crt = *(base + i); } optimized code e1

More information

Lecture 4 Introduc-on to Data Flow Analysis

Lecture 4 Introduc-on to Data Flow Analysis Lecture 4 Introduc-on to Data Flow Analysis I. Structure of data flow analysis II. Example 1: Reaching defini?on analysis III. Example 2: Liveness analysis IV. Generaliza?on 15-745: Intro to Data Flow

More information

Generalizing Data-flow Analysis

Generalizing Data-flow Analysis Generalizing Data-flow Analysis Announcements PA1 grades have been posted Today Other types of data-flow analysis Reaching definitions, available expressions, reaching constants Abstracting data-flow analysis

More information

Data Flow Analysis (I)

Data Flow Analysis (I) Compiler Design Data Flow Analysis (I) Hwansoo Han Control Flow Graph What is CFG? Represents program structure for internal use of compilers Used in various program analyses Generated from AST or a sequential

More information

Dataflow Analysis. Chapter 9, Section 9.2, 9.3, 9.4

Dataflow Analysis. Chapter 9, Section 9.2, 9.3, 9.4 Dataflow Analysis Chapter 9, Section 9.2, 9.3, 9.4 2 Dataflow Analysis Dataflow analysis is a sub area of static program analysis Used in the compiler back end for optimizations of three address code and

More information

Introduction to data-flow analysis. Data-flow analysis. Control-flow graphs. Data-flow analysis. Example: liveness. Requirements

Introduction to data-flow analysis. Data-flow analysis. Control-flow graphs. Data-flow analysis. Example: liveness. Requirements Data-flow analysis Michel Schinz based on material by Erik Stenman and Michael Schwartzbach Introduction to data-flow analysis Data-flow analysis Example: liveness Data-flow analysis is a global analysis

More information

Static Program Analysis

Static Program Analysis Static Program Analysis Xiangyu Zhang The slides are compiled from Alex Aiken s Michael D. Ernst s Sorin Lerner s A Scary Outline Type-based analysis Data-flow analysis Abstract interpretation Theorem

More information

Lecture 10: Data Flow Analysis II

Lecture 10: Data Flow Analysis II CS 515 Programming Language and Compilers I Lecture 10: Data Flow Analysis II (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang

More information

Lecture 11: Data Flow Analysis III

Lecture 11: Data Flow Analysis III CS 515 Programming Language and Compilers I Lecture 11: Data Flow Analysis III (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang

More information

We define a dataflow framework. A dataflow framework consists of:

We define a dataflow framework. A dataflow framework consists of: So far been talking about various dataflow problems (e.g. reaching definitions, live variable analysis) in very informal terms. Now we will discuss a more fundamental approach to handle many of the dataflow

More information

Dataflow Analysis - 2. Monotone Dataflow Frameworks

Dataflow Analysis - 2. Monotone Dataflow Frameworks Dataflow Analysis - 2 Monotone dataflow frameworks Definition Convergence Safety Relation of MOP to MFP Constant propagation Categorization of dataflow problems DataflowAnalysis 2, Sp06 BGRyder 1 Monotone

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 02, 03 May 2016 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 53 Most Essential Assumptions for Real-Time Systems Upper

More information

Dataflow Analysis. Dragon book, Chapter 9, Section 9.2, 9.3, 9.4

Dataflow Analysis. Dragon book, Chapter 9, Section 9.2, 9.3, 9.4 Dataflow Analysis Dragon book, Chapter 9, Section 9.2, 9.3, 9.4 2 Dataflow Analysis Dataflow analysis is a sub-area of static program analysis Used in the compiler back end for optimizations of three-address

More information

Introduction to Theory of Computing

Introduction to Theory of Computing CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages

More information

Dataflow analysis. Theory and Applications. cs6463 1

Dataflow analysis. Theory and Applications. cs6463 1 Dataflow analysis Theory and Applications cs6463 1 Control-flow graph Graphical representation of runtime control-flow paths Nodes of graph: basic blocks (straight-line computations) Edges of graph: flows

More information

Principles of Program Analysis: A Sampler of Approaches

Principles of Program Analysis: A Sampler of Approaches Principles of Program Analysis: A Sampler of Approaches Transparencies based on Chapter 1 of the book: Flemming Nielson, Hanne Riis Nielson and Chris Hankin: Principles of Program Analysis Springer Verlag

More information

Data Flow Analysis. Lecture 6 ECS 240. ECS 240 Data Flow Analysis 1

Data Flow Analysis. Lecture 6 ECS 240. ECS 240 Data Flow Analysis 1 Data Flow Analysis Lecture 6 ECS 240 ECS 240 Data Flow Analysis 1 The Plan Introduce a few example analyses Generalize to see the underlying theory Discuss some more advanced issues ECS 240 Data Flow Analysis

More information

Adventures in Dataflow Analysis

Adventures in Dataflow Analysis Adventures in Dataflow Analysis CSE 401 Section 9-ish Jack Eggleston, Aaron Johnston, & Nate Yazdani Announcements - Code Generation due Announcements - Code Generation due - Compiler Additions due next

More information

CSC D70: Compiler Optimization Static Single Assignment (SSA)

CSC D70: Compiler Optimization Static Single Assignment (SSA) CSC D70: Compiler Optimization Static Single Assignment (SSA) Prof. Gennady Pekhimenko University of Toronto Winter 08 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip

More information

Advanced Restructuring Compilers. Advanced Topics Spring 2009 Prof. Robert van Engelen

Advanced Restructuring Compilers. Advanced Topics Spring 2009 Prof. Robert van Engelen Advanced Restructuring Compilers Advanced Topics Spring 2009 Prof. Robert van Engelen Overview Data and control dependences The theory and practice of data dependence analysis K-level loop-carried dependences

More information

Graph Transformations T1 and T2

Graph Transformations T1 and T2 Graph Transformations T1 and T2 We now introduce two graph transformations T1 and T2. Reducibility by successive application of these two transformations is equivalent to reducibility by intervals. The

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

COSE312: Compilers. Lecture 17 Intermediate Representation (2)

COSE312: Compilers. Lecture 17 Intermediate Representation (2) COSE312: Compilers Lecture 17 Intermediate Representation (2) Hakjoo Oh 2017 Spring Hakjoo Oh COSE312 2017 Spring, Lecture 17 May 31, 2017 1 / 19 Common Intermediate Representations Three-address code

More information

Loop Parallelization Techniques and dependence analysis

Loop Parallelization Techniques and dependence analysis Loop Parallelization Techniques and dependence analysis Data-Dependence Analysis Dependence-Removing Techniques Parallelizing Transformations Performance-enchancing Techniques 1 When can we run code in

More information

Principles of Program Analysis: Control Flow Analysis

Principles of Program Analysis: Control Flow Analysis Principles of Program Analysis: Control Flow Analysis Transparencies based on Chapter 3 of the book: Flemming Nielson, Hanne Riis Nielson and Chris Hankin: Principles of Program Analysis. Springer Verlag

More information

MIT Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology MIT 6.035 Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Dataflow Analysis Compile-Time Reasoning About Run-Time Values of Variables

More information

Organization of a Modern Compiler. Middle1

Organization of a Modern Compiler. Middle1 Organization of a Modern Compiler Source Program Front-end syntax analysis + type-checking + symbol table High-level Intermediate Representation (loops,array references are preserved) Middle1 loop-level

More information

Monotone Data Flow Analysis Framework

Monotone Data Flow Analysis Framework Monotone Data Flow Analysis Framework Definition. A relation R on a set S is (a) reflexive iff ( x S)[xRx], (b) antisymmetric iff [xry yrx x=y] (c) transitive iff ( x,y,z S) [xry yrz xrz] Definition. A

More information

Loop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1

Loop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1 Loop Scheduling and Software Pipelining 2008-04-24 \course\cpeg421-08s\topic-7.ppt 1 Reading List Slides: Topic 7 and 7a Other papers as assigned in class or homework: 2008-04-24 \course\cpeg421-08s\topic-7.ppt

More information

CSE P 501 Compilers. Value Numbering & Op;miza;ons Hal Perkins Winter UW CSE P 501 Winter 2016 S-1

CSE P 501 Compilers. Value Numbering & Op;miza;ons Hal Perkins Winter UW CSE P 501 Winter 2016 S-1 CSE P 501 Compilers Value Numbering & Op;miza;ons Hal Perkins Winter 2016 UW CSE P 501 Winter 2016 S-1 Agenda Op;miza;on (Review) Goals Scope: local, superlocal, regional, global (intraprocedural), interprocedural

More information

Saturday, April 23, Dependence Analysis

Saturday, April 23, Dependence Analysis Dependence Analysis Motivating question Can the loops on the right be run in parallel? i.e., can different processors run different iterations in parallel? What needs to be true for a loop to be parallelizable?

More information

Material Covered on the Final

Material Covered on the Final Material Covered on the Final On the final exam, you are responsible for: Anything covered in class, except for stories about my good friend Ken Kennedy All lecture material after the midterm ( below the

More information

Reading: Chapter 9.3. Carnegie Mellon

Reading: Chapter 9.3. Carnegie Mellon I II Lecture 3 Foundation of Data Flow Analysis Semi-lattice (set of values, meet operator) Transfer functions III Correctness, precision and convergence IV Meaning of Data Flow Solution Reading: Chapter

More information

What is SSA? each assignment to a variable is given a unique name all of the uses reached by that assignment are renamed

What is SSA? each assignment to a variable is given a unique name all of the uses reached by that assignment are renamed Another Form of Data-Flow Analysis Propagation of values for a variable reference, where is the value produced? for a variable definition, where is the value consumed? Possible answers reaching definitions,

More information

Mechanics of Static Analysis

Mechanics of Static Analysis Escuela 03 III / 1 Mechanics of Static Analysis David Schmidt Kansas State University www.cis.ksu.edu/~schmidt Escuela 03 III / 2 Outline 1. Small-step semantics: trace generation 2. State generation and

More information

Construction of Static Single-Assignment Form

Construction of Static Single-Assignment Form COMP 506 Rice University Spring 2018 Construction of Static Single-Assignment Form Part II source IR IR target code Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all

More information

Compiler Design Spring 2017

Compiler Design Spring 2017 Compiler Design Spring 2017 8.6 Live variables Dr. Zoltán Majó Compiler Group Java HotSpot Virtual Machine Oracle Corporation Last lecture Definition: A variable V is live at point P if there is a path

More information

Compiling Techniques

Compiling Techniques Lecture 11: Introduction to 13 November 2015 Table of contents 1 Introduction Overview The Backend The Big Picture 2 Code Shape Overview Introduction Overview The Backend The Big Picture Source code FrontEnd

More information

Lecture Overview SSA Maximal SSA Semipruned SSA o Placing -functions o Renaming Translation out of SSA Using SSA: Sparse Simple Constant Propagation

Lecture Overview SSA Maximal SSA Semipruned SSA o Placing -functions o Renaming Translation out of SSA Using SSA: Sparse Simple Constant Propagation 1 Lecture Overview SSA Maximal SSA Semipruned SSA o Placing -functions o Renaming Translation out of SSA Using SSA: Sparse Simple Constant Propagation [Chapter 9] 2 SSA Revisited It is desirable to use

More information

Model Checking & Program Analysis

Model Checking & Program Analysis Model Checking & Program Analysis Markus Müller-Olm Dortmund University Overview Introduction Model Checking Flow Analysis Some Links between MC and FA Conclusion Apology for not giving proper credit to

More information

Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today

Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today Loop Transformations Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today Today Recall stencil computations Intro to loop transformations Data dependencies between

More information

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Worst-Case Execution Time Analysis. LS 12, TU Dortmund Worst-Case Execution Time Analysis Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 09/10, Jan., 2018 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 43 Most Essential Assumptions for Real-Time Systems Upper

More information

Computing Static Single Assignment (SSA) Form

Computing Static Single Assignment (SSA) Form Computing Static Single Assignment (SSA) Form Overview What is SSA? Advantages of SSA over use-def chains Flavors of SSA Dominance frontiers revisited Inserting φ-nodes Renaming the variables Translating

More information

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). 1a) G = ({R, S, T}, {0,1}, P, S) where P is: S R0R R R0R1R R1R0R T T 0T ε (S generates the first 0. R generates

More information

Program verification

Program verification Program verification Data-flow analyses, numerical domains Laure Gonnord and David Monniaux University of Lyon / LIP September 29, 2015 1 / 75 Context Program Verification / Code generation: Variables

More information

Scalar Optimisation Part 2

Scalar Optimisation Part 2 Scalar Optimisation Part 2 Michael O Boyle January 2014 1 Course Structure L1 Introduction and Recap 4-5 lectures on classical optimisation 2 lectures on scalar optimisation Last lecture on redundant expressions

More information

BASIC MATHEMATICAL TECHNIQUES

BASIC MATHEMATICAL TECHNIQUES CHAPTER 1 ASIC MATHEMATICAL TECHNIQUES 1.1 Introduction To understand automata theory, one must have a strong foundation about discrete mathematics. Discrete mathematics is a branch of mathematics dealing

More information

Goal. Partially-ordered set. Game plan 2/2/2013. Solving fixpoint equations

Goal. Partially-ordered set. Game plan 2/2/2013. Solving fixpoint equations Goal Solving fixpoint equations Many problems in programming languages can be formulated as the solution of a set of mutually recursive equations: D: set, f,g:dxd D x = f(x,y) y = g(x,y) Examples Parsing:

More information

' $ Dependence Analysis & % 1

' $ Dependence Analysis & % 1 Dependence Analysis 1 Goals - determine what operations can be done in parallel - determine whether the order of execution of operations can be altered Basic idea - determine a partial order on operations

More information

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

CPSC 421: Tutorial #1

CPSC 421: Tutorial #1 CPSC 421: Tutorial #1 October 14, 2016 Set Theory. 1. Let A be an arbitrary set, and let B = {x A : x / x}. That is, B contains all sets in A that do not contain themselves: For all y, ( ) y B if and only

More information

Loop Interchange. Loop Transformations. Taxonomy. do I = 1, N do J = 1, N S 1 A(I,J) = A(I-1,J) + 1 enddo enddo. Loop unrolling.

Loop Interchange. Loop Transformations. Taxonomy. do I = 1, N do J = 1, N S 1 A(I,J) = A(I-1,J) + 1 enddo enddo. Loop unrolling. Advanced Topics Which Loops are Parallel? review Optimization for parallel machines and memory hierarchies Last Time Dependence analysis Today Loop transformations An example - McKinley, Carr, Tseng loop

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Fall 28 Alexis Maciel Department of Computer Science Clarkson University Copyright c 28 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information

Models of Computation, Recall Register Machines. A register machine (sometimes abbreviated to RM) is specified by:

Models of Computation, Recall Register Machines. A register machine (sometimes abbreviated to RM) is specified by: Models of Computation, 2010 1 Definition Recall Register Machines A register machine (sometimes abbreviated M) is specified by: Slide 1 finitely many registers R 0, R 1,..., R n, each capable of storing

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information

Chapter 5. Finite Automata

Chapter 5. Finite Automata Chapter 5 Finite Automata 5.1 Finite State Automata Capable of recognizing numerous symbol patterns, the class of regular languages Suitable for pattern-recognition type applications, such as the lexical

More information

Spring, 2010 CIS 511. Introduction to the Theory of Computation Jean Gallier. Homework 4

Spring, 2010 CIS 511. Introduction to the Theory of Computation Jean Gallier. Homework 4 Spring, 00 CIS 5 Introduction to the Theory of Computation Jean Gallier Do either Problem B or Problem B. Do Problems B3, B4, B5 and B6. B problems must be turned in. Homework 4 March 4, 00; Due March

More information

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits Chris Calabro January 13, 2016 1 RAM model There are many possible, roughly equivalent RAM models. Below we will define one in the fashion

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

CS6901: review of Theory of Computation and Algorithms

CS6901: review of Theory of Computation and Algorithms CS6901: review of Theory of Computation and Algorithms Any mechanically (automatically) discretely computation of problem solving contains at least three components: - problem description - computational

More information

Topics on Compilers

Topics on Compilers Assignment 2 4541.775 Topics on Compilers Sample Solution 1. Test for dependences on S. Write down the subscripts. Which positions are separable, which are coupled? Which dependence test would you apply

More information

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van ngelen, Flora State University, 2007 Position of a Parser in the Compiler Model 2 Source Program Lexical Analyzer Lexical

More information

Abstract parsing: static analysis of dynamically generated string output using LR-parsing technology

Abstract parsing: static analysis of dynamically generated string output using LR-parsing technology Abstract parsing: static analysis of dynamically generated string output using LR-parsing technology Kyung-Goo Doh 1, Hyunha Kim 1, David A. Schmidt 2 1. Hanyang University, Ansan, South Korea 2. Kansas

More information

Computational Models - Lecture 3

Computational Models - Lecture 3 Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 1 Computational Models - Lecture 3 Equivalence of regular expressions and regular languages (lukewarm leftover

More information

Jacques Fleuriot. Automated Reasoning Rewrite Rules

Jacques Fleuriot. Automated Reasoning Rewrite Rules Automated Reasoning Rewrite Rules Jacques Fleuriot Lecture 8, page 1 Term Rewriting Rewriting is a technique for replacing terms in an expression with equivalent terms useful for simplification, e.g. given

More information

CSE 355 Test 2, Fall 2016

CSE 355 Test 2, Fall 2016 CSE 355 Test 2, Fall 2016 28 October 2016, 8:35-9:25 a.m., LSA 191 Last Name SAMPLE ASU ID 1357924680 First Name(s) Ima Regrading of Midterms If you believe that your grade has not been added up correctly,

More information

Non-context-Free Languages. CS215, Lecture 5 c

Non-context-Free Languages. CS215, Lecture 5 c Non-context-Free Languages CS215, Lecture 5 c 2007 1 The Pumping Lemma Theorem. (Pumping Lemma) Let be context-free. There exists a positive integer divided into five pieces, Proof for for each, and..

More information

(a) Definition of TMs. First Problem of URMs

(a) Definition of TMs. First Problem of URMs Sec. 4: Turing Machines First Problem of URMs (a) Definition of the Turing Machine. (b) URM computable functions are Turing computable. (c) Undecidability of the Turing Halting Problem That incrementing

More information

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write

More information

Lecture 17: Language Recognition

Lecture 17: Language Recognition Lecture 17: Language Recognition Finite State Automata Deterministic and Non-Deterministic Finite Automata Regular Expressions Push-Down Automata Turing Machines Modeling Computation When attempting to

More information

Probabilistic Model Checking and [Program] Analysis (CO469)

Probabilistic Model Checking and [Program] Analysis (CO469) Probabilistic Model Checking and [Program] Analysis (CO469) Program Analysis Herbert Wiklicky herbert@doc.ic.ac.uk Spring 208 / 64 Overview Topics we will cover in this part will include:. Language WHILE

More information

Register Allocation. Maryam Siahbani CMPT 379 4/5/2016 1

Register Allocation. Maryam Siahbani CMPT 379 4/5/2016 1 Register Allocation Maryam Siahbani CMPT 379 4/5/2016 1 Register Allocation Intermediate code uses unlimited temporaries Simplifying code generation and optimization Complicates final translation to assembly

More information

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar TAFL 1 (ECS-403) Unit- III 3.1 Definition of CFG (Context Free Grammar) and problems 3.2 Derivation 3.3 Ambiguity in Grammar 3.3.1 Inherent Ambiguity 3.3.2 Ambiguous to Unambiguous CFG 3.4 Simplification

More information

LINEAR SYSTEMS AND MATRICES

LINEAR SYSTEMS AND MATRICES CHAPTER 3 LINEAR SYSTEMS AND MATRICES SECTION 3. INTRODUCTION TO LINEAR SYSTEMS This initial section takes account of the fact that some students remember only hazily the method of elimination for and

More information

Syntax Directed Transla1on

Syntax Directed Transla1on Syntax Directed Transla1on Syntax Directed Transla1on CMPT 379: Compilers Instructor: Anoop Sarkar anoopsarkar.github.io/compilers-class Syntax directed Translation Models for translation from parse trees

More information

Lecture 4. Finite Automata and Safe State Machines (SSM) Daniel Kästner AbsInt GmbH 2012

Lecture 4. Finite Automata and Safe State Machines (SSM) Daniel Kästner AbsInt GmbH 2012 Lecture 4 Finite Automata and Safe State Machines (SSM) Daniel Kästner AbsInt GmbH 2012 Initialization Analysis 2 Is this node well initialized? node init1() returns (out: int) let out = 1 + pre( 1 ->

More information

Space-aware data flow analysis

Space-aware data flow analysis Space-aware data flow analysis C. Bernardeschi, G. Lettieri, L. Martini, P. Masci Dip. di Ingegneria dell Informazione, Università di Pisa, Via Diotisalvi 2, 56126 Pisa, Italy {cinzia,g.lettieri,luca.martini,paolo.masci}@iet.unipi.it

More information

On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms

On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms Anne Benoit, Fanny Dufossé and Yves Robert LIP, École Normale Supérieure de Lyon, France {Anne.Benoit Fanny.Dufosse

More information

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Lecture 3 Lexical analysis Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Big picture Source code Front End IR Back End Machine code Errors Front end responsibilities Check

More information

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University.

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University. Lexical Analysis Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University http://compilers.cs.uni-saarland.de Compiler Construction Core Course 2017 Saarland University Today

More information

Pushdown Automata: Introduction (2)

Pushdown Automata: Introduction (2) Pushdown Automata: Introduction Pushdown automaton (PDA) M = (K, Σ, Γ,, s, A) where K is a set of states Σ is an input alphabet Γ is a set of stack symbols s K is the start state A K is a set of accepting

More information

Parallelism and Machine Models

Parallelism and Machine Models Parallelism and Machine Models Andrew D Smith University of New Brunswick, Fredericton Faculty of Computer Science Overview Part 1: The Parallel Computation Thesis Part 2: Parallelism of Arithmetic RAMs

More information

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Section Summary. Relations and Functions Properties of Relations. Combining Relations Chapter 9 Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations Closures of Relations (not currently included

More information

Roy L. Crole. Operational Semantics Abstract Machines and Correctness. University of Leicester, UK

Roy L. Crole. Operational Semantics Abstract Machines and Correctness. University of Leicester, UK Midlands Graduate School, University of Birmingham, April 2008 1 Operational Semantics Abstract Machines and Correctness Roy L. Crole University of Leicester, UK Midlands Graduate School, University of

More information

AST rewriting. Part I: Specifying Transformations. Imperative object programs. The need for path queries. fromentry, paths, toexit.

AST rewriting. Part I: Specifying Transformations. Imperative object programs. The need for path queries. fromentry, paths, toexit. Part I: Specifying Transformations Oege de Moor Ganesh Sittampalam Programming Tools Group, Oxford focus AST rewriting To apply rule: rewrite(pat 0, pat 1 ) Match: focus = φ(pat 0 ) Replace: focus := φ(pat

More information

COMPUTER SCIENCE TRIPOS

COMPUTER SCIENCE TRIPOS CST.2016.2.1 COMPUTER SCIENCE TRIPOS Part IA Tuesday 31 May 2016 1.30 to 4.30 COMPUTER SCIENCE Paper 2 Answer one question from each of Sections A, B and C, and two questions from Section D. Submit the

More information