' $ Dependence Analysis & % 1
|
|
- Edith Norton
- 6 years ago
- Views:
Transcription
1 Dependence Analysis 1
2 Goals - determine what operations can be done in parallel - determine whether the order of execution of operations can be altered Basic idea - determine a partial order on operations - two operations not ordered in partial order => these operations can be done in parallel - any new execution order of operations that is consistent with partial order is legal What should be the granularity of operations? - depends on application - scheduling for pipelined machines: granularity = m/c instructions - loop restructuring: granularity = loop iterations 2
3 Dependences: data control flow anti output Flow dependence: S1 -> S2 (i) S1 executes before S2 in program order (ii) S1 writes into a location that is read by S2 Anti-dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 reads from a location that is overwritten later by S2 Output dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 and S2 write to the same location Input dependence: S1 -> S2 (i) S1 executes before S2 (ii) S1 and S2 both read from the same location output x := 2 y := x + 1 x := 3 y := 7 anti flow output 3
4 Conservative Approximation: - Real programs: imprecise information => need for safe approximation When you are not sure whether a dependence exists, you must assume it does. Example: procedure f (X,i,j) begin X(i) = 10; X(j) = 5; end Question: Is there an output dependence from the first assignment to the second? Answer: If (i = j), there is a dependence; otherwise, not. => Unless we know from interprocedural analysis that the parameters i and j are always distinct, we must play it safe and insert the dependence. Key notion: Aliasing : two program names may refer to the same location (like X(i) and X(j)) May-dependence vs must-dependence: More precise analysis may eliminate may-dependences 4
5 order = lexicographic order on iteration space: Program 1) (1; 2) (1; 3):::(1; 100) (2; 1) (2; 2)::: (100; 100) (1; Loop level Analysis: granularity is a loop iteration Key notion: iteration space of a loop DO I = 1, 100 DO J = 1, 100 S J each (I,J) value of loop indices corresponds to one point in picture I How do we compute dependences between loop iterations? 5
6 Dependences in loops DO 10 I = 1, N X(f(I)) = =...X(g(I)).. Conditions for ow dependence from iteration I w to I r : 1 I w I r N (write before read) f(i w ) = g(i r ) (same array location) Conditions for anti-dependence from iteration I g to I o : 1 I g < I o N (read before write) f(i o ) = g(i g ) (same array location) Conditions for output dependence from iteration I w1 to I w2 : 1 I w1 < I w2 N (write in program order) f(i w1 ) = f(i w2 ) (same array location) 6
7 Dependences in nested loops DO 10 I = 1, N DO 10 J = 1,M X(f(I,J),g(I,J)) = =...X(h(I,J),k(I,J)).. Recall: is the lexicographic order on iterations of nested loops. Conditions for ow dependence from iteration (I 1 ; J 1 ) to (I 2 ; J 2 ): (1; 1) (I 1 ; J 1 ) (I 2 ; J 2 ) (N; M) f(i 1 ; J 1 ) = h(i 2 ; J 2 ) g(i 1 ; J 1 ) = k(i 2 ; J 2 ) Anti and output dependences can be dened analogously. 7
8 general, need more information than presence/absence of In dependences. interchange: Loop in rst loop nest (eg: (2,1) depends on (1,2)) Illegal Example DO 10 I = 1,100 DO 10 I = 1,100 DO 10 J = 1,100 vs DO 10 J = 1, X(I,J) = X(I-1,J+1) X(I,J) = X(I-1,J-1) + 1 Check: both loop nests have dependences. Legal in second loop nest! What additional information about dependences should we compute? 8
9 IP calculator to compute the full dependence relation: Use 2)! (2; 1); (1; 3)! (2; 2); ::(2; 2)! (3; 1):::g f(1; (i) Expensive (time/space) to compute full relation Problems: Expensive to look up big relation (iii) Variable loop bounds? ii) One Approach: Compute the full dependence relation using IP calculator DO 10 I = 1,100 DO 10 J = 1, X(I,J) = X(I-1,J+1) + 1 Flow dependence constraints: (I w ; J w )! (I r ; J r ) (1; 1) (I w ; J w ) (I r ; J r ) (100; 100) I w = I r 1 J w = J r + 1 Use full relation to check legality of loop transformations. 9
10 Summarize dependence relation Distance/direction: at dependence relation from previous slide: Look between dependent iterations = (1; 1). That is, Dierence w ; J w )! (I r ; J r ) 2 dependence relation, implies (I r I w = 1 I r J w = 1 J f(1; 2)! (2; 1); (1; 3)! (2; 2); ::(2; 2)! (3; 1):::g We will say that the distance vector is [1; 1]. Note: From distance vector, we can easily recover the full relation. In this case, distance vector is an exact summary of relation. 10
11 of dependence relations are usuallyapproximate. Summaries Example: dependence equation: 2I w + 1 = I r. Flow relation: f(1! 3); (2! 5); (3! 7); :::g (1). Dependence xed distance between dependent iterations! No all distances are +ve, use direction vector instead. Since direction = [+]. In general, direction = [+] or [0] or [-]. Here, written by some authors as (<), (=), or (>). Also vectors are not exact. Direction we try to recover dependence relation from direction [+], we (eg):if bigger relation than (1): get! 2); (1! 3); :::; (1! 100); (2! 3); (2! 4); :::g f(1 DO 10 I = 1, X(2I+1) = X(I) + 1 Intuition: [+] direction = all distances in range [1; 1) etc. 11
12 loop nest is (I,J). Assume (I 1 ; J 1 )! (I 2 ; J 2 ) 2 dependence relation, then If = (I 2 I 1 ; J 2 J 1 ) Distance = (sign(i 2 I 1 ); sign(j 2 J 1 )) Direction Directions for Nested Loops J (<,>) (<,=) (=,=) (=,<) (<,<) I Figure 1: Legal Directions for Nested Loop (I,J) 12
13 on ow dependences: Focus w ) = g(i r ) f(i I w I r use inequalities shown above to test if dependence exists in First, direction (called [*] direction). any How to compute Directions: Use IP engine DO 10 I = 1, 100 X(f(I)) = =...X(g(I)).. If IP engine says there are no solutions, no dependence. Otherwise, determine the direction(s) of dependence. for direction [<]: add inequality I w < I r Test for direction [=]: add inequality I w = I r Test In a single loop, direction [>] cannot occur. 13
14 Rene direction vectors top down. (1) dependence in (; ) direction (eg),no Computing Directions: Nested Loops Same idea as single loop: hierarchical testing (*, *) (<, *) (=, *) (>, *) illegal directions (<, <) (<, =) (<, >) (=, <) (=, =) (=, >) Figure 2: Hierarchical Testing for Nested Loop Key ideas: ) no need to do more tests. (2) Do not test for impossible directions like (>; ). 14
15 In principle, we can use IP engine to compute all directions. Reality: most subscripts and loop bounds are simple! Engineering a dependence analyzer: First check for simple cases. Call IP engine for more complex cases. 15
16 in 1st dimension of A do not involve loop variables Subscripts subscripts called Zero Index Variable (ZIV) subscripts ) in 2nd dimension of A involve only one loop variable (I) Subscripts subscripts called Single Index Variable (SIV) subscripts ) in 3rd dimension of A involve many loop variables (J,K) Subscripts subscripts called Multiple Index Variable (MIV) subscripts ) Single Index Variable (SIV) subscript DO 10 I DO 10 J DO 10 K 10 A(5,I+1,J) =...A(N,I,K) + c 16
17 index variable in rst subscript (I) does not appear in However, other dimension any subscript is also SIV, but its index variable J appears in Second dimension as well. 3rd Separable SIV Subscript DO 10 I DO 10 J DO 10 K 10 A(I,J,J) =...A(I,J,K) + c Subscripts in both the rst and second dimensions are SIV. ) separable SIV subscript ) coupled SIV subscript 17
18 bounds on I are independent of J and K (as here), If component of direction vectors can be computed independently 1st Signicance of Separability: Break problem into smaller pieces DO 10 I DO 10 J DO 10 K 10 A(I,J,J) =...A(I,J,K) + c for ow dependence: Equations w = I r I w = J r J w = K r J First equation can be solved separately from the other two. of 2nd and 3rd components. In benchmarks, 80 of subscripts are separable! 18
19 SIV subscript: a = c Strong I r I w = (b d)=a ) a divides (b d), and quotient is within loop bounds of I, there If a dependence, and we have Ith component of the is no need to check other dimensions - no dependence Otherwise, exists! Separable,SIV subscript: Simple, precise tests exist. DO 10 J DO 10 I DO 10 K X(aI + b,..,..) =..X(cI + d,..,..).. Equation for ow dependence: a I w + b = c I r + d. direction/distance vector. In benchmarks, roughly 37 of subscripts are strong SIV! 19
20 Say c is 0 ) I w = (d b)=a and I r > I w a divides (d b), and quotient is within loop bounds, then If exists with all iterations beyond I w. dependence may be worth eliminating dependence by performing iterations It b)=a) 1 in one loop, iteration (d b)=a by itself and then 1..((d Another important case: DO 10 I 10 X(aI + b,..,..) =..X(cI + d,..,..).. Weak SIV subscript: Either a or c is 0. Important loop transformation: Index-set splitting - the remaining iterations in another loop. 20
21 can use column operations to reduce to echelon form etc. We usually, a and c are small integers (mag < 5). Exploit this. But entries in each table position: (i) gcd(a; c) Two one solution (I w ; I r ) = (s; t) to eqn a I w + c I r = gcd(a; c) (ii) Equation (1), if a and c are between 1 and 5, Given if gcd(a; c) does not divide (d b), no solution (i) otherwise, one solution is (s; t) (d b)=gcd(a; c) (ii) General solution: (iii) w ; I r ) = n (c; a)=gcd(c; a) + (s; t) (d b)=gcd(a; c) (I is parameter) (n when a or c in Equation (1) are -ve: minor modication of Case procedure. this General SIV Test Equation: a I w + b = c I r + d (1) Build a table indexed by (a; c) pairs for a and c between 1 and 5. 21
22 w ; J w ) = h(i r ; J r ) f(i w ; J w ) = k(i r ; J r ) g(i could convert system to single equation: (any constant C) We f(i w ; J w ) + g(i w ; J w ) = C h(i r ; J r ) + k(i r ; J r ) C solution of simultaneous system is soln of single equation. Any single equation may have solutions even if system does not. But Linearization: reduce number of equations Not as important nowadays, but widely used earlier. DO 10 I =... DO 10 J =... X(f(I,J),g(I,J)) = =...X(h(I,J),k(I,J)).. Equations for ow dependence: ) single equation is conservative approximation to system. 22
23 we choose C so we do not lose precision? Can range of functions g and k is less than some u, choose C = u. If case: We know the size of array X (say (1..50,1..100)). So Common and k values are < 101. Convert g w ; J w ) = h(i r ; J r ) f(i w ; J w ) = k(i r ; J r ) g(i 101 f(i w ; J w ) + g(i w ; J w ) = 101 h(i r ; J r ) + k(i r ; J r ) to loss of precision. w/o We may not know array size at compile time: Problem: for array parameters, dynamically allocated arrays. eg. Easy to verify that all solutions to single equation are also solutions to system
24 dependence constraints: Flow I w I r Banerjees test First dependence test to take inequalities into account. DO 10 I = 1, 100 X(2I+3) = =... X(I+7) 2I w + 3 = I r + 7 Inequalities: bounds on a convex region R Equation: write as h(i w ; I r ) = 2I w I r 4. Solving constraints is equivalent to following question: Does h become 0 at an integer point within region R? 24
25 test: checks whether h has a zero in region R Banerjees not check that zero is reached at integer point Does (Intermediate value theorem) A continuous function Theorem: y) has a zero in a convex, connected region R i f(x; case: f is linear (or ane) and region R is dened by linear Special In this case, inequalities. theorem: f obtains its minimum and maximum values at Dantzigs corners of region R. the idea of Banerjees test: Intuitive Find corners of region R (i) Evaluate f at the corners and nd maximum and minimum. (ii) If max 0 and min 0, then f has a zero in R. (iii) max R f(x; y) 0 and min R f(x; y) 0. 25
26 Example: = 3x - 2 f(x) < b ) 3a < 3b ) 3a 2 < 3b 2 a minimum value = 3a 2; maximum value = 3b 2 So x + : = x if x 0; 0 otherwise Notation: x : = x if x 0; 0 otherwise Similarly, f(x) = cx + d and R = [x min ; x max ] Let R f = (c) + x max + (c) x min + d max R = [a,b] ) `Corners are a and b Using these functions, we can compute max and min easily. min R f = (c) x max + (c) + x min + d What about higher dimensions? 26
27 F (I; J) What are max R F and min R F? Function: way: evaluate F at corners (compute F (1; 1), F (1; N) and One (N; N)) etc. F high dimensions, hard to determine corners of region. In special case: Important variables ordered so bounds on variable x i depend only on x 1, (i) 2,...x i 1 (eg.i bounds are constants, J bounds depend only on I) x each variable has just one linear/ane lower and upper bound (ii) special case: variables can be dealt with one at a time in inner In outer order. to J (N,N) R 1 < I < N < < 1 J I (1,1) (N,1) I 27
28 J F (I; J) = 3I+( 4) + J max +( 4) J min +3 = 3I 4+3 = 3I 1 max J F (I; J) = 3I + ( 4) J max + ( 4) + J min + 3 = 3 3I min R F (I; J) = (3) + I max + (3) I min 1 = 3N 1 max R F (I; J) = (3) I max + (3) + I min 1 = 2 min For special case, higher-dimensional problem is solved Summary: repeated application of the 1-D solution. by J (N,N) R 1 < I < N < < 1 J I Example: (1,1) (N,1) I Objective function: F (I; J) = 3I 4J + 3 Key idea: First, maximize/minimize over J, keeping I xed. Now use bounds on I: 28
29 Conditions: Single equality constraint (i) equalities: use linearization rst) (multiple Variables can be ordered x 1 ; x 2 ; ::: so that (ii) each upper/lower bound on x i is an ane function of x 1 ; x 2 ; :::; x i 1 test: Banerjees Interpret equality constraint as function F over domain R (i) Maximize/minimize F over each variable separately in (ii) to outer order: x i ; x i 1; :::; x 1. inner If max R F 0 and min R F 0, then F becomes 0 (iii) some point in R. at Declare conservatively that F becomes 0 at an integer point (iv) there exist integer solutions to system of inequalities/equality. ) Summary of Banerjees test: given by inequalities. 29
30 In hierarchical testing for directions, solution to equalities (II) be done only once. should Output of equality solver may be useful to determine (III) and to eliminate some directions from consideration. distances dependence equation: J w = J r + 1 ) distance(j) = 1 Flow vector cannot be (=; >). So only possibility is (<; >): Direction Implementation notes: (I) Check for ZIV/separable SIV rst before calling IP engine. (eg) DO 10 I DO 10 J A(J) = A(J+1) + 1 test only for this. 30
31 X and Y may be aliased, there are may-dependences in the loop. If convention: aliased parameters may not be modied in FORTRAN Pointer-based data structures: require totally dierent (V) technology (IV) Array aliasing introduces complications! procedure f(x,y) DO I... X(I) =... =...Y(I-1).. procedure. 31
32 we use I to index into iteration space, dependence distances If -ve! become dependence: from trip n w to n r ) Flow + n w s = 2(l + n r s) 5. l normalization: Transform all loops so low index is 0 and step Loop is 1. We are doing it implicitly. size (VI) Negative loop step sizes: Loop normalization DO 10 I = 10,1, Solution: Use trip counts (0,1,...) to index loop iterations. DO 10 I = l,u,s X(I) = X(2I-5)... Distance vector = [n r n w ] 32
33 should distance/direction analog be for imperfectly nested What loops? (VII)Imperfectly nested loops Distance/direction not adequate for imperfectly nested loops. Perfectly nested loop: all statements nested within same loops DO 10 I = 1,10 DO 10 J = 1,10 10 Y(I) = Y(I) + A(I,J)*X(J) Imperfectly nested loop: triangular solve/cholesky/lu DO 10 I = 1,N DO 20 J = 1, I-1 20 B(I) = B(I) - L(I,J)*X(J) 10 X(I) = B(I)/L(I,I) 33
34 One approach: Compute distance/direction only for common loops. Not adequate for many applications like imperfect loop interchange. (row triangular solve) DO 10 I = 1,N DO 20 J = 1, I-1 20 B(I) = B(I) - L(I,J)*X(J) 10 X(I) = B(I)/L(I,I) => (column triangular solve) DO 10 I = 1,N X(I) = B(I)/L(I,I) DO 20 J = I+1, N 20 B(J) = B(J) - L(I,J)*X(I) 34
35 is a good dependence abstraction for imperfectly nested What loops? 35
36 position: exact dependence testing (using IP engine) is Traditional expensive too exact dependence testing is OK provided we rst check for easy (i) (ZIV,strong SIV, weak SIV) cases Conclusions Recent experience: (ii) IP engine is called for 3-4 of tests for direction vectors (iii) Cost of exact dependence testing: 3-5 of compile time 36
Homework #2: assignments/ps2/ps2.htm Due Thursday, March 7th.
Homework #2: http://www.cs.cornell.edu/courses/cs612/2002sp/ assignments/ps2/ps2.htm Due Thursday, March 7th. 1 Transformations and Dependences 2 Recall: Polyhedral algebra tools for determining emptiness
More informationOrganization of a Modern Compiler. Middle1
Organization of a Modern Compiler Source Program Front-end syntax analysis + type-checking + symbol table High-level Intermediate Representation (loops,array references are preserved) Middle1 loop-level
More informationAdvanced Compiler Construction
CS 526 Advanced Compiler Construction http://misailo.cs.illinois.edu/courses/cs526 DEPENDENCE ANALYSIS The slides adapted from Vikram Adve and David Padua Kinds of Data Dependence Direct Dependence X =
More informationCOMP 515: Advanced Compilation for Vector and Parallel Processors. Vivek Sarkar Department of Computer Science Rice University
COMP 515: Advanced Compilation for Vector and Parallel Processors Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP 515 Lecture 3 13 January 2009 Acknowledgments Slides
More informationAdvanced Restructuring Compilers. Advanced Topics Spring 2009 Prof. Robert van Engelen
Advanced Restructuring Compilers Advanced Topics Spring 2009 Prof. Robert van Engelen Overview Data and control dependences The theory and practice of data dependence analysis K-level loop-carried dependences
More informationOrganization of a Modern Compiler. Front-end. Middle1. Back-end DO I = 1, N DO J = 1,M S 1 1 N. Source Program
Organization of a Modern Compiler ource Program Front-end snta analsis + tpe-checking + smbol table High-level ntermediate Representation (loops,arra references are preserved) Middle loop-level transformations
More informationSaturday, April 23, Dependence Analysis
Dependence Analysis Motivating question Can the loops on the right be run in parallel? i.e., can different processors run different iterations in parallel? What needs to be true for a loop to be parallelizable?
More informationDO i 1 = L 1, U 1 DO i 2 = L 2, U 2... DO i n = L n, U n. S 1 A(f 1 (i 1,...,i, n ),...,f, m (i 1,...,i, n )) =...
15-745 Lecture 7 Data Dependence in Loops 2 Complex spaces Delta Test Merging vectors Copyright Seth Goldstein, 2008-9 Based on slides from Allen&Kennedy 1 The General Problem S 1 A(f 1 (i 1,,i n ),,f
More informationFront-end. Organization of a Modern Compiler. Middle1. Middle2. Back-end. converted to control flow) Representation
register allocation instruction selection Code Low-level intermediate Representation Back-end Assembly array references converted into low level operations, loops converted to control flow Middle2 Low-level
More informationCOMP 515: Advanced Compilation for Vector and Parallel Processors
COMP 515: Advanced Compilation for Vector and Parallel Processors Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP 515 Lecture 4 1 September, 2011 1 Acknowledgments Slides
More informationDependence Analysis. Dependence Examples. Last Time: Brief introduction to interprocedural analysis. do I = 2, 100 A(I) = A(I-1) + 1 enddo
Dependence Analysis Dependence Examples Last Time: Brief introduction to interprocedural analysis Today: Optimization for parallel machines and memory hierarchies Dependence analysis Loop transformations
More informationAnnouncements PA2 due Friday Midterm is Wednesday next week, in class, one week from today
Loop Transformations Announcements PA2 due Friday Midterm is Wednesday next week, in class, one week from today Today Recall stencil computations Intro to loop transformations Data dependencies between
More informationDO i 1 = L 1, U 1 DO i 2 = L 2, U 2... DO i n = L n, U n. S 1 A(f 1 (i 1,...,i, n ),...,f, m (i 1,...,i, n )) =...
15-745 Lecture 7 Data Dependence in Loops 2 Delta Test Merging vectors Copyright Seth Goldstein, 2008 The General Problem S 1 A(f 1 (i 1,,i n ),,f m (i 1,,i n )) = S 2 = A(g 1 (i 1,,i n ),,g m (i 1,,i
More informationCompiler Optimisation
Compiler Optimisation 8 Dependence Analysis Hugh Leather IF 1.18a hleather@inf.ed.ac.uk Institute for Computing Systems Architecture School of Informatics University of Edinburgh 2018 Introduction This
More informationLoop Parallelization Techniques and dependence analysis
Loop Parallelization Techniques and dependence analysis Data-Dependence Analysis Dependence-Removing Techniques Parallelizing Transformations Performance-enchancing Techniques 1 When can we run code in
More informationClasses of data dependence. Dependence analysis. Flow dependence (True dependence)
Dependence analysis Classes of data dependence Pattern matching and replacement is all that is needed to apply many source-to-source transformations. For example, pattern matching can be used to determine
More informationDependence analysis. However, it is often necessary to gather additional information to determine the correctness of a particular transformation.
Dependence analysis Pattern matching and replacement is all that is needed to apply many source-to-source transformations. For example, pattern matching can be used to determine that the recursion removal
More informationData Dependences and Parallelization. Stanford University CS243 Winter 2006 Wei Li 1
Data Dependences and Parallelization Wei Li 1 Agenda Introduction Single Loop Nested Loops Data Dependence Analysis 2 Motivation DOALL loops: loops whose iterations can execute in parallel for i = 11,
More informationTopics on Compilers
Assignment 2 4541.775 Topics on Compilers Sample Solution 1. Test for dependences on S. Write down the subscripts. Which positions are separable, which are coupled? Which dependence test would you apply
More informationLU Factorization a 11 a 1 a 1n A = a 1 a a n (b) a n1 a n a nn L = l l 1 l ln1 ln 1 75 U = u 11 u 1 u 1n 0 u u n 0 u n...
.. Factorizations Reading: Trefethen and Bau (1997), Lecture 0 Solve the n n linear system by Gaussian elimination Ax = b (1) { Gaussian elimination is a direct method The solution is found after a nite
More information2 Grimm, Pottier, and Rostaing-Schmidt optimal time strategy that preserves the program structure with a xed total number of registers. We particularl
Chapter 1 Optimal Time and Minimum Space-Time Product for Reversing a Certain Class of Programs Jos Grimm Lo c Pottier Nicole Rostaing-Schmidt Abstract This paper concerns space-time trade-os for the reverse
More informationOn the optimality of Allen and Kennedy's algorithm for parallelism extraction in nested loops Alain Darte and Frederic Vivien Laboratoire LIP, URA CNR
On the optimality of Allen and Kennedy's algorithm for parallelism extraction in nested loops Alain Darte and Frederic Vivien Laboratoire LIP, URA CNRS 1398 Ecole Normale Superieure de Lyon, F - 69364
More informationNUMERICAL MATHEMATICS & COMPUTING 7th Edition
NUMERICAL MATHEMATICS & COMPUTING 7th Edition Ward Cheney/David Kincaid c UT Austin Engage Learning: Thomson-Brooks/Cole wwwengagecom wwwmautexasedu/cna/nmc6 October 16, 2011 Ward Cheney/David Kincaid
More information1 Solutions to selected problems
Solutions to selected problems Section., #a,c,d. a. p x = n for i = n : 0 p x = xp x + i end b. z = x, y = x for i = : n y = y + x i z = zy end c. y = (t x ), p t = a for i = : n y = y(t x i ) p t = p
More informationLU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark
DM559 Linear and Integer Programming LU Factorization Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [Based on slides by Lieven Vandenberghe, UCLA] Outline
More informationGAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)
GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) D. ARAPURA Gaussian elimination is the go to method for all basic linear classes including this one. We go summarize the main ideas. 1.
More informationLoop Interchange. Loop Transformations. Taxonomy. do I = 1, N do J = 1, N S 1 A(I,J) = A(I-1,J) + 1 enddo enddo. Loop unrolling.
Advanced Topics Which Loops are Parallel? review Optimization for parallel machines and memory hierarchies Last Time Dependence analysis Today Loop transformations An example - McKinley, Carr, Tseng loop
More informationSolution of Linear Systems
Solution of Linear Systems Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico May 12, 2016 CPD (DEI / IST) Parallel and Distributed Computing
More informationDirect Methods for Solving Linear Systems. Matrix Factorization
Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationBasic Concepts in Linear Algebra
Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University February 2, 2015 Grady B Wright Linear Algebra Basics February 2, 2015 1 / 39 Numerical Linear Algebra Linear
More information16.2 Iterated Integrals
6.2 Iterated Integrals So far: We have defined what we mean by a double integral. We have estimated the value of a double integral from contour diagrams and from tables of values. We have interpreted the
More informationReview of matrices. Let m, n IN. A rectangle of numbers written like A =
Review of matrices Let m, n IN. A rectangle of numbers written like a 11 a 12... a 1n a 21 a 22... a 2n A =...... a m1 a m2... a mn where each a ij IR is called a matrix with m rows and n columns or an
More informationReview of Basic Concepts in Linear Algebra
Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra
More information22A-2 SUMMER 2014 LECTURE 5
A- SUMMER 0 LECTURE 5 NATHANIEL GALLUP Agenda Elimination to the identity matrix Inverse matrices LU factorization Elimination to the identity matrix Previously, we have used elimination to get a system
More informationMath 304 (Spring 2010) - Lecture 2
Math 304 (Spring 010) - Lecture Emre Mengi Department of Mathematics Koç University emengi@ku.edu.tr Lecture - Floating Point Operation Count p.1/10 Efficiency of an algorithm is determined by the total
More information7.2 Linear equation systems. 7.3 Linear least square fit
72 Linear equation systems In the following sections, we will spend some time to solve linear systems of equations This is a tool that will come in handy in many di erent places during this course For
More informationAMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems
AMS 209, Fall 205 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems. Overview We are interested in solving a well-defined linear system given
More informationMATH2210 Notebook 2 Spring 2018
MATH2210 Notebook 2 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2009 2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH2210 Notebook 2 3 2.1 Matrices and Their Operations................................
More informationComputational Complexity - Pseudocode and Recursions
Computational Complexity - Pseudocode and Recursions Nicholas Mainardi 1 Dipartimento di Elettronica e Informazione Politecnico di Milano nicholas.mainardi@polimi.it June 6, 2018 1 Partly Based on Alessandro
More informationPipelined Computations
Chapter 5 Slide 155 Pipelined Computations Pipelined Computations Slide 156 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming). Each
More informationNext topics: Solving systems of linear equations
Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:
More informationSolving linear equations with Gaussian Elimination (I)
Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian
More informationElementary Row Operations on Matrices
King Saud University September 17, 018 Table of contents 1 Definition A real matrix is a rectangular array whose entries are real numbers. These numbers are organized on rows and columns. An m n matrix
More informationChapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations
Chapter 1: Systems of linear equations and matrices Section 1.1: Introduction to systems of linear equations Definition: A linear equation in n variables can be expressed in the form a 1 x 1 + a 2 x 2
More informationMATH 3511 Lecture 1. Solving Linear Systems 1
MATH 3511 Lecture 1 Solving Linear Systems 1 Dmitriy Leykekhman Spring 2012 Goals Review of basic linear algebra Solution of simple linear systems Gaussian elimination D Leykekhman - MATH 3511 Introduction
More information15-780: LinearProgramming
15-780: LinearProgramming J. Zico Kolter February 1-3, 2016 1 Outline Introduction Some linear algebra review Linear programming Simplex algorithm Duality and dual simplex 2 Outline Introduction Some linear
More informationFundamentals of Operations Research. Prof. G. Srinivasan. Indian Institute of Technology Madras. Lecture No. # 15
Fundamentals of Operations Research Prof. G. Srinivasan Indian Institute of Technology Madras Lecture No. # 15 Transportation Problem - Other Issues Assignment Problem - Introduction In the last lecture
More informationGeneralizing Data-flow Analysis
Generalizing Data-flow Analysis Announcements PA1 grades have been posted Today Other types of data-flow analysis Reaching definitions, available expressions, reaching constants Abstracting data-flow analysis
More informationII. Determinant Functions
Supplemental Materials for EE203001 Students II Determinant Functions Chung-Chin Lu Department of Electrical Engineering National Tsing Hua University May 22, 2003 1 Three Axioms for a Determinant Function
More informationRectangular Systems and Echelon Forms
CHAPTER 2 Rectangular Systems and Echelon Forms 2.1 ROW ECHELON FORM AND RANK We are now ready to analyze more general linear systems consisting of m linear equations involving n unknowns a 11 x 1 + a
More informationAssignment 4. CSci 3110: Introduction to Algorithms. Sample Solutions
Assignment 4 CSci 3110: Introduction to Algorithms Sample Solutions Question 1 Denote the points by p 1, p 2,..., p n, ordered by increasing x-coordinates. We start with a few observations about the structure
More informationSection 3.6 Complex Zeros
04 Chapter Section 6 Complex Zeros When finding the zeros of polynomials, at some point you're faced with the problem x = While there are clearly no real numbers that are solutions to this equation, leaving
More informationChapter 2. Solving Systems of Equations. 2.1 Gaussian elimination
Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix
More informationEssentials of Intermediate Algebra
Essentials of Intermediate Algebra BY Tom K. Kim, Ph.D. Peninsula College, WA Randy Anderson, M.S. Peninsula College, WA 9/24/2012 Contents 1 Review 1 2 Rules of Exponents 2 2.1 Multiplying Two Exponentials
More informationLecture 5: Loop Invariants and Insertion-sort
Lecture 5: Loop Invariants and Insertion-sort COMS10007 - Algorithms Dr. Christian Konrad 11.01.2019 Dr. Christian Konrad Lecture 5: Loop Invariants and Insertion-sort 1 / 12 Proofs by Induction Structure
More information1 - Systems of Linear Equations
1 - Systems of Linear Equations 1.1 Introduction to Systems of Linear Equations Almost every problem in linear algebra will involve solving a system of equations. ü LINEAR EQUATIONS IN n VARIABLES We are
More information1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )
Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical
More informationMatrices. Chapter Definitions and Notations
Chapter 3 Matrices 3. Definitions and Notations Matrices are yet another mathematical object. Learning about matrices means learning what they are, how they are represented, the types of operations which
More informationA Parallel Implementation of the. Yuan-Jye Jason Wu y. September 2, Abstract. The GTH algorithm is a very accurate direct method for nding
A Parallel Implementation of the Block-GTH algorithm Yuan-Jye Jason Wu y September 2, 1994 Abstract The GTH algorithm is a very accurate direct method for nding the stationary distribution of a nite-state,
More informationPROOF OF TWO MATRIX THEOREMS VIA TRIANGULAR FACTORIZATIONS ROY MATHIAS
PROOF OF TWO MATRIX THEOREMS VIA TRIANGULAR FACTORIZATIONS ROY MATHIAS Abstract. We present elementary proofs of the Cauchy-Binet Theorem on determinants and of the fact that the eigenvalues of a matrix
More informationMath 552 Scientific Computing II Spring SOLUTIONS: Homework Set 1
Math 552 Scientific Computing II Spring 21 SOLUTIONS: Homework Set 1 ( ) a b 1 Let A be the 2 2 matrix A = By hand, use Gaussian elimination with back c d substitution to obtain A 1 by solving the two
More informationMatrices and Matrix Algebra.
Matrices and Matrix Algebra 3.1. Operations on Matrices Matrix Notation and Terminology Matrix: a rectangular array of numbers, called entries. A matrix with m rows and n columns m n A n n matrix : a square
More informationThe purpose of computing is insight, not numbers. Richard Wesley Hamming
Systems of Linear Equations The purpose of computing is insight, not numbers. Richard Wesley Hamming Fall 2010 1 Topics to Be Discussed This is a long unit and will include the following important topics:
More informationSolving Linear Systems of Equations
November 6, 2013 Introduction The type of problems that we have to solve are: Solve the system: A x = B, where a 11 a 1N a 12 a 2N A =.. a 1N a NN x = x 1 x 2. x N B = b 1 b 2. b N To find A 1 (inverse
More informationAlgorithms and Their Complexity
CSCE 222 Discrete Structures for Computing David Kebo Houngninou Algorithms and Their Complexity Chapter 3 Algorithm An algorithm is a finite sequence of steps that solves a problem. Computational complexity
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.
More informationBLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product
Level-1 BLAS: SAXPY BLAS-Notation: S single precision (D for double, C for complex) A α scalar X vector P plus operation Y vector SAXPY: y = αx + y Vectorization of SAXPY (αx + y) by pipelining: page 8
More informationELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS. 1. Linear Equations and Matrices
ELEMENTARY LINEAR ALGEBRA WITH APPLICATIONS KOLMAN & HILL NOTES BY OTTO MUTZBAUER 11 Systems of Linear Equations 1 Linear Equations and Matrices Numbers in our context are either real numbers or complex
More informationAnalysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College
Analysis of Algorithms [Reading: CLRS 2.2, 3] Laura Toma, csci2200, Bowdoin College Why analysis? We want to predict how the algorithm will behave (e.g. running time) on arbitrary inputs, and how it will
More informationG METHOD IN ACTION: FROM EXACT SAMPLING TO APPROXIMATE ONE
G METHOD IN ACTION: FROM EXACT SAMPLING TO APPROXIMATE ONE UDREA PÄUN Communicated by Marius Iosifescu The main contribution of this work is the unication, by G method using Markov chains, therefore, a
More informationA Polynomial-Time Algorithm for Memory Space Reduction
A Polynomial-Time Algorithm for Memory Space Reduction Yonghong Song Cheng Wang Zhiyuan Li Sun Microsystems, Inc. Department of Computer Sciences 4150 Network Circle Purdue University Santa Clara, CA 95054
More informationMatrix Arithmetic. j=1
An m n matrix is an array A = Matrix Arithmetic a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn of real numbers a ij An m n matrix has m rows and n columns a ij is the entry in the i-th row and j-th column
More informationComputation of the mtx-vec product based on storage scheme on vector CPUs
BLAS: Basic Linear Algebra Subroutines BLAS: Basic Linear Algebra Subroutines BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix Computation of the mtx-vec product based on storage scheme on
More informationUniversity of the Virgin Islands, St. Thomas January 14, 2015 Algorithms and Programming for High Schoolers. Lecture 5
University of the Virgin Islands, St. Thomas January 14, 2015 Algorithms and Programming for High Schoolers Numerical algorithms: Lecture 5 Today we ll cover algorithms for various numerical problems:
More informationTuring Machines, diagonalization, the halting problem, reducibility
Notes on Computer Theory Last updated: September, 015 Turing Machines, diagonalization, the halting problem, reducibility 1 Turing Machines A Turing machine is a state machine, similar to the ones we have
More informationChapter 2. Solving Systems of Equations. 2.1 Gaussian elimination
Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix
More informationFundamentals of Engineering Analysis (650163)
Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is
More informationThe Gauss-Jordan Elimination Algorithm
The Gauss-Jordan Elimination Algorithm Solving Systems of Real Linear Equations A. Havens Department of Mathematics University of Massachusetts, Amherst January 24, 2018 Outline 1 Definitions Echelon Forms
More informationCS 246 Review of Linear Algebra 01/17/19
1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector
More informationPhys 201. Matrices and Determinants
Phys 201 Matrices and Determinants 1 1.1 Matrices 1.2 Operations of matrices 1.3 Types of matrices 1.4 Properties of matrices 1.5 Determinants 1.6 Inverse of a 3 3 matrix 2 1.1 Matrices A 2 3 7 =! " 1
More informationIE 5531: Engineering Optimization I
IE 5531: Engineering Optimization I Lecture 5: The Simplex method, continued Prof. John Gunnar Carlsson September 22, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I September 22, 2010
More informationTime Analysis of Sorting and Searching Algorithms
Time Analysis of Sorting and Searching Algorithms CSE21 Winter 2017, Day 5 (B00), Day 3 (A00) January 20, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Big-O : How to determine? Is f(n) big-o of g(n)?
More informationMATRICES. a m,1 a m,n A =
MATRICES Matrices are rectangular arrays of real or complex numbers With them, we define arithmetic operations that are generalizations of those for real and complex numbers The general form a matrix of
More informationOutline. Relaxation. Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING. 1. Lagrangian Relaxation. Lecture 12 Single Machine Models, Column Generation
Outline DMP204 SCHEDULING, TIMETABLING AND ROUTING 1. Lagrangian Relaxation Lecture 12 Single Machine Models, Column Generation 2. Dantzig-Wolfe Decomposition Dantzig-Wolfe Decomposition Delayed Column
More informationMTH 2530: Linear Algebra. Sec Systems of Linear Equations
MTH 0 Linear Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Week # Section.,. Sec... Systems of Linear Equations... D examples Example Consider a system of
More information9.1 - Systems of Linear Equations: Two Variables
9.1 - Systems of Linear Equations: Two Variables Recall that a system of equations consists of two or more equations each with two or more variables. A solution to a system in two variables is an ordered
More informationThe Data-Dependence graph of Adjoint Codes
The Data-Dependence graph of Adjoint Codes Laurent Hascoët INRIA Sophia-Antipolis November 19th, 2012 (INRIA Sophia-Antipolis) Data-Deps of Adjoints 19/11/2012 1 / 14 Data-Dependence graph of this talk
More informationChapter 5 Linear Programming (LP)
Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize f(x) subject to x R n is called the constraint set or feasible set. any point x is called a feasible point We consider
More informationSection 5.3 Systems of Linear Equations: Determinants
Section 5. Systems of Linear Equations: Determinants In this section, we will explore another technique for solving systems called Cramer's Rule. Cramer's rule can only be used if the number of equations
More informationLoop parallelization using compiler analysis
Loop parallelization using compiler analysis Which of these loops is parallel? How can we determine this automatically using compiler analysis? Organization of a Modern Compiler Source Program Front-end
More informationLinear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space
Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................
More informationAck: 1. LD Garcia, MTH 199, Sam Houston State University 2. Linear Algebra and Its Applications - Gilbert Strang
Gaussian Elimination CS6015 : Linear Algebra Ack: 1. LD Garcia, MTH 199, Sam Houston State University 2. Linear Algebra and Its Applications - Gilbert Strang The Gaussian Elimination Method The Gaussian
More informationMath 471 (Numerical methods) Chapter 3 (second half). System of equations
Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular
More informationMatrix Algebra Determinant, Inverse matrix. Matrices. A. Fabretti. Mathematics 2 A.Y. 2015/2016. A. Fabretti Matrices
Matrices A. Fabretti Mathematics 2 A.Y. 2015/2016 Table of contents Matrix Algebra Determinant Inverse Matrix Introduction A matrix is a rectangular array of numbers. The size of a matrix is indicated
More informationA fast algorithm to generate necklaces with xed content
Theoretical Computer Science 301 (003) 477 489 www.elsevier.com/locate/tcs Note A fast algorithm to generate necklaces with xed content Joe Sawada 1 Department of Computer Science, University of Toronto,
More information0 0 0 # In other words it must saisfy the condition that the successive rst non-zero entries in each row are indented.
Mathematics 07 October 11, 199 Gauss elimination and row reduction I recall that a matrix is said to be in row-echelon form if it looks like this: # 0 # 0 0 0 # where the # are all non-zero and the are
More informationTopic 15 Notes Jeremy Orloff
Topic 5 Notes Jeremy Orloff 5 Transpose, Inverse, Determinant 5. Goals. Know the definition and be able to compute the inverse of any square matrix using row operations. 2. Know the properties of inverses.
More informationLoop Scheduling and Software Pipelining \course\cpeg421-08s\topic-7.ppt 1
Loop Scheduling and Software Pipelining 2008-04-24 \course\cpeg421-08s\topic-7.ppt 1 Reading List Slides: Topic 7 and 7a Other papers as assigned in class or homework: 2008-04-24 \course\cpeg421-08s\topic-7.ppt
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationModels of Computation, Recall Register Machines. A register machine (sometimes abbreviated to RM) is specified by:
Models of Computation, 2010 1 Definition Recall Register Machines A register machine (sometimes abbreviated M) is specified by: Slide 1 finitely many registers R 0, R 1,..., R n, each capable of storing
More information