Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

Similar documents
Compiling for Parallelism & Locality. Example. Announcement Need to make up November 14th lecture. Last time Data dependences and loops

The Problem: Mapping programs to architectures

Problem Set 9 Solutions

= z 20 z n. (k 20) + 4 z k = 4

Dynamic Programming. Preview. Dynamic Programming. Dynamic Programming. Dynamic Programming (Example: Fibonacci Sequence)

Chapter Newton s Method

2.3 Nilpotent endomorphisms

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Turing Machines (intro)

Affine transformations and convexity

Chapter 4: Root Finding

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Example: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,

The KMO Method for Solving Non-homogenous, m th Order Differential Equations

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

VQ widely used in coding speech, image, and video

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

PHYS 705: Classical Mechanics. Calculus of Variations II

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Learning Theory: Lecture Notes

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

THE SUMMATION NOTATION Ʃ

The Geometry of Logit and Probit

Edge Isoperimetric Inequalities

a b a In case b 0, a being divisible by b is the same as to say that

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

5 The Rational Canonical Form

Basic Regular Expressions. Introduction. Introduction to Computability. Theory. Motivation. Lecture4: Regular Expressions

Solution of Linear System of Equations and Matrix Inversion Gauss Seidel Iteration Method

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Difference Equations

Lecture Notes on Linear Regression

More metrics on cartesian products

Report on Image warping

Lecture 3: Probability Distributions

COMP 515: Advanced Compilation for Vector and Parallel Processors. Vivek Sarkar Department of Computer Science Rice University

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

Errors for Linear Systems

Implicit Integration Henyey Method

Exercises. 18 Algorithms

Lecture 2 Solution of Nonlinear Equations ( Root Finding Problems )

Quantum Mechanics for Scientists and Engineers. David Miller

NUMERICAL DIFFERENTIATION

Kernel Methods and SVMs Extension

Section 3.6 Complex Zeros

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Lecture 21: Numerical methods for pricing American type derivatives

MMA and GCMMA two methods for nonlinear optimization

COMPLEX NUMBERS AND QUADRATIC EQUATIONS

Section 8.3 Polar Form of Complex Numbers

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Math 261 Exercise sheet 2

CHAPTER 4. Vector Spaces

EEE 241: Linear Systems

Natural Language Processing and Information Retrieval

A 2D Bounded Linear Program (H,c) 2D Linear Programming

Linear Feature Engineering 11

From Biot-Savart Law to Divergence of B (1)

The Study of Teaching-learning-based Optimization Algorithm

On the Multicriteria Integer Network Flow Problem

Min Cut, Fast Cut, Polynomial Identities

Relaxation Methods for Iterative Solution to Linear Systems of Equations

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 13

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

DO i 1 = L 1, U 1 DO i 2 = L 2, U 2... DO i n = L n, U n. S 1 A(f 1 (i 1,...,i, n ),...,f, m (i 1,...,i, n )) =...

APPENDIX A Some Linear Algebra

Expected Value and Variance

Review of Taylor Series. Read Section 1.2

Mathematical Preparations

10.34 Numerical Methods Applied to Chemical Engineering Fall Homework #3: Systems of Nonlinear Equations and Optimization

10-701/ Machine Learning, Fall 2005 Homework 3

Grover s Algorithm + Quantum Zeno Effect + Vaidman

: Numerical Analysis Topic 2: Solution of Nonlinear Equations Lectures 5-11:

Limited Dependent Variables

Feature Selection: Part 1

Formal solvers of the RT equation

SL n (F ) Equals its Own Derived Group

An Interactive Optimisation Tool for Allocation Problems

Formulas for the Determinant

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

Fundamental loop-current method using virtual voltage sources technique for special cases

The Fundamental Theorem of Algebra. Objective To use the Fundamental Theorem of Algebra to solve polynomial equations with complex solutions

CHAPTER 17 Amortized Analysis

Complex Numbers Alpha, Round 1 Test #123

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Lecture 6 More on Complete Randomized Block Design (RBD)

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Lecture 10 Support Vector Machines II

NP-Completeness : Proofs

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Newton s Method for One - Dimensional Optimization - Theory

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

Generalized Linear Methods

Transcription:

15-745 Lecture 6 Data Dependence n Loops Copyrght Seth Goldsten, 2008 Based on sldes from Allen&Kennedy Lecture 6 15-745 2005-8 1 Common loop optmzatons Hostng of loop-nvarant computatons pre-compute before enterng the loop Elmnaton of nducton varables change p=*w+b to p=b,p+=w, when w,b nvarant Loop unrollng to to mprove schedulng of the loop body Software ppelnng To mprove schedulng of the loop body data Loop permutaton to mprove cache memory performance Requres understandng dependences Lecture 5 15-745 2008 2 Why Dependence Analyss Example to mprove localty Goal s to fnd best schedule: Improve memory localty Increase parallelsm Decrease schedulng stalls Before we schedule we need to know possble legal schedules and mpact of schedule on performance for =0 to N for =0 to M A[] = f(a[]); Unroll to see deps A[0] = f(a[0]) A[1] = f(a[1]) A[2] = f(a[2]) A[N] = f(a[n]) A[0] = f(a[0]) Is there a better schedule? Iteraton space Lecture 6 15-745 2005-8 3 Lecture 6 15-745 2005-8 4

Example to mprove localty for =0 to N for =0 to M A[] = f(a[]); Is there a better schedule? Iteraton space Unroll to see deps A[0] = f(a[0]) A[1] = f(a[1]) A[2] = f(a[2]) A[N] = f(a[n]) A[0] = f(a[0]) for =0 to M for =0 to N A[] = f(a[]); Transformed teraton space for =0 to N for =0 to M A[] = f(a[]); for =0 to M for =0 to N A[] = f(a[]); Old Iteraton t space New Iteraton space A[3] A[3] A[3] A[3] A[3] A[3] A[3] A[3] A[2] A[2] A[2] A[2] A[2] A[2] A[2] A[2] A[1] A[1] A[1] A[1] A[1] A[1] A[1] A[1] A[0] A[0] A[0] A[0] A[0] A[0] A[0] A[0] Lecture 6 15-745 2005-8 5 Lecture 6 15-745 2005-8 6 What about What about for =0 to N for =0 to M A[] = f(a[]); B[] = f(b[]); Is there a better schedule? Iteraton space for =0 to N for =0 to M A[] = f(a[]); B[] = f(b[]); Is there a better schedule? Iteraton space Unroll to see deps A[0] = f(a[0]) B[0] = f(b[0]) A[1] = f(a[1]) B[0] = f(b[0]]) A[N] = f(a[n]) N) B[0] = f(b[0]) A[0] = f(a[0]) B[1] = f(b[1]) A[0] A[0] A[1] A[1] A[2] A[2] A[3] A[3] B[3] B[3] B[3] B[3] B[3] B[3] B[3] B[3] A[0] A[0] B[2] B[2] A[0] A[0] B[1] B[1] A[0] A[0] B[0] B[0] A[1] A[1] B[2] B[2] A[1] A[1] B[1] B[1] A[1] A[1] B[0] B[0] A[2] A[2] B[2] B[2] A[2] A[2] B[1] B[1] A[2] A[2] B[0] B[0] A[3] A[3] B[2] B[2] A[3] A[3] B[1] B[1] A[3] A[3] B[0] B[0] Unroll to see deps A[0] = f(a[0]) B[0] = f(b[0]) A[1] = f(a[1]) B[0] = f(b[0]]) A[N] = f(a[n]) N) B[0] = f(b[0]) A[0] = f(a[0]) B[1] = f(b[1]) A[0] A[0] A[1] A[1] A[2] A[2] A[3] A[3] B[3] B[3] B[3] B[3] B[3] B[3] B[3] B[3] A[0] A[0] B[2] B[2] A[0] A[0] B[1] B[1] A[0] A[0] B[0] B[0] A[1] A[1] B[2] B[2] A[1] A[1] B[1] B[1] A[1] A[1] B[0] B[0] A[2] A[2] B[2] B[2] A[2] A[2] B[1] B[1] A[2] A[2] B[0] B[0] A[3] A[3] B[2] B[2] A[3] A[3] B[1] B[1] A[3] A[3] B[0] B[0] Lecture 6 15-745 2005-8 7 Lecture 6 15-745 2005-8 8

But, what f But, what f for =0 to N for =1 to M A[] = f(a[-1]); Can we reschedule? for =0 to N for =1 to M A[] = f(a[-1]); Can we reschedule? Iteraton space Iteraton space Unroll to see deps A[1] = f(a[0]) A[2] = f(a[1]) A[3] = f(a[2]) A[N] = f(a[n-1]) A[1] = f(a[0]) A[2] = f(a[1]) A[3] = f(a[2]) Lecture 6 15-745 2005-8 9 Lecture 6 15-745 2005-8 10 But, what f So, how do we know when/how? for =0 to N for =1 to M A[] = f(a[-1]); Can we reschedule? Iteraton space When should we transform a loop? What transforms are legal? How should we transform the loop. Dependence nformaton helps wth all three questons. A[1] A[0] A[2] A[1] A[3] A[2] A[] 1 A[0] A[2] A[1] A[3] A[2] A[4] A[3] In short, Determne all dependence nformaton Use dependence nformaton to analyze loop Gude transformatons usng dependence nfo Key s: Any transformaton * that preserves every dependence n a program preserves the meanng of the program Lecture 6 15-745 2005-8 11 Lecture 6 15-745 2005-8 12

Dependences n Loops Loop ndependent data dependence occurs between accesses n the same loop teraton. Loop-carred data dependence occurs between accesses across dfferent loop teratons. There s data dependence between access a at teraton -k and access b at teraton when: aand b access the same memory locaton There s a path from a to b Ether Ether a or b s a wrte Defnng Dependences Flow Dependence W R δ f Ant-Dependence R W δ a Output Dependence W W δ o false S1) a=0; S2) b=a; S3) c=a+d+e; S4) d=b; S5) b=5+e; true Lecture 5 15-745 2008 13 Lecture 5 15-745 2008 14 S1) a=0; S2) b=a; S3) c=a+d+e; S4) d=b; S5) b=5+e; Example Dependences These are scalar dependences. 1 The same dea holds for memory accesses. source type target due to S1 δ f S2 a S1 δ f S3 a S2 δ f S4 b S3 δ a S4 d S4 δ a S5 b S2 δ o S5 b What can we do wth ths nformaton? What are ant- and flow- called false dependences? 2 3 4 5 Data Dependence n Loops Dependence can flow across teratons of the loop. Dependence nformaton s annotated wth teraton nformaton. If dependence s across teratons t s loop carred otherwse loop ndependent. for (=0; <n; ++) { A[] = B[]; B[+1] = A[]; Lecture 5 15-745 2008 15 11/20/01 15-411 Fall '01 Seth Copen Goldsten 2001 16

Data Dependence n Loops Dependence can flow across teratons of the loop. Dependence nformaton s annotated wth teraton nformaton. If dependence s across teratons t s loop carred otherwse loop ndependent. δ f loop carred for (=0; <n; ++) { A[] = B[]; B[+1] = A[]; δ f loop ndependent Data Dependence There s a data dependence from statement S 1 to statement S 2 (S 2 depends on S 1 ) f: 1. Both statements t t access the same memory locaton and at least one of them stores onto t, and 2. There s a feasble run-tme executon path from S 1 to S 2 We need to characterze the dependence nformaton n terms of the loop teratons nvolved n the dependence, so we need a way to talk about teratons of a loop. Iteraton vector: a label for a loop teraton usng the nducton varables. Iteraton space: the set of all possble teraton vectors for a loop Lexcographc order: The order of the teratons 11/20/01 15-411 Fall '01 Seth Copen Goldsten 2001 17 Lecture 6 15-745 2005-8 18 Iteraton Space Every teraton generates a pont n an n- dmensonal space, where n s the depth of the loop nest. for (=0; <n; ++) { for (=0; <n; ++) for (=0; <4; ++) { 4 3 2 Iteraton Vectors Need to consder the nestng level of a loop Nestng level of a loop s equal to one more than the number of loops that enclose t. Gven a nest of n loops, the teraton vector of a partcular teraton of the nnermost loop s a vector of ntegers that contans the teraton numbers for each of the loops n order of nestng level. Thus, the teraton vector s: { 1, 2,, n where k, 1 k n represents the teraton number for the loop at nestng level k T. Mowry Lecture 6 15-745 2005-8 19 Lecture 6 15-745 2005-8 20

Iteraton Space Every teraton generates a pont n an n- dmensonal space, where n s the depth of the loop nest. for (=0; <n; ++) { for (=0; <n; ++) for (=0; <4; ++) { 4 3 2 Orderng of Iteraton Vectors Dan orderng for teraton vectors Use an ntutve, lexcographc order Iteraton precedes teraton, denoted <, ff: 1. [1:n-1] < [1:n-1], or 1 2. [1:k-1] = [1:k-1] and 1 k < k 2 2 < k k n n T. Mowry Lecture 6 15-745 2005-8 21 Lecture 6 15-745 2005-8 22 Example Iteraton Space Vstaton Order n Iteraton Space for = 0 to N-1 for = 0 to N-1 A[][] = B[][]; for = 0 to N-1 for = 0 to N-1 A[][] = B[][]; each poston represents an teraton Note: teraton space s not data space T. Mowry T. Mowry

Formal Def of Loop Dependence There exsts a dependence from statements S 1 to statement S 2 n a common nest of loops ff there exst two teraton vectors and for the nest, st. (1) (a) < or (b) = and there s a path from S 1 to S 2 n the body of the loop, (2) statement S 1 accesses memory locaton M on teraton and statement S 2 accesses locaton M on teraton, and (3) one of these accesses s a wrte. 1a: Loop carred and 1b: Loop ndependent S1 s source of dependence, S2 s snk or target of dep Dependence Dstance Usng teraton vectors and def of dependence we can determne the dstance of a dependence: In n-deep loop nest f S1 s source n teraton S2 s snk n teraton Dstance of dependence s represented wth a dstance vector: D Vector of length n, where d k = k - k Lecture 6 15-745 2005-8 25 Lecture 6 15-745 2005-8 26 Dstance Vector Example of Dstance Vectors for (=0; <n; ++) { A[] = B[]; B[+1] = A[]; A[0] = B[0]; B[1] = A[0]; A[1] = B[1]; B[2] = A[1]; A[2] = B[2]; B[3] = A[2]; =0 =1 =2 Dstance vector s the dfference between the target and source teratons. d = I t -I s Exactly the dstance of the dependence,.e., I s + d = I t for (=0; <n; ++) for (=0; <m; ++){ A[,] = ; = A[,]; B[,+1] = ; = B[,]; C[+1,] = ; = C[,+1] ; A 0,2 = =A 0,2 B 0,3 = =B 0,2 C 1,2 = =C 0,3 A 0,1 = =A 0,1 B 0,2 = =B 0,1 C 11 1,1 = =C 02 0,2 A 1,2 = =A 1,2 B 1,3 = =B 1,2 C 2,2 = =C 1,3 A 1,1 = =A 1,1 B 1,2 = =B 1,1 C 21 2,1 = =C 12 1,2 A 2,2 = =A 2,2 B 2,3 = =B 2,2 C 3,2 = =C 2,3 A 2,1 = =A 2,1 B 2,2 = =B 2,1 C 31 3,1 = =C 22 2,2 A 0,0 = =A 0,0 A 1,0 = =A 1,0 A 2,0 = =A 2,0 B 0,1 = =B 0,0 B 1,1 = =B 1,0 B 2,1 = =B 2,0 C 1,0 = =C 0,1 C 2,0 = =C 1,1 C 3,0 = =C 2,1 T. Mowry Lecture 6 15-745 2005-8 27 T. Mowry

Example of Dstance Vectors for (=0; <n; ++) for (=0; <m; ++){ A[,] = ; = A[,]; B[,+1] = ; = B[,]; C[+1,] = ; = C[,+1] ; A yelds: A 0,2 = =A 0,2 A 1,2 = =A 1,2 A 2,2 = =A 2,2 B 03 0,3= =B 02 0,2 B 13 1,3= =B 12 1,2 B 23 2,3= =B 22 2,2 C 1,2 = =C 0,3 C 2,2 = =C 1,3 C 3,2 = =C 2,3 A 0,1 = =A 0,1 B 0,2 = =B 0,1 C 1,1 = =C 0,2 A 00 0,0 = =A 00 0,0 B 0,1 = =B 0,0 C 1,0 = =C 0,1 A 1,1 = =A 1,1 B 1,2 = =B 1,1 C 2,1 = =C 1,2 A 10 1,0 = =A 10 1,0 B 1,1 = =B 1,0 C 2,0 = =C 1,1 0 0 1 B yelds: C yelds: 0 1-1 A 2,1 = =A 2,1 B 2,2 = =B 2,1 C 3,1 = =C 2,2 A 20 2,0 = =A 20 2,0 B 2,1 = =B 2,0 C 3,0 = =C 2,1 Drecton Vectors Less precse than dstance vectors, but often good enough In n-deep loop nest f S1 s source n teraton S2 s snk n teraton Dstance vector: F - Vector of length n, where -f k = k k Drecton vector also vector of length n, where d k = k < f f k > 0, or k < k = f f k = 0, or k = k > f f k < 0, or k > k T. Mowry Lecture 6 15-745 2005-8 30 Example of Drecton Vectors for (=0; <n; ++) for (=0; <m; ++){ A[,] = ; = A[,]; B[,+1] = ; = B[,]; C[+1,] = ; = C[,+1] ; A yelds: A 0,2 = =A 0,2 A 1,2 = =A 1,2 A 2,2 = =A 2,2 B 03 0,3= =B 02 0,2 B 13 1,3= =B 12 1,2 B 23 2,3= =B 22 2,2 C 1,2 = =C 0,3 C 2,2 = =C 1,3 C 3,2 = =C 2,3 A 0,1 = =A 0,1 B 0,2 = =B 0,1 C 1,1 = =C 0,2 A 00 0,0 = =A 00 0,0 B 0,1 = =B 0,0 C 1,0 = =C 0,1 A 1,1 = =A 1,1 B 1,2 = =B 1,1 C 2,1 = =C 1,2 A 10 1,0 = =A 10 1,0 B 1,1 = =B 1,0 C 2,0 = =C 1,1 = = < B yelds: C yelds: = < > A 2,1 = =A 2,1 B 2,2 = =B 2,1 C 3,1 = =C 2,2 A 20 2,0 = =A 20 2,0 B 2,1 = =B 2,0 C 3,0 = =C 2,1 Drecton Vectors Example: DO I = 1, N DO J = 1, M DO K = 1, L S 1 A(I+1, J, K-1) = A(I, J, K) + 10 S 1 has a true dependence on tself. Dstance Vector: (1, 0, -1) Drecton Vector: (<, =, >) T. Mowry Lecture 6 15-745 2005-8 32

Note on vectors A dependence cannot exst f t has a drecton vector whose leftmost non "=" component s not "<" as ths would mply that the snk of the dependence occurs before the source. Lkewse, the frst non-zero dstance n a dstance vector must be postve. The Key Any reorderng transformaton that preserves every dependence n a program preserves the meanng of the program A reorderng transformaton may change order of executon but does not add or remove statements. Lecture 6 15-745 2005-8 33 Lecture 6 15-745 2005-8 34 Man Theme Fndng Data Dependences Determnng whether h dependences d exst between two subscrpted references to the same array n a loop nest Several tests to detect these dependences Lecture 6 15-745 2005-8 35

DO 1 = L 1, U 1 DO 2 = L 2, U 2 The General Problem DO n = L n, U n S 1 A(f 1 ( 1,, n ),,f m ( 1,, n )) = S 2 = A(g 1 ( 1,, n ),,g m ( 1,, n )) A dependence exsts from S1 to S2 f: There exst α and β such that α < β (control flow requrement) f (α) =g g (β) for all, 1 m (common access requrement) Bascs: Conservatve Testng Consder only lnear subscrpt expressons Fndng nteger solutons to system of lnear Dophantne Equatons s NP-Complete Most common approxmaton s Conservatve Testng,.e., See f you can assert No dependence exsts between two subscrpted references of the same array Never ncorrect, may be less than optmal Bascs: Indces and Subscrpts Index: Index varable for some loop surroundng a par of references Subscrpt: A PAIR of subscrpt postons n a par of array references For Example: A(I,) = A(I,k) + C <I,I> s the frst subscrpt <,k> s the second subscrpt Bascs: Complexty A subscrpt s sad to be ZIV f t contans no ndex zero ndex varable SIV f t contans only one ndex sngle ndex varable MIV f t contans more than one ndex multple ndex varable For Example: For Example: A(5,I+1,) = A(1,I,k) + C Frst subscrpt s ZIV Second subscrpt s SIV Thrd subscrpt s MIV

Bascs: Separablty A subscrpt s separable f ts ndces do not occur n other subscrpts If two dfferent subscrpts contan the same ndex they are coupled For Example: A(I+1,) = A(k,) + C Both subscrpts are separable A(I,,) = A(I,,k) + C Second and thrd subscrpts are coupled Bascs:Coupled Subscrpt Groups Why are they mportant? Couplng can cause mprecson n dependence testng DO I = 1, 100 S1 A(I+1,I) = B(I) + C S2 D(I) = A(I,I) * E Dependence Testng: Overvew Partton subscrpts of a par of array references nto separable and coupled groups Classfy each subscrpt as ZIV, SIV or MIV Reason for classfcaton s to reduce complexty of the tests. For each separable subscrpt apply sngle subscrpt test. Contnue untl prove ndependence. Deal wth coupled groups If ndependent, done Otherwse, merge all drecton vectors computed n the prevous steps nto a sngle set of drecton vectors Step 1: Subscrpt Parttonng Parttons the subscrpts nto separable and mnmal coupled groups Notatons // S s a set of m subscrpt pars S 1, S 2, S m each enclosed n n loops wth ndexes I I I whch s to be n loops wth ndexes I 1, I 2, I n, whch s to be parttoned nto separable or mnmal coupled groups. // P s an output varable, contanng the set of parttons // n p s the number of parttons

Subscrpt Parttonng Algorthm procedure partton(s,p, n p ) n p = m; for := 1 to m do P = {S ; for := 1 to n do begn k := <none> for each remanng partton P do f there exsts s ε P such that s contans I then f k = < none > then k = ; else begn P k = P k P ; dscard P; n p = n p 1; end end end partton Step 2: Classfy as ZIV/SIV/MIV Easy step Just count the number of dfferent ndces n a subscrpt Step 3: Applyng Sngle Subscrpt Tests ZIV Test SIV Test Strong SIV Test Weak SIV Test Weak-zero SIV Weak Crossng SIV SIV Tests n Complex Iteraton Spaces ZIV Test DO = 1, 100 S A(e1) = A(e2) + B() e1,e2 are constants or loop nvarant symbols If (e1-e2)!=0 No Dependence exsts

Strong SIV Test Strong SIV Test Example Strong SIV subscrpts are of the form a + c1, a + c 2 For example the followng are strong SIV subscrpts +1, 4 + 2, 4 + 4 DO k = 1, 100 DO = 1, 100 S1 A(+1,k) = S2 = A(,k) + 32 Strong SIV Test Weak SIV Tests Weak SIV subscrpts are of the form a+c 1 1,a+c 2 2 c 1 c 2 d = ' = a For example the followng are weak SIV subscrpts +1, 5 2 + 1, + 5 2 + 1, 2 Dependence exsts f d U L

Geometrc vew of weak SIV Weak-zero SIV Test Specal case of Weak SIV where one of the coeffcents c ents of the ndex s zero The test conssts merely of checkng whether the soluton s an nteger and s wthn loop bounds = c 2 c 1 a 1 Lecture 6 15-745 2005-8 53 Weak-zero SIV Test Weak-zero SIV & Loop Peelng DO = 1, N S 1 Y(, N) = Y(1, N) + Y(N, N) Can be loop peeled to Y(1, N) = Y(1, N) + Y(N, N) DO = 2, N-1 S1 Y(, N) = Y(1, N) + Y(N, N) Y(N, N) = Y(1, N) + Y(N, N)

Weak-crossng SIV Test Weak-crossng SIV Test Specal case of Weak SIV where the coeffcents c ents of the ndex are equal n magntude but opposte n sgn The test conssts merely of checkng whether the soluton ndex s 1. wthn loop bounds and s 2. ether an nteger or has a non-nteger = c 2 c 1 part equal to 1/2 2 a 1 S1 Weak-crossng SIV & Loop Splttng DO = 1, N A() = A(N-+1) + C Ths loop can be splt nto DO = 1,(N+1)/2 A() = A(N-+1) + C DO = (N+1)/2 + 1, N A() = A(N-+1) + C Complex Iteraton Spaces Tll now we have appled the tests only to rectangular teraton spaces These tests can also be extended to apply to trangular or trapezodal loops Trangular: One of the loop bounds s a functon of at least one other loop ndex Trapezodal: Both the loop bounds are functons of at least one other loop ndex

Next Tme Complex teraton spaces MIV Tests Tests n Coupled groups Mergng drecton vectors