CSC D70: Compiler Optimization Static Single Assignment (SSA)

Similar documents
Dataflow Analysis. A sample program int fib10(void) { int n = 10; int older = 0; int old = 1; Simple Constant Propagation

CSC D70: Compiler Optimization Dataflow-2 and Loops

Computing Static Single Assignment (SSA) Form

What is SSA? each assignment to a variable is given a unique name all of the uses reached by that assignment are renamed

: Advanced Compiler Design Compu=ng DF(X) 3.4 Algorithm for inser=on of φ func=ons 3.5 Algorithm for variable renaming

Lecture Overview SSA Maximal SSA Semipruned SSA o Placing -functions o Renaming Translation out of SSA Using SSA: Sparse Simple Constant Propagation

Construction of Static Single-Assignment Form

Class 5. Review; questions. Data-flow Analysis Review. Slicing overview Problem Set 2: due 9/3/09 Problem Set 3: due 9/8/09

Lecture 11: Data Flow Analysis III

Static Program Analysis

Dataflow Analysis Lecture 2. Simple Constant Propagation. A sample program int fib10(void) {

Topics on Compilers

Lecture 10: Data Flow Analysis II

Topic-I-C Dataflow Analysis

Generalizing Data-flow Analysis

Compiler Design. Data Flow Analysis. Hwansoo Han

COSE312: Compilers. Lecture 17 Intermediate Representation (2)

Compiler Design Spring 2017

Space-aware data flow analysis

Models of Computation, Recall Register Machines. A register machine (sometimes abbreviated to RM) is specified by:

Tute 10. Liam O'Connor. May 23, 2017

MIT Loop Optimizations. Martin Rinard

Computational Models - Lecture 4

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries

CMSC 631 Program Analysis and Understanding. Spring Data Flow Analysis

Data Flow Analysis (I)

Compiler Design Spring 2017

DFA of non-distributive properties

Introduction to data-flow analysis. Data-flow analysis. Control-flow graphs. Data-flow analysis. Example: liveness. Requirements

CTL satisfability via tableau A C++ implementation

Computational Models Lecture 8 1

ECE 5775 (Fall 17) High-Level Digital Design Automation. Scheduling: Exact Methods

CSE P 501 Compilers. Value Numbering & Op;miza;ons Hal Perkins Winter UW CSE P 501 Winter 2016 S-1

: Compiler Design. 9.5 Live variables 9.6 Available expressions. Thomas R. Gross. Computer Science Department ETH Zurich, Switzerland

ECE 5775 (Fall 17) High-Level Digital Design Automation. Control Flow Graph

Dataflow Analysis. Chapter 9, Section 9.2, 9.3, 9.4

Data Structures for Disjoint Sets

1 Matchings in Non-Bipartite Graphs

CS 553 Compiler Construction Fall 2006 Homework #2 Dominators, Loops, SSA, and Value Numbering

Verifying the LLVM. Steve Zdancewic DeepSpec Summer School 2017

MTH401A Theory of Computation. Lecture 17

Dataflow analysis. Theory and Applications. cs6463 1

Computational Models Lecture 8 1

Computational Models Lecture 8 1

Syntactic Analysis. Top-Down Parsing

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Path Testing and Test Coverage. Chapter 9

Path Testing and Test Coverage. Chapter 9

Data flow analysis. DataFlow analysis

Please give details of your answer. A direct answer without explanation is not counted.

Lecture 5 Introduction to Data Flow Analysis

CMSC 132, Object-Oriented Programming II Summer Lecture 12

LOWELL JOURNAL. MUST APOLOGIZE. such communication with the shore as Is m i Boimhle, noewwary and proper for the comfort

Special Nodes for Interface

Beyond Superlocal: Dominators, Data-Flow Analysis, and DVNT. COMP 506 Rice University Spring target code. source code OpJmizer

A DARK GREY P O N T, with a Switch Tail, and a small Star on the Forehead. Any

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

Section Summary. Relations and Functions Properties of Relations. Combining Relations

Foundations of

Program Analysis Part I : Sequential Programs

CISC4090: Theory of Computation

Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

Properties of Context-Free Languages. Closure Properties Decision Properties

Data Flow Analysis. Lecture 6 ECS 240. ECS 240 Data Flow Analysis 1

Reduced Ordered Binary Decision Diagrams

Principles of Program Analysis: A Sampler of Approaches

Compiling Techniques

The Reachability-Bound Problem. Gulwani and Zuleger PLDI 10

Syntax Analysis, VI Examples from LR Parsing. Comp 412

EECS 495: Randomized Algorithms Lecture 14 Random Walks. j p ij = 1. Pr[X t+1 = j X 0 = i 0,..., X t 1 = i t 1, X t = i] = Pr[X t+

Symbolic Model Checking with ROBDDs

Automata and Computability. Solutions to Exercises

Advanced Implementations of Tables: Balanced Search Trees and Hashing

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

CS415 Compilers Syntax Analysis Top-down Parsing

CS 314 Principles of Programming Languages

More Dynamic Programming

Algorithms for Computing the Static Single Assignment Form

Reading: Chapter 9.3. Carnegie Mellon

CSE 105 THEORY OF COMPUTATION

Top-Down Parsing and Intro to Bottom-Up Parsing

Notes for Comp 497 (454) Week 10

G54FOP: Lecture 17 & 18 Denotational Semantics and Domain Theory III & IV

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

More Dynamic Programming

ALG 5.4. Lexicographical Search: Applied to Scheduling Theory. Professor John Reif

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Sanjit A. Seshia EECS, UC Berkeley

Recursion: Introduction and Correctness

Reduced Ordered Binary Decision Diagrams

Stacks. Definitions Operations Implementation (Arrays and Linked Lists) Applications (system stack, expression evaluation) Data Structures 1 Stacks

1. Problem I calculated these out by hand. It was tedious, but kind of fun, in a a computer should really be doing this sort of way.

Discrete Wiskunde II. Lecture 5: Shortest Paths & Spanning Trees

What happens to the value of the expression x + y every time we execute this loop? while x>0 do ( y := y+z ; x := x:= x z )

How to Use Calculus Like a Physicist

Languages, regular languages, finite automata

CS1800: Mathematical Induction. Professor Kevin Gold

Scalar Optimisation Part 2

Transcription:

CSC D70: Compiler Optimization Static Single Assignment (SSA) Prof. Gennady Pekhimenko University of Toronto Winter 08 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

From Last Lecture What is a Loop? Dominator Tree Natural Loops Back Edges

Finding Loops: Summary Define loops in graph theoretic terms Definitions and algorithms for: Dominators Back edges Natural loops 3

Where Is a Variable Defined or Used? Example: Loop-Invariant Code Motion Are B, C, and D only defined outside the loop? Other definitions of A inside the loop? Uses of A inside the loop? Example: Copy Propagation For a given use of X: Are all reaching definitions of X: copies from same variable: e.g., X = Y Where Y is not redefined since that copy? If so, substitute use of X with use of Y X = Y A = B + C E = A + D X = Y X = Y W = X + Z It would be nice if we could traverse directly between related uses and def s this would enable a form of sparse code analysis (skip over don t care cases) 4

Appearances of Same Variable Name May Be Unrelated X = A + Y = X + B F = F = 3 X = F + 7 C = X + D The values in reused storage locations may be provably independent in which case the compiler can optimize them as separate values Compiler could use renaming to make these different versions more explicit 5

Definition-Use and Use-Definition Chains X = A + Y = X + B F = F = 3 X = F + 7 C = X + D Use-Definition (UD) Chains: for a given definition of a variable X, what are all of its uses? Definition-Use (DU) Chains: for a given use of a variable X, what are all of the reaching definitions of X? 6

DU and UD Chains Can Be Expensive foo(int i, int j) { switch (i) { case 0: x=3;break; case : x=; break; case : x=6; break; case 3: x=7; break; default: x = ; } switch (j) { case 0: y=x+7; break; case : y=x+4; break; case : y=x-; break; case 3: y=x+; break; default: y=x+9; } In general, N defs M uses O(NM) space and time One solution: limit each variable to ONE definition site 7

DU and UD Chains Can Be Expensive () foo(int i, int j) { switch (i) { case 0: x=3; break; case : x=; break; case : x=6; case 3: x=7; default: x = ; } x is one of the above x s switch (j) { case 0: y=x+7; case : y=x+4; case : y=x-; case 3: y=x+; default: y=x+9; } One solution: limit each variable to ONE definition site 8

Static Single Assignment (SSA) Static single assignment is an IR where every variable is assigned a value at most once in the program text Easy for a basic block (reminiscent of Value Numbering): Visit each instruction in program order: LHS: assign to a fresh version of the variable RHS: use the most recent version of each variable a x + y b a + x a b + c y + a c + a a x + y b a + x a b + c y + a 3 c + a 9

What about Joins in the CFG? c if (i) { a x + y b a + x } else { } a b + c y + a c + a a x + y b a + x c if (i) a b + c y + a 4 c? + a? Use a notational fiction: a function 0

Merging at Joins: the function c if (i) a x + y b a + x a b + c y + a 3 (a,a ) c 3 (c,c ) b (b,?) a 4 c 3 + a 3

The function merges multiple definitions along multiple control paths into a single definition. At a basic block with p predecessors, there are p arguments to the function. x new (x, x, x,, x p ) How do we choose which x i to use? We don t really care! If we care, use moves on each incoming edge

Implementing c if (i) a x + y b a + x a 3 a c 3 c a b + c y + a 3 a c 3 c a 3 (a,a ) c 3 (c,c ) a 4 c 3 + a 3 3

Trivial SSA Each assignment generates a fresh variable. At each join point insert functions for all live variables. x x y x y y x y z y + x Too many functions inserted. x (x,x ) y 3 (y,y ) z y 3 + x 4

Minimal SSA Each assignment generates a fresh variable. At each join point insert functions for all live variables with multiple outstanding defs. x x y x y y x y z y + x y 3 (y,y ) z y 3 + x 5

Another Example a 0 a 0 b a + c c + b a b * if a < N a 3 (a,a ) c 3 (c,c ) b a 3 + c c 3 + b a b * if a < N return c Notice use of c return c 6

When Do We Insert? 5 9 If there is a def of a in block 5, which nodes need a ()? 3 6 7 0 4 8 3 CFG 7

When do we insert? We insert a function for variable A in block Z iff: A was defined more than once before (i.e., A defined in X and Y AND X Y) There exists a non-empty path from x to z, P xz, and a non-empty path from y to z, P yz, s.t. P xz P yz = { z } (Z is only common block along paths) z P xq or z P yr where P xz = P xq z and P yz = P yr z (at least one path reaches Z for first time) Entry block contains an implicit def of all vars Note: v = ( ) is a def of v 8

Dominance Property of SSA In SSA, definitions dominate uses. If x i is used in x (, x i, ), then BB(x i ) dominates i th predecessor of BB(PHI) If x is used in y x, then BB(x) dominates BB(y) We can use this for an efficient algorithm to convert to SSA 9

Dominance 5 9 4 5 9 3 3 6 7 0 3 6 7 8 0 4 8 If there is a def of a in block 5, which nodes need a ()? 3 CFG D-Tree x strictly dominates w (x sdom w) iff x dom w AND x w 0

Dominance Frontier 5 9 4 5 9 3 3 6 7 0 3 6 7 8 0 4 8 3 The Dominance Frontier of a node x = { w x dom pred(w) AND!(x sdom w)} CFG D-Tree x strictly dominates w (x sdom w) iff x dom w AND x w

Dominance Frontier and Path Convergence 5 9 5 9 3 6 7 0 3 6 7 0 4 8 4 8 3 3

Using Dominance Frontier to Compute SSA place all () Rename all variables 3

Using Dominance Frontier to Place () Gather all the defsites of every variable Then, for every variable foreach defsite foreach node in DominanceFrontier(defsite) if we haven t put () in node, then put one in if this node didn t define the variable before, then add this node to the defsites This essentially computes the Iterated Dominance Frontier on the fly, inserting the minimal number of () neccesary 4

Using Dominance Frontier to Place () foreach node n { foreach variable v defined in n { orig[n] = {v} defsites[v] = {n} } } foreach variable v { W = defsites[v] while W not empty { n = remove node from W foreach y in DF[n] if y PHI[v] { insert v (v,v, ) at top of y PHI[v] = PHI[v] {y} if v orig[y]: W = W {y} } } } 5

Renaming Variables Algorithm: Walk the D-tree, renaming variables as you go Replace uses with more recent renamed def For straight-line code this is easy What if there are branches and joins? use the closest def such that the def is above the use in the D-tree Easy implementation: for each var: rename (v) rename(v): replace uses with top of stack at def: push onto stack call rename(v) on all children in D-tree for each def in this block pop from stack 6

Compute Dominance Tree 3 5 j < 0? i j k 0 k < 00? j i k k + 4 6 return j j k k k + D-tree 3 4 5 6 7 7 7

Compute Dominance Frontiers 3 5 j < 0? i j k 0 k < 00? j i k k + 4 6 return j j k k k + DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} 3 4 5 6 7 7 8

Insert () 3 5 j < 0? i j k 0 k < 00? j i k k + 7 4 6 return j j k k k + DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} DFs orig[n] { i,j,k} {} 3 {} 4 {} 5 {j,k} 6 {j,k} 7 {} var i: W={} var j: W={,5,6} DF{} DF{5} defsites[v] i {} j {,5,6} k {,5,6} DF{} 9

Insert () 3 j < 0? i j k 0 k < 00? 4 return j DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} orig[n] { i,j,k} {} 3 {} 4 {} 5 {j,k} 6 {j,k} 7 {} defsites[v] i {} j {,5,6} k {,5,6} 5 j i k k + 6 j k k k + DFs var j: W={,5,6} 7 j (j,j) DF{} DF{5} 30

3 5 j < 0? j i k k + 7 i j k 0 j (j,j) k < 00? 4 6 j (j,j) return j j k k k + DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} DFs orig[n] { i,j,k} {} 3 {} 4 {} 5 {j,k} 6 {j,k} 7 {} var j: W={,5,6,7} defsites[v] DF{} DF{5} DF{7} i {} j {,5,6,7} k {,5,6} 3

3 5 j < 0? j i k k + 7 i j k 0 j (j,j) k < 00? 4 6 j (j,j) return j j k k k + DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} DFs orig[n] { i,j,k} {} 3 {} 4 {} 5 {j,k} 6 {j,k} 7 {} var j: W={,5,6,7} defsites[v] i {} j {,5,6} k {,5,6} DF{} DF{5} DF{7} DF{6} 3

i j k 0 j (j,j) k (k,k) k < 00? DFs {} {} 3 {} 4 {} 5 {7} 6 {7} 7 {} orig[n] { i,j,k} {} 3 {} 4 {} 5 {j,k} 6 {j,k} 7 {} Def sites[v] i {} j {,5,6} k {,5,6} 3 j < 0? 4 return j DFs 5 j i k k + 6 j k k k + var k: W={,5,6} 7 j (j,j) k (k,k) 33

i j k 0 Rename Vars 3 5 j < 0? j i k k + j (j,j ) k (k,k) k < 00? 4 6 return j j k k k + 3 4 5 6 7 7 j (j,j) k (k,k) 34

i j k 0 Rename Vars 3 5 j (j 4,j ) k (k 4,k ) k < 00? j < 0? return j j 3 i k 3 k + 4 6 j 5 k k 5 k + 3 4 5 6 7 7 j 4 (j 3,j 5 ) k 4 (k 3,k 5 ) 35

Computing DF(n) n a b c n dom a n dom b!n dom c 36

Computing DF(n) n a b c x DF(a) DF(b) n dom a n dom b!n dom c 37

Computing the Dominance Frontier compute-df(n) S = {} foreach node y in succ[n] if idom(y) n S = S { y } foreach child of n, c, in D-tree DF[n] = S compute-df(c) foreach w in DF[c] The Dominance Frontier of a node x = { w x dom pred(w) AND!(x sdom w)} if!n dom w S = S { w } 38

SSA Properties Only assignment per variable Definitions dominate uses 39

Constant Propagation If v c, replace all uses of v with c If v (c,c,c) (each input is the same constant), replace all uses of v with c W list of all defs while!w.isempty { } Stmt S W.removeOne if ((S has form v c ) } (S has form v (c,,c) )) then { } delete S foreach stmt U that uses v { replace v with c in U W.add(U) 40

Other Optimizations with SSA Copy propogation delete x (y,y,y) and replace all x with y delete x y and replace all x with y Constant Folding (Also, constant conditions too!) 4

Constant Propagation 3 j < 0? i j k 0 k < 00? 4 Convert to SSA Form return j 3 i j k 0 j (j 4,j ) k (k 4,k ) k < 00? 4 j < 0? return j 5 j i k k + 6 j k k k + 5 j 3 i k 3 k + 6 j 5 k k 5 k + 7 7 j 4 (j 3,j 5 ) k 4 (k 3,k 5 ) 4

i j k 0 3 j (j 4,j ) k (k 4,k ) k < 00? 4 j < 0? return j 5 j 3 i k 3 k + 6 j 5 k k 5 k + 7 j 4 (j 3,j 5 ) k 4 (k 3,k 5 ) 43

i j k 0 3 j (j 4,) k (k 4,0) k < 00? 4 j < 0? return j 5 j 3 k 3 k + 6 j 5 k k 5 k + 7 j 4 (j 3,j 5 ) k 4 (k 3,k 5 ) 44

i j k 0 3 j (j 4,) k (k 4,0) k < 00? 4 j < 0? return j 5 j 3 k 3 k + 7 6 j 4 (,j 5 ) k 4 (k 3,k 5 ) j 5 k k 5 k + Not a very exciting result (yet) 45

i j k 0 Conditional Constant Propagation 3 5 j (j 4,) k (k 4,) k < 00? j < 0? return j j 3 k 3 k + 4 6 j 5 k k 5 k + Does block 6 ever execute? Simple Constant Propagation can t tell But Conditional Const. Prop. can tell: Assumes blocks don t execute until proven otherwise Assumes values are constants until proven otherwise 7 j 4 (,j 5 ) k 4 (k 3,k 5 ) 46

Conditional Constant Propagation Algorithm Keeps track of: Blocks assume unexecuted until proven otherwise Variables assume not executed (only with proof of assignments of a non-constant value do we assume not constant) Lattice for representing variables: not executed - - 0 we have seen evidence that the variable has been assigned a constant with the value we have seen evidence that the variable can hold different values at different times 47

Conditional Constant Propagation i j k 0 3 j (j 4,) k (k 4,0) k < 00? 4 j < 0? return j 5 j 3 k 3 k + 6 j 5 k k 5 k + 7 j 4 (,j 5 ) k 4 (k 3,k 5 ) 48

i j k 0 3 j (j 4,) k (k 4,0) k < 00? 4 j < 0? return j 5 j 3 k 3 k 0 + 6 j 5 k k 5 k + 7 j 4 (,j 5 ) k 4 (k 3,k 5 ) 49

Conditional Constant Propagation i j k 0 3 j (j 4,) k (k 4,0) k < 00? 4 j < 0? return j k (k 3,0) k < 00? k 3 k + return 5 j 3 k 3 k 0 + 6 j 5 k k 5 k + 7 j 4 (,j 5 ) k 4 (k 3,k 5 ) 50

CSC D70: Compiler Optimization Static Single Assignment (SSA) Prof. Gennady Pekhimenko University of Toronto Winter 08 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

Backup Slides 5