Compiler Design. Data Flow Analysis. Hwansoo Han

Similar documents
Data Flow Analysis (I)

Topic-I-C Dataflow Analysis

Lecture 5 Introduction to Data Flow Analysis

Dataflow Analysis Lecture 2. Simple Constant Propagation. A sample program int fib10(void) {

Lecture 4 Introduc-on to Data Flow Analysis

Data flow analysis. DataFlow analysis

Dataflow Analysis. A sample program int fib10(void) { int n = 10; int older = 0; int old = 1; Simple Constant Propagation

(b). Identify all basic blocks in your three address code. B1: 1; B2: 2; B3: 3,4,5,6; B4: 7,8,9; B5: 10; B6: 11; B7: 12,13,14; B8: 15;

Monotone Data Flow Analysis Framework

3/11/18. Final Code Generation and Code Optimization

CSC D70: Compiler Optimization Dataflow-2 and Loops

Adventures in Dataflow Analysis

CMSC 631 Program Analysis and Understanding. Spring Data Flow Analysis

Data-Flow Analysis. Compiler Design CSE 504. Preliminaries

We define a dataflow framework. A dataflow framework consists of:

The Meet-Over-All-Paths Solution to Data-Flow Problems

Dataflow analysis. Theory and Applications. cs6463 1

Department of Computer Science and Engineering, IIT Kanpur, INDIA. Code Optimization. Sanjeev K Aggarwal Code Optimization 1 of I

Generalizing Data-flow Analysis

Introduction to data-flow analysis. Data-flow analysis. Control-flow graphs. Data-flow analysis. Example: liveness. Requirements

Scalar Optimisation Part 2

MIT Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Static Analysis of Programs: A Heap-Centric View

Dataflow Analysis. Dragon book, Chapter 9, Section 9.2, 9.3, 9.4

Lecture 10: Data Flow Analysis II

What is SSA? each assignment to a variable is given a unique name all of the uses reached by that assignment are renamed

Compiler Design Spring 2017

CSC D70: Compiler Optimization Static Single Assignment (SSA)

Dataflow Analysis. Chapter 9, Section 9.2, 9.3, 9.4

CS 553 Compiler Construction Fall 2006 Homework #2 Dominators, Loops, SSA, and Value Numbering

DFA of non-distributive properties

Compiler Design Spring 2017

Intra-procedural Data Flow Analysis Backwards analyses

Lecture 4. Finite Automata and Safe State Machines (SSM) Daniel Kästner AbsInt GmbH 2012

Scalar Optimisation Part 1

COSE312: Compilers. Lecture 17 Intermediate Representation (2)

Flow grammars a flow analysis methodology

Data Flow Analysis. Lecture 6 ECS 240. ECS 240 Data Flow Analysis 1

Computing Static Single Assignment (SSA) Form

MIT Loop Optimizations. Martin Rinard

Static Program Analysis

Topics on Compilers

Iterative Dataflow Analysis

Dataflow Analysis - 2. Monotone Dataflow Frameworks

: Compiler Design. 9.5 Live variables 9.6 Available expressions. Thomas R. Gross. Computer Science Department ETH Zurich, Switzerland

Causal Dataflow Analysis for Concurrent Programs

Lecture 11: Data Flow Analysis III

Lecture Overview SSA Maximal SSA Semipruned SSA o Placing -functions o Renaming Translation out of SSA Using SSA: Sparse Simple Constant Propagation

Construction of Static Single-Assignment Form

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Space-aware data flow analysis

Sparse Analysis of Variable Path Predicates Based Upon SSA-Form

: Advanced Compiler Design Compu=ng DF(X) 3.4 Algorithm for inser=on of φ func=ons 3.5 Algorithm for variable renaming

Model Checking & Program Analysis

Optimizing Finite Automata

Probabilistic Model Checking and [Program] Analysis (CO469)

Goal. Partially-ordered set. Game plan 2/2/2013. Solving fixpoint equations

directed weighted graphs as flow networks the Ford-Fulkerson algorithm termination and running time

CSE P 501 Compilers. Value Numbering & Op;miza;ons Hal Perkins Winter UW CSE P 501 Winter 2016 S-1

Logic and Automata I. Wolfgang Thomas. EATCS School, Telc, July 2014

Class 5. Review; questions. Data-flow Analysis Review. Slicing overview Problem Set 2: due 9/3/09 Problem Set 3: due 9/8/09

Symbolic Model Checking with ROBDDs

ECE 5775 (Fall 17) High-Level Digital Design Automation. Control Flow Graph

Abstract Interpretation II

Models of Computation, Recall Register Machines. A register machine (sometimes abbreviated to RM) is specified by:

Models of Sequential Systems

Discrete Fixpoint Approximation Methods in Program Static Analysis

1 Matchings in Non-Bipartite Graphs

Optimizing Intra-Task Voltage Scheduling Using Data Flow Analysis Λ

Software Verification

Mechanics of Static Analysis

AST rewriting. Part I: Specifying Transformations. Imperative object programs. The need for path queries. fromentry, paths, toexit.

ECE 5775 (Fall 17) High-Level Digital Design Automation. Scheduling: Exact Methods

Reduced Ordered Binary Decision Diagrams

Path Testing and Test Coverage. Chapter 9

Lexical Analysis. DFA Minimization & Equivalence to Regular Expressions

Probabilistic Program Analysis

Path Testing and Test Coverage. Chapter 9

MAS210 Graph Theory Exercises 5 Solutions (1) v 5 (1)

Linearity Analysis for Automatic Differentiation

Reading: Chapter 9.3. Carnegie Mellon

arxiv: v1 [cs.dm] 26 Apr 2010

Verifying the LLVM. Steve Zdancewic DeepSpec Summer School 2017

STAAR Science Tutorial 21 TEK 6.8D: Graphing Motion

Optimal Metric Planning with State Sets in Automata Representation

Program Analysis. Lecture 5. Rayna Dimitrova WS 2016/2017

Linearity Analysis for Automatic Differentiation

The Reachability-Bound Problem. Gulwani and Zuleger PLDI 10

Value Range Propagation

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Principles of Program Analysis: A Sampler of Approaches

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Static Program Analysis


Worst-Case Execution Time Analysis. LS 12, TU Dortmund

CS360 Homework 12 Solution

Multiple-Dispatching Based on Automata

Maximum flow problem (part I)

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Transcription:

Compiler Design Data Flow Analysis Hwansoo Han

Control Flow Graph What is CFG? Represents program structure for internal use of compilers Used in various program analyses Generated from AST or a sequential list of statements Basic block To build CFG, first partition a program into basic blocks Longest possible sequence of straight-line code no branch-in/out in the middle of basic block single entry at the beginning (label, join-point) single exit at the end (can be branch or non-branch) 2

CFG Example CFG = Basic blocks + Control transfers Unreachable code elimination DFS (or BFS) on the CFG can remove unreachable blocks As a result, the declaration of z may also be removed. int test(int n) { int x,y,z,w; if (n < 2) x = 3; else x = 0; goto l; z = x + 5; y = z * 3; l: w = x + 10; y = w * 2; return y; } 3 x=3 z=x+5 y=z*3 test() n<2 goto l w=x+!0 y=w*2 return y x=0

Data Flow Problem Applying data flow analysis on CFG Identify how data are manipulated/modified Optimization relies on information quality acquired thru DFA DFA is a process to solve the effects of statements, basic blocks or larger segments Typical data flow problems Available expressions, reaching definitions Aliases, live variables, copy propagation Upward exposed uses (live-in for basic block) n<2 x=3 x=0 goto l w=x+!0 y=w*2 return y Analyze data information at each point 4

Data Flow Analysis (DFA) Data flow information (DFI) Abstract property of program data (variable, value, name) At various points in the control flow Data flow equation (DFE) Represents effects of statements, basic blocks, or program segments Use systematic equations using following components 1. sets or functions (in, out, gen, kill, f, g, ) 2. operations (,,, ) on sets/functions To gather DFI, DFEs are solved by well-defined mathematical procedures. 5

DFA on Program Hierarchy Local data flow analysis Composes effect of each statement within a basic block Proceeds sequentially from the leader to the last statement in the basic block Global data flow analysis Composes effect of each program segment (basic blocks, loops or intervals) at the segment boundaries in a CFG Proceeds along the control transfer in a control flow graph 6

Local Data Flow Analysis Transfer function (f S ) Represents effects of statement S on DFI within a basic block. Corresponding data flow equation out[s] = f S (in[s]) DFI in[s i ] S i-1 Typical transfer function f S (in[s]) = gen[s] (in[s] kill[s]) gen[s]: DFI newly generated by S kill[s]: DFI that was originally in in[s], but regenerated by S f S i gen[s i ] kill[s i ] out[s i ] S i S i+1 7

Example: Local Reaching Definition DFI of reaching definition Which variable definitions reach which points of uses? DFE for each statement S. in[s] : set of definitions that may reach S out[s] : set of definitions that are valid right after S kill[s] : set of definitions that reach S but are killed because S assigns new values to the definitions gen[s] : set of definitions that S newly assigns values to and not subsequently killed by itself Transfer function: f S (in[s]) = gen[s] (in[s] kill[s]) 8

Computing local reaching definitions v_s 1 x_s 7 y_s 9 in[b] S 10 x := y+4 S 11 y := y+v S 12 z := x*v in[s 10 ] = in[b]={v_s 1, x_s 7, y_s 9 } in[s 11 ] = out[s 10 ]={v_s 1, x_s 10, y_s 9 } kill[s 10 ] = {x_*}, gen[s 10 ]={x_s 10 } kill[s 11 ] = {y_*}, gen[s 11 ]={y_s 11 } in[s 12 ] = out[s 11 ]={v_s 1, x_s 10, y_s 11 } kill[s 12 ] = {z_*}, gen[s 12 ]={z_s 12 } in[s 13 ] = out[s 12 ]={v_s 1, x_s 10, y_s 11, z_s 12 } S 13 x := x-z kill[s 13 ] = {x_*}, gen[s 13 ]={x_s 13 } out[b] v_s 1 x_s 13 y_s 11 z_s 12 in[s 14 ] = out[s 13 ]={v_s 1, x_s 13, y_s 11, z_s 12 } S 14 goto L on z>0 kill[s 14 ] = {}, gen[s 14 ]={} out[b] = out[s 14 ]={v_s 1, x_s 13, y_s 11,z_S 12 } 9

Global Data Flow Analysis Transfer function out[x] = f X (in[x]) in[x] : DFI given to the beginning of X out[x] : output DFI at the end of X, DFI CFG Collects outputs of predecessors in[x] in[x] = out[p] P Pred[X] A typical form of a transfer function f X (in[x]) = gen[x] (in[x] kill[x]) gen[x]: DFI newly generated within X f X gen[x] kill[x] out[x] X kill[x]: DFI that was originally in in[x], but regenerated within X 10

Example: Global Reaching Definition DFE components for each basic block B in[b] : set of definitions that may reach the beginning of B out[b] : set of definitions that may reach the end of B kill[b] : set of definitions that reach B but are killed by being assigned values in it gen[b] : set of definitions that are newly assigned values in B and not subsequently killed in it Transfer function: f B (in[b]) = gen[b] (in[b] kill[b]) 11

Computing Reaching Definition B 0 S 0... B 4 S 50 x := y+4 S 51 y := y+v S 52 z := x*v S 53 x := x-z S 54 goto L on z>0 v_s 12 x_s 27 y_s 39 in[b 4 ] kill[b 4 ] x_* y_* z_* z_s 52 y_s 51 x_s 53 gen[b4 ] v_s 12 x_s 53 y_s 51 out[b 4 ] = gen[b 4 ] (in[b 4 ] kill[b 4 ]) B 8 z_s 52 B 6 S 70 L: t := z+x S 71 s := y*v... S 55... 12

Propagating DFI thru Control Flow Propagate DFI to a join node A node u with multiple predecessors p 1,,p n Need to specify how to combine the DFI propagated from all its predecessors Define a binary meet (or called confluence) operation Find greatest lower bound (glb) on lattice in[u] = out[p 1 ] out[p 2 ] out[p n ] = out[p i ] i = 1,n in[y] initial values y v in[u]=out[v] out[w] u x w out[u]=f u (in[u]) 13

Propagating Reaching Definitions The meet operation in the reaching definitions problem is union. B 3 S 40 read(x) S 41 v := z*v S 42 goto L on x=0 x_s 40 z_s 30 v_s 41 out[b 3 ] in[b 8 ] x_s 22 z_s 30 v_s 34 v_s 12, v_s 41 x_s 40, x_s 53 z_s 30, z_s 52 y_s 51 in[b 3 ] B 4 v_s 12 x_s 27 y_s 39 S 50 x := y+4 S 51 y := y+v S 52 z := x*v S 53 x := x-z S 54 goto L on z>0 v_s 12 x_s 53 y_s 51 z_s 52 out[b 4 ] S 0... in[b 8 ] = out[b 3 ] out[b 4 ] in[b 4 ] B 0 in[b 6 ] = out[b 4 ] in[b 6 ] v_s 12 x_s 53 y_s 51 z_s 52 B B 6 8 S 70 L: t := z+x S S 70... 71 s := y*v... 14

Iterative Data Flow Analysis Given G=(N,E,S), forward data-flow problem s DFE in[b] = Init (Init = Φ for Reaching Definition) out[b] = f B ( in[b] ) = gen[b] ( in[b] kill[b] ) A typical algorithm for iterative data flow analysis for each program segment X do initialize out[x]; od while changes to any out[x] occur do for each segment X in some order do in[x] = out[p X ]; all predecessors P X of X out[x] = gen[x] (in[x]-kill[x]); od od 15

Live Variables Problem Dataflow problem Find which variables are live in each basic block! Definition of live A variable v is live at point p, if the value of v is used along some path in CFG starting from p. Otherwise, v is dead. 16

A Framework For Live Variables Backward dataflow problem DFI propagates backward the flow of control DFI in live variables problem is a set of variables : V = { v v a set of all variables in the program} Transfer function and meet operator a variable is live at the beginning of B if used in B before redefinition (local definition) or live at the end of B and is not redefined in B in[b] = f B (out[b]) = use[b] (out[b] def[b]) a variable is live at the end of B if it is live at the beginning of any of its successors. out[b] = in[s] = S succ(b) 17

Forward vs. Backward Data Flow Forward dataflow equation in[*] = Init, in[b] = out[p] P pred(b) in[b] out[b] out[b] = f B ( in[b] ) = gen[b] ( in[b] kill[b] ) B DFI Backward dataflow equation out[*] = Init, out[b] = 18 in[s] S succ(b) in[b] out[b] in[b] = f B ( out[b] ) = gen[b] ( out[b] kill[b] ) B DFI

Backward Data Flow Analysis To handle backward data flow analysis, 1. Build a reverse CFG G R =(N,E R ) from the original CFG G=(N,E) s.t. for each (u,v) E, there exists (v,u) in E R. 2. Apply the same algorithm that is applied to forward data flow problems. 19

Iterative Algorithm For Liveness Iterative backward data flow analysis to compute live variables for each basic block B do in[b] = use[b]; // f B ( ) = use[b] ( - def[b]) where od // is the initial DFI of live variables while changes to any in[b] occur do for each X in some order do out[b] = in[s B ]; all successors s B of B in[b] = use[b] (out[b] - def[b]); od od 20

Code Example B 1 B 2 B 3 B 5 x := u+3 y := v-4 t := w*2 L 1 : x := x-1 y := y+1 br L 2 if x>y t := y+1 br L 3 L 2 : x := x-1 L 3 :br L 1 if x>0 B 4 B3 PostOrder: B5,B3,B4,B2,B1 B1 B2 B5 B4 Initial Condition (in[ ] = use[ ]) out[b 1 ] = { } in[b 1 ] = {u,v,w} out[b 2 ] = { } in[b 2 ] = {x,y} out[b 3 ] = { } in[b 3 ] = {y} out[b 4 ] = { } in[b 4 ] = {x} out[b 5 ] = { } in[b 5 ] = {x} After 1 st iteration (postorder) out[b 5 ] = {x,y} in[b 5 ] = {x,y} out[b 3 ] = {x,y} in[b 3 ] = {x,y} out[b 4 ] = {x,y} in[b 4 ] = {x,y} out[b 2 ] = {x,y} in[b 2 ] = {x,y} out[b 1 ] = {x,y} in[b 1 ] = {u,v,w} use[b 1 ] = {u,v,w} def[b 1 ] = {x,y,t} use[b 2 ] = {x,y} def[b 2 ] = {x,y} use[b 3 ] = {y} def[b 3 ] = {t} use[b 4 ] = {x} def[b 4 ] = {x} use[b 5 ] = {x} def[b 5 ] = { } After 2 nd iteration (stabilized) out[b 5 ] = {x,y} in[b 5 ] = {x,y} out[b 3 ] = {x,y} in[b 3 ] = {x,y} out[b 4 ] = {x,y} in[b 4 ] = {x,y} out[b 2 ] = {x,y} in[b 2 ] = {x,y} out[b 1 ] = {x,y} in[b 1 ] = {u,v,w} 21

How Fast Is The Iterative Algorithm? Execution time One pass visits all nodes (basic blocks) Number of visits to nodes : O(n 2 ) algorithm = length of longest acyclic path on CFG Merge straight basic blocks to reduce number of nodes Bit vector Instead of set operations, use bit vector and bit operations reaching definition : 1 bit for 1 variable available expression : 1 bit for 1 definition of expression Work-list algorithm Only visit nodes which have changed in[x] i.e. successors of changed out[x] Visiting order Post-order, reverse post-order (DFS) 22

Work List Iterative Algorithm Typical Iterative algorithm implements worklist for each x V do OUT(x) = GEN(x) (IN(x) KILL(x)); od worklist all x V while worklist do remove a node x from worklist oldout(x) = OUT(x); od // IN(x) = initial value IN(X) = OUT(P); P pred(x) OUT(X) = GEN(x) ( IN(x) KILL(x)); // = GEN ( IN & ~KILL ) if oldout(x) OUT(x) then worklist worklist succ(x); 23

Visiting Order To avoid unnecessary work Limit the number of visits by visiting a node roughly after all its predecessors Post-order for backward problem Reverse post-order for forward problem (reverse PO DFS ) b a c main() // Post-Order count = 1; DFS-visit(root); d e f g post-order : d e b f g c a reverse PO: a c g f b e d DFS-visit(n) mark n as visited foreach s succ(n) not yet visited DFS-visit(s); PostOrder(n) = count++; 24

Summary Data flow analysis on CFG Data flow information Data flow equation in[b] = out[p] P Pred[B] out[b] = f B ( in[b] ) = gen[b] ( in[b] kill[b] ) Define a meet operation for join points Faster Iterative Algorithm for DFA Bit vector instead of set operation Work list Visiting order 25