iretilp : An efficient incremental algorithm for min-period retiming under general delay model

Similar documents
VLSI Signal Processing

Problem Set 9 Solutions

DSP Design Lecture 5. Dr. Fredrik Edman.

Network Flow-based Simultaneous Retiming and Slack Budgeting for Low Power Design

Chaper 4: Retiming (Tái định thì) GV: Hoàng Trang

Risk Aversion Min-Period Retiming Under Process Variations

for generating the LP. Utilizing the retiming-skew relation for level-clocked circuits from [7], bounds on the variables of the minarea LP are obtaine

CMPS 6610 Fall 2018 Shortest Paths Carola Wenk

Issues on Timing and Clocking

Clock Skew Scheduling with Delay Padding for Prescribed Skew Domains

Single Source Shortest Paths

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Mathematics for Decision Making: An Introduction. Lecture 8

Skew Management of NBTI Impacted Gated Clock Trees

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

EEE2135 Digital Logic Design

G1 G2 G3 G4 G2 G3 G4

Retiming. delay elements in a circuit without affecting the input/output characteristics of the circuit.

Timing Analysis with Clock Skew

Rahul B. Deokar and Sachin S. Sapatnekar. 201 Coover Hall, Iowa State University, Ames, IA

Network Flow-based Simultaneous Retiming and Slack Budgeting for Low Power Design

The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization

Travelling Salesman Problem

Clock Skew Scheduling in the Presence of Heavily Gated Clock Networks

ENGG 1203 Tutorial_9 - Review. Boolean Algebra. Simplifying Logic Circuits. Combinational Logic. 1. Combinational & Sequential Logic

III. Linear Programming

UNIVERSITY OF CALIFORNIA

Lecture 7: Shortest Paths in Graphs with Negative Arc Lengths. Reading: AM&O Chapter 5

Memory Elements I. CS31 Pascal Van Hentenryck. CS031 Lecture 6 Page 1

Introduction to Algorithms

THE UNIVERSITY OF MICHIGAN. Faster Static Timing Analysis via Bus Compression

A VLSI DSP DESIGN AND IMPLEMENTATION O F ALL POLE LATTICE FILTER USING RETIMING METHODOLOGY

Optimum Prefix Adders in a Comprehensive Area, Timing and Power Design Space

Running Time. Assumption. All capacities are integers between 1 and C.

Logic Design II (17.342) Spring Lecture Outline

Micro-architecture Pipelining Optimization with Throughput- Aware Floorplanning

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Analysis of Algorithms I: All-Pairs Shortest Paths

Implementation of Clock Network Based on Clock Mesh

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th,

Review: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec09 Counters Outline.

Flows. Chapter Circulations

CSC 1700 Analysis of Algorithms: Warshall s and Floyd s algorithms

Review: Designing with FSM. EECS Components and Design Techniques for Digital Systems. Lec 09 Counters Outline.

Lecture 11. Single-Source Shortest Paths All-Pairs Shortest Paths

Sequential Circuit Analysis

CSE140: Design of Sequential Logic

Low Power Discrete Voltage Assignment Under Clock Skew Scheduling

Design for Testability

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Logic BIST. Sungho Kang Yonsei University

Design and Analysis of Algorithms

Design for Testability

Constrained Clock Shifting for Field Programmable Gate Arrays

Topics in Approximation Algorithms Solution for Homework 3

CS675: Convex and Combinatorial Optimization Fall 2014 Combinatorial Problems as Linear Programs. Instructor: Shaddin Dughmi

10/12/2016. An FSM with No Inputs Moves from State to State. ECE 120: Introduction to Computing. Eventually, the States Form a Loop

Design and Analysis of Algorithms

Worst-Case Execution Time Analysis. LS 12, TU Dortmund

Decision Diagrams for Discrete Optimization

All-Pairs Shortest Paths

Lecture Notes for Chapter 25: All-Pairs Shortest Paths

Timing Driven Power Gating in High-Level Synthesis

VLSI Physical Design Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

CS261: A Second Course in Algorithms Lecture #12: Applications of Multiplicative Weights to Games and Linear Programs

Programmable Logic Devices II

Lecture 10: Sequential Networks: Timing and Retiming

For smaller NRE cost For faster time to market For smaller high-volume manufacturing cost For higher performance

Introduction to Algorithms

VLSI Design Verification and Test Simulation CMPE 646. Specification. Design(netlist) True-value Simulator

Buffered Clock Tree Sizing for Skew Minimization under Power and Thermal Budgets

Latch/Register Scheduling For Process Variation

EECS 579: Logic and Fault Simulation. Simulation

Solving Linear and Integer Programs

IBIS Modeling Using Latency Insertion Method (LIM)

Contents. Introduction

University of Toronto Faculty of Applied Science and Engineering Edward S. Rogers Sr. Department of Electrical and Computer Engineering

Disconnecting Networks via Node Deletions

The min cost flow problem Course notes for Optimization Spring 2007

A Polynomial-Time Algorithm for Memory Space Reduction

Dynamic Programming: Shortest Paths and DFA to Reg Exps

CS 4407 Algorithms Lecture: Shortest Path Algorithms

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops

The simplex algorithm

Lecture 9 Tuesday, 4/20/10. Linear Programming

Structural Delay Testing Under Restricted Scan of Latch-based Pipelines with Time Borrowing

Timing Constraints in Sequential Designs. 63 Sources: TSR, Katz, Boriello & Vahid

Determine the size of an instance of the minimum spanning tree problem.

FF A. C I1 I2 I3 I4 Ck Ck Ck FF B I1 I2 I3 I4 FF A FF C

On Two Class-Constrained Versions of the Multiple Knapsack Problem

Single Source Shortest Paths

Chapter 5 Synchronous Sequential Logic

Clocked Synchronous State-machine Analysis

Introduction EE 224: INTRODUCTION TO DIGITAL CIRCUITS & COMPUTER DESIGN. Lecture 6: Sequential Logic 3 Registers & Counters 5/9/2010

Discrete Optimization 2010 Lecture 3 Maximum Flows

TAU 2015 Contest Incremental Timing Analysis and Incremental Common Path Pessimism Removal (CPPR) Contest Education. v1.9 January 19 th, 2015

Lecture 2: Divide and conquer and Dynamic programming

Shortest paths with negative lengths

CSC 373: Algorithm Design and Analysis Lecture 12

Transcription:

iretilp : An efficient incremental algorithm for min-period retiming under general delay model Debasish Das, Jia Wang and Hai Zhou EECS, Northwestern University, Evanston, IL 60201 Place and Route Group, Mentor Graphics, San Jose, CA 95131 ECE, Illinois Institute of Technology, Chicago, IL 60616 January 19, 2010 1 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 2 / 43

Delay Models Simple Delay Model Constant combinational delays are considered General Delay Model (Similar to Lalgudi et al.) considers 4 physical effects Clock skews Load dependent FF setup times Combinational gate delays Interconnect delays 3 / 43

Delay Models Simple Delay Model Constant combinational delays are considered General Delay Model (Similar to Lalgudi et al.) considers 4 physical effects Clock skews Load dependent FF setup times Combinational gate delays Interconnect delays 3 / 43

Retiming Relocate flip-flops (FFs) w/o changing circuit functionality. [Leiserson and Saxe 83] Retiming under simple constant delay model Polynomial time algorithm proposed by Leiserson et al. An Efficient implementation proposed by Shenoy et al. Algorithm considering hold conditions proposed by Papaethymiou et al. Incremental algorithm proposed by Zhou et al., Lin et al. Retiming under general delay model Branch and Bound Algorithm proposed by Soyata et al. Integer linear programming based algorithm proposed by Lalgudi et al. 4 / 43

Retiming Relocate flip-flops (FFs) w/o changing circuit functionality. [Leiserson and Saxe 83] Retiming under simple constant delay model Polynomial time algorithm proposed by Leiserson et al. An Efficient implementation proposed by Shenoy et al. Algorithm considering hold conditions proposed by Papaethymiou et al. Incremental algorithm proposed by Zhou et al., Lin et al. Retiming under general delay model Branch and Bound Algorithm proposed by Soyata et al. Integer linear programming based algorithm proposed by Lalgudi et al. 4 / 43

Retiming Relocate flip-flops (FFs) w/o changing circuit functionality. [Leiserson and Saxe 83] Retiming under simple constant delay model Polynomial time algorithm proposed by Leiserson et al. An Efficient implementation proposed by Shenoy et al. Algorithm considering hold conditions proposed by Papaethymiou et al. Incremental algorithm proposed by Zhou et al., Lin et al. Retiming under general delay model Branch and Bound Algorithm proposed by Soyata et al. Integer linear programming based algorithm proposed by Lalgudi et al. 4 / 43

Circuit classification General delay model classifies the circuit into two categories Begin/end extendible circuit Property (Path Delay Monotonicity) Delay of a register-to-register path decreases as the number of combinational elements on the path is decreased. Efficiently solved by extension of algorithm proposed by Shenoy et al. Efficiently solved by extension of incremental algorithm proposed by Zhou et al. Two-way extendible circuit Can be solved by ILP based algorithm proposed by Lalgudi et al. Inefficient in terms of performance and memory Motivated us to develop iretilp 5 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 6 / 43

Sequential System Optimization using Retiming Why General Delay Model? Considers prominent physical effects More important for lower process nodes Min-period retiming: Relocate FFs to minimize clock period. Ignore cost can increase FF area. Lalgudi et al s algorithm solved min-period retiming. Min-area retiming: Relocate FFs to minimize FF are under given clock period. Why study Min-period retiming? Min-area retiming needs a clock period. Min-period retiming drives Min-area retiming. This paper focusses on Min-period retiming 7 / 43

Circuit Graph Generation Circuit graph G = (V, E) of n vertices and m edges Each vertex v V can be primary input/output port input/output port of combinational cell Each edge e V can be Combinational edge c with 2 labels δ(c) : Minimum load dependent gate delay (c) : Maximum load dependent gate delay Interconnect edge i with 4 labels d(i) : Interconnect delay without FF α(i) : Interconnect delay driving FF input port β(i) : Interconnect delay driven by FF output port w(i) : Number of FFs on interconnect 8 / 43

Circuit Graph Example Example of our circuit graph generation 9 / 43

Extension of Circuit Graph Circuit Extension for concise treatment of timing constraints Two vertices added : H-IN and H-OUT 0 delay edges added from Primary outputs to H-IN H-OUT to Primary inputs H-IN to H-OUT One Virtual FF added on H-IN to H-OUT edge 10 / 43

Example of Extended Circuit Graph Following figure shows the extended circuit graph 11 / 43

Retiming Feasibility Constraints Retiming is represented by an integer-valued vertex label r : V Z r(v) is the # FFs moved from fanout interconnect edges to fanin interconnect edges of the ports of gate v Number of FFs on edge (i,j) is computed as w r (i, j) = w(i, j) + r(j) r(i) Retiming feasibility constraints P0(r) : w r (i, j) 0, (i, j) E w r (i, j) = 0, (i, j) C, w r (H-OUT, H-IN) = 1 12 / 43

Timing Constraints in Retiming For ease of presentation we focus on setup constraints Use label T : V R + to denote latest arrival time at each vertex Clock period of the circuit is φ Following inequalities model the timing constraints Theorem P1(r, φ) : T such that: T (i) + (i, j) T (j), (i, j) C T (i) + d(i, j) T (j), (i, j) I w r (i, j) = 0 T (i) + α(i, j) φ T (j) β(i, j), (i, j) I w r (i, j) 1 The clock period of the circuit G after retiming r is given by φ(r) = max{t (i) + α(i, j) : (i, j) I w r (i, j) 1}. 13 / 43

Timing Analysis Algorithm TA Given an assignment of w r on each edge, TA provides 3 procedures get-period : Computes critical path e e Returns φ as delay of critical path get-head-edge : returns e get-tail-edge : returns e Theorem Given a graph G of n vertices and m edges, the algorithm TA runs in O(m + n) time. 14 / 43

Problem Formulation We formulate the following Setup Retiming problem Problem (Generalized Setup Retiming) Given a circuit G = (V, E) and the number of FFs w on the edges, find a retiming r such that the retiming validity constraints P0(r) and timing feasibility constraints P1(r, φ) can be satisfied with the minimum clock period φ. 15 / 43

Sub-optimality of Polynomial algorithms Polynomial Retiming Algorithm : Step 1 16 / 43

Sub-optimality of Polynomial algorithms Polynomial Retiming Algorithm : Step 2 17 / 43

Sub-optimality of Polynomial algorithms Polynomial Retiming Algorithm : Step 3 18 / 43

Sub-optimality of Polynomial algorithms Sub-optimality of Polynomial Retiming Algorithm 19 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 20 / 43

Traditional Algorithm Preprocessing We call the algorithm proposed by Lalgudi et al. as Retime-General Extension of Leisserson and Saxe s Algorithm. Two matrices : W (size n n) and D (size m m) used W[u][v] indicates minimum register count among all paths between vertices u and v W (u, v) = min{w(p) : u p v} Given pair of edges e = (û, u), ê = (v, ˆv), D(e,ê) is computed as D(e, ê) = max{ω(e, p, ê) : û e u p v ê ˆv, w(p) = W (u, v)} Ω is longest propagation delay from e to ê under minimum register count constraint Matrix D stores all possible clock periods 21 / 43

Clock Period Feasibility Theorem Matrix D and W is a preprocessing stage for Retime-General Given graph G, G r is obtained using retiming transformation r For period c, conditions that satisfy setup constraints in G r Theorem (ILP Constraint) Let G r be a graph with retiming transformation r : V Z and c be a positive real number. c is a feasible clock period for G r if and only if for every edge u e v I, we have w r (e) 0 and for every edge pair e, ê I such that û e u v ê ˆv and D(e,ê) > c, we have W r (u, v) = 0 (w r (e) = 0 w r (ê) = 0) 22 / 43

Traditional Algorithm Overview 1. Generate a clock period bound φ ub using TA 2. Linear search to generate period c < φ ub 3. Feasibility of c is checked using ILP Constraint Theorem 4. Clock period updated by TA on the feasible retimed graph G r 5. Retime-General decreases period until it becomes infeasible 6. Period larger than the infeasible period is declared optimal. 23 / 43

ILP Formulation We use a modification to PO(r) PO (r) = (i, j) E : 0 w r (i, j) 1 (1) Chuan et al s idea for hold constraint violation Practical condition that simplifies ILP formulation ILP formulation for feasibility checking Corollary Let G r be a graph with retiming transformation r and c be a positive real number. c is a feasible clock period if for every edge pair e = (û, u), ê = (v, ˆv) I such that such that û e u v ê ˆv and D(e,ê) > c, we have w r (e) + w r (ê) <= 1 + W r (u, v) 4 integer variables, Bellman-Ford algorithm can t be used 24 / 43

ILP Formulation We use a modification to PO(r) PO (r) = (i, j) E : 0 w r (i, j) 1 (1) Chuan et al s idea for hold constraint violation Practical condition that simplifies ILP formulation ILP formulation for feasibility checking Corollary Let G r be a graph with retiming transformation r and c be a positive real number. c is a feasible clock period if for every edge pair e = (û, u), ê = (v, ˆv) I such that such that û e u v ê ˆv and D(e,ê) > c, we have w r (e) + w r (ê) <= 1 + W r (u, v) 4 integer variables, Bellman-Ford algorithm can t be used 24 / 43

ILP Formulation We use a modification to PO(r) PO (r) = (i, j) E : 0 w r (i, j) 1 (1) Chuan et al s idea for hold constraint violation Practical condition that simplifies ILP formulation ILP formulation for feasibility checking Corollary Let G r be a graph with retiming transformation r and c be a positive real number. c is a feasible clock period if for every edge pair e = (û, u), ê = (v, ˆv) I such that such that û e u v ê ˆv and D(e,ê) > c, we have w r (e) + w r (ê) <= 1 + W r (u, v) 4 integer variables, Bellman-Ford algorithm can t be used 24 / 43

Traditional Algorithm Shortcomings Too many constraints in ILP formulation as D is dense Generating D and W matrix requires all pair shortest path algorithm Generating D and W matrix needs O(n 2 ) and O(m 2 ) memory Needs an artificial clock period decrease factor Proposed Algorithm iretilp uses only critical constraints to generate the optimal clock period, eliminate the need to generate matrix D and W and also removes the need to employ artificial clock period decrease factor. iretilp solves a number of ILP formulations. Each formulation has few constraints. Total runtime is significantly improved. 25 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 26 / 43

Initialization We use 6 variables in our algorithm Optimal clock period φ and register vector F of size m Intermediate clock period φ and register vector F of size m Critical constraint vector CV initialized to Retiming label r for each vertex TA is used to generate initial clock period φ F is populated with the current register count w(e) 0 w(e) 1 r label on each vertex initialized to 0 27 / 43

Iterations Iterations in iretilp are used to either Generate optimal period φ Provide proof that φ is optimal Each iteration of iretilp generates a retimed graph G r Identifies a unique critical constraint, populates to CV If the critical period improves present clock period φ Register assignment of G r is updated to F φ is updated to present critical period Critical constraint at each iteration generated using ILP 28 / 43

Iterations illustrated Loop 1. For all e = (u,v) E: F[e] = w[e] + r[v] - r[u] 2. Invoke TA: φ = get-period() e = get-head-edge(), ê = get-tail-edge() 3. Add (e,ê) to CV 4. If φ < φ : φ = φ, F = F 5. Formulate and Solve ILP to generate next r assignments 6. Terminate if ILP is infeasible or there is critical cycle 29 / 43

ILP for Unique Constraint Generation Consider an arbitrary iteration of iretilp r be the retiming transformation associated with this iteration Generate an ILP where constraints from CV do not appear as clock period of G r Each entry c CV has edges e, ê associated with it Retiming associated with constraint c be r c w rc (e), w rc (ê) = 1 For retiming r, critical path e ê is equivalent to w r (û, u) = 1 w r (v, ˆv) = 1 W r (u, v) = 0 30 / 43

ILP for Unique Constraint Generation (contd.) Lemma to be satisfied so that e ê does not exist in G r Lemma (G r ILP formulation) Formation of G r used the following integer linear constraints for every edge e I, ê C, we have 0 w r (e) 1, w r (ê) = 0 for every 2-tuple { (e, ê), w uv } CV w r (e) + w r (ê) <= w uv + 1 How to generate w uv for each constraint c CV? w uv is equivalent to W (u, v) + r c (v) - r c (u) Traditional algorithm had W matrix, we do not have We have a property, when c was critical,wr (u, v) = 0 W (u, v) = r c (u) r c (v) 31 / 43

ILP for Unique Constraint Generation (contd.) Lemma to be satisfied so that e ê does not exist in G r Lemma (G r ILP formulation) Formation of G r used the following integer linear constraints for every edge e I, ê C, we have 0 w r (e) 1, w r (ê) = 0 for every 2-tuple { (e, ê), w uv } CV w r (e) + w r (ê) <= w uv + 1 How to generate w uv for each constraint c CV? w uv is equivalent to W (u, v) + r c (v) - r c (u) Traditional algorithm had W matrix, we do not have We have a property, when c was critical,wr (u, v) = 0 W (u, v) = r c (u) r c (v) 31 / 43

Correctness and Termination Theorem for algorithm correctness Theorem (Critical Constraints) Each iteration of our algorithm finds a entry from D matrix which is one of the timing feasibility constraint for the ILP formulation used by Retime-General with c as the optimal clock period φ. Invariants used in our iterations Theorem (Loop Invariant) Loop invariant of iretilp is that the integer linear program generated by Lemma 1 is either feasible or there are no critical cycle in the graph G r. 32 / 43

Complexity Analysis Let size of CV be k upon termination. Let size of CV be s i at i th iteration Let complexity of solving ILP = η s i (Linear function) iretilp s runtime T is dominated by ILP solver s runtime T = k η s i i=0 T = O(k 2 ) : quadratic function of total critical constraints Peak memory consumption M = O(k) iretilp generates significantly lesser constraints than Retime-General 33 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 34 / 43

Experimental Setup Random delay parameter generation Floyd-Warshall to generate D,W matrix for Retime-General ILP solved using CPLEX C++ API level integration Divided ISCAS benchmarks into two sets, small and big Small benchmarks vertex count: 1K Big benchmarks vertex count: 52K Running Retime-General till infeasibility is slow Mentioned in Lalgudi et al s work, they run 34 vertices s298 (368 vertices) didn t terminate in 6 hours iretilp on s298 terminated in 91.12 secs iretilp is used to generate optimal period Retime-General is terminated at optimal period Comparisons exclude infeasibility proof runtime 35 / 43

Performance Comparison : Small Benchmarks Blue bar : Constraint solving speedup Red bar : Total speedup (includes Floyd-Warshall runtime) 36 / 43

Memory Comparison : Small Benchmarks Peak Memory consumption of selected benchmarks Larger than 200 critical constraints Blue bar indicates Retime-General Red bar indicates iretilp 37 / 43

iretilp incremental run : Large Benchmarks Incremental mode iretilp runs on ISCAS89 Large benchmark R100: Time taken for 100 iterations R200: Time taken for 200 iterations Runtime scaling in incremental runs 38 / 43

Clock period improvement iretilp is run in incremental mode, initial clock period φ init φ 100 : Optimized period after 100 iterations φ 200 : Optimized period after 200 iterations Period decrease (PD) computed as PD = (φ 200 φ 100 ) φ init 100 Period decrease over incremental runs 39 / 43

Outline Motivation Problem Formulation Algorithmic Ideas Algorithmic Details Experimental Results Conclusions and Future work 40 / 43

Conclusions and Future work Efficient incremental min-period retiming algorithm On an average 100X faster than Retime-General Upto 40X less peak memory consumption than Retime-General Infeasibility proof should be avoided for practical usage For bigger benchmarks, incremental algorithm can be stopped any time, generating a feasible upper bound of optimal clock period Minimum area retiming for general delay models Optimization version of the same problem Experimenting with extended iretilp for min-area retiming Open questions? Complexity class of 4 variable ILP formulation Can we generate a polynomial time algorithm? 41 / 43

Conclusions and Future work Efficient incremental min-period retiming algorithm On an average 100X faster than Retime-General Upto 40X less peak memory consumption than Retime-General Infeasibility proof should be avoided for practical usage For bigger benchmarks, incremental algorithm can be stopped any time, generating a feasible upper bound of optimal clock period Minimum area retiming for general delay models Optimization version of the same problem Experimenting with extended iretilp for min-area retiming Open questions? Complexity class of 4 variable ILP formulation Can we generate a polynomial time algorithm? 41 / 43

Conclusions and Future work Efficient incremental min-period retiming algorithm On an average 100X faster than Retime-General Upto 40X less peak memory consumption than Retime-General Infeasibility proof should be avoided for practical usage For bigger benchmarks, incremental algorithm can be stopped any time, generating a feasible upper bound of optimal clock period Minimum area retiming for general delay models Optimization version of the same problem Experimenting with extended iretilp for min-area retiming Open questions? Complexity class of 4 variable ILP formulation Can we generate a polynomial time algorithm? 41 / 43

Conclusions and Future work Efficient incremental min-period retiming algorithm On an average 100X faster than Retime-General Upto 40X less peak memory consumption than Retime-General Infeasibility proof should be avoided for practical usage For bigger benchmarks, incremental algorithm can be stopped any time, generating a feasible upper bound of optimal clock period Minimum area retiming for general delay models Optimization version of the same problem Experimenting with extended iretilp for min-area retiming Open questions? Complexity class of 4 variable ILP formulation Can we generate a polynomial time algorithm? 41 / 43

Q & A 42 / 43

Thank you! 43 / 43