A Simultaneous Routing Tree Construction and Fanout Optimization Algorithm *

Similar documents
Seidel s Trapezoidal Partitioning Algorithm

Merging to ordered sequences. Efficient (Parallel) Sorting. Merging (cont.)

Homework 1 Solutions CSE 101 Summer 2017

Conducting fuzzy division by using linear programming

Objectives. We will also get to know about the wavefunction and its use in developing the concept of the structure of atoms.

A Bijective Approach to the Permutational Power of a Priority Queue

The Solutions of the Classical Relativistic Two-Body Equation

Theorem on the differentiation of a composite function with a vector argument

Capacity of Data Collection in Arbitrary Wireless Sensor Networks

= ρ. Since this equation is applied to an arbitrary point in space, we can use it to determine the charge density once we know the field.

Three-dimensional systems with spherical symmetry

A Sardinas-Patterson Characterization Theorem for SE-codes

Jackson 4.7 Homework Problem Solution Dr. Christopher S. Baird University of Massachusetts Lowell

Interaction of Feedforward and Feedback Streams in Visual Cortex in a Firing-Rate Model of Columnar Computations. ( r)

Truncated Squarers with Constant and Variable Correction

Jackson 3.3 Homework Problem Solution Dr. Christopher S. Baird University of Massachusetts Lowell

Generalized net model of the process of ordering of university subjects

Mutual Inductance. If current i 1 is time varying, then the Φ B2 flux is varying and this induces an emf ε 2 in coil 2, the emf is

New problems in universal algebraic geometry illustrated by boolean equations

Problem set 6. Solution. The problem of firm 3 is. The FOC is: 2 =0. The reaction function of firm 3 is: = 2

MATH 415, WEEK 3: Parameter-Dependence and Bifurcations

Probability Estimation with Maximum Entropy Principle

APPLICATION OF MAC IN THE FREQUENCY DOMAIN

PHYS 705: Classical Mechanics. Central Force Problems I

Using Laplace Transform to Evaluate Improper Integrals Chii-Huei Yu

Fractional Zero Forcing via Three-color Forcing Games

Determining solar characteristics using planetary data

Course Updates. Reminders: 1) Assignment #10 due next Wednesday. 2) Midterm #2 take-home Friday. 3) Quiz # 5 next week. 4) Inductance, Inductors, RLC

Splay Trees Handout. Last time we discussed amortized analysis of data structures

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012

Duality between Statical and Kinematical Engineering Systems

A NEW VARIABLE STIFFNESS SPRING USING A PRESTRESSED MECHANISM

Estimation of Wiring Area for Hierarchical Design

( ) [ ] [ ] [ ] δf φ = F φ+δφ F. xdx.

FUSE Fusion Utility Sequence Estimator

LINEAR AND NONLINEAR ANALYSES OF A WIND-TUNNEL BALANCE

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

On the ratio of maximum and minimum degree in maximal intersecting families

Gradient-based Neural Network for Online Solution of Lyapunov Matrix Equation with Li Activation Function

ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION. 1. Introduction. 1 r r. r k for every set E A, E \ {0},

Relating Scattering Amplitudes to Bound States

Computing selected eigenvalues of sparse unsymmetric matrices using subspace iteration

A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield

International Journal of Mathematical Archive-3(12), 2012, Available online through ISSN

The Congestion of n-cube Layout on a Rectangular Grid S.L. Bezrukov J.D. Chavez y L.H. Harper z M. Rottger U.-P. Schroeder Abstract We consider the pr

Hardness of Approximating the Minimum Distance of a Linear Code

Hydroelastic Analysis of a 1900 TEU Container Ship Using Finite Element and Boundary Element Methods

EQUI-PARTITIONING OF HIGHER-DIMENSIONAL HYPER-RECTANGULAR GRID GRAPHS

Quantum Lyapunov Control Based on the Average Value of an Imaginary Mechanical Quantity

ANA BERRIZBEITIA, LUIS A. MEDINA, ALEXANDER C. MOLL, VICTOR H. MOLL, AND LAINE NOBLE

3D INTERACTION DOMAINS FOR UNREINFORCED MASONRY PANELS SUBJECTED TO ECCENTRIC COMPRESSION AND SHEAR

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

Do Managers Do Good With Other People s Money? Online Appendix

Top K Nearest Keyword Search on Large Graphs

An Application of Fuzzy Linear System of Equations in Economic Sciences

EM Boundary Value Problems

CENTER FOR MULTIMODAL SOLUTIONS FOR CONGESTION MITIGATION (CMS)

Lecture 8 - Gauss s Law

A Converse to Low-Rank Matrix Completion

An Application of Bessel Functions: Study of Transient Flow in a Cylindrical Pipe

Steady State and Transient Performance Analysis of Three Phase Induction Machine using MATLAB Simulations

transmission subwavelength imaging

Empirical Prediction of Fitting Densities in Industrial Workrooms for Ray Tracing. 1 Introduction. 2 Ray Tracing using DRAYCUB

and Slater Sum Rule Method * M L = 0, M S = 0 block: L L+ L 2

Bayesian Congestion Control over a Markovian Network Bandwidth Process

Identification of the degradation of railway ballast under a concrete sleeper

A scaling-up methodology for co-rotating twin-screw extruders

Encapsulation theory: the transformation equations of absolute information hiding.

Lectures on Multivariable Feedback Control

ON THE INVERSE SIGNED TOTAL DOMINATION NUMBER IN GRAPHS. D.A. Mojdeh and B. Samadi

Computers & Operations Research

Method for Approximating Irrational Numbers

Outline. Reinforcement Learning. What is RL? Reinforcement learning is learning what to do so as to maximize a numerical reward signal

arxiv: v1 [math.co] 4 May 2017

ac p Answers to questions for The New Introduction to Geographical Economics, 2 nd edition Chapter 3 The core model of geographical economics

Journal of Inequalities in Pure and Applied Mathematics

PROBLEM SET #1 SOLUTIONS by Robert A. DiStasio Jr.

Lifting Private Information Retrieval from Two to any Number of Messages

On the ratio of maximum and minimum degree in maximal intersecting families

COMP Parallel Computing SMM (3) OpenMP Case Study: The Barnes-Hut N-body Algorithm

Introduction to Nuclear Forces

Basic Bridge Circuits

Magnetic Field Gradient Optimization for Electronic Anti-Fouling Effect in Heat Exchanger

Math 2263 Solutions for Spring 2003 Final Exam

Likelihood vs. Information in Aligning Biopolymer Sequences. UCSD Technical Report CS Timothy L. Bailey

Encapsulation theory: radial encapsulation. Edmund Kirwan *

The American Community Survey Sample Design: An Experimental Springboard

AST 121S: The origin and evolution of the Universe. Introduction to Mathematical Handout 1

Lab #4: Newton s Second Law

Vector Spherical Harmonics and Spherical Waves

The Substring Search Problem

ELECTROSTATICS::BHSEC MCQ 1. A. B. C. D.

On a quantity that is analogous to potential and a theorem that relates to it

A Deep Convolutional Neural Network Based on Nested Residue Number System

A Multivariate Normal Law for Turing s Formulae

On the integration of the equations of hydrodynamics

The Chromatic Villainy of Complete Multipartite Graphs

Supplementary information Efficient Enumeration of Monocyclic Chemical Graphs with Given Path Frequencies

Experiment I Voltage Variation and Control

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Transcription:

A Simutaneous Routing Tee Constuction and Fanout Optimization Agoithm * Ami H. Saek, Jinan Lou, Massoud Pedam Depatment of Eectica Engineeing - Systems Univesity of Southen Caifonia Los Angees, CA 90089 {ami, jou, massoud@zugos.usc.edu ABSTRACT - This pape pesents an optima agoithm fo soving the pobem of simutaneous fanout optimization and outing tee constuction fo an odeed set of citica sinks. The agoithm, which is based on dynamic pogamming, geneates a ectiinea Steine tee outing soution containing appopiatey sized and paced buffes. The esuting soution, which inheits the topoogy of LT-Tees and the detaied stuctue of P-Tees, maximizes the signa equied time at the dive of the given set of sinks. Expeimenta esuts on benchmak cicuits demonstate the effectiveness of this simutaneous appoach compaed to the sequentia methods. 1. INTRODUCTION The cuent deep-submicon (DSM) pocess technoogies have inceased the contibution of the inteconnect deay to the tota path deay in digita cicuits. At the same time, the existing design fows and toos have had imited, and ony magina, success in incopoating inteconnect panning and optimization eay in the design pocess. This situation has foced IC designes to e-evauate the existing computeaided design (CAD) methodoogies and techniques. To addess the DSM design chaenges, one can eithe incease the ookahead capabiity of high-eve toos o deveop new agoithms fo soving age potions of the ovea design pobem simutaneousy. This atte unification-based appoach is, in ou view, moe pomising. Indeed, the natue of IC design pobems and the cuent state of CAD soutions have eached a point whee it is both necessay and possibe to combine some steps of the synthesis and physica design pocesses. The unificationbased agoithms ae capabe of captuing existing inteactions among the meged design steps and poducing highe-quaity impementations by systematicay seaching a much age soution space (see [SLP98] and [LSP97]). The agoithm poposed in this pape integates two majo design steps: fanout optimization and outing tee geneation. Each of these two optimization steps has been vey effective in educing the cicuit deay, in one case by boosting the tansmitted signas via insetion of sized buffes *This wok was funded in pat by SRC unde contact no. 98-DJ-606 and by a NSF PECASE awad (contact no. MIP-9628999). and in the othe case by geneating suitabe wie stuctues. The goa of this wok is to optimay integate these two steps and theeby povide a unifom famewok fo optimizing the nets of a paced cicuit to achieve faste impementations. The poposed dynamic pogamming based agoithm geneates and popagates a set of buffeed outing tee stuctues in the fom of two dimensiona (equied time vesus input oad) soution cuves. The esuting wiing stuctue is guaanteed to inheit the topoogy of LT-Tees [To90] and the detaied physica impementation of P-Tees [LCLH96]. This agoithm takes a given ode fo the sinks and, stating fom the highe indexed sinks, combines them into goups which ae to be diven by buffes. Fo each goup, pope outing stuctues and buffe ocations ae examined to geneate a set of possibe soutions fo that subset of odeed sinks. Ony the soutions which ae not dominated by othe soutions ae kept. These two steps ae epeated in a dynamic pogamming fashion unti the whoe set of sinks ae combined togethe. Expeimenta esuts epoted in this pape demonstate the effectiveness of this method vesus the conventiona fows that sequentiay pefom outing tee geneation and fanout optimization. The emainde of the pape is oganized as foows. In section 2, backgound and motivation ae given. Section 3 intoduces the poposed agoithm. In sections 4 and 5, ou expeimenta esuts and concuding emaks ae pesented. 2. BACKGROUND AND MOTIVATION 2.1 Fanout Optimization Fanout optimization, an opeation pefomed in the ogic domain, addesses the pobem of distibuting a signa to a set of sinks with known oads and equied times so as to maximize the equied time at the signa dive. Inteconnect deay is not incopoated in this opeation because the ocations of the buffes ae not known at this stage. The genea fanout optimization pobem is NP-had [To90], howeve its estiction to some specia famiies of topoogies is known to have poynomia compexity. Among the many existing woks on fanout optimization pobem, we ae inteested in the agoithm poposed by [To90]. That wok intoduces a specia cass of tee topoogies, caed LT-Tees, fo which the fanout pobem is soved with poynomia compexity. The LT-Tee of type-i (in this pape efeed to as LT-Tee) is a tee that pemits at most one intena node among the immediate chiden of evey intena node in the tee. Touati in [To90] poposed a dynamic pogamming based agoithm fo the fanout optimization pobem whee the buffe stuctue is esticted to the LT-Tee topoogy and sinks with age equied times

ae paced futhe fom the oot of the tee. His agoithm fist sots the sinks in non-deceasing equied time ode and then stating fom the east citica sink, it enumeates a ightmost goupings of the sinks to be diven by a buffe. Finay fo each gouping, it enumeates a possibe ways of adding eithe zeo o one buffe to dive the ightmost subset of the sinks. Touati gives sufficient conditions fo his LT-Tee constuction agoithm, LTTREE, to be optima. A sink node Fig. 1: A simpe LT-Tee An intena node (buffe) Lemma 1: LTTREE woks optimay with espect to the signa equied time at the oot (dive) if a the sinks have equa oad capacitances and ae soted in non-deceasing equied times [To90]. Lemma 2: LTTREE has O(n 2 ) poynomia compexity whee n is the numbe of sink nodes [To90]. 2.2 Routing Tee Geneation Pefomance-diven inteconnect design, an opeation pefomed in the physica domain, addesses the pobem of connecting a souce dive to a set of sinks with known oads, equied times and positions so as to maximize the equied time at the dive. The inheent compexity of this pobem has foced eseaches to eithe sove it heuisticay o to impose constaints on the stuctue of the esuting inteconnect. Fo an oveview of the existing pefomance-diven inteconnect design technique, inteested eades ae efeed to [CHKM96]. Liis et a. in [LCLH96] poposed the Pemutation- Constained Routing Tee o P-Tee stuctue as a soution to the above mentioned pobem. Thei appoach consists of two majo phases: Finding a pope odeing fo the sinks, and then geneating the outing stuctue based on the cacuated odeing. The second phase of the agoithm, caed PTREE thoughout this pape, is empoyed in the pesent pape. Given an odeing of the sink nodes, PTREE finds the optima embedding of the net into the Hanan gid (the set of points fomed by the intesection of hoizonta and vetica ines though the teminas of a net [Ha66]) by a dynamic pogamming appoach. In PTREE, the (intemediate) outing soutions ae stoed in the fom of two dimensiona, non-dominated soution cuves of tota aea vesus equied time fo evey Hanan point. The wost case compexity of PTREE is athe high, O(n 5 ), howeve, the untime fo pactica puposes emains within an acceptabe ange [LCLH96]. Futhemoe, by appying some techniques such as contoing the maximum numbe of Hanan points, the compexity of PTREE is consideaby educed without osing much in tems of the quaity. Lemma 3: Fo a given ode on the sinks and with the estiction that the Steine points ie on the Hanan Gid, PTREE computes the set of a ectiinea Steine tees with non-dominated equied time and tota capacitance [LCLH96]. Lemma 4: If the individua capacitive vaues ae poynomiay bounded integes o can be mapped to such with sufficient pecision, PTREE has O(n 5 ) pseudo-poynomia compexity whee n is the numbe of sink nodes [GJ79][LCLH96]. Note: Late in section 4, it woud be hepfu to know that O(n 2 ) potion of O(n 5 ) compexity of PTREE is due to the existance of n 2 Hanan points. e a b c Fig. 2: An output of P-Tee fo dcba ode 2.3 Othe Woks Okamoto and Cong in [OC96a] poposed a combination of A-Tee outing geneation [CLZ93] and van Ginneken s buffe insetion [Gi90] as a soution to the pobem of buffeed Steine tee constuction. They ate extended thei wok in [OC96b] to incude wie sizing as we. Thei agoithm takes the pacement infomation of the souce and the sinks in addition to the signa equied aiva times and then heuisticay geneates a buffeed outing stuctue such that it maximizes the equied time at the souce of the net. This technique consists of two phases: bottom up tee constuction with non-infeio soution computation and top down buffe insetion. The non-infeio soution which gives the maximum equied time at the oot is chosen, and then it is taced back though the computations pefomed duing the fist phase that ed to this soution. Duing the backtace, the buffe positions ae detemined. Duing the bottom up phase, the subtees ae combined using a weighted addition function with a use specified paamete to heuisticay decide which two subtees ae to be meged. Athough this method empoys the A-Tee constuction agoithm, it cannot guaantee that the esuting stuctue emains an A-Tee. Futhemoe, the fanout optimization agoithm which is based on citica sink isoation is ad-hoc. The ovea agoithm has no guaantee of optimaity. In contast, ou poposed method poduces a buffeed ectiinea Steine tee which is optima subject to the given ode of the sinks, the topoogy of LT-Tees and the detaied stuctue of P-Tees. 3. THE FANROUT ALGORITHM FANROUT, simutaneous fanout and outing tee optimization agoithm, is a dynamic pogamming based agoithm which constucts a buffeed outing stuctue fo a given net, based on the avaiabe pacement, oading, and timing infomation. The goa is to maximize the equied time at the dive of the net. 3.1 Pobem Fomuation A given net, N=(s,S), detemines the set of sink nodes, S={s 1,s 2,,s n, which ae to be diven by the dive of the net, caed s. In addition to the input net, the foowing infomation is equied and used by FANROUT: I. Position of the souce s=(s x,s y ), whee s x and s y ae the hoizonta and vetica coodinates of s. II. Input data fo each sink node s i =(s ix,s iy,s i,s i ) fo 1 i n, x y whee s i and s i ae the hoizonta and vetica d

coodinates, s i is the capacitive oad, and s i is the signa equied time at node s i. III. A ibay, L={b 1, b 2,, b m, containing m buffes with diffeent stengths. IV. A inea odeing of the sinks. 3.2 Two Dimensiona Soution Cuves Athough the objective is to find an impementation with the maximum equied time at s, duing evey step of FANROUT, oad vesus equied time cuves ae geneated and the soutions ae compaed and evauated with espect to these two paametes. Compaison of two sub-soutions based on ony the equied time is an invaid compaison and may esut in dopping the optima soution. This is due to the fact that the oading imposed by a sub-soution on the next eve of the LT-Tee may cause a age incease in the ovea deay such that the diffeence between the equied times is moe than that which was compensated fo. Theefoe, both the equied time and the input oad ae needed to evauate the effect of a sub-soution on the ovea stuctue. Definition 1: Suppose σ 1 and σ 2 ae two buffeed outing stuctues fo a souce and a set of sinks. σ 2 is caed infeio to σ 1, if oad(σ 1 ) oad(σ 2 ) and eqtime(σ 2 ) eqtime(σ 1 ). 3.3 Detaied Appoach FANROUT incopoates LT-Tee and P-Tee constuction techniques into a unified famewok such that the esuting outing stuctue is both an LT-Tee, in tems of the ovea topoogy, and a P-Tee, in tems of the detaied physica stuctue. FANROUT equies an odeing of the sinks and guaantees the optimaity of the soution with espect to this odeing ony. In ine 1 of Fig. 3, FANROUT oads the subject net, N, which incudes a dive, s, and n sinks odeed in some fashion (e.g. based on thei pacement ocations, equied times, o a combination theeof). In ine 2, it oads the ibay of the buffes, L, consisting of m buffes with diffeent design paametes, incuding diving stength, intinsic deay, and input oad. In ine 3, HG(N) is oaded with maximum n 2 Hanan nodes which ae fomed by the intesection of hoizonta and vetica ines though the teminas of net N; see Fig. 5. At evey step, z is the index showing that the n-z+1 ightmost sinks (in the odeed ist of sinks) ae being combined into a goup diven by a buffe; see Fig. 4. The LT-Tee topoogy aows the use of an aeady pocessed sub-goup of ast n-h+1 sinks whee h is a numbe between z and n. This guaantees that in the fina soution, each buffe dives diecty at most one othe buffe. Fo evey Hanan nodes and evey index z, Γ(z,v) is a twodimensiona soution cuve incuding a the non-infeio buffeed outing stuctues each connecting sinks s z though s n with its oot ocated on v. In ine 4, these soution cuves ae initiaized to the set of a non-infeio buffeed paths connecting v to s n. The code in ines 5 though 16 is fo cacuating a the buffeed outing stuctues fo Γ(z,v) using the soutions avaiabe in Γ(h,v) as descibed next. Coesponding to goup h, thee exist n 2 Γ s each fo a Hanan node. In ine 7, a the Hanan nodes ae enumeated by a vaiabe, v, and in ine 8, a the soutions in Γ(h, v) ae etieved one by one by a vaiabe, γ. So, γ is a outing stuctue connecting s h though s n with its oot diven by a buffe ocated at v. In ine 9, PTREE is caed on the set of sink nodes (i.e. s z though s h-1 ) to be combined with γ ; see Fig. 5. Note that fo PTREE, γ acts ike a sink node with its coesponding equied time and oad (whee the oad is equa to the input capacitance of the buffe diving the outing stuctue of γ). agoithm FANROUT { 1. ead N = ( s, S ) whee s is the souce and S = { s 1, s 2,..., s n is an odeed ist of sinks; 2. ead L = { b 1, b 2,..., b m, the ibay of buffes; 3. set HG( N ) = a the Hanan gid points of N ; 4. foeach v HG( N ) set Γ( n, v ) = { The set of a non-infeio paths fom v to s n ; 5. fo z = n to 1 { 6. fo h = z to n 7. foeach v HG( N ) 8. foeach γ Γ( h, v ) { 9. set D = PTREE( HG(N), {v, s z,..., s h-1, γ ) ; 10. foeach Δ D { 11. set u = Hanan node coesponding to Δ ; 12. foeach δ Δ 13. foeach b L { 14. dive δ by b and cacuate the equied time,, at the input of b ; 15. set σ = ( <inputload(b), >, δ, b ) ; 16. add σ to Γ( z, u ) ; 17. foeach v HG( N ) pune Γ( z, v ) ; 18. find ( <,>, δ, b ) Γ(1, v) which esuts in the agest equied time at the input of the dive, ca it bestsoution; 19. etieve fanouttee by foowing the pointes stating fom the bestsoution ; 20. etun fanouttee ; Fig. 3: Pseudo code of FANROUT PTREE etuns a coection of soution cuves each coesponding to a distinct Hanan node. The coection of cuves is stoed in D by PTREE. Then in ine 10, these soution cuves ae seected one by one using a vaiabe, Δ. Reca that each Δ coesponds uniquey to a Hanan node which is efeed to as u in ine 11. Once a Δ is in hand, its encapsuated outing stuctues ae etieved one by one by a vaiabe, δ. Fo a these outing stuctues, a possibe buffes ae tied in ines 13 though 16, and fo each choice the equied time at the input of the buffe is cacuated using the specified deay mode. In ine 15, fo evey match a soution, σ, is geneated (whie saving pointes to its subsoutions, fo ate use in the top-down taceback phase) which coesponds to a outing stuctue (i.e., δ) and a buffe (i.e., b). This soution is added to Γ(z, u) because the oot of σ is ocated at u. The soution cuves Γ(z, v) ae cacuated in this way; howeve, these cuves may contain infeio soutions which ae puned in ine 16. Finay, FANROUT buids the Γ(1, v) soution cuves (fo evey v) which contain buffeed outing stuctues connected to a the sink nodes. Then, fo evey v and fo evey soution of Γ(1, v), the oot of the buffeed outing stuctue is connected to the dive and the equied time at the input of the dive is cacuated in ine 18. The stuctue which esuts in the agest equied time is chosen and is

taced down though the stoed pointes. The buffeed outing stuctue is etieved and etuned in ines 19 and 20. z h Γ( z=2, v ) s 1 s 2 s 3 Note that the opeations pefomed in ines 10 though 16, in fact, can be pefomed intenay by a modified PTREE with no incease in the wost case compexity of PTREE. Theefoe, in the foowing compexity anaysis we do not take into account the compexity of that pat of the pseudocode. A Sink A Hanan Point γ Γ( 4, v ) s 1 v 4. DISCUSSION 4.1 Quaity and Compexity of FANROUT The poposed agoithm is an optima poynomia agoithm based on a set of assumptions. The foowing set of emmas and theoems fomay pove these caims. Theoem 1: The soution space of FANROUT is the poduct of those of PTREE and LTTREE. Poof: Any P-Tee stuctue with inseted buffes such that no buffe immediatey dives moe than one othe buffe can be visited by FANROUT. Aso, any LT-Tee such that the output nets of its buffes ae impemented using PTREE can be visited by FANROUT. Lemma 5: Fo any abitay outing with no buffe, R, which connects the souce to the sinks, we have: I. By deceasing the oad of any sink, the capacitance obseved at the oot of R does not incease. II. By inceasing the equied time of any sink, the equied time at the oot of R does not decease. Poof: Fo case I, deceasing the oad of a sink deceases the amount of the chage needed to bing the votage of R to a cetain eve. Fo case II, if that paticua sink is on the citica path, the statement is tiviay tue. Othewise, the equied time of the dive is detemined by the equied s 4 s 5 Fig. 3: Fig. 4: Pocessing the nodes s s 4 Γ( h=4, v ) Fig. 5: Ca PTREE on the cuent sink nodes s 3 s 2 s 5 time of the othe sinks and emains unchanged. Lemma 6: PTREE is monotone with espect to the oad and the equied time of the sinks. Poof: Suppose R is a outing stuctue geneated by PTREE. Reducing the capacitance and/o inceasing the equied time of a sink whie peseving R esuts in the decease of the capacitance and incease of the equied time at the oot of R. Theefoe, if PTREE is un afte changing the oad and the equied time of the sinks in this way, the esuting stuctue is non-infeio with espect to R and PTREE woud stoe it in the cuve (c.f. Lemma 1). Lemma 7: The use of the puning opeation by FANROUT does not esut in the oss of any non-infeio soution. Poof: Assume that σ 2 is infeio w..t. σ 1. By induction, if σ 2 is the whoe net and its input is diecty connected to the net dive, then the equied time does not decease and the oad does not incease by epacing σ 2 with σ 1. If σ 2 is a soution to a sub-pobem, its input is diven by anothe intena node, ca it g. Due to the monotone behavio of PTREE (c.f. Lemma 6), at g the equied time and the input oad of the impementation incuding σ 2 is guaanteed to be no bette than those of the impementation containing σ 1. A simia agument is then vaid fo g and the est of the intena nodes down to the eaf nodes. Theoem 2: FANROUT is an optima agoithm w..t. equied time, subject to a set of constaints. Poof: An examination of the dynamic pogamming stuctue of FANROUT shows that if no puning is pefomed, a the possibe soutions woud be consideed. Theefoe, to pove the optimaity of the agoithm it is enough to pove that fo an optima soution, epacing a non-infeio soution with an infeio soution cannot impove the whoe impementation; This, howeve, was poved in Lemma 7. Lemma 8: The numbe of soutions in any soution cuve is bounded by the numbe of the buffes in the ibay, L. Poof: The oad of any soution is equa to the input capacitance of the diving buffe. Howeve, the numbe of distinct input capacitances of the buffes is bounded by the tota numbe of the avaiabe buffes in the ibay, L. Fo each oad vaue the soution with the maximum equied time is stoed and the est wi be puned out. Theoem 3: FANROUT has O(n 3 ) memoy compexity. Poof: Thee ae n 2 Hanan points and fo each of them n soution cuves ae stoed. Each soution cuves stoes no moe than L soutions. Theefoe, the caim is poved. Theoem 4: FANROUT has O(n 9 ) untime compexity. Poof: PTREE has O(n 5 ) wost case untime compexity (c.f. Lemma 4). Lines 5 and 6 of the pseudo-code, each intoduce O(n) compexity and ine 7 intoduces anothe O(n 2 ) compexity. Theefoe the ovea wost case compexity is O(n 9 ). 4.2 Reducing the Compexity Undoubtedy, the wost case compexity of FANROUT is

too high fo use in many pactica cases. Howeve, that compexity can be consideaby educed by appying some simpe heuistics. In the foowing, a coupe of heuistics ae intoduced which ae poved to be highy effective with itte compomise in tems of the quaity of the fina esuts. I. Restict the numbe of Hanan points: In the exact vesion of FANROUT, thee ae n 2 Hanan points which is a majo souce of excessive untime. We may, howeve, not aow moe than g Hanan points and change the compexity of ine 7 to O(g) and the compexity of PTREE to O(gn 3 ) (c.f. the note given at the end of sub-section 2.2). Consequenty, the wost case compexity of FANROUT is changed to O(g 2 n 5 ). II. Bound the maximum numbe of fanouts diven by a buffe: We may impose a pactica uppe bound on the numbe of fanouts that a buffe dives. Using that vaue, say, we do not aow FANROUT to connect a buffe to moe than fanouts. FANROUT can easiy hande this case by changing n in ine 6 of the pseudo-code to z+-1. In this case the compexity intoduced by ines 6 and 9 ae changed to O() and O(n 2 3 ) (c.f. the note given at the end of sub-section 2.2), espectivey. Consequenty, the wost case compexity of FANROUT is changed to O( 4 n 5 ). III. Fast method: By appying both of the above technique the compexity of FANROUT is changed to O(g 2 4 n) which esuts in a inea wost-case compexity when g and ae assumed to be independent of n. 5. EXPERIMENTAL RESULTS In ode to veify the effectiveness of FANROUT, a set of expeimenta esuts ae epoted hee. In the pesented conventiona fows (beow), we do not impose any estictions on the odeing fo the sinks. In othe wods, evey fanout optimization and outing tee geneation methods ae independenty fee to choose thei own appopiate odeing fo the sinks (if any needed). In Tabe 1, the esuts ae pesented fo a set of nets taken fom a numbe of benchmaks whee the sinks ae paced andomy. Fo these exampes, two conventiona fows ae compaed against FANROUT whee FANROUT has been used fo two diffeent odeings: I. Odeing with espect to the sink equied times, REQ. II. Odeing geneated by soving the taveing saesman pobem on the set of sinks, TSP. The fist conventiona fow setup, conv-i, uses SIS [SSLM92] fo fanout optimization, foowed by using PTREE fo outing tee geneation. Fo each net, diffeent fanout optimization methods avaiabe in SIS ae used and fo each net ony the best esut in tems of the equied time is epoted. The second conventiona fow setup, conv-ii, uses PTREE fo outing tee geneation foowed by using the buffe insetion method intoduced in [Gi90]. Note that in Tabe 1, tota-aea, eq-time and w-ength stand fo the sum of the aea of buffes, the equied time at the input of the dive and the tota wie ength, espectivey. Ou next set of expeiments (c.f. Tabe 2) compaes the pefomance of the conventiona design fows against ou poposed simutaneous agoithm on a numbe of benchmaks using a CASCADE standad ce ibay (0.5u HP CMOS pocess). Gate and wie deays ae cacuated using a 4-paamete deay equation (simia to that in [LSP97]) and the Emoe deay mode [E48], espectivey. Aso, the fast FANROUT (c.f. sub-section 4.2.) has been un with TSP odeings fo the expeiments epoted in Tabe 2. These expeiments showed that the untime of the fast FANROUT is in the ode of few minutes compaabe to the untimes of the conventiona fows. Note that the aea and deay epoted in this tabe ae tota chip aea and deay afte detai outing. These expeiments wee un in the SIS envionment on an Uta-2 Sun Spac wokstation (sahand.usc.edu) with 256MB memoy. 6. CONCLUSIONS This pape pesents a nove agoithm, FANROUT, which pefoms simutaneous outing and fanout optimization. It is a dynamic-pogamming based agoithm which popey uses LT-Tee and P-Tee constuction agoithms in ode to geneate buffe outing stuctues with maximum signa equied time. It computes oad vesus equied time soution cuves fo evey point on the Hanan gid and popagates them whie gouping moe sink accoding to the given ode. Mege and pune opeations ae defined on the soution cuves to popagate the soution cuves though the steps of the agoithm and dop the ow quaity soutions to maintain the poynomia compexity. FANROUT is an optima agoithm fo maximizing the equied time pobem fo a given ode on the sinks. It aso inheits a the estictions that LT-Tee and P-Tee constuction agoithms have. FANROUT is a poynomia agoithm as we. This new unified design steps yieds high quaity cicuits in tems of post ayout chip aea and deay. 7. ACKNOWLEDGEMENTS We woud ike to thank D. John Liis of the Univesity of Iinois at Chicago fo poviding the impementation of the PTREE agoithm and fo hepfu discussions about the compexity of PTREE. 8. REFERENCES [CHKM96] J. Cong, L. He, C. Koh, and P. Madden, Pefomance optimization of VLSI inteconnect ayout, In Integation, the VLSI Jouna 21, pp. 1-94, 1996. [CLZ93] J. Cong, K. Leung, and D. Zhou, Pefomance-diven inteconnect design based on distibuted RC deay mode, In Poceedings of the 30th Design Automation Confeence, pp. 606-611, 1993. [E48] W. C. Emoe, The tansient esponse of damped inea netwok with paticua egad to wideband ampifies, In Jouna of Appied Physics 19, pp. 55-63, 1948. [Gi90] L.P.P.P. van Ginneken, Buffe pacement in distibuted RC-tee netwoks fo minima Emoe deay, In Poceedings of Intenationa Symposium on Cicuits and Systems, pp. 865-868, 1990. [GJ79] [M. R. Gaey and D. S. Johnson, Computes and Intactabiity: A Guide to the Theoy of NP-Competeness, W. H. Feeman, San Fancisco, 1979. [Ha66] M. Hanan, On Steine s pobem with ectiinea distance, SIAM Jouna of Appied Mathematics, No. 14, pp. 255-265, 1966. [LCLH96] J. Liis, C. K. Cheng, T. Y. Lin, and C. Ho, New pefomance diven outing techniques with expicit aea/deay tadeoff and simutaneous wie sizing, In Poceedings of the 33th Design Automation Confeence, pp. 395-400, 1996. [LSP97] J. Lou, A. H. Saek, and M. Pedam, An exact soution to simutaneous technoogy mapping and inea pacement pobem, In Poceedings of Intenationa Confeence on Compute-Aided Design, pages 671-675, 1997. [OC96a] T. Okamoto, and J. Cong, Buffeed Steine tee constuction with wie sizing fo inteconnect ayout optimization, In Poceedings of

Conventiona FANROUT conv-i conv-ii REQ TSP nets # of sinks eq-time aea w-ength eq-time aea w-ength eq-time aea w-ength eq-time aea w-ength C432 net1 6 35.23 219.12 83.07 21.52 149.93 63.15 21.83 165.66 63.41 17.08 165.66 55.49 net2 10 33.50 329.12 135.15 29.74 297.66 118.64 28.05 222.20 93.22 24.46 222.20 113.11 C1355 net3 8 36.23 329.12 101.28 25.12 297.66 87.09 32.14 195.47 98.44 29.46 195.47 96.57 net4 9 34.23 382.58 116.94 30.33 268.18 116.58 28.78 195.47 109.74 26.65 61.82 110.31 C3540 net5 35 38.70 457.49 119.17 38.20 1147.30 152.62 32.16 270.38 122.82 32.01 270.38 132.27 net6 73 59.44 836.99 535.58 59.78 836.99 549.36 54.75 649.88 549.30 54.69 649.88 583.05 C5315 net7 12 24.94 516.23 68.22 12.21 268.18 42.89 21.83 248.93 71.40 17.23 248.93 70.15 net8 21 33.10 542.96 195.17 35.59 533.50 254.74 32.61 409.31 206.01 25.32 409.31 200.46 C6288 net9 16 48.33 516.23 144.30 43.75 415.58 168.95 40.38 222.20 157.68 28.96 222.20 160.35 net10 20 62.49 436.04 146.61 95.96 238.70 175.93 51.67 222.20 136.42 42.86 222.20 145.90 C7552 net11 16 48.57 516.23 179.16 30.28 504.02 211.38 37.83 222.20 182.51 21.98 222.20 171.69 net12 23 41.68 245.85 185.45 54.88 503.69 261.70 33.00 272.58 157.30 31.62 272.58 189.66 Aveage Ratios: conv-i 0.84 0.63 0.95 0.71 0.61 0.98 conv-ii 1.00 0.70 0.94 0.83 0.66 0.96 Tabe 1: FANROUT vs. conventiona fows fo singe nets Conventiona FANROUT Ratios conv-i conv-ii TSP FANROUT/conv-I FANROUT/conv-II Cicuit Aea Deay Aea Deay Aea Deay Aea Deay Aea Deay C17 400.50 0.87 400.50 0.87 416.50 0.90 1.04 1.03 1.04 1.03 C1355 35539.54 10.39 35225.19 10.20 25215.25 7.49 0.71 0.72 0.72 0.73 C1908 51936.70 16.34 48694.77 18.54 43705.90 11.03 0.84 0.68 0.90 0.59 C432 21947.10 11.59 19179.60 13.54 22241.46 11.63 1.01 1.00 1.16 0.86 C499 29203.65 9.27 29208.99 8.99 31201.45 7.17 1.07 0.77 1.07 0.80 C5315 134504.94 19.31 127776.26 20.04 112800.15 10.88 0.84 0.56 0.88 0.54 C880 29786.25 10.53 28626.21 10.20 20811.15 10.01 0.70 0.95 0.73 0.98 au2 30199.15 14.72 27942.48 17.23 23561.25 10.53 0.78 0.72 0.84 0.61 au4 50985.15 21.60 46912.67 23.89 51801.75 17.34 1.02 0.80 1.10 0.73 apex6 44626.00 7.12 44514.75 6.67 39516.96 5.27 0.89 0.74 0.89 0.79 cm151a 2042.32 2.88 1753.01 3.21 1560.45 1.83 0.76 0.64 0.89 0.57 dau 95323.54 23.65 53424.14 26.47 88595.86 23.92 0.93 1.01 1.66 0.90 misex1 4015.55 4.25 3166.56 5.27 3097.44 2.87 0.77 0.68 0.98 0.54 a 5810.46 3.78 5931.42 4.10 4942.99 2.80 0.85 0.74 0.83 0.68 fg1 6319.74 3.61 6425.50 3.54 5467.56 2.69 0.87 0.75 0.85 0.76 pce 4775.31 3.51 4644.03 3.53 4161.83 1.89 0.87 0.54 0.90 0.54 d73 3519.67 3.62 3594.87 3.50 3676.39 3.67 1.04 1.01 1.02 1.05 vg2 5264.19 3.69 5334.03 3.62 5086.35 2.52 0.97 0.68 0.95 0.70 Aveage Ratios: 0.89 0.78 0.97 0.75 Tabe 2: Intenationa Confeence on Compute-Aided Design, pp. 44-49, 1996. [OC96b] T. Okamoto, and J. Cong, Inteconnect ayout optimization by simutaneous Steine tee constuction and buffe insetion, In Poceedings of the 5 th ACM/SIGDA physica Design Wokshop, pp. 1-6, 1996. [SLP98] A. H. Saek, J. Lou, and M. Pedam, A DSM design fow: Putting foopanning, technoogy-mapping, and gate-pacement togethe, In Poceedings of the 35 th Design Automation Confeence, 1998. [SSLM92] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Mugai, A. Sadanha, H. Savoj, P. R. Stephan, R. K. Bayton, and A. Sangiovanni-Vincentei, SIS: A system fo sequentia cicuit synthesis, Memoandum No. UCB/ERL M92/41, Eectonics Reseach Laboatoy, Coege of Engineeing, Univesity of Caifonia, Bekeey, CA 94720, May 1992. [To90] H. Touati, Pefomance-oiented technoogy mapping, Ph.D. thesis, Univesity of Caifonia, Bekeey, Technica Repot UCB/ERL M90/109, Novembe 1990.