Theory and Algorithm for SPFD-Based Global Rewiring

Similar documents
System in Weibull Distribution

Excess Error, Approximation Error, and Estimation Error

Three Algorithms for Flexible Flow-shop Scheduling

XII.3 The EM (Expectation-Maximization) Algorithm

Xiangwen Li. March 8th and March 13th, 2001

Designing Fuzzy Time Series Model Using Generalized Wang s Method and Its application to Forecasting Interest Rate of Bank Indonesia Certificate

1 Definition of Rademacher Complexity

Applied Mathematics Letters

Chapter 12 Lyes KADEM [Thermodynamics II] 2007

Computational and Statistical Learning theory Assignment 4

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Parity of the Number of Irreducible Factors for Some Pentanomials

Towards strong security in embedded and pervasive systems: energy and area optimized serial polynomial multipliers in GF(2 k )

On Pfaff s solution of the Pfaff problem

An Optimal Bound for Sum of Square Roots of Special Type of Integers

What is LP? LP is an optimization technique that allocates limited resources among competing activities in the best possible manner.

4 Column generation (CG) 4.1 Basics of column generation. 4.2 Applying CG to the Cutting-Stock Problem. Basic Idea of column generation

COS 511: Theoretical Machine Learning

Slobodan Lakić. Communicated by R. Van Keer

Elastic Collisions. Definition: two point masses on which no external forces act collide without losing any energy.

Our focus will be on linear systems. A system is linear if it obeys the principle of superposition and homogenity, i.e.

COMP th April, 2007 Clement Pang

Finding Dense Subgraphs in G(n, 1/2)

SPFD-Based Effective One-to-Many Rewiring (OMR) for Delay Reduction of LUT-based FPGA Circuits

NP-Completeness : Proofs

Several generation methods of multinomial distributed random number Tian Lei 1, a,linxihe 1,b,Zhigang Zhang 1,c

halftoning Journal of Electronic Imaging, vol. 11, no. 4, Oct Je-Ho Lee and Jan P. Allebach

Chapter One Mixture of Ideal Gases

Robust Algorithms for Preemptive Scheduling

Gadjah Mada University, Indonesia. Yogyakarta State University, Indonesia Karangmalang Yogyakarta 55281

Least Squares Fitting of Data

On the number of regions in an m-dimensional space cut by n hyperplanes

Finite Vector Space Representations Ross Bannister Data Assimilation Research Centre, Reading, UK Last updated: 2nd August 2003

An Accurate Measure for Multilayer Perceptron Tolerance to Weight Deviations

Determination of the Confidence Level of PSD Estimation with Given D.O.F. Based on WELCH Algorithm

Module 9. Lecture 6. Duality in Assignment Problems

CHAPTER 6 CONSTRAINED OPTIMIZATION 1: K-T CONDITIONS

Time and Space Complexity Reduction of a Cryptanalysis Algorithm

EEE 241: Linear Systems

ON THE NUMBER OF PRIMITIVE PYTHAGOREAN QUINTUPLES

Problem Set 9 Solutions

Handling Overload (G. Buttazzo, Hard Real-Time Systems, Ch. 9) Causes for Overload

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Introducing Entropy Distributions

CHAPTER 7 CONSTRAINED OPTIMIZATION 1: THE KARUSH-KUHN-TUCKER CONDITIONS

The Minimum Universal Cost Flow in an Infeasible Flow Network

arxiv: v2 [math.co] 3 Sep 2017

Collaborative Filtering Recommendation Algorithm

Revision: December 13, E Main Suite D Pullman, WA (509) Voice and Fax

A Hybrid Variational Iteration Method for Blasius Equation

Solutions for Homework #9

On Syndrome Decoding of Punctured Reed-Solomon and Gabidulin Codes 1

PROBABILITY AND STATISTICS Vol. III - Analysis of Variance and Analysis of Covariance - V. Nollau ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

Algorithm for reduction of Element Calculus to Element Algebra

Least Squares Fitting of Data

Second Order Analysis

Structure and Drive Paul A. Jensen Copyright July 20, 2003

International Journal of Mathematical Archive-9(3), 2018, Available online through ISSN

Assortment Optimization under MNL

Chapter 13. Gas Mixtures. Study Guide in PowerPoint. Thermodynamics: An Engineering Approach, 5th edition by Yunus A. Çengel and Michael A.

AN ANALYSIS OF A FRACTAL KINETICS CURVE OF SAVAGEAU

The Impact of the Earth s Movement through the Space on Measuring the Velocity of Light

On the Construction of Polar Codes

1 Review From Last Time

Multipoint Analysis for Sibling Pairs. Biostatistics 666 Lecture 18

Preference and Demand Examples

,..., k N. , k 2. ,..., k i. The derivative with respect to temperature T is calculated by using the chain rule: & ( (5) dj j dt = "J j. k i.

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

Week 5: Neural Networks

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Numerical Solution of Ordinary Differential Equations

On the Construction of Polar Codes

The Study of Teaching-learning-based Optimization Algorithm

Description of the Force Method Procedure. Indeterminate Analysis Force Method 1. Force Method con t. Force Method con t

FUZZY MODEL FOR FORECASTING INTEREST RATE OF BANK INDONESIA CERTIFICATE

LECTURE :FACTOR ANALYSIS

Gradient Descent Learning and Backpropagation

04 - Treaps. Dr. Alexander Souza

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

One-sided finite-difference approximations suitable for use with Richardson extrapolation

Denote the function derivatives f(x) in given points. x a b. Using relationships (1.2), polynomials (1.1) are written in the form

Solving Fuzzy Linear Programming Problem With Fuzzy Relational Equation Constraint

Recap: the SVM problem

Topic 4. Orthogonal contrasts [ST&D p. 183]

y new = M x old Feature Selection: Linear Transformations Constraint Optimization (insertion)

Study of the possibility of eliminating the Gibbs paradox within the framework of classical thermodynamics *

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION

VERIFICATION OF FE MODELS FOR MODEL UPDATING

Chapter - 2. Distribution System Power Flow Analysis

On the Multicriteria Integer Network Flow Problem

Fuzzy Boundaries of Sample Selection Model

Quantum Particle Motion in Physical Space

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Edge Isoperimetric Inequalities

Beyond Zudilin s Conjectured q-analog of Schmidt s problem

Maintenance Scheduling and Production Control of Multiple-Machine Manufacturing Systems

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Calculation of time complexity (3%)

Fermi-Dirac statistics

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Transcription:

Theory and Algorth for SPFD-Based Global Rewrng Jason Cong and Wangnng Long Departent of Coputer Scence, Unversty of Calforna, Los Angeles, CA 995 Abstract In ths paper we present the theory and the algorth for SPFDbased global rewrng (SPFD-GR), whch allows us to replace a target wre globally n the crcut (by soe wre possbly far away fro the target) It successfully overcoes the ltaton of the exstng SPFD-based local rewrng (SPFD-LR) that can only replace a wre wth another wre havng the sae snk node We apply SPFD-GR to the post-appng area reducton for LUTbased FPGAs under crcut depth restrcton Experental results show that the rewrng ablty of SPFD-GR, n ters of the nuber of target wres found to ha ve alternatve wres, s 45 and tes that of SPFD-LR and an ATPG algorth (wth a prelnary experental flow), respectvely, and the run te s qute acceptable Usng parttonng, the SPFD-GR algorth scales well to large crcuts wth good synthess qualty Introducton Rewrng s a technque that replaces a wre wth another wre so as to acheve perforance proveent or area reducton Recently, t has receved ncreasng attenton due to the need of closer synthess and layout nteracton for tng closure The rewrng proble has been wdely nvestgated The exstng approaches are based on the autoatc test pattern generaton (ATPG) [][4][5][][][], the syetry detecton [], the graph pattern recognton [7], and the SPFD (Set of Pars of Functons to be Dstngushed) [8][5], etc Aong the, ATPG-based redundancy addton and reoval s probably the earlest and ost wdely used approach It uses ATPG technques to add a redundant wre (alternatve wre) to ake the target wre redundant and reovable The advantage of the ATPG-based ethod s that t s capable of global rewrng, e, reovng a target wre by addng an alternatve wre far away fro the target wre n the crcut When appled to LUTbased FPGA crcuts, however, the ATPG-based rewrng has lted flexb lty of changng the nternal logc functon of a node, whch ay provde a greater opportunty for optzaton Recently, the SPFD-based rewrng was proposed and developed n [8] and [5] It has been used n technologyndependent logc synthess [5], LUT-based FPGA synthess [8], floorplannng and placeent of ult-level PLAs [6], and low power desgn for FPGAs [] The SPFD-based rewrng algorth calculates a set of functon pars to be dstngushed (SPFD) at each wre accordng to the ntal crcut structure and the crcut s prary output functons Durng the rewrng stage, f the SPFD at a node s n-pn, say p, can be dstngushed by the logc functon at another node s out-pn, say p, then a new wre p p can replace the orgnal wre ended at p Unlke the ATPG-based ethod, however, the conventonal SPFD-based rewrng (referred to SPFD-LR n ths paper) can only fnd alternatve wres locally, by whch we ean that the snk node of the alternatve wre s the sae as that of the target wre Conversely, the ATPG-based ethod s capable of fndng a global alternatve wre that ght be far away fro the target wre In ths paper we propose an SPFD-based global rewrng (SPFD-GR) Our purpose s to apply ths ethod to LUT-based FPGA synthess Our an contrbutons nclude: ) We developed the theory and algorth for solvng a fundaental proble n SPFD based rewrng: Gven the set of n-pn functons of a node and the SPFD at the node s outpn, s there a way to odfy the node s nternal functon so that the SPFD at the node s out-pn can be satsfed? ) Usng the concept of the donators and node odfcaton technque stated above, we developed SPFD-GR that allows global rewrng wth the flexblty of changng nternal functons of nodes to axze the opportunty of rewrng ) Usng a state-of-art ult-level / ult-way parttonng algorth [8], SPFD-GR scales well to large desgns Ternology and Defntons In ths secton we wll revew soe ternology and defntons The crcuts used n ths paper are cobnatonal crcuts We restrct the crcut n our research to a K-bounded network, eanng that each logc cell (or node ) n the crcut, as shown n Fgure, has an out-pn p and up to K n-pns (p, p,, p n, n K) Each node has an nternal logc functon p = f(p,, p n ) that defnes the logc relatonshp between the out-pn and the n-pns of the node In ths paper, the nternal logc functon of a node can be any cobnaton of the node s nputs, Every pn n the crcut has a global logc functon n ters of the prary nputs of the crcut (denoted as g n Fgure ) Internal functon p =f(p,, pn) g G out-pn p n-pns p p pn (n K) n-pn functon g g g n Fgure Logc cell PIs G p G G4 p G G5 O Fgure Illustraton of donator For wre p p, as shown n Fgure, we call G as ts source node and G as ts snk node A transtve fanout of pn p s a node on one of the paths fro p to a prary output (PO) A transtve fanout of a node s a transtve fanout of the node s out-pn A donator of pn p s a transtve fanout of p through whch all the paths fro p to POs pass A donator of a wre s the donator of the wre s snk pn For exaple, n Fgure, G 5 s the only donator of pn p whle both G and G 5 are the donators of pn p as well as wre p p A functon f s sad to dstngush a functon par (π, π ), where π, π and π π, f ether one of the followng two condtons s satsfed [5]: π f π or π f π O

An SPFD s a set of pars of functons to be dstngushed A functon f s sad to satsfy SPFD P = {(π, π ), (π, π ),, (π, π )} f f dstngushes all the functon pars n P SPFD can be consdered as a way to express the don t-care condtons and provdes flexblty to pleent a node [] Revew of SPFD Calculaton For a logc network, the calculaton of the SPFDs usually conssts of two steps [8]: Frst, traverse the crcut fro prary nputs (PIs) to prary outputs (POs) and calculate the logc functons at all pns Second, calculate the SPFDs backward fro POs to PIs At dfferent pns, the SPFD calculaton ethods are dfferent: ) At each PO, O, the SPFD has only one functon par, P = {(f, f )}, where f s the on-set functon of O, and f s the off-set functon of O ) At a node s out-pn, the SPFD s the unon of ts fanout pns SPFDs ) For the n-pns of a node, once ts out-pn SPFD has been obtaned, the n-pn SPFDs are obtaned by decoposng ts out-pn SPFD nto atoc SPFD and assgnng the functon pars backwards to n-pns, as llustrated n the followng exaple Exaple Assgnng SPFDs to the nputs of a node usng the ethod n [8] for -nput node G, shown n Fgure (a) Assue the crcut s PIs are x and x and PO s at p G s n-pns are p and p and out-pn s p The nternal functon of G s p = p + p Gven the n-pn functons of G are g = x x at p and g = xx at p, the output functon of the crcut s y = xx + xx Hence, the SPFD at p s sply SPFD = {(f, f )}, where f = xx + xx, and f = xx + xx p y = x + x xx y = x + x xx Internal Functon: G SPFD ={(, )} G p = p + p {( x x, xx + x x )} p p g = x + x p p g = xx g = x x p g = x x (b) (a) Fgure (a) A node n a crcut (b) Local rewrng To get the SPFDs at p and p, we frst calculate all nal product ters of G s nput functons, and then ultply the by (f + f ): = g g (f + f ) = g g = x x + x x f = = g g (f + f ) = x x g g (f + f ) = x x f f = g g(f + f ) = We then decopose the sngle -par n the SPFD nto a ultple-par atoc SPFD (thereby propagatng the functon pars n SPFD to p and p ): SPFD = {(, ), (, )} = {( x x, x x + x x ), ( x x, x x + x x )} As g dstngushes (, ), we assgn (, ) to pn p For the sae reason, we assgn (, ) to pn p (Notce that the SPFD assgnng procedure depends on the orderng of the npns) We then obtan the SPFDs at p and p : SPFD = {(, )} = {( x x, x x + x x )} SPFD = {(, )} = {( x x, x x p + x x )} After SPFDs have been obtaned for the crcut, we can perfor rewrng In Fgure (b), for exaple, p wth functon g = x + x satsfes g and therefore satsfes SPFD Therefore we can use wre p p to replace the orgnal wre ended at p After the replaceent, the nternal functon of G should be odfed nto p = p + p Notce that the alternatve wre found by the above process ust have the sae snk node as the target wre has In ths sense, SPFD-LR s a type of local rewrng approach 4 SPFD-Based Global Rewrng In ths secton we present our SPFD-based global rewrng algorth, SPFD-GR The general global rewrng proble can be forulated as follows: Gven a target wre w r wth snk node G, can we add at ost one wre to soe node G D n the network wth possble odfcaton of the nternal functon of G D and other nodes n the crcut so that we can reove w r whle preservng the network s prary-output functons? (See llustraton n Fgure 4) The above proble can be dvded nto two sub-probles: ) When G D = G, the proble can be solved by SPFD-LR; ) When G D G and the SPFD on w r s not epty, we ust deterne how to select G D and how to perfor the logc transforaton such that the logc functon of the crcut wll be preserved Ths paper solves the second sub-proble Donator of wr wr G Target wre G GD p p Alternatve wre wa p G4 func: g Fgure 4 Illustraton of SPFD based global rewrng Inspred by an dea used n soe ATPG-based rewrng algorths, we ake use of the concept of a donator of the target wre (eg G D n Fgure 4), as the effect of reovng the target wre ust pass through the donator to any PO Therefore, after reovng w r, f we have a way to odfy the nternal functons of G D and possbly other nodes so that G D s out-pn SPFD s satsfed, then, we have a way to keep the logc functons of all POs unchanged Ths dea fors the followng algorth SPFD Based Global Rewrng Algorth (SPFD-GR): Gven target wre w r and ts donator G D (Fgure 4), ) Teporarly reove w r fro G and re-calculate the output functon of G ; ) Propagate G s new output functon through ts transtve fanouts untl reachng G D ; ) Try to odfy G D wthout wre addton so that f D dstngushes the SPFD at G D s out-pn (usng the theory to be presented n Secton 5) If successful, go to Step 6 Otherwse, go to the next step; 4) Try to odfy G D by addng a wre w a so that f D dstngushes the SPFD at G D s out-pn (usng the algorth to be presented n Secton 5) If successful, go to Step 6; or try another

canddate wre and repeat ths step If no canddate s left, go to the next step; 6) (Fal) Restore the functons of G and ts transtve fanouts untl G D Return fal 7) (Success) Peranently reove w r, and Update the nternal functon of transtve fanouts of G D as necessary Update the SPFDs fro the changed nodes close to POs backwards to the snk node of the wre selected as the next target wre Return success By reovng a wre fro a node n Step of the above algorth, we ean to set the wre s logc value as or As our approach deals wth coplex nodes, t can soetes be dffcult to deterne how to assgn or to the wre We have a procedure to gude the assgnent that nzes the nuber of wres to be reoved fro ths node For exaple, gven a node s nternal functon f = a b + c, where a s to be reoved, we shall set a = so that b and c are not reoved at the sae te Otherwse, f we set a =, b wll also be reoved Increasng the nuber of wres to be reoved ay decrease the possblty for Steps and 4 to be successful Note that our algorth assues that the target wre s gven Choosng a target wre depends on the applcaton used To reduce the crcut delay, we select a wre on the crtcal path as the target wre If our objectve s to nze the crcut area, we select a wre that enable s deleton of a node or packng wth soe other nodes It s worth notng that n Fgure 4, G s also a donator of w r Therefore, the above SPFD-GR algorth actually covers the cases covered by SPFD-GR In practce we use SPFD-LR to do rewrng when G = G D 5 Theory of SPFD-Based Global Rewrng As seen n the prevous secton, the key queston for the SPFD- GR algorth can be generalzed as two questons: ) For a node n the crcut, gven ts n-pn functons and out-pn SPFD, s there a way to odfy the nternal functon of the node so that ts outpn functon stll dstngushes ts out-pn SPFD? ) If no to queston, can we add a wre to the node and odfy the node s nternal functon so as to ake ts out-pn functon dstngush ts out-pn SPFD? We call ths proble the node odfcaton proble In the followng two subsectons, we wll present two effcent checkng procedures to solve the node odfcaton proble We consder two cases: ) odfyng a node wthout addng a wre; and ) odfyng a node by addng a wre 5 Node Modfcaton Wthout Wre Addton Gven a node s n-pn functons, the output functon of the node can be expressed as the su of nu product ters of the npn functons, whch s defned as followng Defnton (nu product ter): Let Β={β, β,, β N- } (N = n ), where β ( N-) s a nu product ter of the node s n-pn functons, called MP-ter, whch s n the for of β = gg L gn β = g g L g () L β N n = g g L g where g ( n) s the global functon at node G s -th nput Gven a functon par (π, π ), =(π +π )β s called the restrcted n MP-ter f = (π +π )f s called the restrcted functon of f, where f s the orgnal functon The restrcted functon and the orgnal functon have the followng relatonshp: Lea : π f π π f π and π f π π f π In other words, f dstngushes (π, π ) f and only f f dstngushes (π, π ) To choose a proper node functon f that satsfes the SPFD at the node s out-pn, we can try every cobnaton of the MP-ters, e N f = β, where k or However, the te coplexty of k ths sple approach s too hgh O( n ) For a sngle-par SPFD, the followng theore provdes a ore effcent approach to perfor the above checkng wth only O( n ) te coplexty Ths s uch ore effcent and qute affordable for a node wth a sall nuber of nputs, usually 4 or 5 for LUT-based FPGAs Theore : Gven a node whose out-pn SPFD s P = {(π, π )} and n-pn functons are g, g,, g n, where n s the nput nuber of the node, f every non-epty restrcted MP-ter = (π +π )β satsfes one of the followng condtons, π or () π Then there exsts a functon that dstngushes (π, π ) Partcularly, f can be constructed as followng, N = () f = f ( g, g,, g ) k β where N= n ; k = f π, and k = f π N N Proof: () Let f = k = k ( π + π ) β, where k = f π ; k = f π It s clear that f only contans π Hence, f π () On the other hand, snce, we know that = ( π + π ) N N ncludes all the n-ters (of the prary nputs) of π Snce π π, we know that contans no n-ters of π when π Thus, f ncludes all the n-ters of π Hence, f π Cobnng () and (), we know f = π, whch eans that f satsfes π f π (because π π ) Therefore, fro Lea, we know π f π QED Actually, Equaton () gves us a way to construct the nternal functon of a node when the condton n Theore s satsfed For node G n Fgure, we can get G s nternal functon by substtutng g n each MP-ter of () wth p and f wth p (We can also use the ethod proposed n [8] to construct the nternal functon) Exaple : For node G n Fgure, suppose the SPFD at G s out-pn p s SPFD = {(f, f )}, where f = xx + x x ; the n-pn functons are g = x + x at p Snce f + f =, we have = β = g g = x x f = β = β = β = g g = x x f = g g = f = xx + xx and g = x at p and x = g g = x x + x x f

As all the restrcted MP-ters satsfy the condtons n (), we get G s output functon f = β +β that dstngushes (f, f ) By substtutng f wth p, g wth p, and g wth p, we get a feasble nternal functon p = p p + p p (End of exaple) Usually, the functon constructed by Equaton () needs to be splfed, eg, usng SIS s node splfcaton [4] In fact, we can show that the condtons n Theore are also necessary for a node to have an output functon that dstngushes (π, π ) The proof s not ncluded here due to the page ltaton Theore assues that the SPFD at the output of a node has a sngle functon par In practce, however, the SPFD at a node s out-pn ay contan several functon pars The followng theore deonstrates a way to cobne the functon pars Theore : Gven the SPFD, P = {(π, π ), (π, π ),, (π, π )}, at a node s out-pn, and a functon ϕ whch dstngushes all the pars n P, wthout loss of generalty, suppose (π, π ) ( ), π ϕ π Let π = π and, π = π Then f = f(g, g,,g n ) dstngushes any par p P f f dstngushes (π, π ) Proof: Wthout loss of generalty, assue π f π Hence, for any functon par (π, π ) P, there ust be π π f On the other hand, f π π π π Lπ π j j= Therefore π f π Slarly, we can prove thatπ f π π f π Therefore, f dstngushes any par n P f f dstngushes (π,π ) (QED) Notce that the condton n Theore s only a suffcent condton, but no longer a necessary condton (dfferent fro Theore ) In Theore, functon ϕ s needed to classfy the functons n each par nto ϕ s on-set and off-set Snce the node s ntal output functon ust satsfy the SPFD at ts out-pn, we can sply use t as ϕ 5 Node Modfcaton by Addng A Wre Gven a node, f no cobnaton of ts n-pn functons dstngushes ts out-pn SPFD, we can try to add a wre to the node so that the cobnaton of the n-pn functons, ncludng the new wre s functon, dstngushes the out-pn SPFD The followng algorth gves us an effcent way to deterne whch wre can be added nto the node Wre Addton Algorth: ) Calculate the MP-ter set, B = {β, β,, β N- } (N = n ), whch s defned n () The non-epty MP-ters can be classfed nto three sets, B, B and B, where B s the set whose ebers satsfy (π +π )β π ; B s the set whose ebers satsfy (π +π )β π ; and B s the set whose ebers do not satsfy any condtons n () Let B = { β,,, } t β t L β and B t = { β,,, } s β s L β If B s =, then we k can successfully odfy the node wthout addng a wre, return success Otherwse, t s necessary to add a wre, go to next step ) Choose a canddate node n the network fro whch we want to lnk a wre to G s new n-pn p n+ Suppose ts functon s g n+ For each β, let = π π ) β, calculate s B s ( + s r g n + = and r = If each satsfes ether r π or r s g n + s π (r ), we can add ths wre Go to the next step Otherwse, fnd another canddate and repeat ths step If all nodes have been tred, return fal ) Usng the calculaton results n Steps and to get the G s output functon that dstngushes (π, π ): f = f(g, g,, g n, g n+ ) N N = f k β g k β g (4) where f = β + β + L + β t t + + s n+ j s n+ j j= t, β B, whch s obtaned n Step For each k n the second ter, k = f k = else For each k j n the thrd ter, k j = f, and s g π n+ s j g n+ π, and k j = else 4) Use the slar ethod to the one used n the prevous secton to calculate G s nternal functon accordng to (4) Return success As B s a sub-set of B, whch s usually very sall, the checkng procedure n the above algorth does not consue too uch te In addton, B and B rean the sae for any canddate node Therefore, they can be re-used n the coputaton π = g = x + x SPFD ={(π p, π )} g = p p G p xx g = xx wre to be added g = xx Fgure 5 Illustraton of Node odfcaton wth wre addton Exaple (Node odfcaton wth wre addton): Gven the n-pn functons g and g as shown n Fgure 5 The SPFD at G s out-pn p s SPFD = {(π, π )}, where π =x +x and π = xx We carry out the wre addton algorth n the followng steps: ) Condton checkng (wthout wre addton): = g g ( π + π ) = xx + x x = g g( π + π = xx π = g g( π + π ) = x x π = g g( π + π ) = Thus, we can set B =, B ={, } and B ={ } As B, t s necessary to add a wre ) Try to add a wre fro g = x x, check g g g = x x π g g g = x x π ) The condtons are satsfed Therefore we can add the wre fro g to G g = gg + gg + gg g s a functon that dstngushes (π, π ) 4) Fnally, we get G s nternal functon: p = p p + p p + p p p 6 Ipleentaton Notes In the pleentaton of SPFD-GR, CUDD BDD [6] s used to represent logc functon and SPFDs The logc functon at each pn s expressed as the prary nputs of the crcut, whch s represented by a BDD SPFDs are also represented by BDDs The runte of SPFD-GR algorth s deterned by the runte of BDD constructons and varable BDD operatons In order to handle large crcuts, we use a state-of-the-art ultlevel, ult-way parttonng progra, naed KPM/MESC [8], to 4

partton a crcut nto saller blocks and separately apply SPFD-GR to each block After all the blocks have been processed, we cobne the results to get the soluton for the entre crcut 7 Experental Results We appled the SPFD-GR n the LUT-based FPGA synthess We perfored three experents: ) rewrng ablty coparson; ) area nzaton for large crcuts, and ) pact of the partton sze on the synthess qualty As the crcut depth s portant for LUT-based FPGAs, all of our experents are under the crcut depth restrcton, whch eans that the changes to the crcut wll not ncrease the axu crcut depth All of our experents are carred out on Sun Ultra 6 workstaton Table copares SPFD-GR wth SPFD-LR and an ATPG-based rewrng algorth wth a prelnary experental flow A refers to the case that the ATPG rewrng engne does not use the recursve learnng, whle A refers to the case that the ATPG engne uses the recursve learnng of level In the experents for Table, each progra traverses the entre crcut once, selectng every wre as a target wre and tryng to fnd alternatve wres that satsfy the axu depth restrcton For the purpose of collectng statstcal data, we dd not ake real changes to the crcut The crcuts used n Table are 4-LUT networks obtaned through scrptrugged, Cutap [7], Red_Reoval and Greedy_Pack All of these routnes are avalable n SIS [4] and RASP [9] Colun lsts the nuber of wres n each crcut Coluns ~ 6 show the rewrng ablty of the four ethods, whch are ATPG-based algorths wthout (A) or wth (A) recursve learnng, SPFD-LR and SPFD-GR, respectvely By rewrng ablty of a rewrng algorth, we ean the percentage of wres havng at least one alternatve wre Coluns 7 ~ show the run te n seconds of the four algorths We pleented SPFD-LR accordng to [8] and used t for coparson purpose Moreover, we also used t n our progra to cover the cases of local rewrng (whch s covered by SPFD-GR) For ATPG-based rewrng, we use an ATPG rewrng engne that was used n [][], whch s orgnally desgned the rewrng for the cell-based desgn To apply ths ATPG rewrng engne, we decopose the LUT-based network nto a sple gate based network that s then fed nto the rewrng engne returnng the alternatve wre canddates Notce that our experent does not reflect the recent proveents of ATPG-based rewrng, eg, [5] The coparson wth ATPG-based rewrng s prelnary Table shows that the rewrng ablty of the ATPG rewrng algorths wth and wthout recursve learnng are % and 98% respectvely The rewrng ablty of SPFD-LR and SPFD-GR are 5% and 67% respectvely We appled the SPFD-based algorths to the post-appng area reducton We use 5 crcuts whose sze s up to K -nput gates The results are lsted n Table The crcuts fed nto the rewrng progra are 4-LUT networks Ind ~ Ind4 are syntheszed and apped by scrptalgebrac [4], Cutap and Greedy_Pack Ind5 only goes through cutap and greedy_pack because scrptalgebrac fals to get a result wthn hours We use a greedy strategy to do the area nzaton, whch tres to reove as any wres as possble At the end of each pass, greedy_pack [9] s called to further pack the crcut We chose ths strategy based on two consderatons: ) If a Crut Nu Rewrng ablty CPU te (Secs) Nae of SPFD- SPFD- SPFD- SPFD- A A A A wres LR GR LR GR C98 4 6 8 7 96 4 76 76 549 C4 58 4 8 4 76 87 479 5 C55 77 4 6 4 5 7 9 47 47 alu 5 88 6 59 5 4 5 alu4 99 54 94 67 99 76 48 9 apex6 5 7 9 6 5 77 58 54 dalu 8 4 57 457 69 768 977 74 466 exaple 4 9 4 88 9 4 6 7 x 557 9 4 55 5 9 5 58 x 958 48 65 56 55 4 Total Rato 849 84 9 49 5 495 479 6 49 98% % 5% 67% 88 Table Coparson of rewrng ablty for 4-LUT FPGA desgns under crcut depth restrcton node has only one fanout wre and t can be reoved, then the node can also be reoved ) If the neghborng nodes have no ore than 4 nputs, greedy_pack ay further pack the Snce these crcuts are qute large, we used KPM/MESC partton progra [8] to partton the desgn nto a set of blocks The average parttonng sze, whch s the average nuber of LUT nodes wthn each partton, s set to 6 LUT nodes and the skew of partton s set to 5% Note that by ths settng there ay be soe blocks szed larger than 8 LUT nodes The progra runs 4 teratons for a crcut Wthn each teraton, the progra runs up to tes for each partton All the results n Table pass the sulaton verfcaton of SIS wth 8 rando patterns In Table, area represents the nuber of 4-LUT nodes Colun lsts the area of ntal crcuts Coluns ~ 6 show the area and run te of SPFD-LR and SPFD-GR The data show that SPFD-LR and SPFD-GR ethod acheve the area reductons of 4% and 44%, respectvely Notce that SPFD-GR s run te s only tes that of SPFD-LR Therefore, the speed of SPFD- GR s qute acceptable wth a oderate partton sze 5 Crcut Int Area SPFD-LR SPFD-GR Area CPU(s) Area CPU(s) Ind 5 4 7 5 84 Ind 55 989 85 966 5 Ind 4 88 777 85 64 Ind4 87 649 7449 68 Ind5 449 495 987 8657 465 Aver 96 786 489 685 5787 Rato 886% 856% Table Area nzaton for large crcut under crcuts depth restrcton wth average partton sze of 6 A ajor concern s whether the use of the parttonng technque wth SPFD-GR wll consderably degrade the soluton qualty In Table, we copare the results of SPFD-GR for area nzaton wth and wthout parttonng for a set of edusze crcuts For the crcuts whose 4-LUT nubers are less than, we use a -way partton For the others, we use 4-way partton The ntal crcuts are obtaned by the sae ethod as that for Table In Table, Non-p refers to the non-parttoned result; whle part refers to the parttoned result Both progras run teratons for each crcut The total area of the parttoned results s % worse than that wthout parttonng, whle the forer s 6 tes faster than the

later For soe crcuts, the parttoned results actually have a saller area Ths s due to the fact that we adopt a knd of greedy algorth for area nzaton For alu4, the parttoned one s slower than the non-parttoned one The reason s the BDD varable order for one partton s very bad resultng n a uch longer BDD calculaton te We also perfored a slar experent as n Table, except that we use a -way partton for all crcuts We do not lst the data here because of the page ltaton In that experent, the overall result s that the parttoned one s only 8% worse than the non-parttoned one and the run te of the forer s around 9 tes faster than the later Crcut Area CPU(s) Int Non-p Part Non-p Part C4 54 9 4 54 C98 7 8 7 9 x 6 4 9 9 alu 5 4 4 alu4 84 4 5 4 96 apex6 86 4 6 5 5 exaple 5 4 7 6 x 68 4 44 5 7 dalu 7 75 97 69 C55 5 48 487 46 Total 45 6 74 859 6 Rato 86% 886% 6 Table Ipact of parttonng n area nzaton 8 Concluson and Future Work Ths paper presents an SPFD-based global rewrng (SPFD- GR), whch s capable of fndng alternatve wres far away fro the target wre The experental results show that the rewrng ablty of the SPFD-GR ethod s 45 and tes that of the conventonal SPFD ethod and an ATPG-based ethod (wth a prelnary experental flow), respectvely Usng parttonng, the SPFD- GR-based algorth scales well to the large crcuts Our future work wll be: ) applyng SPFD-GR ethods to post-placeent logc synthess of LUT-based FPGAs to get the delay reducton; ) fndng soe heurstcs to help us to choose canddate alternatve wres ore effcenly, so as to prove the speed of SPFD-GR; and ) usng BDD varable orderng to speed up the BDD calculaton Acknowledgeent Ths work s partally supported by the Ggascale Slcon Research Center (GSRC) and the Calforna MICRO progra wth Actel, Altera, Lucent and Xlnx We thank Prof T Cheng and Rc Huang of UC Santa Barbara for provdng the ATPG rewrng engne, S Yashta, H Sawada and A Nagoya of NTT Corp, Japan, for ther bnary code of the orgnal SPFD progra [8] for experental purpose We also thank Sung L of UCLA for adaptng hs parttonng algorth [8] for the needs of ths work We also thank Prof Robert Brayton of UC Berkeley for hs stulatng dscussons on the SPFD technque durng varous GSRC workshops References [] L A Entrena and K-T Cheng Cobnatonal and Sequental Logc Optzaton by Redundancy Addton and Reoval IEEE Transacton on CAD of ICS, Vol 4, No 7, pp 99-96, July 995 [] R K Brayton Understandng SPFDs: A New Method for Specfyng Flexblty In Internatonal Workshop on Logc Synthess, 997 [] C-W Chang, C-K Cheng, P Suars, and M Marek- Sadowska Fast Post-placeent Rewrng Usng Easly Detectable Functonal Syetres In Desgn Autoaton Conference, p 86-89, [4] S-C Chang, K-T Cheng, N-S Woo, and M Marek- Sadowska Postlayout Rewrngusng Alternatve Wres IEEE Trans on CAD ICS, Vol 6, No6, p587-96, June 997 [5] S-C Chang, L V Gnneken, and M Marek-Sadowska Crcut Optzaton by Rewrng IEEE Transacton on Coputers, Vol 48, No 9, pp 96-97 Septeber 999 [6] P Chong, Y Jang, S Khatr, F Mo, S Snha, and R Brayton Don t Care Wres n Logcal/Physcal Desgn In Internatonal Workshop on Logc Synthess, pp - 9, [7] J Cong and Y Hwang Sultaneous Depth and Area Mnzaton n LUT-Based FPGA Mappng Proc ACM rd Int'l Syp on FPGA, Feb 995, pp 68-74 [8] J Cong and S K L Edge Separablty based Crcut Clusterng Wth Applcaton to Crcut Parttonng IEEE/ACM Asa South Pacfc Desgn Autoaton Conference, p 49-44, [9] J Cong, J Peck, and Y Dng RASP: A General Logc Synthess Syste for SRAM-based FPGAs In Proc ACM/SIGDA Int'l Syp on FPGAs, p 7-4, Feb 996 [] R Huang, Y Wang, and K-T Cheng LIBRA-a lbraryndependent fraework for post-layout perforance optzaton In Internatonal Syposu on Physcal Desgn, p5-4, 998 [] J - M Hwang, F - Y Chang, and T- T Hwang A Reengneerng Approach to Low Power FPGA Desgn Usng SPFD In Desgn Autoaton Conference, p 7-75, 998 [] Y-M Jang, A Krstc, K-T Cheng, and M Marek- Sadowska Post-layout rewrngfor perforance optzaton In Desgn Autoaton Conference, p66-665, 997 [] W Kunz and P R Menon Multlevel Logc optzaton by plcaton Analyss, In Internatonal Conference on Coputer Aded Desgn, p 6- [4] E Sentovch, et al SIS: A Syste for Sequental Crcut Synthess Meorandu No UCB/ERL M9/4, Dept EECS, UC Berkeley, 99 [5] S Snha and R K Brayton Ipleentaton and Use of SPFDs n Optzng Boolean Networks In Internatonal Conference on Coputer Aded Desgn, p, 997 [6] F Soenz CUDD: CU Decson Dagra Package Release Technque Report, Dept of ECE, Unv of Colorado at Boulder, 998 [7] Y Wu, W Long and H Fan A Fast Graph-Based Alternatve Wrng Schee For Boolean Networks IEICE Transactons on Fundaentals of Electroncs, Councatons and Coputer Scences, Vol E8-A, No6, p-7, [8] S Yashta, H Sawada and A Nagoya A New Method to Express Functonal Perssbltes for LUT based FPGAs and Its Applcatons In Internatonal Conference on Coputer Aded Desgn, p 54 6, 996 6