temporally share the same FPGA Large circuit PI s micro-cycle one user cycle PO s

Similar documents
CSC Design and Analysis of Algorithms. Example: Change-Making Problem

Algorithmic and NP-Completeness Aspects of a Total Lict Domination Number of a Graph

An undirected graph G = (V, E) V a set of vertices E a set of unordered edges (v,w) where v, w in V

(a) v 1. v a. v i. v s. (b)

CSE 373: AVL trees. Warmup: Warmup. Interlude: Exploring the balance invariant. AVL Trees: Invariants. AVL tree invariants review

Why the Junction Tree Algorithm? The Junction Tree Algorithm. Clique Potential Representation. Overview. Chris Williams 1.

V={A,B,C,D,E} E={ (A,D),(A,E),(B,D), (B,E),(C,D),(C,E)}

, each of which is a tree, and whose roots r 1. , respectively, are children of r. Data Structures & File Management

V={A,B,C,D,E} E={ (A,D),(A,E),(B,D), (B,E),(C,D),(C,E)}

Planar Upward Drawings

12. Traffic engineering

Paths. Connectivity. Euler and Hamilton Paths. Planar graphs.

Outline. 1 Introduction. 2 Min-Cost Spanning Trees. 4 Example

12/3/12. Outline. Part 10. Graphs. Circuits. Euler paths/circuits. Euler s bridge problem (Bridges of Konigsberg Problem)

5/9/13. Part 10. Graphs. Outline. Circuits. Introduction Terminology Implementing Graphs

Math 61 : Discrete Structures Final Exam Instructor: Ciprian Manolescu. You have 180 minutes.

Exam 1 Solution. CS 542 Advanced Data Structures and Algorithms 2/14/2013

CSE 373: More on graphs; DFS and BFS. Michael Lee Wednesday, Feb 14, 2018

A Simple Code Generator. Code generation Algorithm. Register and Address Descriptors. Example 3/31/2008. Code Generation

CSE 373. Graphs 1: Concepts, Depth/Breadth-First Search reading: Weiss Ch. 9. slides created by Marty Stepp

Register Allocation. Register Allocation. Principle Phases. Principle Phases. Example: Build. Spills 11/14/2012

COMPLEXITY OF COUNTING PLANAR TILINGS BY TWO BARS

Graph Isomorphism. Graphs - II. Cayley s Formula. Planar Graphs. Outline. Is K 5 planar? The number of labeled trees on n nodes is n n-2

Module graph.py. 1 Introduction. 2 Graph basics. 3 Module graph.py. 3.1 Objects. CS 231 Naomi Nishimura

Outline. Circuits. Euler paths/circuits 4/25/12. Part 10. Graphs. Euler s bridge problem (Bridges of Konigsberg Problem)

Section 10.4 Connectivity (up to paths and isomorphism, not including)

Similarity Search. The Binary Branch Distance. Nikolaus Augsten.

COMP108 Algorithmic Foundations

b. How many ternary words of length 23 with eight 0 s, nine 1 s and six 2 s?

Outline. Computer Science 331. Computation of Min-Cost Spanning Trees. Costs of Spanning Trees in Weighted Graphs

Graphs. CSC 1300 Discrete Structures Villanova University. Villanova CSC Dr Papalaskari

Constructive Geometric Constraint Solving

Graphs. Graphs. Graphs: Basic Terminology. Directed Graphs. Dr Papalaskari 1

a b v a v b v c v = a d + bd +c d +ae r = p + a 0 s = r + b 0 4 ac + ad + bc + bd + e 5 = a + b = q 0 c + qc 0 + qc (a) s v (b)

Cycles and Simple Cycles. Paths and Simple Paths. Trees. Problem: There is No Completely Standard Terminology!

Garnir Polynomial and their Properties

CS September 2018

ECE 407 Computer Aided Design for Electronic Systems. Circuit Modeling and Basic Graph Concepts/Algorithms. Instructor: Maria K. Michael.

CS200: Graphs. Graphs. Directed Graphs. Graphs/Networks Around Us. What can this represent? Sometimes we want to represent directionality:

Computational Biology, Phylogenetic Trees. Consensus methods

0.1. Exercise 1: the distances between four points in a graph

CS 241 Analysis of Algorithms

1 Introduction to Modulo 7 Arithmetic

Announcements. Not graphs. These are Graphs. Applications of Graphs. Graph Definitions. Graphs & Graph Algorithms. A6 released today: Risk

Solutions for HW11. Exercise 34. (a) Use the recurrence relation t(g) = t(g e) + t(g/e) to count the number of spanning trees of v 1

A Low Noise and Reliable CMOS I/O Buffer for Mixed Low Voltage Applications

Present state Next state Q + M N

Problem solving by search

Using the Printable Sticker Function. Using the Edit Screen. Computer. Tablet. ScanNCutCanvas

Clustering for Processing Rate Optimization

Multi-Way VLSI Circuit Partitioning Based on Dual Net Representation

CS 461, Lecture 17. Today s Outline. Example Run

ECE COMBINATIONAL BUILDING BLOCKS - INVEST 13 DECODERS AND ENCODERS

5/7/13. Part 10. Graphs. Theorem Theorem Graphs Describing Precedence. Outline. Theorem 10-1: The Handshaking Theorem

(2) If we multiplied a row of B by λ, then the value is also multiplied by λ(here lambda could be 0). namely

CS61B Lecture #33. Administrivia: Autograder will run this evening. Today s Readings: Graph Structures: DSIJ, Chapter 12

Aquauno Video 6 Plus Page 1

Weighted graphs -- reminder. Data Structures LECTURE 15. Shortest paths algorithms. Example: weighted graph. Two basic properties of shortest paths

Partitioning Algorithms. UCLA Department of Computer Science, Los Angeles, CA y Cadence Design Systems, Inc., San Jose, CA 95134

12 - M G P L Z - M9BW. Port type. Bore size ø12, ø16 20/25/32/40/50/ MPa 10 C to 60 C (With no condensation) 50 to 400 mm/s +1.

Designing A Concrete Arch Bridge

The University of Sydney MATH2969/2069. Graph Theory Tutorial 5 (Week 12) Solutions 2008

Design Optimization Based on Diagnosis Techniques

Clustering Techniques for Coarse-grained, Antifuse-based FPGAs

Analysis for Balloon Modeling Structure based on Graph Theory

Numbering Boundary Nodes

CS553 Lecture Register Allocation I 3

arxiv: v1 [cs.ds] 20 Feb 2008

More Foundations. Undirected Graphs. Degree. A Theorem. Graphs, Products, & Relations

Announcements. These are Graphs. This is not a Graph. Graph Definitions. Applications of Graphs. Graphs & Graph Algorithms

EE1000 Project 4 Digital Volt Meter

Register Allocation. How to assign variables to finitely many registers? What to do when it can t be done? How to do so efficiently?

learning objectives learn what graphs are in mathematical terms learn how to represent graphs in computers learn about typical graph algorithms

QUESTIONS BEGIN HERE!

XML and Databases. Outline. Recall: Top-Down Evaluation of Simple Paths. Recall: Top-Down Evaluation of Simple Paths. Sebastian Maneth NICTA and UNSW

Weighted Graphs. Weighted graphs may be either directed or undirected.

Outline. Binary Tree

Seven-Segment Display Driver

New challenges on Independent Gate FinFET Transistor Network Generation

RAM Model. I/O Model. Real Machine Example: Nehalem : Algorithms in the Real World 4/9/13

DUET WITH DIAMONDS COLOR SHIFTING BRACELET By Leslie Rogalski

Page 1. Question 19.1b Electric Charge II Question 19.2a Conductors I. ConcepTest Clicker Questions Chapter 19. Physics, 4 th Edition James S.

16.unified Introduction to Computers and Programming. SOLUTIONS to Examination 4/30/04 9:05am - 10:00am

QUESTIONS BEGIN HERE!

A 4-state solution to the Firing Squad Synchronization Problem based on hybrid rule 60 and 102 cellular automata

CS553 Lecture Register Allocation 1

MULTIPLE-LEVEL LOGIC OPTIMIZATION II

Multipoint Alternate Marking method for passive and hybrid performance monitoring

Graph Contraction and Connectivity

# 1 ' 10 ' 100. Decimal point = 4 hundred. = 6 tens (or sixty) = 5 ones (or five) = 2 tenths. = 7 hundredths.

S i m p l i f y i n g A l g e b r a SIMPLIFYING ALGEBRA.

Efficient Broadcast in MANETs Using Network Coding and Directional Antennas

MAT3707. Tutorial letter 201/1/2017 DISCRETE MATHEMATICS: COMBINATORICS. Semester 1. Department of Mathematical Sciences MAT3707/201/1/2017

5/1/2018. Huffman Coding Trees. Huffman Coding Trees. Huffman Coding Trees. Huffman Coding Trees. Huffman Coding Trees. Huffman Coding Trees

Decimals DECIMALS.

Improving Union. Implementation. Union-by-size Code. Union-by-Size Find Analysis. Path Compression! Improving Find find(e)

Chapter 9. Graphs. 9.1 Graphs

Section 3: Antiderivatives of Formulas

Lecture 20: Minimum Spanning Trees (CLRS 23)

Math 166 Week in Review 2 Sections 1.1b, 1.2, 1.3, & 1.4

Transcription:

Ciruit Prtitioning for Dynmilly Rongurl FPGAs Huiqun Liu n D. F. Wong Dprtmnt of Computr Sins Unirsity of Txs t Austin Austin, T 787 Emil: fhqliu, wongg@s.utxs.u Astrt Dynmilly rongurl FPGAs h th potntil to rmtilly impro logi nsity y tim-shring physil FPGA i. This ppr prsnts ntwor-ow s prtitioning lgorithm for ynmilly rongurl FP- GAs s on th rhittur in []. Exprimnts show tht our pproh outprforms th nhn for-irt shuling mtho in [] in trms of ommunition ost. i (Figur ). Eh stg is ll miro-yl n th miro-yls form on usr yl. Btwn th miroyls, th logi los n intronnt in th FPGA r rongur y irnt ontxt. On usr yl shoul prou th sm rsults on th outputs s woul sn y non-tim-multiplx i. tmporlly shr th sm FPGA Introution On of th mjor nts proi y FPGAs is th ility of run-tim rongurtion. Currntly thr is growing intrst in ynmilly rongurl FPGAs (DRFPGA), whih h th potntil to rmtilly impro logi nsity y tim-shring logi. Srl irnt rhitturs h n propos for ynmilly rongurl FPGAs, suh s ilinx timmultiplx FPGA ongurtion mol [], ynmilly rongurl FPGA [], Dhrm [6], th Dynmilly Progrmml Gt Arry [7,8] n th Virtul Elmnt Gt Arry [9]. Ths ynmilly rongurl FPGAs llow th ynmilly rus of th logi los n wir sgmnts y hing mor thn on on-hip SRAM its ontrolling thm. Eh on-hip ongurtion is ll ontxt, n i with mor thn on ontxt is multi-ontxt i. A lrg logi sign is prtition into multipl stgs to shr th sm physil i in tim-multiplx fshion. This is nlogous to th irtul mmory systm whr progrm n lrgr thn th tul siz of th physil mmory, ynmilly rongurl FPGA llows irtully lrg logi sign to implmnt on smllr physil i. For ynmilly rongurl FPGA, iruit is prtition into stgs (or prtitions), suh tht th logi in irnt stgs tmporlly shr th sm physil FPGA This wor ws prtilly support y th Txs An Rsrh Progrm unr Grnt No. 00365888. Lrg iruit PI s miro-yl 3 4 on usr yl PO s Figur : Tmporl prtitioning of iruit for ynmilly rongurl FPGA Th nos (i. LUTs) in physil FPGA r ll th rl nos, whil th nos in ny stg (miro-yl) of th prtitioning solution r ll th irtul nos. To t into physil i, th numr of irtul nos in ny stg shoul lss thn or qul to th numr of rl nos. Bus th logi los n intronnt n for iruit is tim-multiplx on DRFPGA, it is nssry to h goo prtitioning strtgy to nsur th orrtnss of th xution, s wll s stisfy oth th r n pin onstrints for physil FPGA i. It is lso ruil to minimiz th numr of intronntions in orr to ru th orh for th plmnt n routing pross. Th prtitioning prolm for ynmilly rongurl FPGAs ws stui in [,, 3, 6]. Th tritionl irtyli-grph (DAG) shuling mthos r ppli, suh s list shuling [] n for-irt shuling [, 3]. Rntly [6] propos ntwor-ow s pproh for multi-wy prn onstrin prtitioning s on th ilinx tim-multiplx FPGA rhittur [], n [6] hi ttr rsult thn th list shuling huristi in [] in trms of minimizing th ommunition ost. Howr, th ntwor-ow s pproh in [6] n not us to sol th prtitioning prolm for [], sin th rhittur in [] is irnt from tht of [] n imposs irnt onstrints on th prtitioning prolm. In this ppr, w fous on prtitioning lrg logi sign into ynmilly rongurl FPGAs s on th rhi-

ttur propos in []. W prsnt ntwor-ow s pproh for multi-wy prtitioning. W show how to orrtly mol th nts in oth omintionl n squntil iruits, so tht y th mx-ow omputtion, th min-ut orrspons to th numr of ommunition rquir. An -oun iprtitioning lgorithm is prsnt n thn it is itrtily ppli to prtition ntlist into multipl stgs, so tht h stg n tmporlly shr th sm FPGA i. Exprimntl rsults show tht our pproh outprforms th nhn for-irt shuling in [] in trms of ommunition ost. Th orgniztion of th ppr is s follows. In stion, w gi rif summry of th tim-multiplx ommuniting logi mol propos in []. In stion 3, w introu th prolm formultion of th prtitioning for ynmilly rongurl FPGAs. In stion 4 w rst prsnt th nt moling mtho for oth omintionl n squntil iruits, thn prsnt ntwor-ow s pproh for iprtitioning. Stion 5 xplins th multiwy prtitioning lgorithm for ynmilly rongurl FPGAs. Exprimntl rsults r isuss in Stion 6. Mol of Dynmilly Rongurl FPGA For ynmilly rongurl FPGAs, th ommunition ost, whih is th storg n for uring signl from th tim it is rt to th tim it is no longr n, rts onsirl orh. Dirnt rhitturs h n propos for storing th ommunition lus mong th miro-yls [, ], n thy impos irnt onstrints on th prtitioning prolm. For ilinx's rhittur [], th signl from ut nt is stor in on-hip miro-rgistrs, n [] rquirs tht th prn onstrints stis in orr to gurnt th orrtnss. In this ppr, w spilly xmin th prtitioning prolm for th ynmilly rongurl FPGA rhittur propos in []. C C C3 PIs POs PIs POs PIs POs prtition into (fc ; :::; C g, fm ; :::; M mg), with th xution squn ing C,..., C. Eh C i is suiruit to xut t irnt miro-yl on th DRFPGA. Eh C i plus th s n for C i will ll ontxt i sin it orrspons to th i-th ontxt on DRFPGA. Th s n for C i r thos tht r psuo primry inputs or psuo primry outputs of C i. On lo yl of ontxt i pros s follows. First, r th n psuo primry input signls from th mmory for C i, n r th primry input signls from input ps. Son, propgt signl lus through C i. Thir, lth primry output signl into output ps, n stor th psuo primry output signls for C i into mmory. In this mol, iruit is rprsnt y G = (V; N ; N f ). EhnoV is gt. Th nts r lssi- into two typs, N n N f. A nt n = f ; :::; pgn if is th input to th othr nos in this nt. A nt n = f ; :::; pgn f if thr is ip-op (FF) twn n th rst of th nos in nt n, i.. is th input to n FF n th FF is th input to th othr nos ; :::; p. Figur 3 shows prt of squntil iruit n its onrsion to nt f ; ; 3g in N f. If thr r jnt FFs in th iruit, thn ummy gts n twn jnt FFs. FF prt of squntil iruit 3 nt in N f Figur 3: Exmpl of nt in N f. W n s() to th stg to whih no is ssign in th prtitioning solution. In omintionl iruit or th omintionl prt of squntil iruit, th nos in nt in N must follow th prn onstrints, suh tht if no is th input of u, thn must shul in stg no ltr thn u, i.. s() s(u). If nt n =( ; :::; p) N is ut, suh tht 9 j n, s( ) <s( j), thn (mmory lmnt) is us to stor th lu twn n j (Figur 4). Th output of will stor in th n r y j in ltr miroyl within th urrnt usr lo yl. Thus this is us for ommunition within usr yl. FF 3 C s( ) s( j) PIs POs PIs POs C Mmory Elmnts 3 j Figur : Mol of tim-multiplx ommuniting logi. [] prsnt gt-ll mol for DRFPGA omputtion ll th tim-multiplx ommuniting logi (TMCL) mol (Figur ). This mol onsists of two prts. First, thr is nit st of omintionl logi units (CLUs) fc ;C ; :::; C g, whr h C i ontins st of logi los (.g. LUTs). Sonly, thr is nit st of mmory lmnts (s) fm ;M ; :::; M mg, whih n us to stor lus for ommunition twn th CLUs. A iruit is nt in N Mmory Figur 4: For nt in N, if s( ) < s( j), on mmory lmnt is us to stor th ommunition lu for omintionl nt. For squntil iruit, mmory lmnts r us for

pssing lus to irnt miro-yls of th sm usr yl or to th nxt usr yl. Th nos in nt in N f n in ny orr, ut th irnt orring will rsult in iffrnt numr of mmory lmnts rquir. Thr r th following two ss: First, for nt n = ( ; :::; p) N f, if s( ) s( j) ( ; j n), thn is insrt twn n j (Figur 5). Th signl from is stor in th n will not us until th nxt usr yl. Thus this is us for ommunition twn usr lo yls. C Cs( ) s( j ) PIs POs PIs POs j Mmory Figur 5: If s( ) s( j), thn on mmory lmnt is us to stor th ommunition lu. PIs C C j C POs s( ) s( ) m j Mmory Figur 6: If s( ) <s( j), thn two mmory lmnts r us to stor th ommunition lu. Sonly, for nt n N f,ifs( )<s( j)( ; j n), two s must insrt twn n j (Figur 6). This is ll sitution. Two s r n us th rst ts s storg for ommunition within th urrnt usr lo yl. A son (ll s in Figur 6) is n to stor th lu twn usr lo yls. Th n in ny stg ltr thn s( j). Hr w furthr impro th mol in []. Whil [] ssum h nt is two-trminl nt, w onsir th mor gnrl s whr nt n oth two-trminl n multi-trminl nt. This TMCL mol is irnt from th ilinx mol []. In [], no mmory lmnts r n to stor th ommunition lus, n miro-rgistrs r us to s lus to pss to ltr miro-yl or th nxt usr-yl. Eh ut nt, inluing omintionl n squntil nt, must uni-irtionl to stisfy th prn onstrints in orr to gurnt th orrt xution. Howr, in th mol of [], th squntil nts o not n to uni-irtionl, ut th irnt orring of th nos will us irnt ommunition ost. Thrfor, th irnt rhitturs impos irnt onstrints on th prtitioning pross. 3 Prolm Sttmnt A iruit n rprsnt y hyprgrph G =(V; N), whr V is st of nos, N is st of nts whr h nt is sust of nos whih r intronnt, n N = N [ N f. Eh noin V hs n r w(), n th r of sust of V, not y w(), is th totl r of ll th nos in. For nt n = f ; :::; pg with p nos, lt th input to j ( j p), n j ( j p) th output of. If nt only onnts two nos (i.. p = ), thn it is two-trminl nt, if it onnts mor thn two nos (i.. p>), thn it is multi-trminl nt. Bs on th TMCL mol, th prtitioning prolm for ynmilly rongurl FPGAs is to prtition iruit G =(V; N)into non-orlpping susts V ;V ; :::; V, sujt to:. V = [ i=v i;. Prn onstrints, i.. for nt n = f ; :::; pg N,s( )s( j) for j p; 3. Timing onstrint: th numr of lls of nos in ny stg is lss thn D. Th ojti is to:. minimizing th mximum ommunition ost for ny stg;. minimizing th mximum r of ny stg, i.. minimizing mxfw(v i)j i g; Th prn onstrints gurnt th orrtnss of th xution, n th timing onstrint llows th sign to run s fst s possil. Th ommunition ost for stg is th totl numr of mmory lmnts to us y this stg. Th ommunition ost n of nt n is msur s follows. For nt n = f ; :::; pgn,ifitisutsuh tht 9 j, s( ) <s( j), thn th ommunition ost is ; othrwis, if ll th nos in nt n r in th sm stg, thn th ommunition ost is 0. Noti tht th prn onstrints rquir tht s( ) s( j), ; j n. n ; if 9j suh tht s( n = ) <s( j) 0; if ll th nos r in th sm stg For nt n N f, th irnt orring of th nos will rsult in irnt ommunition ost, s isuss in Stion. n ; if s() s( n = j) ; if s( ) <s( j) For -wy prtitioning, it is sirl to ln th totl r mong th stgs so tht th sign n t into smllr i, i.. to h th r of h stg to los to th rg w(v ). W n -oun iprtitioning to prtitioning st of nos V into two susts (; ) so tht w() is s los to s possil, i.. (, ) w() ( + ). is th rition ftor with 0 <,.g. = 0:05. Th -wy prtitioning prolm n ru to ning, -oun iprtitioning. 4 Ntwor-Flow Bs Biprtitioning Ntwor-ow thniqu is wll nown for ning min-ut u to th mx-ow min-ut thorm [5]. FBB [4] ppli

rpt mx-ow min-ut omputtion to n min-nt-ut for ln iruit i-prtitioning. But [4] i not onsir th orring of th nos. In our prtitioning prolm, th nos in th sm nt r not symmtri. For omintionl nt in N, th nos must stisfy th prn onstrints. For squntil nt in N f, th orring of nos inuns th numr of mmory lmnts us for ommunition. [6] ppli ntwor-ow thniqu to multi-wy prtitioning for tim-multiplx FPGAs s on th ilinx rhittur []. A nt moling mtho is gin in [6] to uil ntwor G 0 from th ntlist G, so tht min-ut in G 0 orrspons to uni-irtionl nt ut in G stisfying th prn onstrints. Howr, sin th TMCL mol [] us irnt rhittur thn [] for storing th ommunition, th nt moling for squntil iruit in [6] n not ppli hr. In th following stions, w prsnt nt moling for two-trminl n multi-trminl nts in omintionl n squntil iruits s on th TMCL mol. Thn w prsnt ntwor ow s pproh for ning - oun iprtitioning. 4. Nt Moling for Comintionl Ciruit A propr nt moling for omintionl iruits must mt two rquirmnts: () orrtly mols nt ut, so tht nt is ount xtly on if it is ut; () orrtly mols th prn onstrints mong th nos, i.. nt-ut must uni-irtionl. A uni-irtionl ut is two-wy prtitioning (; ) suh tht for ny nt n = f ; :::; pg N, ithr ll th nos in n r in th sm sust, or is in. If w lt n rlir stg thn, thn it is sy to pro tht uni-irtionl ut stiss th prn onstrints. W onstrut ntwor G 0 =(V 0 ;N 0 ) from G y th following nt moling of nt in N (Figur 7). x in G 0 with w(x) =0. A riging g from to x with pity, n g from x to h no j ( j p) with pity. A n g from no j ( j p) to with pity (Figur 7()). Hr w istinguish th nt moling of two-trminl nt n multi-trminl nt, us for two-trminl nts, w fwr numr of gs n nos, whih will ru th siz of th ntwor n sp up th mx-ow omputtion. For h nt, xtly on riging g! x with pity is, n ll th othr gs h pity. Noti tht nos in th sm nt r symmtri, th riging g strts from n thr is n g with pity from j ( j p) to. Aftr th mx-ow omputtion on th onstrut ntwor G 0, for min-ut (; ), ll th forwr gs from to must sturt (i.. ow quls to th pity) n ll th wr gs from to h zro mount ofow. If nt is ut, thn only th riging g! x will th forwr ut g from to, n thrfor must in, whih prsrs th prn onstrints. Sin th pity on th riging g! x is on, th ut nt ontriuts xtly to th totl ut siz. Figur 8 shows n xmpl of how to gt th orrsponing nt ut in G from ut in G 0. Lmm shows th orrtnss of th o nt moling for omintionl iruits. Lmm : Th min-ut siz in G 0 quls to th minimum numr of uni-irtionl ut-nts in G. G G () Figur 8: A ut in G 0 n th orrsponing nt-ut in G () two-trminl nt multi-trminl nt nt moling x nt moling Figur 7: Nt moling for two-trminl nt n multitrminl nt in N.. All th nos in V r in V 0, i.. V V 0, n h no in V 0 hs th sm r s in V.. For two-trminl nt! in N, riging g! in G 0 with pity, n g! in G 0 with pity (Figur 7()). 3. For multi-trminl nt n = f ; :::; pg with p>, lt th input to ll othr nos in n. A no 4. Nt Moling for Squntil Ciruits In squntil iruits, for nts in N, w us th sm nt moling s introu in stion 4.. For nts in N f, th orring of th nos inuns th numr of mmory lmnts, thrfor th ommunition ost. If s( ) s( j) for ; j n, thn on mmory lmnt () will n. If s( ) <s( j), thn two mmory lmnts (s) will us. W wnt to n min-ut whih minimizs th numr of mmory lmnts n. W introu th following nt moling for nts in N f (Figur 9).. For two-trminl nt ( ; ), n g from to with pity, n n g from to with pity (Figur 9()).. For multi-trminl nt n =( ; :::; p), two nos w n w with w(w )=0,w(w )=0. A n g from to w with pity, n n g from w to with pity. A n g from w to h of th no j ( j p) with pity. A n g from h no j ( j p) to w with pity (Figur 9()).

() Two-trminl nt Nt moling () G w w G ut-siz= () w w () G w w G ut-siz= Multi-trminl nt Nt moling Figur 9: Nt moling for two-trminl nt n multitrminl nt in N f. For th o nt moling, th irnt orring of th nos will us irnt nt-ut siz, n th ut siz r- ts th ommunition ost. Lmm shows th orrtnss of th o nt moling for nt in N f. Lmm : For nt n = f ; :::; pgn f,if n j, thn th ut-siz is ; if n j, thn th ut siz is. Proof: For two-trminl nt, if n, thn g! will th forwr ut g from to with ut siz, whih is qul to th pity on this g. If n, thn! will th forwr ut g with ut siz. Formulti-trminl nt n N f, if n j, thn sin only! w n w! h pity lss thn in th nt moling, w! will th forwr ut g with w. So th ut siz is. On th othr hn, if n j, thn! w will th forwr ut g from to. Sin th pity on g! w is n th g is sturt ftr th mx-ow omputtion, so th ut siz is in this s. ] Figur 0 shows ut in th onstrut ntwor G 0 n th orrsponing nt ut G. In th xmpl of Figur 0(), if nt n is ut n, thn th riging g! w is ut n th ut siz is whih quls to th pity on!w. Figur 0() shows if, thn th riging g w! will ut n ontriuts to th ut siz. Figur shows ntlist G n th orrsponing ntwor G 0 ftr nt moling of oth nts in N n N f. Noti th nt f; ; g is multi-trminl nt longing to N f. 4.3 -oun Biprtitioning By th nt moling, w n uil ntwor G 0 from th ntlist G, thn pply th rpt mx-ow min-ut strtgy similr to th lgorithm in [6] to n n -oun iprtitioning tht minimizs th numr of rossing nts. First, ntwor G 0 is onstrut from G y th nt moling isuss in stions 4. n 4.. Nxt, sour s n sin t is slt. Thn y th mx-ow omputtion, th mximum mount of ow is push from th sour to th sin, n min-ut (; ) is foun in G 0. If (, ) w() ( + ), thn rturn (; ) s th rsult. If w() < (, ), thn nos in r ollps to s n no from is ollps to s, so tht in th nxt itrtion mor ows n push through th ntwor n xplor Figur 0: Th orrsponing ut in th ntwor G 0 n th ntlist G. G G ntlist G FF ntwor G ftr nt moling Figur : Exmpl of nt-moling. irnt nt ut with lrgr r in. If w() > (+), thn ll nos in r ollps to t, n no from is ollps to t. Thn th mx-ow min-ut omputtion rpts until th r w() for sust is within rng. Inrmntl ow thniqu is mploy for int implmnttion. It is not nssry to lult th mx- ow from srth in h itrtion. Only itionl ow is through th ntwor from th sour to th sin to sturt th riging gs uring th mx-ow omputtion. Th tim omplxity for th rpt mx-ow min-ut is symptotilly th sm s on mx-ow omputtion, whih iso(jvjjej). Figur shows n xmpl of ning n -oun iprtitioning with =6. Th gs with no mrings h pity. In th rst itrtion, min-ut is ftr th mx-ow omputtion, n w()=. Thn no is ollps to s (i.. w(s)= now) so tht mor ow n push through th ntwor in th nxt itrtion. In th son itrtion, ftr pushing th mx-ow, th min-ut siz is still n w() = 3. Anothr no from is ollps to n w() = 4. In th thir itrtion, minut is n is ollps to t with w() =3. In th nxt itrtion, ftr pushing mor ow through th ntwor, minut is n rhs th r limit with w() =5. So (; ) forms n -oun min-ut with ut siz. W n thn n th orrsponing nt ut in th originl ntlist G. 5 Multi-wy Prtitioning To prtition ntlist into ( > ) stgs for ynmilly rongurl FPGA, w rptly pply th ntwor-ow s iprtitioning lgorithm, tims to prtition th ntlist into stgs. Sin th lngth of th ritil pth is usully longr thn th numr of stgs, thr will mor thn on lls of

s s 4 Itrtion t t s s 4 Itrtion 3 Itrtion 4 3 t Itrtion min-ut siz = Figur : Exmpl of -oun iprtitioning. nos in on stg. Lt pth th numr of lls of nos on th ritil pth in th ntlist. Whn prtitioning into stgs, th numr of lls in on stg n D = pth in orr to m th sign s fst s possil. If th timing onstrint for h stg is nown priori, thn th mximum numr of lls in on stg n lult oringly. Bsis minimizing th ommunition ost, our ojti lso inlus minimizing th mximum numr of nos in ny stg in orr to llow th sign to t into smllr physil FPGA. It is sirl to m h stg h r s los to th rg, w(v ), s possil (i.. = w(v ) ). Our strtgy is to rst pply th As-Soon-As-Possil (ASAP) n As-Lt-As-Possil (ALAP) shuling to ssign h no rng of fsil stgs. Thn th nos on th ritil pths r x to rtin stgs, n thos nos on th shortr pths h th xiility to put in mor thn on stg. Ntwor ow s -oun iprtitioning is itrtily ppli to prtition ths xil nos twn stg i n i + (i<), with th ojti of minimizing th ommunition ost n lning th numr of nos in h stg s wll. Du to th prn onstrints, th x nos will sr s sour n sin whn prtitioning th xil nos. Th prtitioning pross of Algorithm hs four mjor stps. Stp : prform As-Soon-As-Possil (ASAP) n As- Lt-As-Possil (ALAP) shuling. In th ASAP shuling, h no is ssign to th rlist possil stg y rth srh. In th ALAP shuling, h noiss- sign to th ltst possil stg. Lt AS(), AL() th rlist n ltst stg for no. If AS() = AL() = j, thn must shul in stg j. W ll s x no. If AS() < AL(), thn n ssign to ny stg from AS() toal(). W ll s xil no. t Stp : lt P i th sust of nos x to stg i ( i ), i.. P i = fjas() =AL() =ig. Assign ll th nos in P i to stg i. Th othr unssign nos r th xil nos whih n put in mor thn on stg. In our prtitioning pross, th gol is to ssign stg for h of th xil no whil lning th numr of nos in h stg n minimizing th nt-ut siz twn th stgs. Stp 3: itrtily ll th ntwor-ow s iprtitioning lgorithm to prtition th xil nos twn stgs i n i + ( i<). For th i-th itrtion, th tils of th prtitioning pross r s follows. Algorithm Ntwor-ow s multi-wy prtitioning for DRFPGAs gin. prform ASAP n ALAP shuling; for h no, lt AS() th rlist stg y th ASAP shuling; lt AL() th ltst stg y th ALAP shuling;. for i =too P i = fjas() =AL() =ig; for i =too ssign ll nos in P i to stg i; 3. for i =to,o gin 3.. F = fjas() i; n is unssigng; 3.. sour s =([ i, j= Vj) [ Pi, n w(s) =w(pi); sin no t = fjas() >ig, n w(t) =w(p i+); 3.3. onstrut ntwor F 0 from F [ s [ t y nt moling; 3.4. n n -oun iprtitioning (; ) inf 0 ; 3.5. ssign nos in to stg i, lt V i = P i [ (, s); 3.6. for F with AL() =i+, ssign to stg i +; n 4. Optimlly trmin th lotions of th 's. n In stp 3., ll th xil nos tht n put in stg i form sust F = fjas() i; n is unssigng. Noti tht nos in F n ithr put in stg i or i +. This stp gurnts tht no no is ssign to stg rlir thn AS(). In stp 3., th sour s n sin t of th ntwor r i. Th sour is sust of nos with s =([ i, j=vj) [ Pi n w(s) =w(pi). Th sour s ontins ll th nos ssign to stgs prior to i n th x nos in P i. Th sin t = fjas() >ign w(t) =w(p i+). t ontins nos whih n only put in stg ltr thn i. In stp 3.3, ntwor F 0 is onstrut from F [ s [ t y th nt moling. In stp 3.4, th -oun iprtitioning lgorithm is ppli on F 0 to n min-ut (; ). Thn in stp 3.5, th nos in r ssign to stg i, suh tht V i = P i [ (, s). Nxt in stp 3.6. ll th unssign nos with AL() =i+ r ssign to stg i +, i.. P i+ = P i+ [fjal() =i+g. This stp is to gurnt tht ny noshoul ssign to stg no ltr thn AL(). Thn i is inrmnt y n ontrol gos to stp 3. to strt th nxt itrtion. Sin th nos on th ritil pth r x to stg for th prtitioning pross, thrfor P i 6=, n th sour s n sin t r nown in h itrtion. With x sour n sin, th ntwor-ow s pproh is pl of ning goo min ut. Stp 4: ftr th -wy prtitioning is on, th lotions of th 's ( sitution) r trmin. Th 's r ssign to stg whih will minimlly inrs th numr of s n. This n optimlly sol y th mx-ow s lgorithm [].

By Lmm, it is sy to pro tht th prtitioning solution of Algorithm stiss th prn onstrints mong ll th nos. Th following Lmm 3 shows tht Algorithm lso stiss th timing onstrint. Gin th mximum numr of lls in h stg, AS() n AL() r th rlist n ltst stg tht no n ssign in y th ASAP n ALAP shuling. To stisfy th timing onstrint, must in stg within th rng of [AS();AL()]. Lmm 3: Th prtitioning solution of Algorithm stiss th timing onstrint, suh tht for ny no, AS() s() AL(). Proof: In AS() =AL() =j, thn P j, is x to stg j n AS() =s()=al(). If AS() <AL(), is xil no, y th onstrution of F in stp 3.3, is in F only in th i-th itrtion whn AS() i, n n only ssign to stg ithr in stp 3.5 or 3.6. If, thn is ssign to stg i in stp 3.5, ls is ssign to stg AL() y stp 3.6. In oth ss, AS() s(). Stp 3.6 gurnts tht is ssign to stg no ltr thn AL(), so s() AL(). Thrfor, AS() s() AL(). ] Th ASAP n ALAP shuling in stp ts O(jV j) tim. Eh itrtion in stp 3 ts O(jV jjej) tim, so th tim omplxity for th, itrtions is O(jV jjej). Th tim omplxity for Algorithm is O(jV jjej). Tl : Chrtristis of th ntlists in our xprimnt Ciruit # LUT # DFF # Nts Dpth s5378 4 6 590 0 s934 37 35 46 3 s307 688 453 4 s5850 056 540 570 3 s3593 756 78 455 6 s3847 3458 464 4894 8 s38584 3545 94 4793 0 6 Exprimntl Rsults W implmnt lgorithm in C++ on Intl Pntium- Pro of 00Mz with 3MB mmory, n xprimnt on th sm ntlists s in []. Ths ntlists r ri y thnology-mpping th ISCAS'89 squntil nhmr iruits into 4-LUTs. Tl shows th hrtristi of ths ntlists. Columns to 4 show th numr of LUTs, FFs n nts in h ntlist. In olumn 5, pth rfrs to th numr of lls of nos on th ritil pth. In th xprimnt, w prform multi-wy prtitioning into, 4 n 8 stgs. W x th numr of lls in h stg to pth with ing th numr of stgs. In Tl, w ompr th ommunition ost with tritionl for-irt shuling (FDS) n nhn for-irt shuling (FDS) [] whn prtitioning into 8 stgs. T CM is th mximum numr of s rquir for ny stg. Noti tht in our lgorithm, th LUTs r ln mong th stgs. Tl lso shows th CPU tim of Algorithm. From th xprimnt, our ntwor-ow s mtho outprforms oth th FDS n nhn FDS mthos, with n rg impromnt of 4.5% n 5% rsptily. Tl 3 omprs th ommunition ost of Algorithm with th nhn FDS mtho [] whil prtitioning into, 4 n 8 stgs. Our mtho onsistntly prforms ttr, with n rg impromnt of4.0%, 0.5% n 5.4% rsptily. Sin our nt moling os not for th sitution, our xprimnts lso show tht th numr of 's is rltily low n usully os not to th orll ommunition ost. Th xprimntl rsults show tht with propr nt moling, ntwor-ow s prtitioning pproh n hnl th shuling tss. Th nt moling orrtly mols th prn onstrints in omintionl iruits, n th ut-siz rts th ommunition ost in oth omintionl n squntil iruits. Th itrti mx-ow min-ut omputtion lns th numr of nos in h stg. 7 Conlusion Dynmilly rongurl FPGAs (DRFPGA) h th potntil to rmtilly impro logi nsity y timshring logi, n h om n ti rsrh for ron- gurl omputing. Th irnt DRFPGA rhitturs impos irnt rquirmnt on th prtitioning prolm. W prsnt ntwor-ow s mtho for multi-wy prtitioning for ynmilly rongurl FPGAs s on th TMCL mol in []. W rst gi nt moling for oth omintionl n squntil iruits so tht y th mx-ow omputtion on th onstrut ntwor, th minut siz rts th numr of ommunition rquir. W thn prsnt rpt mx-ow min-ut s pproh to n n -oun iprtitioning. Algorithm itrtily pplis th iprtitioning lgorithm to n multi-wy prtitioning. Th xprimnts show tht th ntwor-ow s lgorithm outprforms th nhn for-irt shuling mtho []. 8 Anowlgmnt W thn Dr. Dougls Chng for inly proiing us with th t for th xprimnt. Rfrns [] St Trimrgr, \Shuling Dsigns into Tim- Multiplx FPGA", Intrntionl Symposium on Fil Progrmml Gt Arrys, F, 998. [] Dougls Chng n Mlgorzt Mr-Sows, \Prtitioning Squntil Ciruits on ynmilly Rongurl FPGAs", Intrntionl Symposium on Fil Progrmml Gt Arrys, F, 998. [3] Dougls Chng n Mlgorzt Mr-Sows, \Bur Minimiztion n Tim-multiplx I/O on Dynmilly Rongurl FPGAs", Intrntionl Symposium on Fil Progrmml Gt Arrys, F., 997. [4] Ssn Imn, Mssou Prm, Chrls Fin n Json Cong, \Fining Uni-Dirtionl Cuts Bs on Physil Prtitioning n Logi Rstruturing", 4th Intrntionl worshop on physil sign, 993.

Tl : Compring prtitioning rsults with FDS n nhn FDS (FDS) FDS FDS[] ours Imp. or Imp. or Ciruit T CM T CM T CM FDS FDS CPU(s.) s5378 4 98 76 46.%.4% 0.7 s934 5 80 64 44.3% 0.0% 0.09 s307 500 73 4 7.6% 7.9% 0.5 s5850 384 5 0 45.3% 6.7% 0.8 s3593 733 664 656 0.5%.% 5.56 s3847 88 743 63 8.5% 5.% 9.63 s38584 004 744 56 44.% 4.6%.3 Arg 4.5% 5.4% Tl 3: Compring th ommunition ost with nhn FDS (FDS) T CM (n =) T CM (n =4) T CM (n =8) Ciruit FDS[] ours Imp. FDS[] ours Imp. FDS[] ours Imp. s5378 59 3 6.9% 7 09 4.% 98 76.4% s934 30 6 3.% 07 00 6.5% 80 64 0.0% s307 305 88 5.6% 37 9 3.4% 73 4 7.9% s5850 400 336 6.0% 335 85 4.9% 5 0 6.7% s3593 47 308.% 094 067.5% 664 656.% s3847 34 38 4.5% 49 875 3.8% 743 63 5.% s38584 054 76 3.% 80 66 8.4% 744 56 4.6% Arg 4.0% 0.5% 5.4% [5] J.R. For n D.R. Fulrson, \Flows in Ntwors", Printon Unirsity Prss, 96. [6] N.B. Bht, K. Chuhry n E.S. Kuh, \Prformnorint fully routl ynmi rhittur for l progrmml logi i", Mmornum No. UCB/ERL M93/4, unirsity of Cliforni, Brly, 993. [7] Jrmy Brown, Drri Chn, t l. \DELTA: Prototyp for rst- gnrtion ynmilly progrmml gt rry", Trnsit Not, MIT, 995. [8] Anr DHon, \DPGA-oupl miroprossors: Commoity ICs for th rly st ntury", In IEEE Worshop on FPGAs for Custom Computing Mhins, 994. [9] D. Jons n D.M. Lwis, \A tim-multiplx FPGA rhittur for logi multion", In IEEE Custom Intgrt Ciruits Confrn, 995. [0] R. S. Tsy, E. S. Kuh n C. P. Hsu, \Prou: A sof-gts plmnt lgorithm.", in Proings of th IEEE Intrntionl Confrn on Computr Ai Dsign, pp38-33, No. 988. [] M. R. Gry n D. S. Johnson, Computrs n Intrtility: A Gui to th Thory of NP- Compltnss, W. H. Frmn n Compny, 979. [] C. M. Fiui n R. M. Mtthyss, \A linr-tim huristi for improing ntwor prtitions", In Proings of th 9th Dsign Automtion Confrns, pp 75-8, Jun 98. [3] B. W. Krnighn n S. Lin, \An int huristi prour for prtitioning grphs", IEEE Trnstion on Computrs, pp064-068, No. 978. [4] Honghu Yng n D. F. Wong, \Eint Ntwor Flow Bs Min-Cut Bln Prtitioning", Pro. IC- CAD, 994. [5] ilinx, Th Progrmml Logi Dt Boo, 996. [6] Huiqun Liu n D. F. Wong, \Ntwor ow s iruit prtitioning for tim-multiplx FPGAs", Intrntionl Confrn on Computr Ai Dsign, Sn Jos, CA, No. 998.