Statstcal Crcut Optmzaton Consderng Devce and Interconnect Process Varatons I-Jye Ln, Tsu-Yee Lng, and Yao-Wen Chang The Electronc Desgn Automaton Laboratory Department of Electrcal Engneerng Natonal Tawan Unversty March 17, 2007 NTUEE 1
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 2
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 3
Interconnect Process Varaton Interconnect delay and relablty hghly affect VLSI performance. The varablty of nterconnect parameters wll rase up to 35%. Srvastava et al., Sprnger, 2005. The worst-case corner models cannot capture the worst-case varatons n nterconnect delay. Lu et al., DAC 2000 The nterconnect optmzaton guded by statstcal analyss technques has become an nevtable trend. Vsweswarah, SLIP 2006 NTUEE 4
Prevous Work n Statstcal Optmzaton Statstcal gate szng wth tmng constrants usng Lagrangan Relaxaton. Cho et al., DAC 2005. Statstcal power mnmzaton by delay budgetng usng second order conc programmng. Orshansky et al., DAC 2005. Statstcal gate szng usng geometrc programmng Patl et al., ISQED 2005. No statstcal optmzaton work consder both nterconnect and devce szng. NTUEE 5
Comparson wth Prevous Work Szng varable Delay Model Objectve Constrant Orshansky s work (DAC 2005) Gate only Lnear model (lnear term) Power Tmng Our work Gate and wre Elmore delay model (nonlnear term) Area Power, tmng, thermal Due to the nonlnear term ntroduce by the Elmore delay model, the optmzaton usng both gate and wre szng wll be much harder to solve. NTUEE 6
Delay Model Our delay model and tmng constrant: Elmore delay model D = R g ( C w X w L w + C g X ) /X j + R w L w ( C w X w L w /2 + C g X )/(X w ) Tmng constrant: a = arrval tme of gate a a + j D Hgher order (quadratc) terms! Delay model and tmng constrant used n prevous work n DAC 2005: 0 a a + d + d lnear terms! d 0 = delay due to the szng for maxmum slack d = slack added to node due to the loadng j NTUEE 7
Statstcal Crcut Optmzaton wth SOCP Second-order conc programmng (SOCP) Mnmze f T x subject to A x + b Fx = g 2 c T x + d Convex optmzaton Theoretcal runtme O(N 1.3 ) Orshansky (DAC 2005), Davood (DAC 2006) Second-order conc constrant: T A x b c x + d + 2 Approxmaton Lnear terms! method Nonlnear (quadratc) terms are not applcable! NTUEE 8
Approxmaton Method Fx the gate sze n the tmng constrant. Reduce the tmng constrant from quadratc order to lnear order. Approxmate the gate szes by a two-stage flow. Iteratvely reduce the approxmaton errors. The flow s smlar to Sequental Lnear Program (SLP). no Solve the SOCP problem under current constrants (gate sze fxed) Update the gate sze of tmng constrants and form a new SOCP problem convergence or max teratons yes Fnsh NTUEE 9
Our Contrbutons The frst work of statstcal optmzaton on crcut nterconnect and devces Prevous work consders only crcut devces (gates). Statstcal optmzaton for consderng both nterconnect and devces s much harder. The frst work that statstcally optmzes the area wth thermal- and tmng-constraned parametrc yelds Most exstng statstcal optmzaton consders only tmng. The frst work capable of analytcally transformng the statstcal RC model nto an SOCP Prevous work uses lnear delay model NTUEE 10
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 11
Tmng Constrant 1 4 6 10 13 15 0 D R 1 D R 2 2 3 5 7 9 11 14 16 L C 1 17 Tmng constrant: D R 3 8 12 L C 2 # of paths may grow exponentally to the crcut sze. To reduce problem sze, we dstrbute the tmng nformaton to each node. NTUEE 12
Thermal Constrant Electron Mgraton (EM) lfetme relablty of metal nterconnects s governed by the well-known Black s equaton: The desgn s relable when TTF: tme-to-fal perod A* : a constant j : average current densty Q : actvaton energy KB : Boltzmann s constant. Tm : metal temperature j 0 : specfc current densty T ref : specfc metal temperature NTUEE 13
Average Temperature of the Chp The average temperature of the chp, T avg, can be estmated by: Power P tot : total power consumpton of the chp T ar : ambent temperature R n : thermal resstance of the substrate and the package A : chp area Banerjee et al., ISPD 2001. NTUEE 14
Power Constrant Need to constran chp s temperature under a reasonable bound durng the optmzaton: For smplcty, consder the dynamc power consumpton only. P B: the power bound of the gate c : the downstream capactance of the gate I α : swtchng actvty of component I NTUEE 15
Determnstc Formulaton l : gate unt area or wre length x : gate or wre sze (szng varable) Thermal constrant Tmng constrant Power constrant f: workng frequency; α : swtchng actvty of component I; c : load capactance of component I; ω: path n the path set Ω. NTUEE 16
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 17
Varaton Models Introduce two process parameters as the varaton sources: Inter-layer delectrc (ILD) thckness (H), and metal thckness (T). R and C can be approxmated by the frst-order Taylor expresson: R nom /C nom : nomnal value of R/C T/ H : random devaton of metal thckness/ild thckness a1, b1, b2 are senstvtes calculated by the dfferental dfferentaton of: C gnd /ε: normalzed capactance S: space between parallel lnes Srvastava et al., Sprnger 2005. NTUEE 18
RC Varablty Assume T and H are Gaussan, the varablty magntude of R and C can easly be calculated by: σ σ 2 R 2 C = a σ = b 2 1 2 1 σ 2 T 2 T + b 2 2 σ 2 H Apply the nterconnect delay varaton metrc to calculate the varablty of the product of R and C. Well captured by a normal dstrbuton wth 1.2% average error of the mean delay and 3.8% average error of the standard devaton. Blaauw et al., DAC 2004. NTUEE 19
NTUEE 20 Statstcal Formulaton δ/ζ/η: Thermal/Tmng/Power yeld constrant Determnstc formulaton Statstcal formulaton B B j j B m s n s U x L P c D a a D a a D T T to subject x l Mnmze + + + = ' 1 α B B j j B m s n s U x L P c Prob ( D Prob (a a D Prob (a a Prob (D ) T Prob (T to subject x l Mnmze + + + = η α ζ ζ ζ δ ' 1 ) ) )
Transformaton nto SOCP Theorem: Gven ndependent Gaussan random vectors a and bound vectors b, the parametrc yeld (η) problem s as follows: the problem can be reformulated as an SOCP: Φ -1 : the cumulatve densty nverse functon Boyd and Vandenberghe, Cambrdge, 2004. NTUEE 21
Transformaton Flow varance: mean: zero mean unt varance Gaussan varable cumulatve densty functon NTUEE 22
Thermal & Power Constrants n SOCP Form Thermal constrant: Power (Thermal dstrbuton) constrant: NTUEE 23
Tmng Constrant n SOCP Form X j : sze of the drvng gate X : sze of the loadng gate X w : wdth of the nterconnect L w : length of the nterconnect (constant) D = R g ( C w X w L w + C g X ) /X j + R w L w ( C w X w L w /2 + C g X )/ X w Only Xw s the szng varable D = R j C w X w L w + R j C + R w C w (L w ) 2 /2 + R m L w C / X w Tmng constrant: NTUEE 24
Statstcal Problem Formulaton usng SOCP Thermal constrant Tmng constrant Power constrant NTUEE 25
Program Flow Begn Formulate the problem wth the RC varaton Assgn the values of the current gate szes to the gate sze varables n the tmng constrant Formulate the problem nto SOCP wth gate sze as fxed value n the tmng constrant Solve t wth the nteror pont method no Convergence or Max teratons yes End Iteratvely reduce the approxmaton errors. NTUEE 26
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 27
Expermental Setup Crcut Name c17 c432 c499 c880 c1355 c1908 c2670 c3540 c5315 c6288 Crcut Sze #Gate #Wre #Total 11 12 23 122 230 352 246 396 642 256 230 486 297 555 852 201 336 537 499 754 1253 429 1021 1450 927 1792 2719 1298 3596 4894 Implemented n C++ & appled the MOSEK optmzaton tool to solve t. Tested on the commonly used ISCAS85 benchmark crcuts n ths area. Used Desgn Compler & Astro wth UMC 0.18¹m technology lbrary to synthesze and place the crcuts. NTUEE 28
Expermental Results Acheve 51%, 39%, and 26% area reductons for 70%, 84.1%, and 99.9% yeld constrants, respectvely. Avg. / Max. # of the runnng teratons: 5.6 / 10 Tmng constrant error bound: 2% Crcut name area (µm 2 ) Determnstc Runtme / te. (s) Total runtme (s) area (µm 2 ) 70% yeld Area mprov. Runtme / te. (s) Total runtme (s) c17 7160 0.06 0.6 2892 59.61% 0.09 0.36 c432 47752 0.24 1.21 21543 54.89% 0.83 4.15 c499 127103 0.41 2.07 56957 55.19% 2.41 9.62 c880 152804 0.37 1.11 38346 74.91% 1.40 13.96 c1355 174896 0.58 5.79 84076 51.93% 3.87 19.33 c1908 96968 0.33 3.26 44350 54.26% 2.79 5.57 c2670 275967 0.74 7.39 121065 56.13% 7.32 14.64 c3540 362409 1.10 11.03 146519 59.57% 7.43 22.29 c5315 913522 1.88 13.18 727853 20.30% 10.43 31.28 c6288 1455730 5.23 15.69 1100120 24.43% 70.56 352.78 Avg. 51.12% NTUEE 29
Expermental Results of 84.1% and 99.9% yeld The lower the yeld constrants, the better the area optmzaton. All constrants (tmng, power, thermal) are met. Crcut name area (µm 2 ) 84.1% yeld Area mprov. Runtme / te. (s) Total runtme (s) area (µm 2 ) 99.9% yeld Area mprov. Runtme / te. (s) Total runtme (s) c17 3394 52.60% 0.09 0.6 3460 51.68% 0.09 0.47 c432 27860 41.66% 1.26 7.48 29179 38.89% 0.80 2.41 c499 57758 54.56% 2.11 8.43 89148 29.86% 1.54 4.61 c880 66420 56.53% 2.91 7.82 107349 29.75% 1.54 15.41 c1355 147397 15.72% 2.11 19.03 169347 3.17% 2.29 22.9 c1908 65020 32.95% 1.56 12.48 70830 26.96% 1.38 13.57 c2670 161426 41.51% 2.93 5.85 248474 9.96% 3.47 24.32 c3540 169331 53.28% 5.57 22.27 176715 51.24% 5.23 15.70 c5315 735838 19.45% 9.24 36.95 884514 3.18% 7.16 28.65 c6288 1109090 23.81% 69.74 348.71 1291240 11.30% 82.32 411.63 Avg. 39.21% 25.60% NTUEE 30
Delay, Power and Temperature Performance Though the delay and the maxmum metal temperature are ncreased, they all meet the gven bounds. Crcut Name Fully utlzed the constrant bound to get the best optmzaton results. Bound Delay (ns) Before After Before Power (mw) After Max T ncrease () Before After c17 36.82 22.19 32.21 2.02 1.35 8.19 10.05 c432 247.65 154.59 136.62 22.96 12.59 6.97 23.46 c499 186.13 153.79 135.32 56.10 35.23 7.20 27.20 c880 253.43 208.84 170.15 64.03 43.44 7.26 19.69 c1355 274.55 203.45 241.45 78.56 67.24 7.37 27.47 c1908 222.91 161.09 136.11 43.32 28.36 7.09 31.78 c2670 290.84 176.66 229.94 103.81 98.69 8.88 22.72 c3540 507.80 308.59 245.85 143.14 72.65 8.25 14.03 c5315 445.58 313.52 421.54 365.65 355.45 7.78 19.55 c6288 1333.91 913.33 1148.41 661.62 513.14 7.35 11.62 Comparson 1 1.13 1 0.80 1 2.72 NTUEE 31
Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 32
Conclusons Presented the frst statstcal work for area mnmzaton under thermal and tmng constrants by gate and wre szng. Obtaned much better results than those of the determnstc method. Formulated statstcal RC model by SOCPs whch can be solved effcently and effectvely. Used more accurate delay model (Elmore delay model) Solved the problem by a two-stage approxmaton flow Nonlnear terms are not applcable to SOCP NTUEE 33
Thank You NTUEE 34
Backup Sldes NTUEE 35
Temperature Dstrbuton Applyng the Fnte Dfference Method (FDM), we can dvde the whole chp nto m mesh nodes and calculate each node s temperature by P : th mesh node s power dsspaton T : th mesh node s temperature g: power densty of the heat sources (W/m3) Chapman, Heat Transfer, New York: Macmllan, 1984 Vol., 4 th Ed.. NTUEE 36
Temperature Dependent Delay An nseparable aspect of electrcal power dstrbuton and sgnal transmsson through the nterconnects Resstance s dependent of Temperature ( 1+ T ( )) r( x) = ρ0 β x ρ 0 : the resstance per unt length at reference temperature β: the temperature coeffcent of resstance (1/ C) NTUEE 37
Interconnect Temperature Calculaton The nterconnect temperature s gven by x : wre wdth θ nt : the thermal mpedance of the nterconnect lne to the chp σ: duty cycle V cross : cross voltage of wre t ox : the total thckness of the underlyng delectrc t m : the thckness of the wre K ox : the thermal conductvty l : wre length R m : the temperature dependent unt resstance ψ : the heat spreadng parameter Not lnear functons Least Square Estmator NTUEE 38
Least Square Estmator (LSE) Least squares solves the problem by fndng the lne for whch the sum of the square devatons (or resduals) n the d drecton (the nosy varable drecton) are mnmzed. Apply Cramer Rule to fnd the A 1 and A 0, whch mnmzes the square devatons y = A 1 x +A 0 Cramer Rule: NTUEE 39
Approxmaton for Thermal Constrant Let N = 5 and pck fve szes of x, we can approxmate the thermal constrant by Least Square Estmator (LSE). Banerjee et al., DAC 1999 NTUEE 40