Statistical Circuit Optimization Considering Device and Interconnect Process Variations

Similar documents
Interconnect Optimization for Deep-Submicron and Giga-Hertz ICs

Variability-Driven Module Selection with Joint Design Time Optimization and Post-Silicon Tuning

Coarse-Grain MTCMOS Sleep

Effective Power Optimization combining Placement, Sizing, and Multi-Vt techniques

An Efficient Algorithm for Statistical Minimization of Total Power under Timing Yield Constraints

Highly Efficient Gradient Computation for Density-Constrained Analytical Placement Methods

Chapter - 2. Distribution System Power Flow Analysis

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

NUMERICAL DIFFERENTIATION

Generalized Linear Methods

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient

Analytical Thermal Placement for VLSI Lifetime Improvement and Minimum Performance Variation

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Interconnect Modeling

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

RELIABILITY ASSESSMENT

Basic Statistical Analysis and Yield Calculations

Queueing Networks II Network Performance

Analytical Gradient Evaluation of Cost Functions in. General Field Solvers: A Novel Approach for. Optimization of Microwave Structures

POWER AND PERFORMANCE OPTIMIZATION OF STATIC CMOS CIRCUITS WITH PROCESS VARIATION

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

A FAST HEURISTIC FOR TASKS ASSIGNMENT IN MANYCORE SYSTEMS WITH VOLTAGE-FREQUENCY ISLANDS

Simulation and Probability Distribution

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Laboratory 1c: Method of Least Squares

Global Sensitivity. Tuesday 20 th February, 2018

Reliable Power Delivery for 3D ICs

Leakage and Dynamic Glitch Power Minimization Using Integer Linear Programming for V th Assignment and Path Balancing

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

SIO 224. m(r) =(ρ(r),k s (r),µ(r))

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

T E C O L O T E R E S E A R C H, I N C.

A Hybrid Variational Iteration Method for Blasius Equation

Lecture Notes on Linear Regression

VLSI Circuit Performance Optimization by Geometric Programming

Conic Programming in GAMS

ONE DIMENSIONAL TRIANGULAR FIN EXPERIMENT. Technical Advisor: Dr. D.C. Look, Jr. Version: 11/03/00

Statistics Chapter 4

DETERMINATION OF TEMPERATURE DISTRIBUTION FOR ANNULAR FINS WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY BY HPM

FUZZY FINITE ELEMENT METHOD

Uncertainty in measurements of power and energy on power networks

Chapter 13: Multiple Regression

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

Chapter 11: Simple Linear Regression and Correlation

Uncertainty and auto-correlation in. Measurement

Optimum Design of Steel Frames Considering Uncertainty of Parameters

Statistical Leakage and Timing Optimization for Submicron Process Variation

1 Convex Optimization

Which Separator? Spring 1

Laboratory 3: Method of Least Squares

Appendix B: Resampling Algorithms

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

The Minimum Universal Cost Flow in an Infeasible Flow Network

Lecture 10 Support Vector Machines II

Lecture 12: Classification

Optimal Slack-Driven Block Shaping Algorithm in Fixed-Outline Floorplanning

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Topic 5: Non-Linear Regression

CHAPTER 7 STOCHASTIC ECONOMIC EMISSION DISPATCH-MODELED USING WEIGHTING METHOD

CLARKSON UNIVERSITY. Block-Based Compact Thermal Modeling of Semiconductor Integrated Circuits

ELE B7 Power Systems Engineering. Power Flow- Introduction

Lecture 21: Numerical methods for pricing American type derivatives

Multigradient for Neural Networks for Equalizers 1

Composite Hypotheses testing

Flexible Allocation of Capacity in Multi-Cell CDMA Networks

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Constitutive Modelling of Superplastic AA-5083

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Chapter 9: Statistical Inference and the Relationship between Two Variables

Statistics for Economics & Business

MMA and GCMMA two methods for nonlinear optimization

Feature Selection: Part 1

Abstract. The assumptions made for rank computation are as follows. (see Figure 1)

The Geometry of Logit and Probit

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

CHAPTER 9 CONCLUSIONS

Chapter 12 Analysis of Covariance

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Lecture 14: Forces and Stresses

Numerical modelization by finite differences of a thermoelectric refrigeration device of double jump". Experimental validation.

Ensemble Methods: Boosting

A Particle Filter Algorithm based on Mixing of Prior probability density and UKF as Generate Importance Function

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

( ) = ( ) + ( 0) ) ( )

Problem Set 9 Solutions

Clock-Gating and Its Application to Low Power Design of Sequential Circuits

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Optimal Slack-Driven Block Shaping Algorithm in Fixed-Outline Floorplanning

DUE: WEDS FEB 21ST 2018

Estimating Delays. Gate Delay Model. Gate Delay. Effort Delay. Computing Logical Effort. Logical Effort

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Adaptive Reduction of Design Variables Using Global Sensitivity in Reliability-Based Optimization

A Bayesian Approach to Arrival Rate Forecasting for Inhomogeneous Poisson Processes for Mobile Calls

Formal solvers of the RT equation

Energy Saving on Distribution of Liquid Animal Feeds at Pigsty Farms

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites

DESIGN OPTIMIZATION OF CFRP RECTANGULAR BOX SUBJECTED TO ARBITRARY LOADINGS

Transcription:

Statstcal Crcut Optmzaton Consderng Devce and Interconnect Process Varatons I-Jye Ln, Tsu-Yee Lng, and Yao-Wen Chang The Electronc Desgn Automaton Laboratory Department of Electrcal Engneerng Natonal Tawan Unversty March 17, 2007 NTUEE 1

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 2

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 3

Interconnect Process Varaton Interconnect delay and relablty hghly affect VLSI performance. The varablty of nterconnect parameters wll rase up to 35%. Srvastava et al., Sprnger, 2005. The worst-case corner models cannot capture the worst-case varatons n nterconnect delay. Lu et al., DAC 2000 The nterconnect optmzaton guded by statstcal analyss technques has become an nevtable trend. Vsweswarah, SLIP 2006 NTUEE 4

Prevous Work n Statstcal Optmzaton Statstcal gate szng wth tmng constrants usng Lagrangan Relaxaton. Cho et al., DAC 2005. Statstcal power mnmzaton by delay budgetng usng second order conc programmng. Orshansky et al., DAC 2005. Statstcal gate szng usng geometrc programmng Patl et al., ISQED 2005. No statstcal optmzaton work consder both nterconnect and devce szng. NTUEE 5

Comparson wth Prevous Work Szng varable Delay Model Objectve Constrant Orshansky s work (DAC 2005) Gate only Lnear model (lnear term) Power Tmng Our work Gate and wre Elmore delay model (nonlnear term) Area Power, tmng, thermal Due to the nonlnear term ntroduce by the Elmore delay model, the optmzaton usng both gate and wre szng wll be much harder to solve. NTUEE 6

Delay Model Our delay model and tmng constrant: Elmore delay model D = R g ( C w X w L w + C g X ) /X j + R w L w ( C w X w L w /2 + C g X )/(X w ) Tmng constrant: a = arrval tme of gate a a + j D Hgher order (quadratc) terms! Delay model and tmng constrant used n prevous work n DAC 2005: 0 a a + d + d lnear terms! d 0 = delay due to the szng for maxmum slack d = slack added to node due to the loadng j NTUEE 7

Statstcal Crcut Optmzaton wth SOCP Second-order conc programmng (SOCP) Mnmze f T x subject to A x + b Fx = g 2 c T x + d Convex optmzaton Theoretcal runtme O(N 1.3 ) Orshansky (DAC 2005), Davood (DAC 2006) Second-order conc constrant: T A x b c x + d + 2 Approxmaton Lnear terms! method Nonlnear (quadratc) terms are not applcable! NTUEE 8

Approxmaton Method Fx the gate sze n the tmng constrant. Reduce the tmng constrant from quadratc order to lnear order. Approxmate the gate szes by a two-stage flow. Iteratvely reduce the approxmaton errors. The flow s smlar to Sequental Lnear Program (SLP). no Solve the SOCP problem under current constrants (gate sze fxed) Update the gate sze of tmng constrants and form a new SOCP problem convergence or max teratons yes Fnsh NTUEE 9

Our Contrbutons The frst work of statstcal optmzaton on crcut nterconnect and devces Prevous work consders only crcut devces (gates). Statstcal optmzaton for consderng both nterconnect and devces s much harder. The frst work that statstcally optmzes the area wth thermal- and tmng-constraned parametrc yelds Most exstng statstcal optmzaton consders only tmng. The frst work capable of analytcally transformng the statstcal RC model nto an SOCP Prevous work uses lnear delay model NTUEE 10

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 11

Tmng Constrant 1 4 6 10 13 15 0 D R 1 D R 2 2 3 5 7 9 11 14 16 L C 1 17 Tmng constrant: D R 3 8 12 L C 2 # of paths may grow exponentally to the crcut sze. To reduce problem sze, we dstrbute the tmng nformaton to each node. NTUEE 12

Thermal Constrant Electron Mgraton (EM) lfetme relablty of metal nterconnects s governed by the well-known Black s equaton: The desgn s relable when TTF: tme-to-fal perod A* : a constant j : average current densty Q : actvaton energy KB : Boltzmann s constant. Tm : metal temperature j 0 : specfc current densty T ref : specfc metal temperature NTUEE 13

Average Temperature of the Chp The average temperature of the chp, T avg, can be estmated by: Power P tot : total power consumpton of the chp T ar : ambent temperature R n : thermal resstance of the substrate and the package A : chp area Banerjee et al., ISPD 2001. NTUEE 14

Power Constrant Need to constran chp s temperature under a reasonable bound durng the optmzaton: For smplcty, consder the dynamc power consumpton only. P B: the power bound of the gate c : the downstream capactance of the gate I α : swtchng actvty of component I NTUEE 15

Determnstc Formulaton l : gate unt area or wre length x : gate or wre sze (szng varable) Thermal constrant Tmng constrant Power constrant f: workng frequency; α : swtchng actvty of component I; c : load capactance of component I; ω: path n the path set Ω. NTUEE 16

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 17

Varaton Models Introduce two process parameters as the varaton sources: Inter-layer delectrc (ILD) thckness (H), and metal thckness (T). R and C can be approxmated by the frst-order Taylor expresson: R nom /C nom : nomnal value of R/C T/ H : random devaton of metal thckness/ild thckness a1, b1, b2 are senstvtes calculated by the dfferental dfferentaton of: C gnd /ε: normalzed capactance S: space between parallel lnes Srvastava et al., Sprnger 2005. NTUEE 18

RC Varablty Assume T and H are Gaussan, the varablty magntude of R and C can easly be calculated by: σ σ 2 R 2 C = a σ = b 2 1 2 1 σ 2 T 2 T + b 2 2 σ 2 H Apply the nterconnect delay varaton metrc to calculate the varablty of the product of R and C. Well captured by a normal dstrbuton wth 1.2% average error of the mean delay and 3.8% average error of the standard devaton. Blaauw et al., DAC 2004. NTUEE 19

NTUEE 20 Statstcal Formulaton δ/ζ/η: Thermal/Tmng/Power yeld constrant Determnstc formulaton Statstcal formulaton B B j j B m s n s U x L P c D a a D a a D T T to subject x l Mnmze + + + = ' 1 α B B j j B m s n s U x L P c Prob ( D Prob (a a D Prob (a a Prob (D ) T Prob (T to subject x l Mnmze + + + = η α ζ ζ ζ δ ' 1 ) ) )

Transformaton nto SOCP Theorem: Gven ndependent Gaussan random vectors a and bound vectors b, the parametrc yeld (η) problem s as follows: the problem can be reformulated as an SOCP: Φ -1 : the cumulatve densty nverse functon Boyd and Vandenberghe, Cambrdge, 2004. NTUEE 21

Transformaton Flow varance: mean: zero mean unt varance Gaussan varable cumulatve densty functon NTUEE 22

Thermal & Power Constrants n SOCP Form Thermal constrant: Power (Thermal dstrbuton) constrant: NTUEE 23

Tmng Constrant n SOCP Form X j : sze of the drvng gate X : sze of the loadng gate X w : wdth of the nterconnect L w : length of the nterconnect (constant) D = R g ( C w X w L w + C g X ) /X j + R w L w ( C w X w L w /2 + C g X )/ X w Only Xw s the szng varable D = R j C w X w L w + R j C + R w C w (L w ) 2 /2 + R m L w C / X w Tmng constrant: NTUEE 24

Statstcal Problem Formulaton usng SOCP Thermal constrant Tmng constrant Power constrant NTUEE 25

Program Flow Begn Formulate the problem wth the RC varaton Assgn the values of the current gate szes to the gate sze varables n the tmng constrant Formulate the problem nto SOCP wth gate sze as fxed value n the tmng constrant Solve t wth the nteror pont method no Convergence or Max teratons yes End Iteratvely reduce the approxmaton errors. NTUEE 26

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 27

Expermental Setup Crcut Name c17 c432 c499 c880 c1355 c1908 c2670 c3540 c5315 c6288 Crcut Sze #Gate #Wre #Total 11 12 23 122 230 352 246 396 642 256 230 486 297 555 852 201 336 537 499 754 1253 429 1021 1450 927 1792 2719 1298 3596 4894 Implemented n C++ & appled the MOSEK optmzaton tool to solve t. Tested on the commonly used ISCAS85 benchmark crcuts n ths area. Used Desgn Compler & Astro wth UMC 0.18¹m technology lbrary to synthesze and place the crcuts. NTUEE 28

Expermental Results Acheve 51%, 39%, and 26% area reductons for 70%, 84.1%, and 99.9% yeld constrants, respectvely. Avg. / Max. # of the runnng teratons: 5.6 / 10 Tmng constrant error bound: 2% Crcut name area (µm 2 ) Determnstc Runtme / te. (s) Total runtme (s) area (µm 2 ) 70% yeld Area mprov. Runtme / te. (s) Total runtme (s) c17 7160 0.06 0.6 2892 59.61% 0.09 0.36 c432 47752 0.24 1.21 21543 54.89% 0.83 4.15 c499 127103 0.41 2.07 56957 55.19% 2.41 9.62 c880 152804 0.37 1.11 38346 74.91% 1.40 13.96 c1355 174896 0.58 5.79 84076 51.93% 3.87 19.33 c1908 96968 0.33 3.26 44350 54.26% 2.79 5.57 c2670 275967 0.74 7.39 121065 56.13% 7.32 14.64 c3540 362409 1.10 11.03 146519 59.57% 7.43 22.29 c5315 913522 1.88 13.18 727853 20.30% 10.43 31.28 c6288 1455730 5.23 15.69 1100120 24.43% 70.56 352.78 Avg. 51.12% NTUEE 29

Expermental Results of 84.1% and 99.9% yeld The lower the yeld constrants, the better the area optmzaton. All constrants (tmng, power, thermal) are met. Crcut name area (µm 2 ) 84.1% yeld Area mprov. Runtme / te. (s) Total runtme (s) area (µm 2 ) 99.9% yeld Area mprov. Runtme / te. (s) Total runtme (s) c17 3394 52.60% 0.09 0.6 3460 51.68% 0.09 0.47 c432 27860 41.66% 1.26 7.48 29179 38.89% 0.80 2.41 c499 57758 54.56% 2.11 8.43 89148 29.86% 1.54 4.61 c880 66420 56.53% 2.91 7.82 107349 29.75% 1.54 15.41 c1355 147397 15.72% 2.11 19.03 169347 3.17% 2.29 22.9 c1908 65020 32.95% 1.56 12.48 70830 26.96% 1.38 13.57 c2670 161426 41.51% 2.93 5.85 248474 9.96% 3.47 24.32 c3540 169331 53.28% 5.57 22.27 176715 51.24% 5.23 15.70 c5315 735838 19.45% 9.24 36.95 884514 3.18% 7.16 28.65 c6288 1109090 23.81% 69.74 348.71 1291240 11.30% 82.32 411.63 Avg. 39.21% 25.60% NTUEE 30

Delay, Power and Temperature Performance Though the delay and the maxmum metal temperature are ncreased, they all meet the gven bounds. Crcut Name Fully utlzed the constrant bound to get the best optmzaton results. Bound Delay (ns) Before After Before Power (mw) After Max T ncrease () Before After c17 36.82 22.19 32.21 2.02 1.35 8.19 10.05 c432 247.65 154.59 136.62 22.96 12.59 6.97 23.46 c499 186.13 153.79 135.32 56.10 35.23 7.20 27.20 c880 253.43 208.84 170.15 64.03 43.44 7.26 19.69 c1355 274.55 203.45 241.45 78.56 67.24 7.37 27.47 c1908 222.91 161.09 136.11 43.32 28.36 7.09 31.78 c2670 290.84 176.66 229.94 103.81 98.69 8.88 22.72 c3540 507.80 308.59 245.85 143.14 72.65 8.25 14.03 c5315 445.58 313.52 421.54 365.65 355.45 7.78 19.55 c6288 1333.91 913.33 1148.41 661.62 513.14 7.35 11.62 Comparson 1 1.13 1 0.80 1 2.72 NTUEE 31

Introducton Determnstc Algorthm Statstcal Algorthm Expermental Results Conclusons Outlne NTUEE 32

Conclusons Presented the frst statstcal work for area mnmzaton under thermal and tmng constrants by gate and wre szng. Obtaned much better results than those of the determnstc method. Formulated statstcal RC model by SOCPs whch can be solved effcently and effectvely. Used more accurate delay model (Elmore delay model) Solved the problem by a two-stage approxmaton flow Nonlnear terms are not applcable to SOCP NTUEE 33

Thank You NTUEE 34

Backup Sldes NTUEE 35

Temperature Dstrbuton Applyng the Fnte Dfference Method (FDM), we can dvde the whole chp nto m mesh nodes and calculate each node s temperature by P : th mesh node s power dsspaton T : th mesh node s temperature g: power densty of the heat sources (W/m3) Chapman, Heat Transfer, New York: Macmllan, 1984 Vol., 4 th Ed.. NTUEE 36

Temperature Dependent Delay An nseparable aspect of electrcal power dstrbuton and sgnal transmsson through the nterconnects Resstance s dependent of Temperature ( 1+ T ( )) r( x) = ρ0 β x ρ 0 : the resstance per unt length at reference temperature β: the temperature coeffcent of resstance (1/ C) NTUEE 37

Interconnect Temperature Calculaton The nterconnect temperature s gven by x : wre wdth θ nt : the thermal mpedance of the nterconnect lne to the chp σ: duty cycle V cross : cross voltage of wre t ox : the total thckness of the underlyng delectrc t m : the thckness of the wre K ox : the thermal conductvty l : wre length R m : the temperature dependent unt resstance ψ : the heat spreadng parameter Not lnear functons Least Square Estmator NTUEE 38

Least Square Estmator (LSE) Least squares solves the problem by fndng the lne for whch the sum of the square devatons (or resduals) n the d drecton (the nosy varable drecton) are mnmzed. Apply Cramer Rule to fnd the A 1 and A 0, whch mnmzes the square devatons y = A 1 x +A 0 Cramer Rule: NTUEE 39

Approxmaton for Thermal Constrant Let N = 5 and pck fve szes of x, we can approxmate the thermal constrant by Least Square Estmator (LSE). Banerjee et al., DAC 1999 NTUEE 40