Digital Integrated Circuits Lecture 5: Logical Effort

Similar documents
Lecture 6: Logical Effort

EE 447 VLSI Design. Lecture 5: Logical Effort

ECE429 Introduction to VLSI Design

VLSI Design, Fall Logical Effort. Jacob Abraham

Introduction to CMOS VLSI Design. Lecture 5: Logical Effort. David Harris. Harvey Mudd College Spring Outline

Lecture 8: Combinational Circuit Design

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

Very Large Scale Integration (VLSI)

Logical Effort. Sizing Transistors for Speed. Estimating Delays

Introduction to CMOS VLSI Design. Logical Effort B. Original Lecture by Jay Brockman. University of Notre Dame Fall 2008

Lecture 8: Logic Effort and Combinational Circuit Design

Lecture 5: DC & Transient Response

Lecture 5: DC & Transient Response

EE5780 Advanced VLSI CAD

Lecture 6: DC & Transient Response

Chapter 6. Series-Parallel Circuits ISU EE. C.Y. Lee

Lecture 4: DC & Transient Response

DC and Transient. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

0 0 = 1 0 = 0 1 = = 1 1 = 0 0 = 1

Lecture 12 CMOS Delay & Transient Response

Algorithms and Complexity

7. Combinational Circuits

CPE/EE 427, CPE 527 VLSI Design I Delay Estimation. Department of Electrical and Computer Engineering University of Alabama in Huntsville

Chapter 22 Lecture. Essential University Physics Richard Wolfson 2 nd Edition. Electric Potential 電位 Pearson Education, Inc.

生物統計教育訓練 - 課程. Introduction to equivalence, superior, inferior studies in RCT 謝宗成副教授慈濟大學醫學科學研究所. TEL: ext 2015

5.5 Using Entropy to Calculate the Natural Direction of a Process in an Isolated System

5. CMOS Gate Characteristics CS755

數位系統設計 Digital System Design

相關分析. Scatter Diagram. Ch 13 線性迴歸與相關分析. Correlation Analysis. Correlation Analysis. Linear Regression And Correlation Analysis

= lim(x + 1) lim x 1 x 1 (x 2 + 1) 2 (for the latter let y = x2 + 1) lim

Chapter 1 Linear Regression with One Predictor Variable

邏輯設計 Hw#6 請於 6/13( 五 ) 下課前繳交

雷射原理. The Principle of Laser. 授課教授 : 林彥勝博士 Contents

pseudo-code-2012.docx 2013/5/9

Static CMOS Circuits. Example 1

Lecture 6: Circuit design part 1

Delay and Power Estimation

Interconnect (2) Buffering Techniques. Logical Effort

Ch.9 Liquids and Solids

Linear Regression. Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) SDA Regression 1 / 34

EXPERMENT 9. To determination of Quinine by fluorescence spectroscopy. Introduction

Lecture 9: Combinational Circuit Design

EE M216A.:. Fall Lecture 4. Speed Optimization. Prof. Dejan Marković Speed Optimization via Gate Sizing

Chapter 4. Digital Integrated Circuit Design I. ECE 425/525 Chapter 4. CMOS design can be realized meet requirements from

C.K. Ken Yang UCLA Courtesy of MAH EE 215B

Lecture 5. Logical Effort Using LE on a Decoder

EE M216A.:. Fall Lecture 5. Logical Effort. Prof. Dejan Marković

Lecture 1: Gate Delay Models

GSAS 安裝使用簡介 楊仲準中原大學物理系. Department of Physics, Chung Yuan Christian University

原子模型 Atomic Model 有了正確的原子模型, 才會發明了雷射

Digital Integrated Circuits A Design Perspective

Chapter 20 Cell Division Summary

EE141. Administrative Stuff

Lecture 7: Multistage Logic Networks. Best Number of Stages

CPE/EE 427, CPE 527 VLSI Design I L13: Wires, Design for Speed. Course Administration

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

EE115C Digital Electronic Circuits Homework #6

Logic Gate Sizing. The method of logical effort. João Canas Ferreira. March University of Porto Faculty of Engineering

Frequency Response (Bode Plot) with MATLAB

EE115C Digital Electronic Circuits Homework #5

Lecture 8: Combinational Circuits

14-A Orthogonal and Dual Orthogonal Y = A X

Lecture 8: Combinational Circuits

台灣大學開放式課程 有機化學乙 蔡蘊明教授 本著作除另有註明, 作者皆為蔡蘊明教授, 所有內容皆採用創用 CC 姓名標示 - 非商業使用 - 相同方式分享 3.0 台灣授權條款釋出

Lecture Outline. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. Review: 1st Order RC Delay Models. Review: Two-Input NOR Gate (NOR2)

統計學 Spring 2011 授課教師 : 統計系余清祥日期 :2011 年 3 月 22 日第十三章 : 變異數分析與實驗設計

EECS 151/251A Homework 5

CARNEGIE MELLON UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING DIGITAL INTEGRATED CIRCUITS FALL 2002

Integrated Circuits & Systems

Differential Equations (DE)

Lecture 9: Combinational Circuits

and V DS V GS V T (the saturation region) I DS = k 2 (V GS V T )2 (1+ V DS )

ESE570 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Integrated Cicruits AND VLSI Fundamentals

Statistical Intervals and the Applications. Hsiuying Wang Institute of Statistics National Chiao Tung University Hsinchu, Taiwan

CHAPTER 2. Energy Bands and Carrier Concentration in Thermal Equilibrium

Chapter 8 Lecture. Essential University Physics Richard Wolfson 2 nd Edition. Gravity 重力 Pearson Education, Inc. Slide 8-1

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Lecture 14: Circuit Families

EEE 421 VLSI Circuits

2019 年第 51 屆國際化學奧林匹亞競賽 國內初選筆試 - 選擇題答案卷

EEC 118 Lecture #6: CMOS Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

University of Toronto. Final Exam

Logic effort and gate sizing

Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

Chapter 7 Propositional and Predicate Logic

壓差式迴路式均熱片之研製 Fabrication of Pressure-Difference Loop Heat Spreader

Lecture 4: CMOS Transistor Theory

EEC 116 Lecture #5: CMOS Logic. Rajeevan Amirtharajah Bevan Baas University of California, Davis Jeff Parkhurst Intel Corporation

Advanced Engineering Mathematics 長榮大學科工系 105 級

Chapter 9. Estimating circuit speed. 9.1 Counting gate delays

國立中正大學八十一學年度應用數學研究所 碩士班研究生招生考試試題

Announcements. EE141-Spring 2007 Digital Integrated Circuits. CMOS SRAM Analysis (Read/Write) Class Material. Layout. Read Static Noise Margin

Name: Grade: Q1 Q2 Q3 Q4 Q5 Total. ESE370 Fall 2015

CMOS Technology Worksheet

在雲層閃光放電之前就開始提前釋放出離子是非常重要的因素 所有 FOREND 放電式避雷針都有離子加速裝置支援離子產生器 在產品設計時, 為增加電場更大範圍, 使用電極支援大氣離子化,

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

CPE/EE 427, CPE 527 VLSI Design I L18: Circuit Families. Outline

VLSI GATE LEVEL DESIGN UNIT - III P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT

A Direct Simulation Method for Continuous Variable Transmission with Component-wise Design Specifications

Transcription:

Digital Integrated Circuits Lecture 5: Logical Effort Chih-Wei Liu VLSI Signal Processing LAB National Chiao Tung University cwliu@twins.ee.nctu.edu.tw DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 1

Outline RC Delay Estimation Logical Effort Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Example Summary DIC-Lec5 cwliu@twins.ee.nctu.edu.tw

RC Delay Model Use equivalent circuits for MOS transistors Ideal switch + capacitance and ON resistance Unit nmos has resistance R, capacitance C Unit pmos has resistance R (mobility 較小 ), capacitance C Capacitance proportional to width Resistance inversely proportional to width DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3

RC Values Capacitance C = C g = C s = C d = ff/μm of gate width Values similar across many processes Resistance R 6 KΩ*μm in 0.6um process Improves with shorter channel lengths Unit transistors May refer to minimum contacted device (4/ λ) 假設 nmos/pmos 寄生電容與閘極電容相同 Or maybe 1 μm wide device Doesn t matter as long as you are consistent DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 4

Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter by using the basic RC delay model of nmos and pmos RC model A 1 Y 1 s kc g kc R/k kc d DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 5

Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter R C A 1 Y 1 R C C Y C C 假設 A 電位提昇 C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 6

Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter R C A 1 Y 1 R C C Y C C R C C C C A=1 C output DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 7

Inverter Delay Estimate Estimate the delay of a fanout-of-1 inverter R C A 1 Y 1 R C C Y C C R C C C C C t pd = 6RC 若沒有寄生電容的 ideal Inverter 則為 t pd = 3RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 8

Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 9

Example: 3-input NAND Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R). Step 1 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 10

3-input NAND Caps We still assume C=C g =C diff Annotate the 3-input NAND gate with gate and diffusion capacitance. 3 3 3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 11

3-input NAND Caps We assume C=C g =C diff Annotate the 3-input NAND gate with gate and diffusion capacitance. C C C C C C C C C Step 3C 3C 3C 3 3 3 3C 3C 3C 3C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 1

3-input NAND Caps Annotate the 3-input NAND gate with gate and diffusion capacitance. Final result 5C 5C 5C 3 3 3 9C 3C 3C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 13

Elmore Delay Model ON transistors look like resistors Pullup or pulldown network modeled as RC ladder Elmore delay of RC ladder t R C pd i to source i nodes i ( )... (... ) = RC + R + R C + + R + R + + R C 1 1 1 1 R 1 R R 3 R N C 1 C C 3 C N N N RC ladder DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 14

Example: -input NAND Estimate rising and falling propagation delays of a - input NAND driving h identical gates. A B x Y h copies -input NAND DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 15

Example: -input NAND Estimate rising and falling propagation delays of a - input NAND driving h identical gates. A B x 6C C Y 4hC h copies DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 16

Example: -input NAND Estimate worst-case rising and falling propagation delays of a -input NAND driving h identical gates. A B x 6C C Y 4hC h copies Worst case rising: 只有一個 pmos 開啟 R Y ( ) (6+4h)C t pdr = 6+ 4h RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 17

Example: -input NAND Estimate worst-case rising and falling propagation delays of a - input NAND driving h identical gates. A B x 6C C Y 4hC h copies R/ x R/ C Y (6+4h)C Worst case falling:a 已經為 1 ;B 的電位開始上升 t pdf R R = ( )C + ( = (7 + 4h) RC + R )[(6 + 4h) C] DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 18

Delay Components Delay has two parts Parasitic delay 6 or 7 RC Independent of load Effort delay 4h RC Proportional to load capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 19

Contamination Delay Best-case (contamination) delay can be substantially less than worst-case propagation delay. Ex: If both inputs fall simultaneously A B x 6C C Y 4hC R R Y (6+4h)C t cdr R = ( )[(6 + 4h) C] = (3 + h) RC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 0

Contamination Delay Best-case (contamination) falling delay can be substantially less than worst-case propagation delay. Ex: B 已經為 1 ;A 的電位開始上升 A B x 6C C Y 4hC t cdf = = R R ( + )[(6 + (6 + 4h) RC 4h) C] DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 1

Delay Components Again, Delay has two parts Parasitic delay 3 or 6 RC Independent of load Effort delay 4h or h RC Proportional to load capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw

Diffusion Capacitance We assumed contacted diffusion on every s / d. Good layout minimizes diffusion area Ex: NAND3 layout shares one diffusion contact Reduces output capacitance by C Merged un-contacted diffusion might help too 3 9C 5C 3 3C 5C 3 3C 5C Recall 3-input NAND Shared Contacted Diffusion Merged Uncontacted Diffusion C C Gate capacitance: 由電晶體的寬度決定 Diffusion capacitance: 與佈局有關 Isolated Contacted Diffusion 3 3 7C 3C 3C 3C 3C 3 3C DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3

Layout Comparison Which layout is better? V DD A B V DD A B Y Y GND GND DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 4

Introduction Chip designers face a bewildering array of choices What is the best circuit topology for a function? How many stages of logic give least delay? How wide should the transistors be???? Logical effort is a method to make these decisions Uses a simple model of delay Allows back-of-the-envelope calculations Helps make rapid comparisons between alternatives Emphasizes remarkable symmetries DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 5

Example Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. Decoder specifications: A[3:0] A[3:0] 3 bits 16 word register file Each word is 3 bits wide Each bit presents load of 3 unit-sized transistors 4:16 Decoder 16 Register File 16 words True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 6

Delay in a Logic Gate Express delays in process-independent unit d = d abs τ = 3RC 沒有寄生電容的理想反向器 1 ps in 180 nm process τ Normalized delay representation 40 ps in 0.6 μm process DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 7

Delay in a Logic Gate Express delays in process-independent unit d = d abs τ Delay has two components d = f + p 無關於製程型態的方式來表示 Delay, 可使電路能依照佈局來做 Delay 的比較, 而不必考慮不同製程對電路速度的影響 Linear delay model DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 8

Delay in a Logic Gate Express delays in process-independent unit Delay has two components Effort delay: f = gh (a.k.a. stage effort) d = d d abs τ = f + p Again has two components DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 9

Delay in a Logic Gate Express delays in process-independent unit Delay has two components Effort delay f = gh (a.k.a. stage effort) Again has two components g: logical effort d = d abs τ d = f + p Measures relative ability of gate to deliver current g 1 for inverter Gate Complexity, g i.e. 會耗費較多的時間去驅動相同 fanout 數的電晶體 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 30

Delay in a Logic Gate Express delays in process-independent unit Delay has two components Effort delay f = gh (a.k.a. stage effort) Again has two components h: electrical effort = C out / C in d = d abs τ d = f + p Ratio of output to input capacitance Sometimes called fanout Fan-out, h DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 31

Delay in a Logic Gate Express delays in process-independent unit Delay has two components Parasitic delay p d = d d abs τ = f + p Represents delay of gate driving no load Set by internal parasitic capacitance DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 3

Delay in NAND Effort delay: 4hC ( 與外部電容有關 ) Delay in -input NAND 6C d = f + p 4hC = + 3C 4 = ( ) h + = gh + 3 p A B Normalized delay representation How to calculate the logical effort g? x C 註 :Ideal Inverter ( 無寄生電容 Single Fanout) τ = 3RC;g = 1 Y 4hC DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 33

Delay Plots d = f + p = gh + p Normalized Delay: d 6 5 4 3 -input NAND g = p = d = Inverter g = p = d = 1 0 0 1 3 4 5 Electrical Effort: h = C out / C in DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 34

Delay Plots d = f + p = gh + p What about NOR? Normalized Delay: d 6 5 4 3 -input NAND Inverter g = 4/3 p = d = (4/3)h + g = 1 p = 1 d = h + 1 Effort Delay: f 1 0 Parasitic Delay: p 0 1 3 4 5 Electrical Effort: h = C out / C in DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 35

Computing Logical Effort DEF: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current. Measure from delay vs. fanout plots Or estimate by counting transistor widths A 1 Y A B Y A B 4 4 1 1 Y C in = 3 g = 3/3 C in = 4 g = 4/3 C in = 5 g = 5/3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 36

Catalog of Gates Logical effort of common gates Gate type Number of inputs 1 3 4 n Inverter 1 NAND 4/3 5/3 6/3 (n+)/3 NOR 5/3 7/3 9/3 (n+1)/3 Tristate / mux XOR, XNOR 4, 4 6, 1, 6 8, 16, 16, 8 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 37

Recall: 4:1 Multiplexer 4:1 mux chooses one of 4 inputs using two selects S1S0 S1S0 S1S0 S1S0 D0 D1 Logical Effort =; 與輸入的數目無關 D D3 大型多工器的速度與小型多工器的速度相當? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 38

Computing Parasitic Delay DEF: Represents delay of gate driving no load Set by internal parasitic capacitance Measure from delay vs. fanout plots Or estimate by counting total diffusion capacitances at output A A Y 1 B Y A 4 B 4 Y 1 1 C= 3 p= 3/3= 1 C = 6 p= 6/3= C= 6 p= 6/3= 增加電晶體大小會降低電阻, 但卻會增加寄生電容, 但較寬的電晶體可以摺疊, 使內部的寄生電容可能變小 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 39

Catalog of Gates Parasitic delay of common gates In multiples of p inv ( 1) Gate type Number of inputs 1 3 4 n Inverter 1 NAND 3 4 n NOR 3 4 n Tristate / mux 4 6 8 n XOR, XNOR 4 6 8 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 40

Example: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: g = Electrical Effort: h = Parasitic Delay: p = Stage Delay: d = DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 41

Example1: FO4 Inverter Estimate the delay of a fanout-of-4 (FO4) inverter d Logical Effort: g = 1 Electrical Effort: h = 4 Parasitic Delay: p = 1 Stage Delay: d = gh + p = 5 經驗法則 The FO4 delay is about 00 ps in 0.6 μm process 60 ps in a 180 nm process f/3 ns in an f μm process DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 4

Example: Ring Oscillator Estimate the frequency of an N-stage ring oscillator Odd-number stages Logical Effort: g = 1 Electrical Effort: h = 1 31 stage ring oscillator in 0.6 μm process has Parasitic Delay: p = 1 frequency of ~ 00 MHz Stage Delay: d = Frequency: f osc = 1/(*N*d) = 1/4N 如果 g 1, 那如何計算多級邏輯網路的延遲呢? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 43

Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort Path Electrical Effort Path Effort G = g i C H = C out-path in-path F = f = gh 與電晶體大小無關 與電晶體大小有關 i i i 10 g 1 = 1 h 1 = x/10 x g = 5/3 h = y/x y g 3 = 4/3 h 3 = z/y z g 4 = 1 h 4 = 0/z 0 F=GH=(5/3)(4/3)(x/10)(y/x)(z/y)(0/z)=G(0/10) DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 44

Multistage Logic Networks Logical effort generalizes to multistage networks Path Logical Effort Path Electrical Effort Path Effort G = g i C H = C out-path in-path F = fi = gh i i Can we always write F = GH? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 45

Branch Paths No! Consider paths that branch: G = H = 5 15 90 GH = h 1 = 15 90 h = F = GH? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 46

Paths that Branch No! Consider paths that branch: G = 1 H = 90 / 5 = 18 5 15 90 GH = 18 h 1 = (15 +15) / 5 = 6 15 90 h = 90 / 15 = 6 F = g 1 g h 1 h = 36 = GH DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 47

Branching Effort Branching effort Accounts for branching between stages in path Now we compute the path effort b = C F = GBH on path B= b i C + C on path off path Path branching effort Note: h i = BH DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 48

Multistage Delays Path Effort Delay D F = f i Path Parasitic Delay Path Delay P= p i = i = F + D d D P Recall that F = GBH 各級 path effort 的乘積為 F;path delay 則為各級 path effort delay 的總和 最小值發生在各級的 path effort delay 相同, 亦即 f i = F 1/N DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 49

Designing Fast Circuits Delay is smallest when each stage bears same effort Thus minimum delay of N stage path is This is a key result of logical effort = i = F + D d D P fˆ = gh = F i i 1 N D= NF + P 1 N Find fastest possible delay Doesn t require calculating gate sizes 這種方式優於模擬 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 50

Choose Gate Sizes to Meet the Least Delay Design How wide should the gates be for least delay? fˆ = gh= g C = in i C C out in gc i fˆ out i Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. Check work by verifying input cap spec is met. DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 51

Example: 3-stage path Select gate sizes x and y for least delay from A to B x x y 45 A 8 x y B 45 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 5

Example: 3-stage path x A 8 x x y y 45 B 45 Logical Effort G = Electrical Effort H = Branching Effort B = Path Effort F = Best Stage Effort ˆf = Parasitic Delay P = Delay D = DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 53

Example: 3-stage path x A 8 x x y y 45 B 45 Logical Effort G = (4/3)*(5/3)*(5/3) = 100/7 Electrical Effort H = 45/8 Branching Effort B = 3 * = 6 Path Effort F = GBH = 15 Best Stage Effort ˆ 3 F 5 Parasitic Delay P = + 3 + = 7 f = = Delay D = 3*5 + 7 = = 4.4 FO4 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 54

Example: 3-stage path Work backward for sizes y = x = x x y 45 A 8 x y B 45 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 55

Example: 3-stage path Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15+15) * (5/3) / 5 = 10 fˆ = gh= g C = in i C C out in gic fˆ out i 45 A P: 4 N: 4 P: 4 N: 6 P: 1 N: 3 B 45 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 56

Example: 3-stage path Check the result: 1. the size of the NAND gate: (10+10+10)*(4/3) / 5 = 8. NAND gate delay: d 1 =g 1 h 1 +p 1 = (4/3)*(10+10+10) / 8 + = 7 3. NAND3 gate delay: d =g h +p = (5/3)*(15+15) / 10 + 3 = 8 4. NOR gate delay: d 3 =g 3 h 3 +p 3 = (5/3)*(45/15) + = 7 7+8+7 = 45 A P: 4 N: 4 P: 4 N: 6 P: 1 N: 3 B 45 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 57

Best Number of Stages How many stages should a path use? Minimizing number of stages is not always fastest Example: drive 64-bit datapath with unit inverter Initial Driver 1 1 1 1 D = Datapath Load 64 64 64 64 N: f: D: 1 3 4 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 58

Best Number of Stages Example: drive 64-bit datapath with unit inverter Initial Driver 1 1 1 1 D = NF 1/N + P = N(64) 1/N + N 8 4.8 16 8 3 Datapath Load 64 64 64 64 N: f: D: 1 64 65 8 18 3 4 15 Fastest 4.8 15.3 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 59

Derivation for Best Number of Stages Consider adding inverters to end of path How many give least delay? = n 1 1 N + i + i= 1 ( ) D NF p N n p D N 1 1 1 N N N 1 inv = F ln F + F + p = 0 inv Logic Block: n 1 Stages Path Effort F N - n 1 Extra Inverters Define best stage effort 1 ρ = F N p + ρ 1 lnρ = 0 inv ( ) DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 60

Best Number of Stages p + ρ 1 lnρ = 0 ( ) has no closed-form solution inv Neglecting parasitics (p inv = 0), we find ρ =.718 (e) For p inv = 1, solve numerically for ρ = 3.59 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 61

Sensitivity Analysis How sensitive is delay to using exactly the best number 1.6 of stages? 1.51 Nˆ = log ρ F D(N) /D(N) 1.4 1.6 1. 1.15 1.0 (ρ=6) (ρ =.4) If p inv = 1 0.0 0.5 0.7 1.0 1.4.0.4 < ρ < 6 gives delay within 15% of optimal We can be sloppy! I like ρ = 4 N / N DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 6

Example, Revisited Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. Decoder specifications: A[3:0] A[3:0] 3 bits 16 word register file Each word is 3 bits wide Each bit presents load of 3 unit-sized transistors 4:16 Decoder 16 Register File 16 words True and complementary address inputs A[3:0] Each input may drive 10 unit-sized transistors Ben needs to decide: How many stages to use? How large should each gate be? How fast can decoder operate? DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 63

Possible 4:16 Decoder Each word is 3 bits wide Each bit presents load of 3 unit-sized transistors Each input may drive 10 unit-sized transistors A[3] A[3] A[] A[] A[1] A[1] A[0] A[0] 10 10 10 10 10 10 10 10 y z word[0] 96 units of wordline capacitance y z word[15] DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 64

Number of Stages Decoder effort is mainly electrical and branching Electrical Effort: H = (3*3) / 10 = 9.6 Branching Effort: B = 8 If we neglect logical effort (assume G = 1) Path Effort: F = GBH = 76.8 Number of Stages: N = log 4 F = 3.1 Try a 3-stage design DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 65

Gate Sizes & Delay Logical Effort: G = 1 * 6/3 * 1 = Path Effort: F = GBH = 154 Stage Effort: Path Delay: Gate sizes: z = 96*1/5.36 = 18 y = 18*/5.36 = 6.7 A[3] A[3] A[] A[] A[1] A[1] A[0] A[0] 10 10 10 10 10 10 10 10 ˆ 1/3 f = F = 5.36 D= 3 fˆ + 1+ 4 + 1 =.1 落在.4 與 6 之間 y z word[0] 96 units of wordline capacitance y z word[15] DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 66

Comparison Compare many alternatives with a spreadsheet Design N G P D NAND4-INV 5 9.8 NAND-NOR 0/9 4 30.1 INV-NAND4-INV 3 6.1 NAND4-INV-INV-INV 4 7 1.1 NAND-NOR-INV-INV 4 0/9 6 0.5 NAND-INV-NAND-INV 4 16/9 6 19.7 INV-NAND-INV-NAND-INV 5 16/9 7 0.4 NAND-INV-NAND-INV-INV-INV 6 16/9 8 1.6 DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 67

Review of Definitions Term number of stages logical effort electrical effort branching effort effort effort delay parasitic delay delay Stage 1 g h = b = f f p = C C C out in on-path gh + C C on-path d = f + p off-path Path N G = g i H = C C out-path in-path B = b i F = GBH D F = f i P = p i D = d = D + P i F DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 68

Method of Logical Effort 1) Compute path effort ) Estimate best number of stages 3) Sketch path with N stages 4) Estimate least delay 5) Determine best stage effort F = GBH N = log 4 F 1 N D= NF + P fˆ = F 1 N 6) Find gate sizes C in i = gic fˆ out i DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 69

Limits of Logical Effort Chicken and egg problem Need path to compute G But don t know number of stages without G Simplistic delay model Neglects input rise time effects Interconnect Iteration required in designs with wire Maximum speed only Not minimum area/power for constrained delay DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 70

Summary Logical effort is useful for thinking of delay in circuits Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesn t mean faster paths Delay of path is about log 4 F FO4 inverter delays Inverters and NAND best for driving large caps Provides language for discussing fast circuits But requires practice to master DIC-Lec5 cwliu@twins.ee.nctu.edu.tw 71