Clocking Issues: Distribution, Energy

Similar documents
EE115C Winter 2017 Digital Electronic Circuits. Lecture 19: Timing Analysis

The Linear-Feedback Shift Register

Xarxes de distribució del senyal de. interferència electromagnètica, consum, soroll de conmutació.

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

Timing Issues. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić. January 2003

GMU, ECE 680 Physical VLSI Design 1

Lecture 9: Sequential Logic Circuits. Reading: CH 7

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

9/18/2008 GMU, ECE 680 Physical VLSI Design

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Clock Strategy. VLSI System Design NCKUEE-KJLEE

GMU, ECE 680 Physical VLSI Design

Digital Integrated Circuits A Design Perspective

Designing Sequential Logic Circuits

Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. November Digital Integrated Circuits 2nd Sequential Circuits

Digital Integrated Circuits A Design Perspective

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Digital Integrated Circuits A Design Perspective

Skew-Tolerant Circuit Design

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

Itanium TM Processor Clock Design

MODULE 5 Chapter 7. Clocked Storage Elements

Hw 6 due Thursday, Nov 3, 5pm No lab this week

EE241 - Spring 2006 Advanced Digital Integrated Circuits

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Integrated Circuits & Systems

CMPEN 411. Spring Lecture 18: Static Sequential Circuits

EECS 427 Lecture 14: Timing Readings: EECS 427 F09 Lecture Reminders

Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering. Jungli, Taiwan

Digital Integrated Circuits A Design Perspective

ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ Εαρινό Εξάμηνο 2018

EE141- Spring 2007 Digital Integrated Circuits

Issues on Timing and Clocking

Lecture 16: Circuit Pitfalls

Lecture 4: Implementing Logic in CMOS

Dynamic Combinational Circuits. Dynamic Logic

EE141Microelettronica. CMOS Logic

9/18/2008 GMU, ECE 680 Physical VLSI Design

Energy Delay Optimization

COMP 103. Lecture 16. Dynamic Logic

Memory, Latches, & Registers

Lecture 1: Circuits & Layout

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

EECS 151/251A Homework 5

EE M216A.:. Fall Lecture 5. Logical Effort. Prof. Dejan Marković

Digital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

EE 560 CHIP INPUT AND OUTPUT (I/0) CIRCUITS. Kenneth R. Laker, University of Pennsylvania

Digital Integrated Circuits A Design Perspective

Lecture 21: Packaging, Power, & Clock

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Topics. CMOS Design Multi-input delay analysis. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Chapter 5 CMOS Logic Gate Design

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

5. Sequential Logic x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS

UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

Adders, subtractors comparators, multipliers and other ALU elements

EE141-Fall 2011 Digital Integrated Circuits

Lecture 27: Latches. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

EE 447 VLSI Design. Lecture 5: Logical Effort

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

! Charge Leakage/Charge Sharing. " Domino Logic Design Considerations. ! Logic Comparisons. ! Memory. " Classification. " ROM Memories.

Announcements. EE141- Fall 2002 Lecture 25. Interconnect Effects I/O, Power Distribution

Digital Integrated Circuits Designing Combinational Logic Circuits. Fuyuzhuo

VLSI Design, Fall Logical Effort. Jacob Abraham

Last Lecture. Power Dissipation CMOS Scaling. EECS 141 S02 Lecture 8

Circuit A. Circuit B

Digital VLSI Design. Lecture 8: Clock Tree Synthesis

ECE429 Introduction to VLSI Design

Lecture 14: Circuit Families

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

L15: Custom and ASIC VLSI Integration

EECS 141 F01 Lecture 17

Properties of CMOS Gates Snapshot

Lecture 6: Logical Effort

! Crosstalk. ! Repeaters in Wiring. ! Transmission Lines. " Where transmission lines arise? " Lossless Transmission Line.

ECE321 Electronics I

! Memory. " RAM Memory. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

C.K. Ken Yang UCLA Courtesy of MAH EE 215B

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

Lecture 16: Circuit Pitfalls

Next, we check the race condition to see if the circuit will work properly. Note that the minimum logic delay is a single sum.

Semiconductor Memories

Adders, subtractors comparators, multipliers and other ALU elements

UNIVERSITY OF CALIFORNIA

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

Luis Manuel Santana Gallego 31 Investigation and simulation of the clock skew in modern integrated circuits

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

EECS 427 Lecture 8: Adders Readings: EECS 427 F09 Lecture 8 1. Reminders. HW3 project initial proposal: due Wednesday 10/7

Lecture 8: Combinational Circuit Design

CPE/EE 427, CPE 527 VLSI Design I L18: Circuit Families. Outline

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Topics to be Covered. capacitance inductance transmission lines

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

EE 330 Lecture 6. Improved Switch-Level Model Propagation Delay Stick Diagrams Technology Files

ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ Εαρινό Εξάμηνο 2018

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Transcription:

EE M216A.:. Fall 2010 Lecture 12 Clocking Issues: istribution, Energy Prof. ejan Marković ee216a@gmail.com Clock istribution Goals: eliver clock to all memory elements with acceptable skew eliver clock edges with acceptable sharpness Clock network design is another one of the big challenges in the design of a large chip Clocks are generally distributed via wiring trees Want to use low resistance interconnect to minimize delay Use multiple drivers to distribute driver requirements Use optimal sizing principles to design buffers Clock lines can create significant crosstalk Lecture 12:. Clocking Markovic Issues / Slide 2 2

Issues in Clock istribution Network Skew Process, voltage, temp. ata dependence Noise coupling Load balancing Power, CV 2 f (no ½ or α) Clock gating Flexibility/Tunability Compactness fit into existing layout/design Reliability Electromigration Skew from distribution See videos from P. Restle (IBM) on classwiki Lecture 12:. Clocking Markovic Issues / Slide 3 3 Clock istribution Methods RC Tree Less capacitance More accuracy Fl ibl i i Grids Reliable Less data dependency T bl (l t i d i ) Flexible wiring Tunable (late in design) Shown here for final stage drivers driving F F loads Lecture 12:. Clocking Markovic Issues / Slide 4 4

RC Trees, Clock istribution Observe: only relative skew is important H Tree X Tree Binary Tree Asymmetric trees that match RCs can be used CLOCK H-Tree Network Lecture 12:. Clocking Markovic Issues / Slide 5 5 More Realistic H Tree [Restle98] Lecture 12:. Clocking Markovic Issues / Slide 6 6

The Grid System No RC matching Large power river GCLK clock grid EC Alpha Examples GCLK river river GCLK 21064 21164 river GCLK 21264 Lecture 12:. Clocking Markovic Issues / Slide 7 7 Examples of istribution H Tree Asymmetric RC Tree IBM Grids EC [Alphas] Serpentines Intel x86 [Young ISSCC97] Lecture 12:. Clocking Markovic Issues / Slide 8 8

Examples of Processor Chips EC Alpha 21064 clock spines Note: reverse Z axis EC Alpha 21064 RC delays EC Alpha 21164 RC delays for global distribution Spine + Grid EC Alpha 21164 RC local delays Lecture 12:. Clocking Markovic Issues / Slide 9 9 Example: EV5 (Alpha 21164) Clocking (1995) t cycle = 3.3ns t rise = 0.35ns t skew = 150ps Clock rivers Single phase clocking 2 distributed driver channels Reduced RC delay/skew Improved thermal distribution 3.75 nf clock load 58 cm final driver width Local inverters for latching Conditional clocks in caches to reduce power More complex race checking evice variation Lecture 12:. Clocking Markovic Issues / Slide 10 10

Example: EV6 (Alpha 21264) Clocking (1998) 600MHz, 0.35µm CMOS t cycle = 1.67ns t rise = 0.15ns Global clock waveform t skew = 50ps rise skew p PLL Multiple conditional buffered clocks 2.8 nf clock load 40 cm final driver width Reduced load/skew Reduced thermal issues Multiple clocks complicate race checking Lecture 12:. Clocking Markovic Issues / Slide 11 11 21264 Clocking Lecture 12:. Clocking Markovic Issues / Slide 12 12

EV6 Clock Results ps 5 10 15 20 25 30 35 40 45 50 ps 300 305 310 315 320 325 330 335 340 345 GCLK Skew (at V /2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%) Lecture 12:. Clocking Markovic Issues / Slide 13 13 EV7 Clock Hierarchy, 2002 152 million transistors, 15/137 logic/memory Active Skew Management and Multiple Clock omains L2L_CLK (L2 Cache) LL NCLK (Mem Ctrl) GCLK (U Core) LL PLL LL L2R_CLK (L2 Cache) Widely dispersed drivers LLs compensate static and low frequency variation ivides design and verification effort LL design and verification is added work Tailored clocks SYSCLK Lecture 12:. Clocking Markovic Issues / Slide 14 14

Alpha Processors Case Study EV4 (21064) 0.75µm, 200 MHz ~ 1992 Single global clock driver, 5 levels of buffering 35 cm driver, 3.25 nf, 40% power EV5 (21164) 0.5µm, 300 MHz ~ 1995 One central, two side clock drivers 58 cm driver, 3.75 nf, 40% power EV6 (21264) 0.35µm, 600 MHz ~ 1998 Clock grid, 4 window panes, hierarchical, gated clock domains 40 cm driver, 2.8 nf EV7 0.18µm, 1.2GHz~ 2002 Multiple clock domains, LLs Lecture 12:. Clocking Markovic Issues / Slide 15 15 Intel Processors Pentium II Pentium III Pentium 4 MPR Issue June 1997 April 2000 ec 2001 Clock Speed 266 MHz 1GHz 2GHz Pipeline Stages 12/14 12/14 22/24 Transistors 7.5M 24M 42M Cache (I//L2) 16k/16K/ 16K/16K/256K 12K/8K/256K ie Size 203mm 2 106mm 2 217mm 2 IC Process 0.25µm, 4M 0.18µm, 6M 0.18µm, 6M Max Power 27W 23W 67W Increasing clock speeds and die size Balancing skew using simple RC trees becoming less effective Insertion delay 7 8FO4 due to increased die Clock skew control harder to due to increasing PVT variations Use of active deskewing circuits Lecture 12:. Clocking Markovic Issues / Slide 16 16

IA 32 Pentium Pro: Active eskewing Ext FB CLK Gen elay Line elay SR eskew Control elay Line elay SR Core Lef ft Spine P Rig ght Spine Clock distribution network with deskewing circuit [Geannopoulos and ai 1998] Lecture 12:. Clocking Markovic Issues / Slide 17 17 IA 32 Pentium Pro: elay Shift Register In Load<1:15,2> elay Line Load<0:14,2> Out <1:15,2> <0:14,2> elay shift register [Geannopoulos and ai 1998] elay Shift Register Lecture 12:. Clocking Markovic Issues / Slide 18 18

IA 32 Pentium Pro: Phase etector Right Bandwidth Control elay = n Phase etector 1 Left ftleads Left elay = n Phase etector 2 Right Leads Phase detector [Geannopoulos and ai 1998] Lecture 12:. Clocking Markovic Issues / Slide 19 19 First IA 64 Processor: Clock istribution PLL RCs PLL Clock distribution topology [Rusu and Tam 2000] Core Clock Reference Clock eskew Cluster PLL generates internal clock 2x frequency distribution architecture Balanced global clock tree Multiple deskew buffers Multiple local clock buffers Lecture 12:. Clocking Markovic Issues / Slide 20 20

Result of Active eskewing Measured regional clock skew [Rusu and Tam 2000] Lecture 12:. Clocking Markovic Issues / Slide 21 21 Some Energy Reduction Ideas Clock gating Global Local Low swing clocking Reduced swing drivers CSE redesign N only CSE w/ low swing ual edge triggering Latch mux Pulsed latch Lecture 12:. Clocking Markovic Issues / Slide 22 22

Global Clock Gating Used to save clocking energy when data activity is low (a) In 0 1 S (b) In Load Time mux (no gating) REG EN REG Global Gating (a) Nongated clock circuit, (b) gated clock circuit. [Kitahara et al. 1998] Lecture 12:. Clocking Markovic Issues / Slide 23 23 Local Clock Gating (Requires FF Redesign) Same concept, control at a fine level of granularity (single FF) Pulse Generator I M Clock Control P 1 ata-transition look-ahead latch [Nogawa and Ohtomo, 1998] ata Transition Look Ahead Lecture 12:. Clocking Markovic Issues / Slide 24 24

Local Clock Gating, Another Example Clock enabled only when * * S S' R' R N N 1 Conditional-capture FF [Kong et al. 2000] S' R R' S Lecture 12:. Clocking Markovic Issues / Slide 25 25 Gated Transmission Gate M S Latch Concept: inhibit clock when = comp= XNOR comp = 0 ( ), M S Latch comp = 1 ( = ), = 0, 1 = 1 latches closed, no output change, no internal power S M 1 comp 1 M 0.5 * 0.5 0.5 0.5 * 1 S S comp S 0.5 0.5 S S S Gated M-S latch [Markovic et al. 2001] Lecture 12:. Clocking Markovic Issues / Slide 26 26

Gated TG MS Latch: Timing and Energy Setup (U) and Hold time (H) Energy Increased t setup in gated MSL due to inclusion of the comparator into the critical path slower than conventional MSL Smaller energy per transition if switching activity of is <0.3 For higher α, comparator and gen dominate the energy Lecture 12:. Clocking Markovic Issues / Slide 27 27 ata Transition Look Ahead (TLA) Pulsed Latch Pulsed latch, pulses are gated with XOR TLA circuit If P 1 = 0, circuit operates as a conventional pulsed latch If = P 1 = 1 = 0, no change or energy consumption XOR and Control in the critical path large t setup, and t M Pulse Generator I Clock Control P 1 [Nogawa and Ohtomo, 1998] ata Transition Look Ahead Lecture 12:. Clocking Markovic Issues / Slide 28 28

TLA L: Analysis of Energy Consumption M TLA L without clock gating Pulse Generator I C in (CC) TLA L Pulse Generator α 1 α E = ( E + E ) + E + E CMSL 01 10 Cin 2 2 E L FF α = ( E 2 0 1 + E 1 0 1 α ) + E 2 idle = E idle + ECLK + EL CC + E E E 0 1 + int ext 1 0 = E idle + ECLK + EG + Eint E + E N PG E idle = E0 0 = E1 1 = + E Cin PG shared among N TLA L s α input switching activity Lecture 12:. Clocking Markovic Issues / Slide 29 29 Energy Comparison: TLA L vs. Conventional M S TLA L is more energy efficient than CMSL when N > 2 and α < 0.25 E(TLA L) < E(CMSL) E(TLA L) > E(CMSL) Lecture 12:. Clocking Markovic Issues / Slide 30 30

Clock on emand (CO) Pulsed Latch Pulsed latch, pulses gated with XNOR TLA ckt If XNOR=0, 1 when, and 0 after has changed to If = XNOR=1 =0, no change or energy consumption ata Transition Look Ahead XNOR Pulse Generator includes clock control Cannot be shared among multiple PL s Pulse Generator [Hamada et al., 1999] Lecture 12:. Clocking Markovic Issues / Slide 31 31 Energy Efficient Pulse Generator in CO PL XNOR 1 "0 C int "1 " "1" Straightforward realization with CMOS gates C int switches in each cycle Energy inefficient XNOR "0 inv inv XNOR Compound AN NOR "1" Compound AN NOR gate Energyefficient Lecture 12:. Clocking Markovic Issues / Slide 32 32

Circuit Sizing Impacts Energy Efficiency of CO PL CO PL more effective for high speed due to large transistors [Markovic et al., 2001] Lecture 12:. Clocking Markovic Issues / Slide 33 33 Conditional Capture Flip Flop (CCFF) First stage: pulse generator with internal clock gating When =1, S=R=1 When =1, 1=0, S can switch low if =1, =0, R can switch low if =0, =1 Otherwise, S=R=1 no energy consumed Second stage: pass gate implementation of SAFF No t setup degradation due to clock gating S * * S R R N N 1 S R R S [Kong et al. 2000] Lecture 12:. Clocking Markovic Issues / Slide 34 34

Timing Comparison elay Race Immunity (R) esigns with gating / elay: Relative delay constant w/ V Large delay due to long t setup Internal race immunity: R(FF) < R(MSL) < R(Gated MSL) CO PL bad (wide pulse) [Markovic et al. 2001] Lecture 12:. Clocking Markovic Issues / Slide 35 35 Low Swing Clocking: river Re design Half swing clock drivers: 50% power reduction (minus some penalty in clock drivers) V T C p1 B C p2 C A V V thp T B GN CNT C n1 CNB C n2 H V C B V thn GN CNT CNB [Kojima at al. 1995] Lecture 12:. Clocking Markovic Issues / Slide 36 36

N only Clocked Latches Bring clock only to n MOS transistors to allow reduced clock swing without conflict with partially turned off p MOS transistors Reduced clocking energy with some penalty in performance is in the critical path S SM M N PL (c) SS d 1 N 1 N 2 S N MSL (a) N FF (b) N PPL (d) Pulse Generator (b) (d) (a) conventional TG MSL, (b) pulsed latch, (c) conventional PL, (d) push pull PL Lecture 12:. Clocking Markovic Issues / Slide 37 37 Low Swing FFs: Energy elay Comparison Full swing: PL preferred for high speed MSL preferred for low energy Low swing clock: N FF preferred for high speed N PPL is preferred for low energy 130nm technology, C L = 50fF, C in 12.5fF, α = 0.1 cycle (norm) Energy / / cycle (norm) Energy 1.2 1.0 0.8 0.6 0.4 0.2 0.00 0.8 0.6 0.4 0.2 0 PL N-PPL N-FF N-PPL N-FF MSL N-PL High-Vdd Low-Swing NMSL N-MSL 0 1 2 3 4 5 6 ata-to- delay (FO3 inverter delay) [Markovic et al. 2004] Lecture 12:. Clocking Markovic Issues / Slide 38 38

Effect of Clock Noise on Latch elay All latches fail for clock noise > 12% of clock voltage N FF gives best clock noise rejection delay degradation - 20% 16% 12% 8% 4% 0% N-CL N-PPL N-FF 0% 3% 6% 9% 12% Noise on low-swing clock Lecture 12:. Clocking Markovic Issues / Slide 39 39 ual Edge Triggering: Latch Mux Saves clocking energy regardless of data activity! Extra Mux longer delay C 0 1 S C Lecture 12:. Clocking Markovic Issues / Slide 40 40

ET Latch Mux: Circuit Example Pass gate latches One transparent when = 0 One transparent when = 1 Pass gate multiplexer l that t selects the output t of the opaque latch [Llopis and Sachdev, 1996] Lecture 12:. Clocking Markovic Issues / Slide 41 41 C 2 MOS Latch Mux (C 2 MOS LM) C 2 MOS latches One transparent when = 1 One transparent when = 0 N4 N2 Multiplexer: two C 2 MOS inverters that propagate the output of the opaque latch Large clock transistors shared between the latches and the multiplexer N5 N2 N3 N4 [Gago et al., 1993] N3 N5 Lecture 12:. Clocking Markovic Issues / Slide 42 42

ual Edge Triggering: Pulsed Latch First stage: pulse generator Second stage: capturing latch Pulse Gen C C Pulse Gen C Lecture 12:. Clocking Markovic Issues / Slide 43 43 ET Pulsed Latch Circuit Example 2 2 1 1 1 1 1 1 2 2 2 1 2 1 (a) Pulse generator transparent to only when = 1 = 1, or when = 2 = 1 shortly after both edges of the clock ET PL consumes lot of energy for four clocked pass gates To improve speed, modified from original design (Strollo et al., 1999) which implemented n MOS pass gate and p MOS keeper Lecture 12:. Clocking Markovic Issues / Slide 44 44 (b)

ual Edge Triggered Flip Flop Based on SR latch S C R S CL C R Lecture 12:. Clocking Markovic Issues / Slide 45 45 ET Flip Flop Circuit Example Two pulse generators: X active at rising edge of the clock, Y active at falling edge of the S X and S Y alternately l precharge / evaluate At any moment, one of S X and S Y keeps the value of data sampled at the most recent clock edge The other S X or S Y is precharged high 1 1 st STAGE: X 2 nd STAGE 1 st STAGE: Y S X S Y 1 1 2 2 Pulses at S X and S Y have same width as clock Second stage is a simple NAN gate no need for a latch Lecture 12:. Clocking Markovic Issues / Slide 46 46

SET vs. ET: elay Comparison 6 5 SE E elay [F FO4] 4 3 2 1 0 MSL/LM C2MOS-LM PL SPGFF Latch MUX s have two equally critical paths, somewhat shorter than that of MSL PL is more complex, adding more capacitance to the critical path compared to SET PL SPGFF has short domino like critical path fastest Lecture 12:. Clocking Markovic Issues / Slide 47 47 SET vs. ET: Power Consumption LM s benefit from clever implementation of latch mux structure with clock transistors sharing 180 160 140 120 Non-clk PL adds extra highactivity capacitance compared to SET PL Power [uw] 100 80 60 40 SPGFF power consumption is in the 20 middle, mainly due to 0 alternate switching of nodes SX and SY M SL LM PL SE PL E C2M OS SE C2M OS E SPGFF (0.18 µm, 500MHz for SET, 250MHZ for ET, high load) Lecture 12:. Clocking Markovic Issues / Slide 48 48

SET vs. ET: EP Comparison fj/250mhz] EP [fj/500mhz], [f 60 50 40 30 20 10 0 Single Edge ouble Edge M SL/LM C2M OS PL SPGFF Latch MUX s have similar il or better EP than their SET peers PL exhibits worse delay and energy compared to SET PL, due to more complex design SPGFF is fastest with moderate energy consumption: lowest EP EP (SPGFF) < EP (LM) < EP (PL) Lecture 12:. Clocking Markovic Issues / Slide 49 49 Clock istribution for ET H tree clock distribution with L levels of clock buffers Example below: L = 3 s = Storage Element s/2 L 1 s PLL s/2 L 1 M/4 L 1 storage elements Lecture 12:. Clocking Markovic Issues / Slide 50 50

SET vs. ET: Clocking Power lk (SET) P (ET) / P Cl 1.2 1.0 08 0.8 0.6 0.4 SET better ET better 0.2 1 2 3 0.0 0 0.3 0.6 0.9 1.2 1.5 C CSE,SE / C wire L ET wins if the total clock load capacitance is < 2x the capacitance of a SET design Lecture 12:. Clocking Markovic Issues / Slide 51 51