Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

Similar documents
CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

EE115C Winter 2017 Digital Electronic Circuits. Lecture 19: Timing Analysis

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

EE241 - Spring 2006 Advanced Digital Integrated Circuits

Lecture 21: Packaging, Power, & Clock

Timing Issues. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić. January 2003

GMU, ECE 680 Physical VLSI Design 1

Xarxes de distribució del senyal de. interferència electromagnètica, consum, soroll de conmutació.

The Linear-Feedback Shift Register

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Issues on Timing and Clocking

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Lecture 27: Latches. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

Itanium TM Processor Clock Design

Clock Strategy. VLSI System Design NCKUEE-KJLEE

Lecture 9: Sequential Logic Circuits. Reading: CH 7

Designing Sequential Logic Circuits

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Digital VLSI Design. Lecture 8: Clock Tree Synthesis

Digital Integrated Circuits A Design Perspective

EECS 427 Lecture 14: Timing Readings: EECS 427 F09 Lecture Reminders

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Digital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.

EE241 - Spring 2007 Advanced Digital Integrated Circuits. Announcements

CMPEN 411. Spring Lecture 18: Static Sequential Circuits

Chapter 13. Clocked Circuits SEQUENTIAL VS. COMBINATIONAL CMOS TG LATCHES, FLIP FLOPS. Baker Ch. 13 Clocked Circuits. Introduction to VLSI

EE371 - Advanced VLSI Circuit Design

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits A Design Perspective

GMU, ECE 680 Physical VLSI Design

9/18/2008 GMU, ECE 680 Physical VLSI Design

Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. November Digital Integrated Circuits 2nd Sequential Circuits

Interconnect s Role in Deep Submicron. Second class to first class

Luis Manuel Santana Gallego 31 Investigation and simulation of the clock skew in modern integrated circuits

Problem Set 9 Solutions

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

! Crosstalk. ! Repeaters in Wiring. ! Transmission Lines. " Where transmission lines arise? " Lossless Transmission Line.

MODULE 5 Chapter 7. Clocked Storage Elements

ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ Εαρινό Εξάμηνο 2018

L15: Custom and ASIC VLSI Integration

Clocking Issues: Distribution, Energy

Integrated Circuits & Systems

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

EECS 427 Lecture 15: Timing, Latches, and Registers Reading: Chapter 7. EECS 427 F09 Lecture Reminders

Implementation of Clock Network Based on Clock Mesh

Homework 2 due on Wednesday Quiz #2 on Wednesday Midterm project report due next Week (4 pages)

Skew-Tolerant Circuit Design

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

EE141- Spring 2007 Digital Integrated Circuits

EE141Microelettronica. CMOS Logic

EE 560 CHIP INPUT AND OUTPUT (I/0) CIRCUITS. Kenneth R. Laker, University of Pennsylvania

Next, we check the race condition to see if the circuit will work properly. Note that the minimum logic delay is a single sum.

Motivation for CDR: Deserializer (1)

Lecture 7: Logic design. Combinational logic circuits

Managing Physical Design Issues in ASIC Toolflows Complex Digital Systems Christopher Batten February 21, 2006

Very Large Scale Integration (VLSI)

Statistical Clock Skew Modeling With Data Delay Variations

Power in Digital CMOS Circuits. Fruits of Scaling SpecInt 2000

9/18/2008 GMU, ECE 680 Physical VLSI Design

Statistical Clock Skew Modeling with Data Delay Variations

EE371 - Advanced VLSI Circuit Design

24.2: Self-Biased, High-Bandwidth, Low-Jitter 1-to-4096 Multiplier Clock Generator PLL

Hold Time Illustrations

Timing Analysis with Clock Skew

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Chapter 5 CMOS Logic Gate Design

CPE/EE 422/522. Chapter 1 - Review of Logic Design Fundamentals. Dr. Rhonda Kay Gaede UAH. 1.1 Combinational Logic

NTE74HC173 Integrated Circuit TTL High Speed CMOS, 4 Bit D Type Flip Flop with 3 State Outputs

EE241 - Spring 2003 Advanced Digital Integrated Circuits

Chapter 8. Low-Power VLSI Design Methodology

Chapter 7 Sequential Logic

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

Topics. CMOS Design Multi-input delay analysis. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Reducing Delay Uncertainty in Deeply Scaled Integrated Circuits Using Interdependent Timing Constraints

Name: Answers. Mean: 83, Standard Deviation: 12 Q1 Q2 Q3 Q4 Q5 Q6 Total. ESE370 Fall 2015

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

VLSI Design, Fall Logical Effort. Jacob Abraham

Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering. Jungli, Taiwan

Interconnect (2) Buffering Techniques.Transmission Lines. Lecture Fall 2003

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Digital Logic Design - Chapter 4

CMPEN 411 VLSI Digital Circuits Spring 2012

EE 447 VLSI Design. Lecture 5: Logical Effort

ALU, Latches and Flip-Flops

UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: Nick Weaver & John Wawrzynek. Lecture 10 EE141

Memory, Latches, & Registers

CMOS logic gates. João Canas Ferreira. March University of Porto Faculty of Engineering

ENGG 1203 Tutorial _03 Laboratory 3 Build a ball counter. Lab 3. Lab 3 Gate Timing. Lab 3 Steps in designing a State Machine. Timing diagram of a DFF

CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 07: Pass Transistor Logic

Design of Control Modules for Use in a Globally Asynchronous, Locally Synchronous Design Methodology

ELEC Digital Logic Circuits Fall 2014 Sequential Circuits (Chapter 6) Finite State Machines (Ch. 7-10)

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Lecture 8: Combinational Circuit Design

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

Transcription:

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM Mark McDermott Electrical and Computer Engineering The University of Texas at Austin 9/27/18 VLSI-1 Class Notes

Why Clocking? Synchronous systems use a clock to keep operations in sequence Distinguish this cycle from previous cycle or next cycle Determine speed at which machine operates Clock must be distributed to all the sequencing elements Flip-flops and latches Also distribute clock to other elements Domino circuits and memories Clocking overhead % increases as the frequency is increased. Page 2

Logic Transactions and Clock Dependence Output(s) Input(s) Internal logic & wire delay from input pin to register Internal logic bound by F2F timing Internal logic & wire delay from register to output pin Din Delay DFF COMB DFF Dout Delay CLK Clock Skew = CLK - CLK CLK Clk Tcycle Tuncertainty Clk 3

Clocking Overhead per Technology Generation Clocking overhead ( skew and jitter ) is growing as we move to DSM processes. Careful design of the clock generation and distribution circuits is now required for all high performance processor designs. Page 4

Clock Distribution On a small chip, the clock distribution network is just a wire And possibly an inverter for clk On practical chips, the RC delay of the wire resistance and gate load is very long Variations in this delay cause clock to get to different elements at different times This is called clock skew Most chips use repeaters to buffer the clock and equalize the delay Reduces but does not eliminate skew Page 5

Example Skew comes from differences in gate and wire delay With right buffer sizing, clk 1 and clk 2 could ideally arrive at the same time. But power supply noise changes buffer delays clk 2 and clk 3 will always see RC skew 3 mm gclk 3.1 mm 0.5 mm clk 1 1.3 pf clk 2 clk 3 0.4 pf 0.4 pf 9/27/18 VLSI-1 Class Notes Page 6

Sources of Clock Uncertainties 4 Power Supply Devices 2 3 Interconnect 6 Capacitive Load 1 Clock Generation 5 Temperature 7 Coupling to Adjacent Lines J. Rabaey, 2002 7

Clock Non-idealities Clock skew Spatial variation in temporally equivalent clock edges; deterministic + random, t SK Clock jitter Temporal variations in consecutive edges of the clock signal; modulation + random noise Cycle-to-cycle (short-term) t JS Long term t JL Variation of the pulse width Important for level sensitive clocking J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 8

Clock Skew and Jitter Clk_A t Skew Clk_B t Jitter Both skew and jitter affect the effective cycle time Only skew affects the race margin assuming jitter tracks in the same direction J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 9

Positive Clock Skew Clock and data flow in the same direction In D R1 Q Combinational logic D R2 Q clk t clk1 T delay t clk2 1 d > 0 T + d 3 2 4 T : d + t hold T + d ³ t c-q + t plogic + t su so T ³ t c-q + t plogic + t su - d t hold : t hold + d t cdlogic + t cdreg so t hold t cdlogic + t cdreg - d d > 0: Improves performance, but makes t hold harder to meet. If t hold is not met (race conditions), the circuit malfunctions independent of the clock period! J. Rabaey, 2002 9/27/18 10 VLSI-1 Class Notes

Negative Clock Skew Clock and data flow in opposite directions In D R1 Q Combinational logic D R2 Q t clk1 T delay t clk2 clk 1 T + d 3 2 d < 0 4 T : T + d ³ t c-q + t plogic + t su so T ³ t c-q + t plogic + t su - d t hold : t hold + d t cdlogic + t cdreg so t hold t cdlogic + t cdreg - d d < 0: Degrades performance (t setup ), but t hold is easier to meet (eliminating race conditions) J. Rabaey, 2002 11

Clock Jitter Jitter causes T to vary on a cycle-by-cycle basis In clk R1 t clk T Combinational logic -t jitter +t jitter T : T - 2t jitter ³ t c-q + t plogic + t su so T ³ t c-q + t plogic + t su + 2t jitter Jitter directly reduces the performance of a sequential circuit J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 12

Combined Impact of Skew and Jitter Constraints on the minimum clock period (d > 0) In D R1 Q Combinational logic D R2 Q t clk1 T t clk2 1 d > 0 T + d 6 12 -t jitter T ³ t c-q + t plogic + t su - d + 2t jitter t hold t cdlogic + t cdreg d 2t jitter d > 0 with jitter: Degrades performance, and makes t hold even harder to meet. (The acceptable skew is reduced by jitter.) J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 13

Clock Skew Solutions Reduce clock skew Careful clock distribution network design Plenty of metal wiring resources Supply Filtering Active de-skewing Analyze clock skew Only budget actual, not worst case skews Local vs. global skew budgets Tolerate clock skew Choose circuit structures insensitive to skew Take advantage of useful skew Page 14

Jitter reduction using local supply filtering 9/27/18 VLSI-1 Class Notes Page 15

Active Clock De-skewing Active clock de-skewing is accomplished by dynamically delaying the global clock signals. This can result in clock jitter. Careful analysis is required to validate the benefits. VLSI-1 Class Notes

Clock Distribution There are four basic types of clock distribution networks used in high performance processor designs: Tree: IBM and Freescale PowerPC, HP PA-RISC Grid: SPARC, Alpha Serpentine: Pentium-III Spine: Alpha, Pentium-4 Each technique has advantages and disadvantages: Wire Cap Delay Skew Grid High 15x Low sub100 ps Low-Med Trees Low 1x High 100 s ps Low Serpentine Very High 30x High 100 s ps Low Spine High 10x Low-sub100ps Med 9/27/18 VLSI-1 Class Notes Page 17

Clock Grids Use grid on two or more levels to carry clock Make wires wide to reduce RC delay Ensures low skew between nearby points But possibly large skew across die Page 18

Alpha Clock Grids Alpha 21064 Alpha 21164 Alpha 21264 PLL gclk grid gclk grid Alpha 21064 Alpha 21164 Alpha 21264 9/27/18 19 VLSI-1 Class Notes

H-Trees Fractal structure Gets clock arbitrarily close to any point Matched delay along all paths Delay variations cause skew A and B might see big skew A B 9/27/18 VLSI-1 Class Notes 20

Itanium 2 H-Tree Four levels of buffering: Primary driver Repeater Second-level clock buffer Gater Route around obstructions Repeaters Typical SLCB Locations Primary Buffer 9/27/18 VLSI-1 Class Notes Page 21

Hybrid Networks Use H-tree to distribute clock to many points Tie these points together with a grid Ex: IBM Power4, PowerPC H-tree drives 16-64 sector buffers Buffers drive total of 1024 points All points shorted together with grid Page 22

Dealing with Clock Skew and Jitter To minimize skew, balance clock paths using H-tree or matchedtree clock distribution structures. If possible, route data and clock in opposite directions; eliminates races at the cost of performance. The use of gated clocks to help with dynamic power consumption make jitter worse. Shield clock wires (route power lines VDD or GND next to clock lines) to minimize/eliminate coupling with neighboring signal nets. J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 23

Dealing with Clock Skew and Jitter Use dummy fills to reduce skew by reducing variations in interconnect capacitances due to interlayer dielectric thickness variations. Beware of temperature and supply rail variations and their effects on skew and jitter. Power supply noise fundamentally limits the performance of clock networks. J. Rabaey, 2002 9/27/18 VLSI-1 Class Notes Page 24

PLL Synchronization There are two techniques used to synchronize the clocks in a high performance system: Phase Locked Loop (PLL) or a Delay Locked Loop (DLL) The PLL is used to phase synchronize (and probably multiply) the system clock WRT to a reference clock (internal or external). PLL features: Frequency Multiplication to run processor at faster speed than memory interface. Skew reduction. The reference clock is aligned to the feedback clock. Possible stability issues with PLL due to 2nd or 3rd order loop behavior. AVCC C1 C2 Ref Fdbk PFD UP DN Charge Pump R VCNTL Bias Gen NBIAS VCO 1/N 9/27/18 Clock Network VLSI-1 Class Notes Page 25

DLL Synchronization The DLL is used to delay synchronize the system clock to a reference clock. Some high performance systems use a combination of both to generate the various clocks in a multiple clock domain design. SOC designs can have many multiple frequency clock domains. 9/27/18 VLSI-1 Class Notes 26

Chip-2-Chip PLL Synchronization Digital System Chip 1 Data Chip 2 Digital System f system = N x f crystal PLL Divider reference clock PLL Clock Buffer Crystal Oscillator f crystal, 200<Mhz J. Rabaey, 2002 27

High Performance Processor Clock Network 4*XX 4*XX/N 4*XX*F/N 28

Clock Distribution tree for a Datapath High peak currents to drive typical clock loads (¼ 1000 pf) 29

Matching Delays in Clock Distribution Balance delays of paths Match buffer and wire delays to minimize skew Issues Load of latch (driven by clock) is data-dependent (capacitance depends on source voltage) Process variations IR drops and temperature variations Need tools to support clock tree design Page 30

Backup 31

Variations of tree distribution networks Target: Metallization and Gate topology uniformity Tapered H-Tree 32

Clock Generation 33

Core Clock Distribution Adjustable delay buffer 34

SLCB (Second Level Clock Buffer) 35

First Level Route Geometry 36

Measured Skew Laser probed SLCB skew (non-loading, valid DC levels) 37

ISSCC 2007: Multi-Core Clocking Approach Asynchronous Communication and Independent Core Frequencies Source: An Integrated Quad-Core Opteron Processor ISSCC 2007 Source: A 4320MIPS Four-Processor Core SMP/AMP with Individually Managed Clock Frequency for Low Power Consumption, ISSCC 2007 38

Non-overlapping clock generator 39

HP PA-RISC Clock Distribution 40

IBM PowerPC Clock Distribution Source: A 1GHz single-issue 64b PowerPC processor, ISSCC 2000 41

Sun Microsystems UltraSparc III 42

DEC Alpha 21164 Clock Distribution 43

DEC Alpha 21264 Clock Distribution 44

Pentium 4 Processor Clock Network 2GHz triple-spine clock distribution 45

IBM POWER-4 in.18u SOI H-Tree and Grid Distribution Clock skew: <25ps ~70% power in clock and latches Dual core, shared L2 174M transistors 115W at 1.1GHz, 1.5V 9/27/18 VLSI-1 Class Notes Page 46

IBM POWER-4 3D Skew Visualization 9/27/18 VLSI-1 Class Notes 47

Clock Skew across CELL Processor Global Clock Distribution 9/27/18 VLSI-1 Class Notes 48