Timing Issues. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić. January 2003

Similar documents
GMU, ECE 680 Physical VLSI Design 1

The Linear-Feedback Shift Register

Xarxes de distribució del senyal de. interferència electromagnètica, consum, soroll de conmutació.

EE241 - Spring 2006 Advanced Digital Integrated Circuits

EE115C Winter 2017 Digital Electronic Circuits. Lecture 19: Timing Analysis

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

EECS 427 Lecture 14: Timing Readings: EECS 427 F09 Lecture Reminders

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

EE241 - Spring 2007 Advanced Digital Integrated Circuits. Announcements

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

Lecture 27: Latches. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

Digital Integrated Circuits A Design Perspective

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

Issues on Timing and Clocking

Clock Strategy. VLSI System Design NCKUEE-KJLEE

Final presentations May 8, 1-5pm, BWRC Final reports due May 7, 8pm Final exam, Monday, May :30pm, 241 Cory

Digital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.

Digital Integrated Circuits A Design Perspective

Digital Integrated Circuits A Design Perspective

Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. November Digital Integrated Circuits 2nd Sequential Circuits

Homework 2 due on Wednesday Quiz #2 on Wednesday Midterm project report due next Week (4 pages)

Designing Sequential Logic Circuits

Clocking Issues: Distribution, Energy

Skew-Tolerant Circuit Design

GMU, ECE 680 Physical VLSI Design

9/18/2008 GMU, ECE 680 Physical VLSI Design

EE141Microelettronica. CMOS Logic

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering. Jungli, Taiwan

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Digital VLSI Design. Lecture 8: Clock Tree Synthesis

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Timing Analysis with Clock Skew

Motivation for CDR: Deserializer (1)

Chapter 5 CMOS Logic Gate Design

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Chapter 10 SOLUTIONS

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

MODULE 5 Chapter 7. Clocked Storage Elements

Integrated Circuits & Systems

CMPEN 411. Spring Lecture 18: Static Sequential Circuits

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

Chapter 13. Clocked Circuits SEQUENTIAL VS. COMBINATIONAL CMOS TG LATCHES, FLIP FLOPS. Baker Ch. 13 Clocked Circuits. Introduction to VLSI

Synchronizers, Arbiters, GALS and Metastability

! Charge Leakage/Charge Sharing. " Domino Logic Design Considerations. ! Logic Comparisons. ! Memory. " Classification. " ROM Memories.

Lecture 9: Sequential Logic Circuits. Reading: CH 7

Chapter 3. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 3 <1>

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Implementation of Clock Network Based on Clock Mesh

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Digital Integrated Circuits A Design Perspective

Semiconductor Memories

EE 560 CHIP INPUT AND OUTPUT (I/0) CIRCUITS. Kenneth R. Laker, University of Pennsylvania

COMP 103. Lecture 16. Dynamic Logic

9/18/2008 GMU, ECE 680 Physical VLSI Design

NTE4035B Integrated Circuit CMOS, 4 Bit Parallel In/Parallel Out Shift Register

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Itanium TM Processor Clock Design

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

UNIVERSITY OF CALIFORNIA

Chapter 3. Chapter 3 :: Topics. Introduction. Sequential Circuits

NTE74HC173 Integrated Circuit TTL High Speed CMOS, 4 Bit D Type Flip Flop with 3 State Outputs

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

EE371 - Advanced VLSI Circuit Design

Digital Integrated Circuits A Design Perspective. Semiconductor. Memories. Memories

EECS 427 Lecture 15: Timing, Latches, and Registers Reading: Chapter 7. EECS 427 F09 Lecture Reminders

Stop Watch (System Controller Approach)

NTE74HC299 Integrated Circuit TTL High Speed CMOS, 8 Bit Universal Shift Register with 3 State Output

Memory, Latches, & Registers

ALU, Latches and Flip-Flops

Hold Time Illustrations


ΗΜΥ 307 ΨΗΦΙΑΚΑ ΟΛΟΚΛΗΡΩΜΕΝΑ ΚΥΚΛΩΜΑΤΑ Εαρινό Εξάμηνο 2018

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

NTE74HC109 Integrated Circuit TTL High Speed CMOS, Dual J K Positive Edge Triggered Flip Flop w/set & Reset

EE141- Spring 2007 Digital Integrated Circuits

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Next, we check the race condition to see if the circuit will work properly. Note that the minimum logic delay is a single sum.

Digital Integrated Circuits A Design Perspective

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

Lecture 7: Logic design. Combinational logic circuits

Topics to be Covered. capacitance inductance transmission lines

Lecture 21: Packaging, Power, & Clock

Luis Manuel Santana Gallego 31 Investigation and simulation of the clock skew in modern integrated circuits

5. Sequential Logic x Computation Structures Part 1 Digital Circuits. Copyright 2015 MIT EECS

Models for representing sequential circuits

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

The Inverter. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic

Sequential Logic Design: Controllers

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

EECS Components and Design Techniques for Digital Systems. FSMs 9/11/2007

Design of CMOS Adaptive-Bandwidth PLL/DLLs

Experiment 9 Sequential Circuits

Synchronization and Arbitration in GALS

Transcription:

Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić Timing Issues January 2003 1

Synchronous Timing CLK In R Combinational 1 R Logic 2 C in C out Out 2

Timing Definitions 3

Latch Parameters D Q Clk Clk D PW m t hold T t su Q t c-q t d-q Delays can be different for rising and falling data transitions 4

Register Parameters D Q Clk Clk T D t hold t su Q t c-q Delays can be different for rising and falling data transitions 5

Clock Uncertainties 4 Power Supply Devices 2 3 Interconnect 6 Capacitive Load 1 Clock Generation 5 Temperature 7 Coupling to Adjacent Lines Sources of clock uncertainty 6

Clock Nonidealities Clock skew Spatial variation in temporally equivalent clock edges; deterministic + random, t SK Clock jitter Temporal variations in consecutive edges of the clock signal; modulation + random noise Cycle-to-cycle (short-term) t JS Long term t JL Variation of the pulse width Important for level sensitive clocking 7

Clock Skew and Jitter Clk t SK Clk t JS Both skew and jitter affect the effective cycle time Only skew affects the race margin 8

Clock Skew # of registers Earliest occurrence of Clk edge Nominal δ/2 Latest occurrence of Clk edge Nominal + δ /2 Insertion delay Max Clk skew Clk delay δ 9

Positive and Negative Skew In R1 D Q Combinational Logic R2 D Q Combinational Logic R3 D Q CLK t CLK1 t CLK2 t CLK3 delay (a) Positive skew delay In R1 D Q Combinational Logic R2 D Q Combinational Logic R3 D Q t CLK1 t CLK2 t CLK3 delay delay CLK (b) Negative skew 10

Positive Skew T CLK + δ CLK1 1 δ T CLK 3 CLK2 2 4 δ + t h Launching edge arrives before the receiving edge 11

Negative Skew T CLK + δ CLK1 1 T CLK 3 CLK2 2 δ 4 Receiving edge arrives before the launching edge 12

Timing Constraints In R1 D Q Combinational Logic R2 D Q CLK t CLK1 t CLK2 t c q t c q, cd t su, t hold t logic t logic, cd Minimum cycle time: T - δ = t c-q + t su + t logic Worst case is when receiving edge arrives early (positive δ) 13

Timing Constraints In R1 D Q Combinational Logic R2 D Q CLK t CLK1 t CLK2 t c q t c q, cd t su, t hold t logic t logic, cd Hold time constraint: t (c-q, cd) + t (logic, cd) > t hold + δ Worst case is when receiving edge arrives late Race between data and clock 14

Impact of Jitter 2 T CLK 5 CLK 1 3 4 -t ji tte r t jitter 6 In REGS CLK t c-q, t c-q, cd t su, t hold t jitter Combinational Logic t logic t logic, cd 15

Longest Logic Path in Edge-Triggered Systems Clk T Clk-Q T T LM T SU T JI + δ Latest point of launching Earliest arrival of next cycle 16

Clock Constraints in Edge-Triggered Systems If launching edge is late and receiving edge is early, the data will not be too late if: T c-q + T LM + T SU < T T JI,1 T JI,2 - δ Minimum cycle time is determined by the maximum delays through the logic T c-q + T LM + T SU + δ + 2 T JI < T Skew can be either positive or negative 17

Shortest Path Earliest point of launching Clk T Clk-Q T Lm Clk T H Nominal clock edge Data must not arrive before this time 18

Clock Constraints in Edge-Triggered Systems If launching edge is early and receiving edge is late: T c-q + T LM T JI,1 < T H + T JI,2 + δ Minimum logic delay T c-q + T LM < T H + 2T JI + δ 19

How to counter Clock Skew? Negative Skew REG φ REG. REG log Out In REG φ φ Positive Skew φ Clock Distribution Data and Clock Routing 20

Flip-Flop Flop Based Timing φ Logic delay Skew Flip-flop delay Flip -flop Logic T SU φ = 0 T Clk-Q φ = 1 Representation after M. Horowitz, VLSI Circuits 1996. 21

Flip-Flops Flops and Dynamic Logic Logic delay T SU T SU T Clk-Q φ = 0 T Clk-Q φ = 1 φ = 0 φ = 1 Logic delay Precharge Evaluate Evaluate Precharge Flip-flops are used only with static logic 22

Latch timing t D-Q D Q When data arrives to transparent latch Latch is a soft barrier Clk t Clk-Q When data arrives to closed latch Data has to be re-launched 23

Single-Phase Clock with Latches φ Latch Logic T skl T skl T skt T skt Clk PW P 24

Latch-Based Design L1 latch is transparent when φ = 0 φ L2 latch is transparent when φ = 1 L1 Latch Logic L2 Latch Logic 25

Slack-borrowing In L1 L2 L1 D Q CLB_A CLB_B D Q D Q a b c d e t pd,a t pd,b CLK1 CLK2 CLK1 T CLK CLK1 1 2 3 4 CLK2 slack passed to next stage t pd,a t DQ t pd,b t DQ a valid b valid c valid e valid d valid 26

Latch-Based Timing φ Static logic Skew L1 Latch Logic L2 Latch φ = 1 L2 latch L1 latch Logic Can tolerate skew! Long path φ = 0 Short path 27

Clock Distribution H-tree CLK Clock is distributed in a tree-like fashion 28

More realistic H-treeH [Restle98] 29

The Grid System Driver GCLK GCLK Driver Driver GCLK No rc-matching Large power Driver GCLK 30

Example: DEC Alpha 21164 Clock Frequency: 300 MHz - 9.3 Million Transistors Total Clock Load: 3.75 nf Power in Clock Distribution network : 20 W (out of 50) Uses Two Level Clock Distribution: Single 6-stage driver at center of chip Secondary buffers drive left and right side clock grid in Metal3 and Metal4 Total driver size: 58 cm! 31

21164 Clocking t rise = 0.35ns t cycle = 3.3ns Clock waveform final drivers pre-driver Location of clock driver on die t skew = 150ps 2 phase single wire clock, distributed globally 2 distributed driver channels Reduced RC delay/skew Improved thermal distribution 3.75nF clock load 58 cm final driver width Local inverters for latching Conditional clocks in caches to reduce power More complex race checking Device variation 32

Clock Drivers 33

Clock Skew in Alpha Processor 34

EV6 (Alpha 21264) Clocking 600 MHz 0.35 micron CMOS t cycle = 1.67ns t rise = 0.35ns Global clock waveform t skew = 50ps PLL 2 Phase, with multiple conditional buffered clocks 2.8 nf clock load 40 cm final driver width Local clocks can be gated off to save power Reduced load/skew Reduced thermal issues Multiple clocks complicate race checking 35

21264 Clocking 36

ps 5 10 15 20 25 30 35 40 45 50 EV6 Clock Results ps 300 305 310 315 320 325 330 335 340 345 GCLK Skew (at Vdd/2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%) 37

EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains NCLK (Mem Ctrl) + widely dispersed drivers DLL DLL DLL + DLLs compensate static and lowfrequency variation + divides design and verification effort L2L_CLK (L2 Cache) GCLK (CPU Core) PLL L2R_CLK (L2 Cache) - DLL design and verification is added work SYSCLK + tailored clocks 38

Self-timed and Asynchronous Design Functions of clock in synchronous design 1) Acts as completion signal 2) Ensures the correct ordering of events Truly asynchronous design 1) Completion is ensured by careful timing analysis 2) Ordering of events is implicit in logic Self-timed design 1) Completion ensured by completion signal 2) Ordering imposed by handshaking protocol 39

Synchronous Pipelined Datapath In D R1 Q Logic Block #1 D R2 Q Logic Block #2 D R3 Q Logic Block #3 D R4 Q CLK t pd,reg t pd1 t pd2 t pd3 40

Self-Timed Pipelined Datapath Req Req Req Req Ack HS Ack HS Ack HS ACK Start Done Start Done Start Done In R1 F1 R2 F2 R3 F3 Out t pf1 t pf2 t pf3 41

Completion Signal Generation In LOGIC NETWORK Out Start DELAY MODULE Done Using Delay Element (e.g. in memories) 42

Completion Signal Generation Using Redundant Signal Encoding 43

Completion Signal in DCVSL V DD V DD Start B0 B1 Done B0 B1 In1 In1 In2 In2 PDN PDN Start 44

Self-Timed Adder Start P 0 P 1 V DD P 2 P 3 C 0 C 1 C 2 C 3 C 4 C 4 Start C 4 V DD Done C 4 C 0 G 0 G 1 G 2 G 3 C 3 C 3 Start C 2 C 2 C 1 C 1 Start P 0 P 1 V DD P 2 P 3 C 0 C 1 C 2 C 3 C 4 C 4 Start (b) Completion signal C 0 K 0 K 1 K 2 K 3 Start (a) Differential carry generation 45

Completion Signal Using Current Sensing Inputs Start Input Register V DD Static CMOS Logic GND sense Current Sensor A Output Start A B t delay t overlap Min Delay Generator B Done Done Output t MDG t pd-nor valid 46

Hand-Shaking Protocol Req Ack Req 2 SENDER Data RECEIVER Ack 3 (a) Sender-receiver configuration Data 1 1 Two Phase Handshake cycle 1 cycle 2 Sender s action Receiver s action (b) Timing diagram 47

Event Logic The Muller-C C Element A B C F A B F n+1 0 0 1 1 0 1 0 1 0 F n F n 1 (a) Schematic (b) Truth table V DD V DD V DD A B S R Q F A B B B F A F (a) Logic A B B (b) Majority Function (c) Dynamic 48

2-Phase Handshake Protocol Sender logic Data ready Data Receiver logic Data accepted C Req Ack Handshake logic Advantage : FAST - minimal # of signaling events (important for global interconnect) Disadvantage : edge - sensitive, has state 49

Example: Self-timed FIFO In R1 R2 R3 Out En Done Req i C C C Req 0 Ack i Ack o All 1s or 0s -> pipeline empty Alternating 1s and 0s -> pipeline full 50

2-Phase Protocol 51

Example From [Horowitz] 52

Example 53

Example 54

Example 55

4-Phase Handshake Protocol Req 2 4 Sender s action Receiver s action Ack 3 5 Data 1 1 Cycle 1 Cycle 2 Also known as RTZ Slower, but unambiguous 56

4-Phase Handshake Protocol Implementation using Muller-C elements Sender logic Data Receiver logic Data ready Data accepted C C S Req Ack Handshake logic 57

Self-Resetting Logic Precharged Logic Block (L1) completion detection (L1) Precharged Logic Block (L2) completion detection (L2) Precharged Logic Block (L3) completion detection (L3) V DD int out Post-charge logic A B C 58

Clock-Delayed Domino GND CLK1 CLK2 (to next stage) V DD Q1 (also D2) D1 Pulldown Network 59

Asynchronous-Synchronous Interface Asynchronous system f in Synchronous system f CLK Synchronization 60

Synchronizers and Arbiters Arbiter: Circuit to decide which of 2 events occurred first Synchronizer: Arbiter with clock φ as one of the inputs Problem: Circuit HAS to make a decision in limited time - which decision is not important Caveat: It is impossible to ensure correct operation But, we can decrease the error probability at the expense of delay 61

A Simple Synchronizer CLK D int I 1 Q CLK I 2 Data sampled on rising edge of the clock Latch will eventually resolve the signal value, but... this might take infinite time! 62

Synchronizer: Output Trajectories 2.0 V out 1.0 0.0 0 100 200 300 time [ps] Single-pole model for a flip-flop 63

Mean Time to Failure 64

Example T f = 10 nsec = T T signal = 50 nsec t r = 1 nsec t = 310 psec V IH - V IL = 1 V (V DD = 5 V) N(T) = 3.9 10-9 errors/sec MTF (T) = 2.6 10 8 sec = 8.3 years MTF (0) = 2.5 μsec 65

Influence of Noise p(v) Uniform distribution around VM T logarithmic reduction 0 V IL V IH Initial Distribution Still Uniform Low amplitude noise does not influence synchronization behavior 66

Typical Synchronizers 2 phase clocking circuit φ2 Q φ1 Q φ2 φ1 Using delay line 67

Cascaded Synchronizers Reduce MTF In O 1 O 2 Out Sync Sync Sync φ 68

Arbiters Req1 Req2 Arbiter Ack1 Ack2 Req1 A B Ack2 (a) Schematic symbol Req2 Ack1 Req1 (b) Implementation Req2 V A T gap B metastable Ack1 t (c) Timing diagram 69

PLL-Based Synchronization Digital System Chip 1 Data Chip 2 Digital System f system = N x f crystal PLL Divider reference clock PLL Clock Buffer Crystal Oscillator f crystal, 200<Mhz 70

PLL Block Diagram Reference clock Phase detector Up Charge pump Loop filter v cont VCO Local clock Down Divide by N System Clock 71

Phase Detector Output before filtering ref local clock (a) Output ref local clock Output Output (Low pass filtered) (b) V DD Transfer characteristic -180-90 90 180 phase error (deg) (c) 72

Phase-Frequency Detector Rst D Q UP B B A A Rst D Q DN B (a) schematic UP = 0 DN = 1 A UP = 0 DN = 0 B (b) state transition diagram UP = 1 DN = 0 A A A B UP DN B UP DN (c) Timing waveforms 73

PFD Response to Frequency A B UP DN 74

PFD Phase Transfer Characteristic Average (UP-DN) V DD 2 π 2π phase error (deg) 75

Charge Pump V DD UP To VCO Control Input DN 76

PLL Simulation Control Voltage (V) 1.0 0.8 0.6 0.4 0.2 ref div vco ref 0.0 0 1 2 3 4 5 Time ( μ s) div vco 77

Clock Generation using DLLs Delay-Locked Loop (Delay Line Based) f REF Phase Det U D Charge Pump Filter DL f O Phase-Locked Loop (VCO-Based) f REF U N PD D CP VCO Filter f O 78

Delay Locked Loop F REFΔPH Phase detect U D Charge pump C V CTRL VCDL (a) F O REF OUT UP DN (b) ΔPH V CTRL Delay (c) 79

DLL-Based Clock Distribution VCDL Digital Circuit CP/LF Phase Detector GLOBAL CLK VCDL Digital Circuit CP/LF Phase Detector 80