EECS 141 S02 Timing Project 2: A Random Number Generator R R R S 0 S 1 S 2 1 0 0 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 0 0 1 1 0 0 The Linear-Feedback Shift Register 1
Project Goal Design a 4-bit LFSR SPEED, SPEED, SPEED! Feel free to use:» Logic style» Register style» Clocking style No layout, only schematics and simulation 4-bit LFSR R R R R A 0 A 1 A 2 A 3 Pseudo-Randomnumber generator. + additional circuitry for asynchronous reset to a seed 2
Constraints TECHNOLOGY: 0.25µm CMOS technology SUPPLY: 2.5V PERFORMANCE METRIC: V OH, V OL : The output signals should settle to within 10% of their final value before the next clock event can be introduced!!! NOISE MARGINS: The noise margins should be at least 10% of the voltage swing. LOAD CAPACITANCE: Each output bit of the generator should have a 20 ff load. CLOCKS: You are given a primary clock signal with a rise and fall time of 50 psec and a duty cycle of 50%. NO RACES!!!!!!!!!!!!!!! Reporting No written report Submit a short summary of your results on May 6 Poster presentation on afternoon of Tu May 7 (9 power-point slides)» Tell your story in 5 minutes!» And convince us that your design is the best And submit your poster electronically 3
Latch Parameters D Q D PW m T H T SU Q T -Q T D-Q Delays can be different for rising and falling data transitions Flip-Flop Parameters D Q D PW m T H T SU Q T -Q Delays can be different for rising and falling data transitions 4
The Clock Distribution Challenge Global operations Low bandwidth High latency & High power 20 Clocks 90,000 tracks Local, parallel operations High bandwidth Low latency & Low power Source: Bill Dally, Stanford Example Clock System Courtesy of IEEE Press, New York. 2000 5
Clock Nonidealities Clock skew» Spatial variation in temporally equivalent clock edges; deterministic + random, t SK Clock jitter» Temporal variations in consecutive edges of the clock signal; modulation + random noise» Cycle-to-cycle (short-term) t JS» Long term t JL Variation of the pulse width» for level sensitive clocking Clock Skew and Jitter t SK t JS Both skew and jitter affect the effective cycle time Only skew affects the race margin 6
Clock Skew #ofregisters Earliest occurrence of edge Nominal T sk /2 Latest occurrence of edge Nominal + T sk /2 Insertion delay Max skew delay T sk Sources of skew and jitter 1 Devices 4 2 Clock Generation Power Supply 3 5 Interconnect Temperature 6 Capacitive Load 7Coupling to Adjacent Lines 7
Positive and Negative Skew φ Data CL R CL R CL R (a) Positive skew φ Data CL R CL R CL R (b) Negative skew Constraints on Skew R1 δ φ φ t φ t φ = t φ + δ t r,min + t l,min data R2 Late (a) Race between clock and data. δ φ φ φ + P t φ t φ + T = t r,max + t l,max i t φ + T + δ R1 R2 data Early (b) Data should be stable before clock pulse is applied. 8
Clock Constraints in Edge-Triggered Logic δ t rmin, + t i + t l, min T t rmax, + t i + t l, δ max Maximum Clock Skew Determined by Minimum Delay between Latches Minimum Clock Period Determined by Maximum Delay between Latches Impact of Jitter CLK 1 2 T CLK 3 4 5 -t ji tte r t jitter 6 In REGS CLK t c-q,t c-q, cd t su, t hold t jitter Combinational Logic t logic t log ic, cd 9
Longest Logic Path in Edge-Triggered Systems T -Q T T LM T SU T JI - δ Latest point of launching Earliest arrival of next cycle Unger and Tan Trans. on Comp. 10/86 Clock Constraints in Edge-Triggered Systems If launching edge is late and receiving edge is early, the data will not be too late if: T c-q +T LM +T SU <T T JI,1 T JI,2 + δ Minimum cycle time is determined by the maximum delays through the logic T c-q +T LM +T SU - δ +2T JI <T Jitter always works negatively 10
Shortest Path Earliest point of launching T -Q T Lm T H Nominal clock edge Data must not arrive before this time Clock Constraints in Edge-Triggered Systems If launching edge is early and receiving edge is late: T c-q +T LM T JI,1 <T H +T JI,2 + δ Minimum logic delay T c-q +T LM <T H +2T JI + δ 11
How to counter Clock Skew? Negative Skew REG φ REG. REG log Out In REG φ φ Positive Skew φ Clock Distribution Data and Clock Routing Flip-Flop Based Timing φ Logic delay Skew Flip-flop delay Flip -flop Logic T SU φ = 0 T -Q φ = 1 12
Flip-Flops and Dynamic Logic Logic delay T SU T SU T -Q φ = 0 T -Q φ = 1 φ = 0 φ = 1 Precharge Evaluate Logic delay Evaluate Precharge Flip-flops are used only with static logic Latch timing t D-Q D Q When data arrives to transparent latch Latch is a soft barrier t -Q When data arrives to closed latch Data has to be re-launched 13
Single-Phase Clock with Latches φ Latch Unger and Tan Trans. on Comp. 10/86 Logic T skl T skl T skt T skt PW P Latch-Based Design L1 latch is transparent when φ =0 φ L2 latch is transparent when φ =1 L1 Latch Logic L2 Latch Logic 14
Slack-borrowing In L1 D Q CLB_A L2 L1 D Q CLB_B D Q a t pd,a b c t pd,b d e CLK1 CLK1 CLK2 T CLK 1 2 3 4 CLK1 CLK2 slack passed to next stage t pd,a t DQ t pd,b t DQ a valid b valid c valid e valid d valid Latch-Based Timing φ Static logic Skew L1 Latch Logic L2 Latch φ = 1 L2 latch L1 latch Logic Long path φ = 0 Can tolerate skew! Short path 15
Clock Distribution CLOCK H-Tree Network Observe: Only Relative Skew is Important More realistic H-tree [Restle98] 16
Clock Network with Distributed Buffering Local Area Module Module secondary clock drivers Module Module Module Module main clock driver CLOCK Reduces absolute delay, and makes Power-Down easier Sensitive to variations in Buffer Delay The Grid System Driver GCLK GCLK Driver Driver GCLK No rc-matching Large power Driver GCLK 17
Example: DEC Alpha 21164 Clock Frequency: 300 MHz - 9.3 Million Transistors Total Clock Load: 3.75 nf Power in Clock Distribution network : 20 W (out of 50) Uses Two Level Clock Distribution: Single 6-stage driver at center of chip Secondary buffers drive left and right side clock grid in Metal3 and Metal4 Total driver size: 58 cm! 21164 Clocking t rise = 0.35ns t cycle = 3.3ns Clock waveform final drivers pre-driver Location of clock driver on die t skew = 150ps 2 phase single wire clock, distributed globally 2 distributed driver channels» Reduced RC delay/skew» Improved thermal distribution» 3.75nF clock load» 58 cm final driver width Local inverters for latching Conditional clocks in caches to reduce power More complex race checking Device variation 18
Clock Drivers Clock Skew in Alpha Processor 19
EV6 (Alpha 21264) Clocking 600MHz 0.35micronCMOS t cycle = 1.67ns t rise = 0.35ns Global clock waveform t skew = 50ps PLL 2 Phase, with multiple conditional buffered clocks» 2.8 nf clock load» 40 cm final driver width Local clocks can be gated off to save power Reduced load/skew Reduced thermal issues Multiple clocks complicate race checking 21264 Clocking 20
EV6 Clock Results ps 5 10 15 20 25 30 35 40 45 50 ps 300 305 310 315 320 325 330 335 340 345 GCLK Skew (at Vdd/2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%) EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains NCLK (Mem Ctrl) + widely dispersed drivers DLL DLL DLL + DLLs compensate static and lowfrequency variation + divides design and verification effort L2L_CLK (L2 Cache) GCLK (CPU Core) PLL L2R_CLK (L2 Cache) - DLL design and verification is added work SYSCLK + tailored clocks 21