Session 8C-5: Inductive Issues in Power Grids and Packages. Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs

Similar documents
DesignConEast 2005 Track 4: Power and Packaging (4-WA1)

Design and Implementation of Memory-based Cross Talk Reducing Algorithm to Eliminate Worst Case Crosstalk in On- Chip VLSI Interconnect

Agilent EEsof EDA.

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Chapter 8. Low-Power VLSI Design Methodology

10/16/2008 GMU, ECE 680 Physical VLSI Design

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

! Crosstalk. ! Repeaters in Wiring. ! Transmission Lines. " Where transmission lines arise? " Lossless Transmission Line.

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

Power Consumption in CMOS CONCORDIA VLSI DESIGN LAB

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Announcements. EE141- Fall 2002 Lecture 25. Interconnect Effects I/O, Power Distribution

Digital Integrated Circuits. The Wire * Fuyuzhuo. *Thanks for Dr.Guoyong.SHI for his slides contributed for the talk. Digital IC.

EE 466/586 VLSI Design. Partha Pande School of EECS Washington State University

Temporal Redundancy Based Encoding Technique for Peak Power and Delay Reduction of On-Chip Buses

Grasping The Deep Sub-Micron Challenge in POWERFUL Integrated Circuits

Interconnect s Role in Deep Submicron. Second class to first class

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

Equivalent Circuit Model Extraction for Interconnects in 3D ICs

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

Signal integrity in deep-sub-micron integrated circuits

Rg2 Lg2 Rg6 Lg6 Rg7 Lg7. PCB Trace & Plane. Figure 1 Bypass Decoupling Loop

PDN Planning and Capacitor Selection, Part 1

VLSI Design I. Defect Mechanisms and Fault Models

Explicit Constructions of Memoryless Crosstalk Avoidance Codes via C-transform

Accurate Estimating Simultaneous Switching Noises by Using Application Specific Device Modeling

Where Does Power Go in CMOS?

Power Dissipation. Where Does Power Go in CMOS?

Introduction. HFSS 3D EM Analysis S-parameter. Q3D R/L/C/G Extraction Model. magnitude [db] Frequency [GHz] S11 S21 -30

Making Fast Buffer Insertion Even Faster via Approximation Techniques

Administrative Stuff

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

DKDT: A Performance Aware Dual Dielectric Assignment for Tunneling Current Reduction

! Memory. " RAM Memory. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

Introduction to CMOS VLSI Design (E158) Lecture 20: Low Power Design

SCSI Connector and Cable Modeling from TDR Measurements

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

Implementation of Clock Network Based on Clock Mesh

Digital Integrated Circuits Designing Combinational Logic Circuits. Fuyuzhuo

Interconnect (2) Buffering Techniques.Transmission Lines. Lecture Fall 2003

Preamplifier in 0.5µm CMOS

Technology Mapping for Reliability Enhancement in Logic Synthesis

Problems in VLSI design

Lecture 2: CMOS technology. Energy-aware computing

EE241 - Spring 2001 Advanced Digital Integrated Circuits

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Lecture 6 Power Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

Timing-Aware Decoupling Capacitance Allocation in Power Distribution Networks

Last Lecture. Power Dissipation CMOS Scaling. EECS 141 S02 Lecture 8

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Chapter 2 Fault Modeling

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

Announcements. EE141- Fall 2002 Lecture 7. MOS Capacitances Inverter Delay Power

School of EECS Seoul National University

SRAM System Design Guidelines

Lecture 8-1. Low Power Design

EECS 312: Digital Integrated Circuits Midterm Exam 2 December 2010

NAND3_X1. Page 104 of 969

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

VLSI VLSI CIRCUIT DESIGN PROCESSES P.VIDYA SAGAR ( ASSOCIATE PROFESSOR) Department of Electronics and Communication Engineering, VBIT

Logic BIST. Sungho Kang Yonsei University

Dynamic operation 20

Section 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic

Signal Encoding Schemes for Low-Power Interface Design

Lecture 16: Circuit Pitfalls

S.Y. Diploma : Sem. III [CO/CM/IF/CD/CW] Digital Techniques s complement 2 s complement 1 s complement

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

Switch Fabrics. Switching Technology S P. Raatikainen Switching Technology / 2004.

The Wire. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

EECS 141 F01 Lecture 17


S.Y. Diploma : Sem. III [DE/ED/EI/EJ/EN/ET/EV/EX/IC/IE/IS/IU/MU] Principles of Digital Techniques

MODULE 5 Chapter 7. Clocked Storage Elements

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

The Wire EE141. Microelettronica

Lecture 21: Packaging, Power, & Clock

Name: Answers. Mean: 38, Standard Deviation: 15. ESE370 Fall 2012

Floating Point Representation and Digital Logic. Lecture 11 CS301

IBIS EBD Modeling, Usage and Enhancement An Example of Memory Channel Multi-board Simulation

Sample Test Paper - I

Interconnects. Wire Resistance Wire Capacitance Wire RC Delay Crosstalk Wire Engineering Repeaters. ECE 261 James Morizio 1

Semiconductor memories

Software Engineering 2DA4. Slides 8: Multiplexors and More

Digital Integrated Circuits EECS 312. Review. Fringe vs. parallel plate capacitance. Rent s rule. Impact of inter-wire capacitance

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

Transistor Sizing for Radiation Hardening

Reliability Breakdown Analysis of an MP-SoC platform due to Interconnect Wear-out

DS0026 Dual High-Speed MOS Driver

A Gray Code Based Time-to-Digital Converter Architecture and its FPGA Implementation

Digital Integrated Circuits EECS 312. Midterm exam 1 II. Homework 3 walkthrough. Review. Rent s rule. Inter-wire capacitance

Simultaneous Switching Noise Analysis Using Application Specific Device Modeling

Lecture 15: Scaling & Economics

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops

Lecture 7 Circuit Delay, Area and Power

MM74C922 MM74C Key Encoder 20-Key Encoder

Optimal Folding Of Bit Sliced Stacks+

Fast On-Chip Inductance Simulation Using a Precorrected-FFT Method

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Transcription:

ASP-DAC 2006 Session 8C-5: Inductive Issues in Power Grids and Packages Controlling Inductive Cross-talk and Power in Off-chip Buses using CODECs Authors: Brock J. LaMeres Agilent Technologies Kanupriya Gulati, Texas A&M University Sunil P. Khatri Texas A&M University January 27, 2006 1

Motivation Power delivery is the biggest challenge facing designers entering DSM - The IC core current continues to increases (P4 = 80Amps). - The package interconnect inductance limits instantaneous current delivery. - The inductance leads to ground and power supply bounce. SSN on signal pins is the leading cause of inter-chip bus failure - Ground/power supply bounce causes unwanted switching. - Mutual Inductive cross-talk causes edge degradation which limits speed. - Mutual Inductive cross-talk causes glitches which results in unwanted switching. Further, power in off-chip buses can be significant. - Large percentage of power may be consumed in the output stages Aggressive package design helps, but is too expensive: - Flip-Chip technology can reduce the interconnect inductance. - Flip-Chip requires a unique package design for each ASIC. - This leads to longer process time which equals cost. - 90% of ASIC design starts use wire-bonding due to its low cost. - Wire-bonding has large parasitic inductance that must be addressed. January 27, 2006 2

Our Solution Encode Off-Chip Data to Avoid Inductive Cross-talk & Power Consumption Avoid the following cases: 1) Excessive switching in the same direction = reduce ground/power bounce 2) Excessive X-talk on a signal when switching = reduce edge degradation 3) Excessive X-talk on signal when static = reduce glitching 4) At the same time, limit the number of transitions = reduce power January 27, 2006 3

Our Solution This results in: 1) A subset of vectors is transmitted that avoids inductive X-talk & power. 2) The off-chip bus can now be ran at a higher data rate. 3) The subset of vectors running faster can achieve a higher throughput over the original set of vectors running slower. Throughput of less vectors at higher data-rate Throughput of more vectors at lower data-rate January 27, 2006 4

Agenda 1) Inductive X-talk & Power 2) Terminology 3) Methodology 4) Experimental Results 5) Conclusion January 27, 2006 5

1) Inductive X-Talk Supply Bounce The instantaneous current that flows when signals switch induces a voltage across the inductance of the power supply interconnect following: V bnc di = L dt When more than one signal returns current through one supply pin, the expression becomes: V bnc di = L dt NOTE: Reducing the number of signals switching in the same direction at the same time will reduce the supply bounce. January 27, 2006 6

1) Inductive X-Talk Glitching Mutual inductive coupling from neighboring signals that are switching cause a voltage to induce on the victim that is static: The net coupling is the summation from all neighboring signals that are switching: V i glitch V i glitch di m k = ± Mik k = 1 dt k =± Mik di dt NOTE: The mutual inductive coupling can be canceled out when two neighbors of equal K ik switch in opposite directions. Also, K ik is the mutual inductive coupling coefficient January 27, 2006 7 M ik = Kik Li Lk

1) Inductive X-Talk Edge Degradation Mutual inductive coupling from neighboring signals that are switching cause a voltage to be induced on the victim that is also switching. This follows the same expression as glitch coupling: V glitch di k k = ± M1 k 1 dt The mutual inductive coupling can be manipulated to cause a positive (negative) glitch for a rising (falling) signal. Mutual coupling can thus be exploited so as to help the transition resulting in a faster rise-time or fall-time (alternately, to not hinder the risetime of the transition) January 27, 2006 8

1) Power Power Consumption The power consumed in the output stage is proportional to the capacitance being driven, the output voltage swing, and the switching frequency. p = CV f pin 2 DD NOTE: Power is proportional to the number of switching pins. January 27, 2006 9

2) Terminology Define the following: n = width of the bus segment where each bus segment consists of n-2 signals and 1 VDD and 1 VSS. = the segment consisting of an n-bit bus. is the segment under consideration. -1 is the segment to the immediate left. +1 is the segment to the immediate right. each segment has the same VDD/VSS placement. January 27, 2006 10

2) Terminology Define the following: v i = the transition (vector sequence) that the i th signal in the th segment is undergoing, where v i v i v i = 1 = rising edge = -1 = falling edge = 0 = signal is static This 3-valued algebra enables us to model mutual inductive coupling of any sign January 27, 2006 11

2) Terminology Define the following coding constraints: Supply Bounce v i if is a supply pin, the total bounce on this pin is bounded by P bnc. P bnc is a user defined constant. Glitching if v i is a signal pin and is static ( v i = 0), the total magnitude of the glitch from switching neighbors should be less than P 0. P 0 is a user defined constant. Edge Degradation if v i is a signal pin and is switching ( v i = 1/-1), the total magnitude of the coupling from switching neighbors should be greater than P 1 / P -1. This coupling should not hurt (should aid) the transition. P 1 / P -1 is a user defined constant. January 27, 2006 12

2) Terminology - Power Define the following coding constraints: Power for a given segment, the total power consumption on that segment is bounded by Ppower. Ppower is a user defined constant. January 27, 2006 13

2) Terminology Also define the following: p = how far away to consider coupling (ex., p = 3, consider K 11, K 12, and K 13 on each side of the victim) k q = Magnitude of coupled voltage on pin i when its q th neighbor p switches: k q di dt p = Mip January 27, 2006 14

3) Methodology For each pin v i within segment, we will write a series of constraints that will bound the inductive cross-talk magnitude. The constraints will differ depending on whether v i is a signal or power pin. The coupling constraints will consider signals in adacent segments (+1, -1) depending on p. January 27, 2006 15

3) Methodology Signal Pin Constraints Glitching : coupling is bounded by P 0 Example: v 2 =0, and p=3. This means the three adacent neighbors on either side of v 2 need to be considered (v -1 4, v 0, v 1, v 3, v 4, v +1 0 ). Note we use modulo n arithmetic (and consider adacent segments as required). v 2 = 0 (static) 0 0 0 0 -P 0 < k 3 (v 4-1 ) + k 2 (v 0 ) + k 1 (v 1 ) + k 1 (v 3 ) + k 2 (v 4 ) + k 3 (v 0 +1 ) < P 0 The constraint equation is tested against each possible transition and the transitions that violate the constraint are eliminated. January 27, 2006 16

3) Methodology Signal Pin Constraints Edge Degradation : coupling is bounded by P 1 and P - 1 Example: v 2 = 1 or -1, and p = 3. This means the three adacent neighbors on either side of v 2 need to be considered (v -1 4, v 0, v 1, v 3, v 4, v +1 0 ). v 2 = 1 (rising) 0 0 0 0 k 3 (v 4-1 ) + k 2 (v 0 ) + k 1 (v 1 ) + k 1 (v 3 ) + k 2 (v 4 ) + k 3 (v 0 +1 ) > P 1 v 2 = -1 (falling) 0 0 0 0 k 3 (v 4-1 ) + k 2 (v 0 ) + k 1 (v 1 ) + k 1 (v 3 ) + k 2 (v 4 ) + k 3 (v 0 +1 ) < P -1 Again, the constraint equations are tested against each possible transition and the transitions that violate the constraints are eliminated. January 27, 2006 17

3) Methodology Power Pin Constraints Supply Bounce : coupling is bounded by P bnc Example: v 0 =VDD or VSS. The total number of switching signals that use v 0 to return current must be considered. Due to symmetry of the bus arrangement, signal pins will always return current through two supply pins. i.e., (v -1 0 and v 0 ) or (v 4 and v +1 4 ). This results in the self inductance of the return path being divided by 2. Let z = L di/dt for any pin. Then, v 0 = VDD (z/2) (# of v i pins that are 1) < P bnc v 4 = VSS (z/2) (# of v i pins that are -1) < P bnc January 27, 2006 18

3) Methodology Power Constraints Power Consumption : consumption is bounded by Ppower Example: For segment. The total number of switching signals can be constrained to reduce power. Segment (# of v i pins that are 1 or -1) < Ppower January 27, 2006 19

3) Methodology Constructing Legal Vectors Sequences For each bit in the th segment bus, constraints are written. If the pin is a signal, 3 constraint equations are written; - v 0 = 0, the bit is static and a glitching constraint is written - v 0 = 1, the bit is rising and an edge degradation constraint is written. - v 0 = -1, the bit is falling and an edge degradation constraint is written. If the pin is VDD, 1 constraint equation is written to avoid supply bounce. If the pin is VSS, 1 constraint equation is written to avoid ground bounce. For the segment, 1 constraint equation is written to constrain power. January 27, 2006 20

3) Methodology Constructing Legal Vectors Sequences This results in the total number of constraint equations written is: (3 n 3) Each equation must be evaluated for each possible transition to verify if the transition meets the constraints. The total number of transitions that are evaluated depends on n and p: 3 (n+2p 6) This follows since there are n-2 signal pins in the segment, and 2p-4 signal pins in neighboring segments. The values of n and p are small in practice, hence this is tractable. January 27, 2006 21

3) Methodology Constructing the CODEC The remaining legal transitions are used to create the CODEC. The total number of remaining legal transitions will depend on how aggressive the user-defined constants are chosen (P 0, P 1, P -1, P bnc, P power ) From the remaining legal transitions, find the effective bus width m that can be encoded using a physical bus of width n, using a memorybased CODEC. Utilize a fixpoint computation January 27, 2006 22

3) Methodology Constructing the CODEC Represent remaining legal transitions in a digraph 000 Algorithm to find CODEC: 001 101 Let n = size of physical bus Let m = size of effective bus Then the digraph of legal transitions of the n bit bus can encode an m bit bus (m < n) iff 010 011 We can find a closed set S of nodes such that S 2 m Each vertex s in S has at least 2 m out-edges (including self-edges) to vertices s in S 110 111 100 Now we can synthesize the encoder and decoder (memory based). January 27, 2006 23

4) Experimental Results 5 Signal Pins Example Bus: n=7, p=2 Aggressive Encoding Non-Aggressive Encoding Power Encoding P0, P1, P-1, Pbnc 5% of VDD 12.5% of VDD 20% of Max January 27, 2006 24

4) Experimental Results Constraint Equations # of Constraints = (3n 3) = 12 1) v 0 = VDD (L/2) (# of v i pins that are 1) < Pbnc 2) v 1 = 1 k1 (v 2 ) + k2 (v 3 ) > P1 3) v 1 = -1 k1 (v 2 ) + k2 (v 3 ) < P-1 4) v 1 = 0 - P0 < k1 (v 2 ) + k2 (v 3 ) < P0 5) v 2 = 1 k1 (v 1 ) + k1 (v 3 ) > P1 6) v 2 = -1 k1 (v 1 ) + k1 (v 3 ) < P-1 7) v 2 = 0 - P0 < k1 (v 1 ) + k1 (v 3 ) < P0 8) v 3 = 1 k2 (v 1 ) + k1 (v 2 ) > P1 9) v 3 = -1 k2 (v 1 ) + k1 (v 2 ) < P-1 10) v 3 = 0 - P0 < k2 (v 1 ) + k1 (v 2 ) < P0 11) v 4 = VSS (L/2) (# of v i pins that are -1) < Pbnc 12) (# of v i pins that are -1 or 1) < Ppower January 27, 2006 25

4) Experimental Results CASE 1: Fixed di/dt Transitions Eliminated due to Rule Violations Rule(s) Violated Transition Aggressive Non Aggressive 011 violates 1,4-0-1-1 violates 4,11-101 violates 1,7-110 violates 1,10-111 violates 1,2,5,8 violates 11 11-1 violates 1-1-11 violates 1-1-1-1 violates 11 - -10-1 violates 7,11 - -111 violates 1 - -11-1 violates 11 - -1-10 violates 10,11 - -1-11 violates 11 - -1-1-1 violates 3,6,9,11 violates 1 January 27, 2006 26

4) Experimental Results CASE 1: Fixed di/dt Encoded data avoids Inductive X-talk pattern Overhead = 1 - Effective = n - m Physical m Bus can be ran faster January 27, 2006 27

4) Experimental Results CASE 1: Fixed di/dt Ground Bounce Simulation January 27, 2006 28

4) Experimental Results CASE 1: Fixed di/dt Glitch Simulation January 27, 2006 29

4) Experimental Results CASE 1: Fixed di/dt Edge Degradation Simulation January 27, 2006 30

4) Experimental Results CASE 2: Variable di/dt di/dt was swept for both the non-encoded and encoded configuration. the maximum di/dt was recorded that resulted in a failure. Failure : 5% of VDD (Aggressive) and 12.5% of VDD (Non-Aggressive) the maximum di/dt was converted to data rate and throughput. Original Aggressive Non-Aggr Maximum di/dt: 8 MA/s 19.9 MA/s 37 MA/s Maximum data-rate per pin: 133 Mb/s 333 Mb/s 667 Mb/s Effective bus width: 5 4 2 Total Throughput: 667 Mb/s 1332 Mb/s 1332 Mb/s Improvement - 100% 100% Power Constraint (% of Max) 100% 20% 20% January 27, 2006 31

4) Experimental Results ASIC Synthesis A 0.13um, TSMC ASIC process was used. Delay and Area Extracted January 27, 2006 32

4) Experimental Results FPGA Implementation A Xilinx, Virtex-II, 0.35um, FPGA was used. Delay and Area Extracted January 27, 2006 33

5) Conclusion Using a single mathematical framework, inductive X-talk & power constraints can be written that consider supply bounce, glitching, and edge degradation. This technique can be used to encode off-chip data transmission to reduce inductive X-talk & power to acceptable levels. It was demonstrated that even after reducing the effective bus size, the improvement in per pin data-rate resulted in an increase in throughput compared to a non-encoded bus. January 27, 2006 34

Thank you! January 27, 2006 35