High-Performance SRAM Design Rahul Rao IBM Systems and Technology Group
Exercise RWL WWL READ Path RWL WBL WBLb RBL WWL READ Path WBL WBLb RBL
Worst case read condition : Worst case Bitline Leakage when reading a 1
Data Independent Leakage Cell BL RdWL BL RdBL WL V DD V DD M 2 M 4 M 10 M 5 Q Q M 6 M 9 M 8 M 1 M 3 M 7 Figure 9 (c): Schematic of Ten transistors with M9 and M10 added to schematic of eight transistor to lower leakage power (Calhoun, 2010)
Mechanisms of Parametric Failures Read Failure BL Access Failure Voltage WL V R Voltage DMIN BR V L WL Time Time Write Failure Hold Failure Voltage WL V L V R Voltage V DDH V R V L Time 4 Time
Question q Which of the following are true for the 6-T SRAM cell a) A cell with poor READ margin is unlikely to have access failure b) Differential read means there is no worst case data condition for read c) The worst case write condition is having cells with alternate 0s and 1 along the column d) Access fails can be minimized by running the array at a slower frequency Slide 5
Topics q Introduction to memory q SRAM basics and bitcell array (refresher) q Current Challenges q Alternative Cell Types (6 to 10T), Asymmetric Cells, Subthreshold Cells, Low leakage cells q Impact of Variation, Assist Circuits q BTI and impact on SRAMs q Power Slide 6
Sources of Manufacturing Variations 7
Impact of Manufacturing Variations Location of Identical Ring Oscillators on a Die Frequency Correlation (averaged over 300 die) 8 Manjul Bhushan, ICMTS, 2005
Environmental Variations hot cold Temperature Variation Switching Characteristics of Blocks Material Properties: Thermal Coefficient Cooling and Packaging Solutions Workload and Thermal Management Policies Delay and leakage increase with temperature Power Supply Variation IR drop: Leakage, Power grid robustness Ldi/dt: Transient activity, decoupling capacitors Power Efficient Design Strategies: Clock Gating, Power Gating Delay increases with power supply droop 9 P. Restle, ICCAD 2006
Global and Local Variations Random Dopant Fluctuation V t LOCAL LOCAL intra-die V t GLOBAL GLOBAL inter-die
Hold Failure WL=0 AXL L= 1 PL V DDH PR AXR Voltage WL V DDH V R V L NL NR R= 0 Time -> BL BR Voltage WL V DDH V R V L Time -> S. Mukhopadhyay, ITC 2010
Read Failure WL AXL V TRIPR D V L = 1 PL NL PR NR V R = 0 V READ AXR Voltage WL V L V R =V READ Time -> BL BR Voltage WL V R V L Time -> S. Mukhopadhyay, ITC 2010
Write Failure WL T WL AXL L= 1 PL PR AXR Voltage WL V R V L NL NR R= 0 Time -> BL BR Voltage WL V L V R Time -> S. Mukhopadhyay, ITC 2010
Access Failure WL= 0 BL V L = 0 WL= 1 Voltage DMIN BR PL AXL V L = 1 PR NR V R = 0 AXR WL NL BL BR Time -> T MAX T AC >T MAX S. Mukhopadhyay, ITC 2010
Question q Mark worst case VT variation condition for each device for write failure VDD BL INV-1 MP1 INV-2 MP2 MA1 A B WL MN1 MN2 Slide 15
Inter-die Variation & Cell Failures 1 0 1 0 Low Vt Corners Read failure Hold failure GLOBAL High Vt Corners Access failure Write failure inter-die Vt shift (DV th-global ) S. Mukhopadhyay et. al, ITC2005, VLSI2006, JSSC2007, TCAD2008
Failures in SRAM Array Overall Cell Failure: [ ] P = P Fail = P A UR UW UH F F F F F P MEM Redundant Columns PASS AF FAIL RF P F WF HF 1-P F P F P COL P COL : Probability that any of the cells in a column fail P COL = 1 (1 P ) F N ROW
Impact of Redundancy on Memory Failure Actual Col. Red. Col. Failure Probability svt P MEM svt Total Area=Const. Cell Failure Redundant Col / Total Col. [%] Larger redundancy (1) more column to replace (less memory failure). (2) smaller cell area (larger cell failure).
Transistor Sizing Failure Probability (Log) Failure Probability (Log) -4-6 -8 105 115 125 Width of Pull-Up Transistor (nm) -5-10 -15 185 215 245 Width of Pull-Down Transistor (nm) Failure Probability (Log) -4-6 -8-10 130 140 150 Width of Access Transistor (nm) = Read Failure Vti Vt Write Failure Access Failure Cell Failure 0 L MIN i W LW MIN i Slide contributed by K. Roy, Purdue
Question q Array redundancy a) Improves cell stability b) Degrades cell performance (i.e increases read and write times) c) Does not require any change to cell peripheral circuits d) Row redundancy is better than column redundancy Slide 20
Example: Multi-VCC for SRAM Cell V2 (V) 1.2 1 0.8 0.6 0.4 0.2 - V_WL-V_Cell = 0V V_WL-V_Cell - = -0.1V - V_WL-V_Cell = -0.2V Cell write margin (normalized) 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 Improved Write Margin 0 0 0.2 0.4 0.6 0.8 1 1.2 V1 /(V) 0 0 0.1 0.2 0.3 V_WL V_Cell (V) Create differential voltage between WL and Cell to decouple the Read & Write Write: V_WL > V_Cell Read: V_WL < V_Cell Source: K. Zhang et. al. ISSCC 2005
Dynamic Circuit Techniques for Variation Tolerant SRAM V WL = V DD + D Higher V WL => V cell = V DD - D Lower V WL => V WL Read lower V read (weak AX) Write Strong AX helps discharge AXL 1 PL NL PR NR 0 AXR V cs Higher V cs => lower V read (strong PD) Lower V cs => Weak PUP Higher V trip V BL = 0 - D V BL = 0 V BR =V DD V BL Weak impact Negative V BL for 0 => strong AX helps discharge
Example: Dual-Vcc based Dynamic Circuit Techniques VCC MUX WL cell cell cell cell cell VCC_SRAM WL BI MUX VCC MUX VCC_hi VCC_lo cell cell cell cell cell W R R R MUX (8:1) MUX MUX MUX MUX MUX VCC_Select VCC_Hi VCC_Lo Dynamic VCC MUX is integrated into subarray VCC selection is along column direction to decouple the Read & Write Source: K. Zhang et. al. ISSCC 2005
Implementation Consideration: Half-Select Stability WL 1 =V DD + D WL 2 = 0 V cell = V DD - D Sel. col. V DD Half-sel col. -D V DD V DD V DD Higher V WL - Row-based scheme - Degrades half-select read stability of the unselected columns Lower V cell or negative bit-line + Column-based scheme + Half-select read stability remains same
Negative Bit Line Scheme Conventional This Scheme BL WL, PCHG BR BL BR BIT_EN & WL, NSEL PCHG ~ C boost /C BL BL BR cell P1 P2 C boost WR PCHG BIT_EN BIT_EN generating block CS Vin C boost NSEL NSEL V BL N BL,P BL N BR,P BR C BL DB= 1 D= 0 Source: S. Mukhopadhyay, R. Rao et. al, TVLSI 2009
Effectiveness Considerations: Writability improvement Norm. write fail prob 10 0 10-5 10-10 Fast Monte-Carlo simulations for 45nm PD/SOI V cell = V DD - D V WL = V DD + D V BL = - D V BL = - D V cell = V DD - D 10-15 0 50 100 150 200 change in terminal voltage (D) [mv] Various dynamic schemes have different effectiveness in improving writability for similar read stability Higher V WL is most effective Source: S. Mukhopadhyay, R. Rao et. al, TVLSI 2009
Impact on Active Data-Retention V cell = V DD - D Fail probabilities are normalized to write fail prob. at nominal condition WL 2 = 0 Sel. col. -D V DD Active dataretention fails DC-NBL Lower V cell Column based read-write control adversely impact the active data-retention failures DC negative bitline has higher active data-retention failures Tran-NBL and lower V cs have comparable failure rates Source: S. Mukhopadhyay, R. Rao et. al, TVLSI 2009
Assist Methods
Question q Of the various assist methods a) Negative bit line scheme does not help 8-T sram cell b) Word line under drive does not help 8-T sram cell c) Word line over drive does not help 7-T conditionally decoupled sram cell d) VCDL does not help any kind of assymetric sram cell Slide 30
Block Diagram 2 m bits A 0 A 1 WL[0] Precharge Circuit Row Decoder CELL 2 n 2 n x 2 m cell CELL WL[2 n -1] A n CELL CELL BL 0 BLB 0 BL 2 m -1 BLB 2 m -1 A n A n+m-1 Column Decoder Sense Amplifier & Write Driver Blocks Block Decoder Address Address Buffer Global Data Bus R/W CS Timing & Control Global Read/Write Slide 31