Semiconductor Memories Adapted from Chapter 12 of Digital Integrated Circuits A Design Perspective Jan M. Rabaey et al. Copyright 2003 Prentice Hall/Pearson Outline Memory Classification Memory Architectures The Memory Core Periphery Reliability Case Studies
Semiconductor Memory Classification Read-Write Memory Non-Volatile Read-Write Memory Read-Only Memory Random Access Non-Random Access EPROM E 2 PROM Mask-Programmed Programmable (PROM) SRAM FIFO FLASH DRAM LIFO Shift Register CAM Memory Timing: Definitions Read cycle READ Read access Read access Write cycle WRITE Data valid Write access DATA Data written
N wo r d s D e c o d e r Memory Architecture: Decoders M bits M bits S 0 S 1 S 2 Word 0 Word 1 Word 2 Storage cell A 0 A 1 S 0 Word 0 Word 1 Word 2 Storage cell S N2 2 S N2 1 Word N2 2 Word N2 1 A K-1 K = log 2 N Word N2 2 Word N2 1 Input-Output (M bits) Input-Output (M bits) Intuitive architecture for N x M memory Too many select signals: N words == N select signals Decoder reduces the number of select signals K = log 2 N Problem: ASPECT RATIO or HEIGHT >> WIDTH Array-Structured Memory Architecture 2 L 2 K Bit line Storage cell A K A K11 A L 21 Row Decoder Word line Sense amplifiers / Drivers M.2 K Amplify swing to rail-to-rail amplitude A 0 A K21 Column decoder Selects appropriate word Input-Output (M bits)
Block31 Block30 Subglobalrowdecoder Globalrowdecoder Subglobalrowdecoder Block1 128KArayBlock0 Localrowdecoder Hierarchical Memory Architecture Block 0 Block i Block P 2 1 Row address Column address Block address Control circuitry Block selector Global amplifier/driver Global data bus I/O Advantages: 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings Block Diagram of 4 Mbit SRAM Clock generator Z-address buffer X-address buffer Predecoder and block selector Bit line load Transfer gate Column decoder Sense amplifier and write driver CS, WE buffer I/O buffer x1/x4 controller Y-address buffer X-address buffer Digital Integrated Circuits 2nd [Hirose90] Memories
I / O B u f f e r s I / O B u f f e r s Co m m a n d s C o m m a n d s Ad d r e s s De c o d e r A d d r e s s D e c o d e r 9 9 2 V a l i d i t y B i t s 2 V a l i d i t y B i t s P r i o r i t y E n c o d e r P r i o r i t y E n c o d e r Contents-Addressable Memory I/O Buffers Data (64 bits) Commands Comparand Mask Control Logic R/W Address (9 bits) Address Decoder CAM Array 2 9 words 3 64 bits 2 9 Validity Bits Priority Encoder Memory Timing: Approaches Address bus Row Address Column Address RAS CAS Address Bus Address Address transition initiates memory operation RAS-CAS timing DRAM Timing Multiplexed Addressing SRAM Timing Self-timed
Memory core Read-Only Memory Cells 1 0 GND Diode ROM MOS ROM 1 MOS ROM 2 MOS OR ROM [0] [1] [2] [3] [0] [1] [2] [3] V bias Pull-down loads
MOS NOR ROM Pull-up devices [0] [1] GND [2] GND [3] [0] [1] [2] [3] MOS NOR ROM Layout Cell (9.5λ x 7λ) Programmming using the Active Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion
MOS NOR ROM Layout Cell (11λ x 7λ) Programmming using the Contact Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion MOS NAND ROM Pull-up devices [0] [1] [2] [3] [0] [1] [2] [3] All word lines high by default with exception of selected row
MOS NAND ROM Layout Cell (8λ x 7λ) Programmming using the Metal-1 Layer Only No contact to VDD or GND necessary; drastically reduced cell size Loss in performance compared to NOR ROM Polysilicon Diffusion Metal1 on Diffusion NAND ROM Layout Cell (5λ x 6λ) Programmming using Implants Only Polysilicon Threshold-altering implant Metal1 on Diffusion
Equivalent Transient Model for MOS NOR ROM Model for NOR ROM r word C bit c word Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon) Bit line parasitics Resistance not dominant (metal) Drain and Gate-Drain capacitance Equivalent Transient Model for MOS NAND ROM Model for NAND ROM r bit C L r word c bit c word Word line parasitics Similar to NOR ROM Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance
Decreasing Word Line Delay Driver Polysilicon word line Metal word line (a) Driving the word line from both sides Metal bypass K cells (b) Using a metal bypass Polysilicon word line (c) Use silicides Precharged MOS NOR ROM f pre Precharge devices [0] [1] GND [2] [3] GND [0] [1] [2] [3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.
Non-Volatile Memories The Floating-gate gate transistor (FAMOS) Source Floating gate Gate Drain D t ox G n + p Substrate t ox n +_ S Device cross-section Schematic symbol Floating-Gate Transistor Programming 20 V 0 V 5 V 10 V 5 V 20 V - 5 V 0 V - 2.5 V 5 V S D S D S D Avalanche injection Removing programming voltage leaves charge trapped Programming results in higher V T.
A Programmable-Threshold Transistor I D 0 -state 1 -state ON DV T OFF V V GS FLOTOX EEPROM Floating gate Source Gate Drain I 20 30 nm n 1 Substrate p n 1 10 nm -10 V 10 V V GD FLOTOX transistor Fowler-Nordheim I-V characteristic
EEPROM Cell Absolute threshold control is hard Unprogrammed transistor might be depletion 2 transistor cell Flash EEPROM Control gate Floating gate erasure n 1 source programming p-substrate Thin tunneling oxide n 1 drain Many other options
Cross-sections sections of NVM cells Flash EPROM Digital Integrated Circuits 2nd Courtesy Intel Memories Basic Operations in a NOR Flash Memory Erase cell array 0 1 12 V G 0 V 0 S D 12 V 0 V 1 open open
Basic Operations in a NOR Flash Memory Write 12 V G 6 V 12 V 0 1 0 S D 0 V 0 V 1 6 V 0 V Basic Operations in a NOR Flash Memory Read 5 V G 1 V 5 V 0 1 0 S D 0 V 0 V 1 1 V 0 V
Characteristics of State-of of-the-art NVM Read-Write Memories (RAM) STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended
6-transistor CMOS SRAM Cell M 2 M 4 Q M Q M 5 6 M 1 M 3 CMOS SRAM Analysis (Read) M 4 Q = 0 M 5 Q = 1 M 6 M 1 C bit C bit
V o l t a g e r i s e [ V ] CMOS SRAM Analysis (Read) 1.2 1 Voltage Rise (V) 0.8 0.6 0.4 0.2 0 0 0.5 11.2 1.5 2 Cell Ratio (CR) 2.5 3 CMOS SRAM Analysis (Write) M 4 Q = 0 M 6 M 5 Q = 1 M 1 = 1 = 0 W4 / L4 PR = W / L 6 6
CMOS SRAM Analysis (Write) W4 / L4 PR = W / L 6 6 6T-SRAM Layout M2 M4 Q Q M1 M3 M5 M6 GND
Resistance-load SRAM Cell R L R L M 3 Q Q M 4 M 1 M 2 Static power dissipation -- Want R L large Bit lines precharged to to address t p problem SRAM Characteristics
3-Transistor DRAM Cell 1 2 W R W M 3 R M 1 X M 2 X 2 V T C S 1 2 2 V T DV No constraints on device ratios Reads are non-destructive Value stored at node X when writing a 1 = V W -V Tn 3T-DRAM Layout 2 1 GND R M3 M2 W M1
1-Transistor DRAM Cell Write 1 Read 1 M 1 C S X GND 2 V T /2 V sensing DD /2 C Write: C S is charged or discharged by asserting and. Read: Charge redistribution takes places between bit line and storage capacitance C S V = V V PRE = (V BIT V PRE ) ------------ C S + C Voltage swing is small; typically around 250 mv. DRAM Cell Observations 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out. DRAM memory cells are single ended in contrast to SRAM cells. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design. When writing a 1 into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than
Sense Amp Operation V V(1) V PRE DV(1) Sense amp activated Word line activated V(0) t 1-T T DRAM Cell Capacitor Metal word line Poly n + n + Inversion layer Poly induced by plate bias Cross-section SiO 2 Field Oxide Diffused bit line Polysilicon gate Layout Polysilicon plate M 1 word line Uses Polysilicon-Diffusion Capacitance Expensive in Area
SEM of poly-diffusion capacitor 1T-DRAM Advanced 1T DRAM Cells Word line Insulating Layer Cell plate Capacitor dielectric layer Cell Plate Si Capacitor Insulator Refilling Poly Transfer gate Isolation Storage electrode Storage Node Poly 2nd Field Oxide Si Substrate Trench Cell Stacked-capacitor Cell
Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry Row Decoders Collection of 2 M complex logic gates Organized in regular and dense fashion (N)AND Decoder NOR Decoder
Hierarchical Decoders Multi-stage implementation improves performance 1 0 A 0 A 1 A 0 A 1 A 0 A 1 A 0 A 1 A 2 A 3 A 2 A 3 A 2 A 3 A 2 A 3 NAND decoder using 2-input pre-decoders A 1 A 0 A 0 A 1 A 3 A 2 A 2 A 3 Dynamic Decoders Precharge devices GND GND 3 3 2 2 1 0 1 0 φ A 0 A 0 A 1 A 1 A 0 A 0 A 1 A 1 φ 2-input NOR decoder 2-input NAND decoder
2 - i n p u t N O R d e c o d e r 4-input pass-transistor based column decoder 0 1 2 3 A 0 S 0 S 1 S 2 A 1 S 3 Advantages: speed (t pd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count D 4-to-11 tree based column decoder 0 1 2 3 A 0 A 0 A 1 A 1 Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches D
Sense Amplifiers t p = C ---------------- V I av make V as small as possible large small Idea: Use Sense Amplifer small transition s.a. input output Differential Sense Amplifier M 3 M 4 y Out bit M 1 M 2 bit SE M 5 Directly applicable to SRAMs
Differential Sensing SRAM PC EQ i SRAM cell i x Diff. Sense Amp 2 x Output (a) SRAM sensing scheme Latch-Based Sense Amplifier (DRAM) EQ SE SE Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point.
Single-to to-differential Conversion Cell x Diff. S.A. 2 x 1 2 V ref Output How to make a good V ref? Open bitline architecture with dummy cells EQ L L 1 L 0 R0 SE R 1 L L R C S C S C S SE CS Dummy cell C S C S Dummy cell
V V V DRAM Read Process with Dummy Cell 3 3 2 2 1 1 0 0 1 2 3 0 0 1 2 3 t (ns) t (ns) reading 0 reading 1 3 EQ 2 SE 1 0 0 1 2 3 t (ns) control signals Noise Sources in 1T DRam substrate Adjacent C W a-particles leakage C S electrode C cross
Open Bit-line Architecture Cross Coupling EQ 1 0 C D W C D 0 1 W C C C C Sense Amplifier C C C C Folded-Bitline Architecture 1 1 0 0 D C W D C x y C C C C C C EQ Sense Amplifier C x y C W
Transposed-Bitline Architecture 9 C cross 99 (a) Straightforward bit-line routing SA 9 99 C cross SA (b) Transposed bit-line architecture