CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L21 S.1
Review: Basic Building Blocks Datapath Execution units - Adder, multiplier, divider, shifter, etc. Register file and pipeline registers Multiplexers, decoders Control Finite state machines (PLA, ROM, random logic) Interconnect Switches, arbiters, buses Memory Caches (SRAMs), TLBs, DRAMs, buffers Sp11 CMPEN 411 L21 S.2
Parallel Programmable Shifters Control = Shift amount (Sh 2 Sh 1 Sh 0 ) Shift direction (left, right) Shift type (logical, arithmetic, circular) Shifters used in multipliers, floating point units Consume lots of area if done in random logic gates Sp11 CMPEN 411 L21 S.3
A Programmable Binary Shifter 0 0 rgt nop left A i B i A i A i-1 rgt nop left B i B i-1 A 1 A 0 0 1 0 A 1 A 0 A 1 A 0 1 0 0 0 A 1 A 1 A 0 0 0 1 A 0 0 A i-1 B i-1 0 0 Sp11 CMPEN 411 L21 S.4
A Programmable Binary Shifter 0 0 rgt nop left A i B i A i A i-1 rgt nop left B i B i-1 A 1 A 0 0 1 0 A 1 A 0 A 1 A 0 1 0 0 0 A 1 A 1 A 0 0 0 1 A 0 0 A i-1 B i-1 0 0 Sp11 CMPEN 411 L21 S.5
4-bit Barrel Shifter A 3 A 2 A 1!Sh 1 Sh 0 Sh 1!Sh 0 B 3 B 2 B 1 Example:!Sh 1!Sh 0 = 1 B 3 B 2 B 1 B 0 = A 3 A 2 A 1 A 0!Sh 1 Sh 0 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 2 A 1 Sh 1!Sh 0 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 2 Sh 1 Sh 0 = 1 B 3 B 2 B 1 B 0 = A 3 A 3 A 3 A 3 A 0 Sh 1 Sh 0 B 0!Sh 1!Sh 0!Sh 1 Sh 0 Sh 1!Sh 0 Sh 1 Sh 0 Area dominated by wiring Sp11 CMPEN 411 L21 S.6
4-bit Barrel Shifter Layout Width barrel A 3 A 2 A 1 A 0!ShSh0 1!Sh 0 Only one Sh# active at a time l!sh Sh1 1 Sh 0 Sh2 Sh 1!Sh 0 Sh3 Sh 1 Sh 0 Width barrel ~ O(N) Delay ~ 1 fet + N diff caps Buffer Sp11 CMPEN 411 L21 S.7
Logarithmic Shifter Structure Sh 0!Sh 0 Sh 1!Sh 1 Sh 2!Sh 2 Sh 3!Sh 3 Sh 4!Sh 4 shifts of 0 or 1 bits shifts of 0 or 2 bits shifts of 0 or 4 bits shifts of 0 or 8 bits shifts of 0 or 16 bits 0,1 shifts Sp11 CMPEN 411 L21 S.8 0,1,2,3 shifts 0,1,2,3,4, 5,6,7 shifts 0,1,2 15 shifts 0,1,2 31 shifts xxx
8-bit Logarithmic Shifter 0 1 1 0 0 1 Sh 0!Sh 0 Sh 1!Sh 1 Sh 2!Sh 2 A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 Sp11 CMPEN 411 L21 S.9 log N stages
8-bit Logarithmic Shifter Layout Slice 1 2 4 A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 K = log 2 N Delay ~ K fets + 2 diff caps Sp11 CMPEN 411 L21 S.10
Shifter Implementation Comparisons Barrel Logarithmic Width Speed Width Speed N K 2 N p m 1 + N diffs p m (2 K +2K-1) K + 2 diffs 8 3 16 p m 1 + 8 13 p m 3 + 2 16 4 32 p m 1 + 16 23 p m 4 + 2 32 5 64 p m 1 + 32 41 p m 5 + 2 64 6 128 p m 1 + 64 75 p m 6 + 2 Barrel shifter needs an K x 2 K shift amount decoder Sp11 CMPEN 411 L21 S.11
Decoders Decodes inputs to activate one of many outputs Enable In 0 2x4 Out 0 =!In 1 &!In 0 Out 1 =!In 1 & In 0 In 1 Out 2 = In 1 &!In 0 Out 3 = In 1 & In 0 In random gate logic need two inverters, four 2-input nand gates, four inverters plus enable logic how about for a 3-to-8, 4-to-16, etc. decoder? Sp11 CMPEN 411 L21 S.12
Dynamic NOR Row Decoder V DD GND GND on on Out 0 1 0 on Out 1 1 0 on Out 2 1 0 Out 3 1 1 precharge 0 1!A 0 A 0!A 1 A 1 0 1 0 1 Sp11 CMPEN 411 L21 S.13
Dynamic NAND Row Decoder Out 0 1 1 Out 1 1 1 Out 2 1 1 on on Out 3 1 0!A 0 A 0!A 1 A 1 0 1 0 1 precharge 0 1 Sp11 CMPEN 411 L21 S.14
Building Big Decoders from Small Active low enable Active low output 1 0 1 enable 2x4 1x2 2x4 A 4 A 3 A 2 2x4... A 1 A 0 2x4 0 0 0 0 1 Sp11 CMPEN 411 L21 S.15
Multiplexers Selects one of several inputs to gate to the single output S 1 S 0 In 0 In 1 In 2 In 3 4x1 Out = In 0 &!S 1 &!S 0 In 1 &!S 1 & S 0 In 2 & S 1 &!S 0 In 3 & S 1 & S 0 In random gate logic need two inverters, four 3-input nands, one 4-input nand how about for an 8x1, 16x1, etc. mux? Sp11 CMPEN 411 L21 S.16
Review: TG 2x1 Multiplexer S S S F V DD In 2!S F In 1 S F =!((In 1 & S) (In 2 &!S)) GND In 1 S S In 2 Sp11 CMPEN 411 L21 S.17
Building Big Muxes from Small 0 S 0 1 S 1 A 0 A 1 2x1 2x1 Out A 2 A 3 2x1 Sp11 CMPEN 411 L21 S.18
Memory Definitions Size Kbytes, Mbytes, Gbytes, Tbytes Speed Read Access delay between read request and the data available Write Access delay between write request and the writing of the data into the memory (Read or Write) Cycle - minimum time required between successive reads or writes Read Cycle Read Read Access Read Access Write Cycle Write Write Setup Write Access Data Data Valid Data Written Sp11 CMPEN 411 L21 S.19
A Typical Memory Hierarchy By taking advantage of the principle of locality, we can present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology. On-Chip Components Control edram Datapath RegFile ITLB DTLB Instr Data Cache Cache Second Level Cache (SRAM) Main Memory (DRAM) Secondary Memory (Disk) Speed (ns):.1 s 1 s 10 s 100 s 1,000 s Size (bytes): 100 s K s 10K s M s T s Cost: highest lowest Sp11 CMPEN 411 L21 S.20
Random Access Read Write Memories SRAM Static Random Access Memory data is stored as long as supply is applied large cells (6 fets/cell) so fewer bits/chip fast so used where speed is important (e.g., caches) differential outputs (output BL and!bl) use sense amps for performance compatible with CMOS technology DRAM - Dynamic Random Access Memory periodic refresh required (every 1 to 4 ms) to compensate for the charge loss caused by leakage small cells (1 to 3 fets/cell) so more bits/chip slower so used for main memories single ended output (output BL only) need sense amps for correct operation not typically compatible with CMOS technology Sp11 CMPEN 411 L21 S.21
Next Lecture and Reminders Next lecture SRAM, DRAM, and CAM cores - Reading assignment Rabaey, et al, 12.1-12.2.4 Sp11 CMPEN 411 L21 S.22