EECS 427 Lecture 8: dders Readings: 11.1-11.3.3 3 EECS 427 F09 Lecture 8 1 Reminders HW3 project initial proposal: due Wednesday 10/7 You can schedule a half-hour hour appointment with me to discuss your initial proposal before submission Email your initial proposal (one per group) in doc format to zhengya@eecs.umich by 7 pm on Wednesday night Limit your write-up to 4 pages maximum. Keep it brief and clear. Quiz 1 on Wednesday 10/14 Samples (with solutions) from past terms are posted Half-lecture review next Monday CD4 is due Wednesday 10/14 You can submit it by Thursday 10/15 at noon EECS 427 F09 Lecture 8 2 1
Last Time Dynamic circuits high performance Dynamic logic non-ratioed, dynamic power only, no static current, higher activity, low noise margin Domino logic can be safely cascaded, only non-inverting logic Footless domino ripple precharge, delayed clock, extra power Dual-rail domino both inverting and noninverting outputs NP CMOS cascade n- and p- evaluation networks EECS 427 F09 Lecture 8 3 Overview Full adder Mirror adder Transmission-gate adder Manchester carry chain dder topologies Ripple carry Carry bypass Carry select Carry lookahead Carry lookahead adder Kogge-Stone Radix 2 or Radix 4 Sparse tree, rent-kung EECS 427 F09 Lecture 8 4 2
Full dder Cin Full adder Sum Cout EECS 427 F09 Lecture 8 5 The inary dder Cin Full adder Sum Cout S = = + + + = + + EECS 427 F09 Lecture 8 6 3
Express Sum and Carry as a function of P, G, D Define 3 new variables which ONLY depend on, Generate (G) = Propagate (P) = Delete = WHY? Can also derive expressions for S and based on D and P EECS 427 F09 Lecture 8 7 The Ripple-Carry dder 0 0 1 1 2 2 3 3,0,0,1,2,3 F F F F (,1 ) S 0 S 1 S 2 S 3 Worst case delay linear with the number of bits t d = O(N) t adder = (N-1)t carry + t sum Goal: Make the fastest possible carry path circuit EECS 427 F09 Lecture 8 8 4
Complementary Static CMOS Full dder X S 28 Transistors EECS 427 F09 Lecture 8 9 Inversion Property F F S S EECS 427 F09 Lecture 8 10 5
Minimize Critical Path by Reducing Inverting Stages long Carry Path Even cell Odd cell 0 0 1 1 2 2 3 3,0,0,1,2,3 F F F F S 0 S 1 S 2 S 3 Exploit Inversion Property llows us to remove inverter in carry chain at what cost? EECS 427 F09 Lecture 8 11 etter Structure: Mirror dder Delete "0"-Propagate C i "1"-Propagate Generate S 24 transistors EECS 427 F09 Lecture 8 12 6
Mirror dder Layout S GND EECS 427 F09 Lecture 8 13 Mirror dder Details NMOS and PMOS chains are completely symmetric. Maximum of 2 series transistors in carry ygeneration In layout critical to minimize capacitance at node. Reduction of junction capacitances is particularly important Capacitance at node is composed of 4 junction capacitances, 2 internal gate capacitances, and 6 gate capacitances in connecting adder cell Transistors connected to are closest to output Only optimize transistors in carry stage for speed Transistors in sum stage can be small EECS 427 F09 Lecture 8 14 7
Sizing Mirror dder 6 0-Propagate 1-Propagate 12 12 4 Delete 4 4 4 6 12 4 4 6 6 2 2 3 Ci Generate 6 6 2 2 2 2 3 S Fanout (effective) ~2 3 EECS 427 F09 Lecture 8 15 Transmission Gate Full dder P P P S Sum Generation P P P Carry Generation Setup P EECS 427 F09 Lecture 8 16 8
Manchester Carry Chain V P i P i G i Ci G i P i D i Static Dynamic EECS 427 F09 Lecture 8 17 Manchester Carry Chain Implement P with pass-transistors Implement G with pull-up OR kill (delete) with pull-down (note inversion) Use dynamic logic to reduce complexity and speed up P 0 P 1 P 2 P 3 C 3,0 G 0 G 1 G 2 G 3 C 0 C 1 C 2 C 3 EECS 427 F09 Lecture 8 18 9
Carry-ypass dder P 0 G 1 P 0 G 1 P 2 G 2 P 3 G 3,0,0,1,2 F F F F,3 lso called Carry-Skip P 0 G 1 P 0 G 1 P 2 G 2 P 3 G 3 P=P o P 1 P 2 P 3,0,0,1,2 F F F F Mu ultiplexer,3 Idea: If (P0 and P1 and P2 and P3 = 1) then 3 =,0, else delete or generate EECS 427 F09 Lecture 8 19 Carry-ypass dder (cont.) it 0 3 Setup t setup it 4 7 Setup t bypass it 8 11 Setup it 12 15 Setup Carry propagation Carry propagation Carry propagation Carry propagation Sum Sum Sum t sum Sum M bits t adder = t setup + M tcarry + (N/M-1)t bypass + (M-1)t carry + t sum Inner blocks do not contribute to worst-case delay since they have time to compute while bits 0-3 are propagating (assuming they have a generate or delete) EECS 427 F09 Lecture 8 20 10
Carry Ripple versus Carry ypass t p ripple adder bypass adder 4..8 N EECS 427 F09 Lecture 8 21 Carry-Select dder "0" Setup PG P,G "0" Carry Propagation asic idea: Precompute both possibilities, choose when n available "1" "1" Carry Propagation,k-1 l C Co,k+3 Sum Generation Carry Vector EECS 427 F09 Lecture 8 22 11
Carry Select dder: Critical Path it 0 3 it 4 7 it 8 11 it 12 15 Setup Setup Setup Setup 0 0-Carry 0 0-Carry 0 0-Carry 0 0-Carry 1 1-Carry 1 1-Carry 1 1-Carry 1 1-Carry,0,3,7,11,15 Sum Generation Sum Generation Sum Generation Sum Generation S 0 3 S 4 7 S 8 11 S 12 15 t add = t setup + Mt carry + (N/M)t mux + t sum EECS 427 F09 Lecture 8 23 Linear Carry Select it 0-3 it 4-7 it 8-11 it 12-15 St Setup St Setup St Setup St Setup (1) "0" (1) "0" Carry "0" "0" Carry "0" "0" Carry "0" "0" Carry "1" "1" Carry (5) (5) "1" Carry "1" Carry "1" Carry "1" "1" "1" (5) (5) (5) (6) (7) (8),0 (9) Sum Generation Sum Generation Sum Generation Sum Generation S 0-3 S 4-7 S 8-11 S 12-15 (10) t add = t setup + M*t carry + (N/M)t mux + t sum EECS 427 F09 Lecture 8 24 12
Square Root Carry Select it 0-1 it 2-4 it 5-8 it 9-13 it 14-19 Setup Setup Setup Setup (1) "0" Carry "0" Carry "0" Carry "0" Carry "0" "0" "0" "0" (1) "1" Carry "1" Carry "1" Carry "1" Carry "1" "1" "1" "1" (3) (3) (4) (5) (6) (4) (5) (6) (7),0 Sum Generation Sum Generation Sum Generation Sum Generation S 0-1 S 2-4 S 5-8 S 9-13 (7) Mux (8) Sum S 14-19 (9) t add t Mt 2 setup carry N tmux tsum EECS 427 F09 Lecture 8 25 dder Delays - Comparison 50 t p (in unit delays) 40 30 20 10 Ripple adder Linear select Square root select 0 0 20 40 N 60 EECS 427 F09 Lecture 8 26 13
Lookahead - asic Idea 0, 0 1, 1 N-1, N-1,0 P 0,1 P 1, N-1 P N-1 S 0 S 1 S N-1 k = f k k k = G k + P k k 1 1 EECS 427 F09 Lecture 8 27 4-bit Lookahead dder Expanding Lookahead equations: k = G k + P k G k 1 + P k 1 k 2 G 3 G 2 ll the way: k = G k + P k G k 1 + P k 1 + P 1 G 0 + P 0 0,0 G 1 G 0,3 P 0 P 1 P 2 P 3 EECS 427 F09 Lecture 8 28 14
Logarithmic Lookahead dder 0 F 1 2 3 4 5 6 7 0 1 t p N 2 3 4 5 6 7 F t p log 2 (N) EECS 427 F09 Lecture 8 29 Carry Lookahead Trees 0 = G 0 + P 0 0 1 = G 1 + P 1 G 0 + P 1 P 0 0 2 = G 2 + P 2 G 1 + P 2 P 1 G 0 + P 2 P 1 P 0 0 = G 2 + P 2 G 1 + P 2 P 1 G 0 + P 0 0 = G 2:1 + P 2:1 0 Can continue building the tree hierarchically. In general G j:k = G j + P j G j-1:k P j:k = P j P j-1:k EECS 427 F09 Lecture 8 30 15
Tree dder ( 0, 0 ) ( 1, 1 ) ( 2, 2 ) ( 3, 3 ) ( 4, 4 ) ( 5, 5 ) ( 6, 6 ) ( 7, 7 ) ( 8, 8 ) ( 9, 9 ) ( 10, 10 ) ( 11, 11 ) ( 12, 12 ) ( 13, 13 ) ( 14, 14 ) ( 15, 15 ) S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 16-bit radix-2 Kogge-Stone tree EECS 427 F09 Lecture 8 31 Domino dder Clk G i =ab i i Clk P i = a i + b i a i a i b i b i Clk Clk Propagate Generate EECS 427 F09 Lecture 8 32 16
Domino dder Clk k P i:i-2k+1 Clk k G i:i-2k+1 P i:i-k+1 P i:i-k+1 G i:i-k+1 P i-k:i-2k+1 G i-k:i-2k+1 Propagate Generate EECS 427 F09 Lecture 8 33 Domino Sum Keeper Clk Clkd Sum Gi:0 Clk S i 0 Clkd Clk Gi:00 S i 1 Clk EECS 427 F09 Lecture 8 34 17
Tree dder (a ) 0, b 0 (a ) 1, b 1 (a ) 2, b 2 (a ) 3, b 3 (a ) 4, b 4 (a ) 5, b 5 (a ) 6, b 6 (a ) 7, b 7 (a ) 8, b 8 (a ) 9, b 9 (a ) 10, b 10 (a 11, b 11 (a ) 12, b 12 (a ) 13, b 13 (a ) 14, b 14 (a ) 15, b 15 S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 ) 16-bit radix-4 Kogge-Stone Tree EECS 427 F09 Lecture 8 35 Sparse Tree (a 0, b 0 ) (a 1, b 1 ) (a 2, b 2 ) (a 3, b 3 ) (a 4, b 4 ) (a 5, b 5 ) (a 6, b 6 ) (a 7, b 7 ) (a 8, b 8 ) (a 9, b 9 ) (a 10, b 10 ) (a 11, b 11 ) (a 12, b 12 ) (a 13, b 13 ) (a 14, b 14 ) (a 15, b 15 ) S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 16-bit radix-2 sparse tree with sparseness of 2 EECS 427 F09 Lecture 8 36 18
Tree dder ( 0, 0 ) ( 1, 1 ) ( 2, 2 ) ( 3, 3 ) ( 4, 4 ) ( 5, 5 ) ( 6, 6 ) ( 7, 7 ) ( 8, 8 ) ( 9, 9 ) ( 10, 10 ) ( 11, 11 ) ( 12, 12 ) ( 13, 13 ) ( 14, 14 ) ( 15, 15 ) S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 rent-kung Tree EECS 427 F09 Lecture 8 37 Radix 2 vs. Radix 4 Fanout consideration Easier to drive large fanouts with Radix-2 Radix-4 has fewer stages and could have speed advantage when driving low fanouts Wire length consideration Radix-4 has longer wires (64 bits: crosses 48 bit slices vs. 32 in radix-2). Fewer logic stages precedes large wireload. Logic consideration Radix-2 has lower stack heights EECS 427 F09 Lecture 8 38 19
Full vs. Sparse Trees Not all the carries are calculated in sparse trees Only every 2 nd, 4 th, etc. Reduced (uneven) input capacitance Fewer transistors and wires Lower power Sparse tree adders need to recover missing carries Ripple (extra gate delay) Precompute (extra fanout) Complex precomputation can get into the critical path EECS 427 F09 Lecture 8 39 Overview Full adder Mirror adder size it carefully Transmission-gate adder efficient XOR implementation Manchester carry chain long RC chain dder topologies Ripple carry simple, small adders Carry bypass skip in groups Carry select precomputation to shorten the critical path Carry lookahead compute carry-in for higher order bits to shorten the critical path Carry look-ahead adder Kogge-Stone full tree, regular interconnect, consistent fanout, large area and high power Radix 2 or Radix 4 tradeoff between gate complexity and tree depth Sparse tree, rent-kung fewer gates, fewer wires, some high-fanout gates slows down the performance EECS 427 F09 Lecture 8 40 20