Low Latency Architectures of a Comparator for Binary Signed Digits in a 28-nm CMOS Technology

Low Latency Architectures of a Comparator for Binary Signed Digits in a 28-nm CMOS Technology Martin Schmidt, Thomas Veigel, Sebastian Haug, Markus Grözing, Manfred Berroth Stuttgart, Germany 1

Outline Motivation Binary Signed-Digit Number System Adder Comparator Conclusion 2

Motivation 3

Motivation Applications for fast adders / multipliers: CPU / GPU Cryptography DSP 4

Motivation Applications for fast adders / multipliers: CPU / GPU Cryptography DSP FFT for optical OFDM transmitter QAM Mapper IFFT- Prozessor S 1 1 1 P I D A P N M M S Q D A 4

Motivation Applications for fast adders / multipliers: CPU / GPU Cryptography DSP FFT for optical OFDM transmitter Delta-sigma modulators for class-s power amplifiers X z 2 z 2 Y 1 +2 4

Motivation Methods to increase throughput of adders and multipliers: Carry look-ahead architectures 5

Motivation Methods to increase throughput of adders and multipliers: Carry look-ahead architectures % High complexity % Long interconnects in layout " MSB is sign bit 5

Motivation Methods to increase throughput of adders and multipliers: Carry look-ahead architectures % High complexity % Long interconnects in layout " MSB is sign bit Redundant number systems: Carry-save % Conversion necessary % Sign operation is very slow 5

Binary Signed-Digit Number System Sign-Zero Representation Difference Representation 6

Binary Signed-Digit (BSD) Number System X = N 1 i=0 x i 2 i with x i { 1, 0, 1} X x i BSD number BSD digit 7

Binary Signed-Digit (BSD) Number System X = N 1 i=0 x i 2 i with x i { 1, 0, 1} X x i BSD number BSD digit Redundancy: 0 1 1 4 3 1 1 1 2 = 3 1 0 1 1 3 7

Number Representation in BSD Number System Sign-zero representation: x = ( 1) xs (1 x z ) with x s, x z {0, 1} Sign operation equals leading-one detection Difference representation: x = x x with x, x {0, 1} Better for addition / multiplication Chosen for this design 8

BSD Adder BSD-TC BSD-BSD 9

BSD-TC (Two s Complement) Adder Implementation of BSD adder with full adders and inverters: a FA b c in s = a b c c out = ab + ac + bc c out s inverted input/output 10

BSD-TC (Two s Complement) Adder Implementation of BSD adder with full adders and inverters: a FA b c in s = a b c c out = ab + ac + bc c out s inverted input/output Addition a + b = s n 1 a n 1 n 2 a n 2 n 3 a n 3 0 a 0 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 0 n s n n 1 s n 1 n 2 s n 2 n 3 s 1 0 s 0 10

BSD-TC (Two s Complement) Adder Implementation of BSD adder with full adders and inverters: a FA b c in s = a b c c out = ab + ac + bc c out s inverted input/output Addition a + b = s BSD TC BSD n 1 a n 1 n 2 a n 2 n 3 a n 3 0 a 0 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 0 n s n n 1 s n 1 n 2 s n 2 n 3 s 1 0 s 0 10

BSD-BSD Adder = BSD+(TC-TC) Addition a + b = s BSD TC BSD n 1 a n 1 n 2 a n 2 n 3 a n 3 0 a 0 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 0 n s n n 1 s n 1 n 2 s n 2 n 3 s 1 0 s 0 11

BSD-BSD Adder = BSD+(TC-TC) Addition a - b = s BSD TC BSD n 1 a n 1 n 2 a n 2 n 3 a n 3 0 a 0 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 1 n s n n 1 s n 1 n 2 s n 2 n 3 s 1 0 s 0 11

BSD-BSD Adder = BSD+(TC-TC) Addition a + b = a + b b = s BSD BSD BSD BSD BSD n 1 a n 1 n 2 a n 2 n 3 a n 3 0 a 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 0 FA b n 1 b n 2 b n 3 FA FA FA b 0 1 n s n n 1 s n 1 n 2 s n 2 n 3 s 1 0 s 0 11

Summary BSD Adder Advantages " Adder delay for BSD+TC sum single full adder delay " Adder delay for BSD+BSD sum two times full adder delay Delay independent of word width " Only local interconnects Easy layout Disadvantages % Slow conversion to TC % Comparator / Sign operator is not as simple as in TC 12

BSD Comparator BSD-TC Converter Linear Search Tree Search 13

Comparator Architectures Conversion to TC MSB is sign bit Conversion to sign-zero representation Leading-One detector 14

BSD-TC Conversion n 1 a s BSD TC a n 1 n 2 a n 2 n 3 a n 3 0 a 0 FA 1 FA FA FA s n 1 s n 2 s n 3 s 0 15

BSD-TC Conversion n 1 a s BSD TC a n 1 n 2 a n 2 n 3 a n 3 0 a 0 FA 1 Ripple carry adder FA FA FA Y= s n 1 s n 2 s n 3 s 0 15

Conversion to Sign-Zero Representation Difference representationx x Dec Sign-zero representation 00 0 x1 11 0 x1 01-1 10 10 +1 00 16

Conversion to Sign-Zero Representation Difference representationx x Dec Sign-zero representation 00 0 11 11 0 11 01-1 10 10 +1 00 a ( 1) bs (1 b z ) BSD Diff BSD SZ n 1 a n 1 n 2 a n 2 1 a 1 0 a 0 XOR AND XOR AND XOR AND XOR AND b z n 1 b s n 1 b z n 2 b s n 2 b z 1 b s 1 b z 0 b s 0 16

Sign-Zero Representation and Leading-One Detection Example: 1 is coded in sign-zero representation: 111000 zero 101011 sign 0 + 0 0 + 4 2 1 = +1 17

Sign-Zero Representation and Leading-One Detection Example: 1 is coded in sign-zero representation: 111 000 zero 101 011 sign -0+0-0 + 4 2 1 = +1 17

Sign-Zero Representation and Leading-One Detection Example: 1 is coded in sign-zero representation: 111 0 00 zero 101 0 11 sign 0 + 0 0 +4 2 1 = + 1 17

Sign-Zero Representation and Leading-One Detection Example: 1 is coded in sign-zero representation: 1110 00 zero 1010 11 sign 0 + 0 0 + 4-2-1 = +1 17

Sign-Zero Representation and Leading-One Detection Example: 1 is coded in sign-zero representation: 111000 zero 101011 sign 0 + 0 0 + 4 2 1 = +1-1 is coded in sign-zero representation: 111000 zero 101100 sign 0 + 0 0 4 + 2 + 1 = 1 Absolute value of n th -digit is larger than sum of all digits k<n Leading non-zero digit (zero i =0) of BSD number determines sign 17

Leading-One Detector: Linear Search n 1 a n 1 n 2 a n 2 1 a 1 0 a 0 XOR AND XOR AND XOR AND XOR AND b z n 1 b s n 1 b z n 2 b s n 2 b z 1 b s 1 b z 0 b s 0 Y M UX 0 1 M UX 0 1 M UX 0 1 18

Leading-One Detector: Tree Search - 2 Digits b z 1 b z 0 b s 1 b s 0 0 1 MUX AND c z 10 c s 10 19

Leading-One Detector: Tree Search - 4 Digits b z 3 b z 2 b s 3 b s 2 AND 0 1 MUX b z 1 b z 0 b s 1 b s 0 AND 0 1 MUX c z 32 c s 32 c z 10 c s 10 c z 10 c s 10 c z 32 c s 32 0 1 MUX AND d z d s 20

AND Tree - Sign Operator for 4 Digits n 1 a n 1 n 2 a n 2 1 a 1 0 a 0 XOR AND XOR AND XOR AND XOR AND b z 3 b s 3 b z 2 b s 2 b z 1 b s 1 b z 0 b s 0 b z 3 b z 2 b s 3 b s 2 AND 0 1 MUX b z 1 b z 0 b s 1 b s 0 AND 0 1 MUX c z 32 c s 32 c z 10 c s 10 c z 32 c z 10 c s 32 c s 10 0 1 MUX AND d z d s 21

Summary Comparator Transistor Delay Delay Comparator count Layout 4 digits 16 digits Ripple carry adder High Regular 48 ps 192 ps Linear search Low Regular 31 ps 153 ps AND tree Medium Hierarchical 14 ps 34 ps Delay is simulated in a 28 nm low power CMOS technology with low threshold voltage MOSFETs on schematic level (preliminary(!) models). 22

Conclusion BSD adder is independent of word width 3 BSD comparator architectures Conversion to two s complement Linear search Tree search 34 ps comparator delay for 16-digit BSD number 23

Thank you for your attention 24