Tunable Floating-Point for Energy Efficient Accelerators
|
|
- Valerie O’Neal’
- 5 years ago
- Views:
Transcription
1 Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 1 / 22
2 Outline Motivation and Background Tunable Floating-Point Multiplication Hardware Implementation Application Example Improvements Conclusions and Future Work A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 2 / 22
3 Tunable Floating-Point (TFP) TFP is a floating-point format with adjustable significand and exponent fields bit-width Features significand m = [4, 24] bits (including hidden bit) exponent e = [5,8] bits rounding modes RTZ Round toward zero (truncation) RTN Round to the nearest RTNE Round to the nearest (tie to even) IEEE 754 roundtiestoeven mode subnormals flushed to zero TFP includes binary32 (m = 24, e = 8), binary16 (m = 11, e = 5), Google s Brain-FP (m = 8, e = 8). A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 3 / 22
4 Motivation High throughput parallel multiplication required in most computing applications Parallel multiplication is a power demanding operation A flexible unit, handling a flexible format, can increase the power efficiency Power efficiency = throughput watt Multiplication can be approximated by reducing the precision Accuracy of reduced precision can be improved by rounding A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 4 / 22
5 Example: Machine Learning Machine Learning applications, such as Deep Neural Networks (DNN), require huge number of multiplications: N w i x i Reducing the precision, when possible, saves energy. Example DNN A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 5 / 22
6 Example: Machine Learning Machine Learning applications, such as Deep Neural Networks (DNN), require huge number of multiplications: N w i x i Reducing the precision, when possible, saves energy. Example DNN Training Training dataset DNN w i error Ground Truth Backpropagation Operations in binary32 (single-precision) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 5 / 22
7 Example: Machine Learning Machine Learning applications, such as Deep Neural Networks (DNN), require huge number of multiplications: N w i x i Reducing the precision, when possible, saves energy. Example DNN Inference Testing dataset DNN Output w i error Backpropagation Operations in binary16 (half-precision) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 5 / 22
8 Baseline FP-Multiplier (Significand) Radix-4 array multiplier Radix-4 Recoding PPs generation Adder tree (carry-save): P S,P C M X 2m 1 m PPgen P s P c 2m 1 M Y radix 4 Recoding T R E E m OVF P2 tie flush to 0 HA m 1 P1 1 2:1 mux 0 m 1 ulp/2 m 1 3:1 mux M Z HA m 1 ulp/2 <<1 1 0 P0 mux 2m 1 2m 1 MSBs m 1 P0 LSBs m+1 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 6 / 22
9 Baseline FP-Multiplier (Significand) Radix-4 array multiplier Radix-4 Recoding PPs generation Adder tree (carry-save): P S,P C Speculative Rounding P2: 2 P < 4 P1: 1 P < 2 (norm.) P0: no rounding bit injection 2m 1 OVF M X P2 m PPgen P s P c tie flush to 0 2m 1 HA m 1 M Y radix 4 Recoding T R E E P1 1 2:1 mux 0 m 1 m ulp/2 m 1 3:1 mux M Z HA m 1 ulp/2 <<1 1 0 P0 mux 2m 1 2m 1 MSBs m 1 P0 LSBs m+1 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 6 / 22
10 Baseline FP-Multiplier (Significand) Radix-4 array multiplier Radix-4 Recoding PPs generation Adder tree (carry-save): P S,P C Speculative Rounding P2: 2 P < 4 P1: 1 P < 2 (norm.) P0: no rounding bit injection Normalization & Selection OVF overflow P 2 tie-to-even underflow (exp=0) overflow (exp= ) 2m 1 OVF M X P2 m PPgen P s P c tie flush to 0 2m 1 HA m 1 M Y radix 4 Recoding T R E E P1 1 2:1 mux 0 m 1 m ulp/2 m 1 3:1 mux M Z HA m 1 ulp/2 <<1 1 0 P0 mux 2m 1 2m 1 MSBs m 1 P0 LSBs m+1 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 6 / 22
11 Speculative Rounding 2m 1 P s P c 2m 1 HA ulp/2 HA ulp/2 <<1 P0 P2 m 1 P1 m P0 mux 2m 1 2m 1 pos. 2m-1 2m-2 2m-3... m+2 m+1 m m-1 m P2 1 b. b... b L R b b... b b ulp/2 1 P b... b b L R b... b b ulp/2 1 Sticky bit T = OR m 1 i=0 b i A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 7 / 22
12 Speculative Rounding 2m 1 P s P c 2m 1 HA ulp/2 HA ulp/2 <<1 P0 P2 m 1 P1 m P0 mux 2m 1 2m 1 pos. 2m-1 2m-2 2m-3... m+2 m+1 m m-1 m P2 1 b. b... b L R b b... b b P S ulp/2 1 P C P b... b b L R b... b b HA s ulp/2 1 HA c P... Sticky bit T = OR m 1 i=0 b i A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 7 / 22
13 Baseline FP-Multiplier (Exponent) At beginning of operation and in parallel Detect if operands are zero set integer bit in M Add exponents E 1 = E X +E Y Bias Increment exponent E 2 = E 1 +1 Select exponent on OVF Check conditions underflow (exp=0) overflow (exp= ) Set flush-to-0 Set exponent E Z E x e E y Exp. add E 1 E z e E mux E max Exp. adjust e e Int. bit det. Bias Exp incr. e ZERO OVF INFTY M x (m) M y (m) flush to 0 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 8 / 22
14 TFP Rounding Need to round in variable position according to m baseline binary32 P S P C ulp/2 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 9 / 22
15 TFP Rounding Need to round in variable position according to m TFP m = 24 P S P C RW... A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 9 / 22
16 TFP Rounding Need to round in variable position according to m TFP m = 24 TFP m = 21 P S P C RW... P S P C RW... A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 9 / 22
17 TFP Rounding Need to round in variable position according to m TFP m = 24 TFP m = 21 P S P C RW... P S P C RW... Need to introduce RW rounding word for rounding Need to select blue bits for sticky bit computation A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 9 / 22
18 TFP Multiplier (Significand) Hardware to support TFP Decoder needed to generate RW from m HA changed in FA (3:2 CSA) M X PPgen M Y P s P c radix 4 Recoding T R E E m 5 decoder 24 [47,24] <<1 RW : rounding word 3:2 CSA 3:2 CSA P0 <<1 OVF P2 P1 1 2:1 mux P047 mux MSBs 47 SHAMT shift right tie flush to :1 mux M Z 25 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
19 TFP Multiplier (Significand) Hardware to support TFP Decoder needed to generate RW from m HA changed in FA (3:2 CSA) P S P C RW... FA s FA c P M Z... OVF M X PPgen P s P c 3:2 CSA P2 tie flush to 0 M Y radix 4 Recoding T R E E <<1 3:2 CSA P1 1 2:1 mux 0 3:1 mux M Z m decoder P047 MSBs [47,24] RW : rounding word SHAMT <<1 1 mux 0 47 tie P0 shift right 25 sticky comp. A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
20 TFP Multiplier (Significand) Hardware to support TFP Decoder needed to generate RW from m HA changed in FA (3:2 CSA) P S P C RW... FA s FA c P M Z... Barrel shifter (to right) to select blue bits for sticky bit computation Modified selection multiplexer to zero LSBs of result OVF M X PPgen P s P c 3:2 CSA P2 tie flush to 0 M Y radix 4 Recoding T R E E <<1 3:2 CSA P1 1 2:1 mux 0 3:1 mux M Z m decoder P047 MSBs [47,24] RW : rounding word SHAMT <<1 1 mux 0 47 tie P0 shift right 25 sticky comp. A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
21 TFP Multiplier Rounding Modes RTNE IEEE roundtiestoeven Round to the nearest (tie to even) M X PPgen M Y P s P c radix 4 Recoding T R E E m 5 decoder 24 [47,24] <<1 RW : rounding word 3:2 CSA 3:2 CSA P0 <<1 OVF P2 P1 1 2:1 mux P047 mux MSBs 47 SHAMT shift right tie flush to :1 mux M Z 25 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
22 TFP Multiplier Rounding Modes M X M Y RTN Round to the nearest PPgen P s P c radix 4 Recoding T R E E m 5 decoder 24 [47,24] <<1 RW : rounding word 3:2 CSA 3:2 CSA P0 <<1 OVF P2 P1 1 2:1 mux P047 mux MSBs 47 SHAMT shift right tie flush to :1 mux M Z 25 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
23 TFP Multiplier Rounding Modes M X M Y PPgen radix 4 Recoding m 5 decoder RTZ Round toward zero T R E E P s P c <<1 24 [47,24] RW : rounding word (truncation) 3:2 CSA 3:2 CSA P0 <<1 OVF P2 P1 1 2:1 mux OVF mux MSBs 47 SHAMT shift right tie flush to :1 mux M Z 25 sticky comp. tie A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
24 TFP Error vs. Precision m Matrix multiplication 8 8 for m = {6,8,11,14,16,20,24} 0.1 error 0.01 RTZ RTN RTNE e-05 1e-06 1e Error RTNE Error RTN m A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
25 Hardware Implementation STM 45 nm low-power CMOS standard cells Clock period 1.0 ns 15 FO4 2-stage pipeline Post synthesis comparison Area (1) Power (2) error RTNE 15, RTN 12, RTZ 10, (1) [NAND2 equiv.], (2) [mw] at 1 GHz Trade-offs RTNE RTN RTZ Power (ratio) RTNE RTN RTZ Accuracy (ratio) Best tradeoffs for TFP multiplier RTN mode A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
26 TFP multiplier RTN Physical implementation (layout) E x 8 E y 8 Exp. add e 2 Bias tbl Bias 3 (3 MSBs) M X PPgen M Y radix 4 Recoding T R E E P s P c m 5 decoder 24 RW Clock period 1.0 ns (2-stage) Area 10,930 [NAND2 equiv.] Power dissipation 15.5 mw (1 GHz) (binary32 operands) E 1 8 E max <<1 Exp incr. 3:2 CSA 3:2 CSA E max 8 E z E mux Exp. adjust 8 ZERO OVF INFTY P2 P1 OVF MASK 1 masked 2:1 mux flush to 0 0 2:1 mux 1 M Z A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
27 TFP multiplier RTN Physical implementation (layout) E x 8 E 1 Exp. add E max E y 8 8 Bias 3 (3 MSBs) E z Exp incr. E mux Exp. adjust 8 8 e 2 Bias tbl E max ZERO OVF INFTY OVF M X PPgen :2 CSA P2 flush to 0 M Y radix 4 Recoding T R E E P s P c <<1 3:2 CSA P1 1 masked 2:1 mux 0 0 2:1 mux 1 M Z RW m 5 decoder 24 MASK Clock period 1.0 ns (2-stage) Area 10,930 [NAND2 equiv.] Power dissipation 15.5 mw (1 GHz) (binary32 operands) Power dissipation breakdown percent Mult. array 75 Spec. rounding 13 Norm. & selection 2 Exponent 3 Regs. etc. 6 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
28 binary32 multiplier RTN Physical implementation (layout) E x 8 E 1 E y 8 Exp. add 8 Bias e 2 Bias tbl Exp incr. 3 E max M X PPgen M Y radix 4 Recoding T R E E P s P c HA m 5 decoder 24 RW <<1 HA Comparison multiplier TFP vs. binary32 (RTN mode) Area (post-layout) b32 9,790 TFP 10,930 (+11%) [NAND2 equiv.] E max 8 E z E mux Exp. adjust 8 ZERO OVF INFTY P2 P1 OVF 1 2:1 mux 0 MASK flush to 0 0 2:1 mux 1 M Z Power dissipation 1 Z = X Y 2 Z = (X Y) W A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
29 Switching Activity in TFP and binary32 Multiplier 1 Z = X Y TFP binary32 s exp X : fraction s exp fraction s exp Y : fraction = s exp fraction = Z : s exp fraction s exp fraction A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
30 Switching Activity in TFP and binary32 Multiplier 1 Z = X Y TFP binary32 s exp X : fraction s exp fraction s exp Y : fraction = s exp fraction = Z : s exp fraction s exp fraction 2 Z = (X Y) W TFP binary32 s exp X : fraction s exp fraction s exp Y : fraction = s exp fraction = s exp fraction s exp fraction s exp W : fraction = s exp fraction = Z : s exp fraction s exp fraction A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
31 Power Dissipation TFP vs. binary32 Multiplier P ave under changing significand bitwidth m, e = 8 1 Z = X Y P ave 16 TFP b A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
32 Power Dissipation TFP vs. binary32 Multiplier P ave under changing significand bitwidth m, e = 8 2 Z = (X Y) W P ave 16 TFP b A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
33 Power Dissipation TFP vs. binary32 Multiplier P ave under changing significand bitwidth m, e = 8 2 Z = (X Y) W P ave 16 TFP b32 b32 10 TFP TFP more power efficient than b32 fom m < 20 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
34 Example: Neural Network Neural network with two hidden layers (depth=2) input layer hidden layer 1 hidden layer 2 output layer A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
35 Example: Neural Network Neural network with two hidden layers (depth=2) input layer hidden layer 1 hidden layer 2 output layer Training/Learning A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
36 Example: Neural Network Neural network with two hidden layers (depth=2) input layer hidden layer 1 hidden layer 2 output layer Training dataset DNN w i error Ground Truth Backpropagation Training/Learning A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
37 Example: Neural Network Neural network with two hidden layers (depth=2) input layer hidden layer 1 hidden layer 2 output layer Training/Learning binary32 binary16 m e ǫ ave epoch n op P ave E Learn , , , , , , , , [mw] [J] A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
38 Example: Neural Network Neural network with two hidden layers (depth=2) input layer hidden layer 1 hidden layer 2 output layer C simulator ǫ ave, epoch, n op test-vectors Synopsys VCS switching activity Synopsys ICC P ave E Learn = P ave n op T clk Training/Learning binary32 binary16 m e ǫ ave epoch n op P ave E Learn , , , , , , , , [mw] [J] A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
39 Example: Neural Network Inference Approximation Error (binary32) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
40 Example: Neural Network Inference Approximation Error (binary32) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
41 Example: Neural Network Inference Approximation Error (binary32) Ideal n = 120 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
42 Example: Neural Network Inference Approximation Error (binary32) Ideal n = 120 NN approx. n = 129 ( ) 97 Miss-classified (in) 32 (out) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
43 Example: Neural Network Inference Quantization Error / Reduced Precision binary32 binary16 m=8, e=8 m=6, e=5 m=5, e=5 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
44 Example: Neural Network Inference Quantization Error / Reduced Precision binary32 binary16 m=8, e=8 m=6, e=5 m=5, e=5 m e Np TOT ( ) points in b32 and not in (m,e) points in (m,e) and not in b32 A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
45 M X M Y M X M Y M X M Y Improvements Use compound adder in speculative rounding Implement programmable rounding modes PPgen radix 4 Recoding m 5 PPgen radix 4 Recoding m 5 PPgen radix 4 Recoding m 5 decoder decoder decoder T R E E P s P c 24 [47,24] T R E E P s P c 24 [47,24] T R E E P s P c 24 [47,24] <<1 RW : rounding word <<1 RW : rounding word <<1 RW : rounding word 3:2 CSA 3:2 CSA P0 3:2 CSA 3:2 CSA P0 3:2 CSA 3:2 CSA P0 <<1 <<1 <<1 P2 P1 1 0 P047 mux MSBs 47 P2 P1 1 0 P047 mux MSBs 47 P2 P1 1 0 OVF mux MSBs 47 OVF 1 2:1 mux 0 SHAMT shift right OVF 1 2:1 mux 0 SHAMT shift right OVF 1 2:1 mux 0 SHAMT shift right tie flush to 0 3:1 mux sticky comp. tie 3:1 mux sticky comp. tie 3:1 mux flush to 0 flush to 0 tie tie M Z M Z M Z RTNE RTN RTZ sticky comp. tie RTNE for binary32 and binary16 only replace barrel shifter with multiplexer Split the unit in two when m < 12 two binary16 in parallel A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
46 Conclusions and Future Work Tunable Floating-Point multiplier for accelerator use correct rounding for the selected precision reduced power dissipation Future Work Currently testing a TFP-adder (2-stage) Integrating multiplication and addition in FMA Division A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
47 Additional Slides A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH-25 / 22
48 TFP-mult vs. b32-mult: Power Dissipation Average power dissipation for TFP-mult and b32-mult (RTN) P ave Z = X Y Z = (X Y) W m TFP-mult b32-mult ratio TFP-mult b32-mult ratio P ave [mw] measured at 1 GHz. A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
49 Average Power Dissipation for TFP-mult Average power dissipation for TFP-mult (RTN) as m scales (e = 8) Matrix multiplication Gauss elimination m P ave [mw] ratio P ave [mw] ratio P ave measured at 1 GHz. A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
50 TFP-mult Trends 16 P ave Matrix mult Gauss Trends of average power dissipation for TFP-mult (RTN) as m scales (e = 8) A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
51 Hardware for binary32 to binary16 Reduction /* range checking (exponent) */ E b16 = E b32 B b32 +B b16 = E b // must be positive /* check lower bound (exponent) */ E b16 E max(5b) < 0 E b = E b < 0 /* check significand for non-zero bits */ zero = 0; for i=0 to 12 do zero = zero or M b32 (i); end for c1 Eb = 112 = bit sign if ((E b16 > 0)and(E b < 0)and(zero = 0)) then reduce to binary16 else keep binary32 end if 5 E b16 8 c2 c1 c2 zero 8 bit sign binary16 M M zero... binary32 1 2:1 mux 0 OR TREE A. Nannarelli (DTU Compute) Tunable Floating-Point for Energy Efficient Accelerators ARITH / 22
9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017
9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin
More informationChapter 8: Solutions to Exercises
1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier Science, c 2004 with contributions by Fabrizio Lamberti Exercise 8.1 Fixed-point representation
More informationLRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation
LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation Jingyang Zhu 1, Zhiliang Qian 2, and Chi-Ying Tsui 1 1 The Hong Kong University of Science and
More informationVectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier
Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Espen Stenersen Master of Science in Electronics Submission date: June 2008 Supervisor: Per Gunnar Kjeldsberg, IET Co-supervisor: Torstein
More informationBinary Floating-Point Numbers
Binary Floating-Point Numbers S exponent E significand M F=(-1) s M β E Significand M pure fraction [0, 1-ulp] or [1, 2) for β=2 Normalized form significand has no leading zeros maximum # of significant
More informationProposal to Improve Data Format Conversions for a Hybrid Number System Processor
Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor
More informationAdders, subtractors comparators, multipliers and other ALU elements
CSE4: Components and Design Techniques for Digital Systems Adders, subtractors comparators, multipliers and other ALU elements Instructor: Mohsen Imani UC San Diego Slides from: Prof.Tajana Simunic Rosing
More informationWhat s the Deal? MULTIPLICATION. Time to multiply
What s the Deal? MULTIPLICATION Time to multiply Multiplying two numbers requires a multiply Luckily, in binary that s just an AND gate! 0*0=0, 0*1=0, 1*0=0, 1*1=1 Generate a bunch of partial products
More informationEECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters
EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters April 15, 2010 John Wawrzynek 1 Multiplication a 3 a 2 a 1 a 0 Multiplicand b 3 b 2 b 1 b 0 Multiplier X a 3 b 0 a 2 b 0 a 1 b
More informationHardware Design I Chap. 4 Representative combinational logic
Hardware Design I Chap. 4 Representative combinational logic E-mail: shimada@is.naist.jp Already optimized circuits There are many optimized circuits which are well used You can reduce your design workload
More informationBinary Multipliers. Reading: Study Chapter 3. The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding
Binary Multipliers The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 4 6 8 2 4 6 8 3 3 6 9 2 5 8 2 24 27 4 4 8 2 6
More informationCS 140 Lecture 14 Standard Combinational Modules
CS 14 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris 1 Part III. Standard Modules A. Interconnect B. Operators. Adders Multiplier
More informationProposal to Improve Data Format Conversions for a Hybrid Number System Processor
Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH
More informationProcessor Design & ALU Design
3/8/2 Processor Design A. Sahu CSE, IIT Guwahati Please be updated with http://jatinga.iitg.ernet.in/~asahu/c22/ Outline Components of CPU Register, Multiplexor, Decoder, / Adder, substractor, Varity of
More informationARITHMETIC COMBINATIONAL MODULES AND NETWORKS
ARITHMETIC COMBINATIONAL MODULES AND NETWORKS 1 SPECIFICATION OF ADDER MODULES FOR POSITIVE INTEGERS HALF-ADDER AND FULL-ADDER MODULES CARRY-RIPPLE AND CARRY-LOOKAHEAD ADDER MODULES NETWORKS OF ADDER MODULES
More informationChapter 5: Solutions to Exercises
1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier Science, c 2004 Updated: September 9, 2003 Chapter 5: Solutions to Selected Exercises With contributions
More informationALU (3) - Division Algorithms
HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)
More informationLecture 18: Datapath Functional Units
Lecture 8: Datapath Functional Unit Outline Comparator Shifter Multi-input Adder Multiplier 8: Datapath Functional Unit CMOS VLSI Deign 4th Ed. 2 Comparator 0 detector: A = 00 000 detector: A = Equality
More informationLogic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits
Logic and Computer Design Fundamentals Chapter 5 Arithmetic Functions and Circuits Arithmetic functions Operate on binary vectors Use the same subfunction in each bit position Can design functional block
More informationDesign and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives
Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Miloš D. Ercegovac Computer Science Department Univ. of California at Los Angeles California Robert McIlhenny
More informationLecture 12: Datapath Functional Units
Lecture 2: Datapath Functional Unit Slide courtey of Deming Chen Slide baed on the initial et from David Harri CMOS VLSI Deign Outline Comparator Shifter Multi-input Adder Multiplier Reading:.3-4;.8-9
More informationLecture 11. Advanced Dividers
Lecture 11 Advanced Dividers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division
More informationNumbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture
Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose
More informationA 32-bit Decimal Floating-Point Logarithmic Converter
A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan
More informationLecture 8: Sequential Multipliers
Lecture 8: Sequential Multipliers ECE 645 Computer Arithmetic 3/25/08 ECE 645 Computer Arithmetic Lecture Roadmap Sequential Multipliers Unsigned Signed Radix-2 Booth Recoding High-Radix Multiplication
More informationPart VI Function Evaluation
Part VI Function Evaluation Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationLecture 12: Datapath Functional Units
Introduction to CMOS VLSI Deign Lecture 2: Datapath Functional Unit David Harri Harvey Mudd College Spring 2004 Outline Comparator Shifter Multi-input Adder Multiplier 2: Datapath Functional Unit CMOS
More informationChapter 4 Number Representations
Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point
More informationDIVISION BY DIGIT RECURRENCE
DIVISION BY DIGIT RECURRENCE 1 SEVERAL DIVISION METHODS: DIGIT-RECURRENCE METHOD studied in this chapter MULTIPLICATIVE METHOD (Chapter 7) VARIOUS APPROXIMATION METHODS (power series expansion), SPECIAL
More informationCSE 140 Lecture 11 Standard Combinational Modules. CK Cheng and Diba Mirza CSE Dept. UC San Diego
CSE 4 Lecture Standard Combinational Modules CK Cheng and Diba Mirza CSE Dept. UC San Diego Part III - Standard Combinational Modules (Harris: 2.8, 5) Signal Transport Decoder: Decode address Encoder:
More informationVLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California
VLSI Arithmetic Lecture 9: Carry-Save and Multi-Operand Addition Prof. Vojin G. Oklobdzija University of California http://www.ece.ucdavis.edu/acsel Carry-Save Addition* *from Parhami 2 June 18, 2003 Carry-Save
More information8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.
Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu
More informationLow Power Division and Square Root
UNIVERSITY OF CALIFORNIA, IRVINE Low Power Division and Square Root DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Engineering by Alberto Nannarelli
More informationA COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte
A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic
More informationDesign of Sequential Circuits
Design of Sequential Circuits Seven Steps: Construct a state diagram (showing contents of flip flop and inputs with next state) Assign letter variables to each flip flop and each input and output variable
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH
More informationArithmetic Circuits-2
Arithmetic Circuits-2 Multipliers Array multipliers Shifters Barrel shifter Logarithmic shifter ECE 261 Krish Chakrabarty 1 Binary Multiplication M-1 X = X i 2 i i=0 Multiplicand N-1 Y = Y i 2 i i=0 Multiplier
More informationAppendix B. Review of Digital Logic. Baback Izadi Division of Engineering Programs
Appendix B Review of Digital Logic Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Elect. & Comp. Eng. 2 DeMorgan Symbols NAND (A.B) = A +B NOR (A+B) = A.B AND A.B = A.B = (A +B ) OR
More informationComplement Arithmetic
Complement Arithmetic Objectives In this lesson, you will learn: How additions and subtractions are performed using the complement representation, What is the Overflow condition, and How to perform arithmetic
More informationCMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes
CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN
More informationOn various ways to split a floating-point number
On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-
More informationReduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs
Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,
More informationA HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *
Copyright IEEE 999: Published in the Proceedings of Globecom 999, Rio de Janeiro, Dec 5-9, 999 A HIGH-SPEED PROCESSOR FOR RECTAGULAR-TO-POLAR COVERSIO WITH APPLICATIOS I DIGITAL COMMUICATIOS * Dengwei
More informationImplementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System
Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System G.Suresh, G.Indira Devi, P.Pavankumar Abstract The use of the improved table look up Residue Number System
More informationUNIT V FINITE WORD LENGTH EFFECTS IN DIGITAL FILTERS PART A 1. Define 1 s complement form? In 1,s complement form the positive number is represented as in the sign magnitude form. To obtain the negative
More informationResearch Article Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid Number System
Computer Networks and Communications, Article ID 890354, 7 pages http://dx.doi.org/10.1155/2014/890354 Research Article Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid
More informationNeural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017
Neural Network Approximation Low rank, Sparsity, and Quantization zsc@megvii.com Oct. 2017 Motivation Faster Inference Faster Training Latency critical scenarios VR/AR, UGV/UAV Saves time and energy Higher
More informationAdders, subtractors comparators, multipliers and other ALU elements
CSE4: Components and Design Techniques for Digital Systems Adders, subtractors comparators, multipliers and other ALU elements Adders 2 Circuit Delay Transistors have instrinsic resistance and capacitance
More informationCost/Performance Tradeoff of n-select Square Root Implementations
Australian Computer Science Communications, Vol.22, No.4, 2, pp.9 6, IEEE Comp. Society Press Cost/Performance Tradeoff of n-select Square Root Implementations Wanming Chu and Yamin Li Computer Architecture
More informationDIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute
DIGITAL TECHNICS Dr. Bálint Pődör Óbuda University, Microelectronics and Technology Institute 4. LECTURE: COMBINATIONAL LOGIC DESIGN: ARITHMETICS (THROUGH EXAMPLES) 2016/2017 COMBINATIONAL LOGIC DESIGN:
More informationNumber Representation and Waveform Quantization
1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.
More informationPart II Addition / Subtraction
Part II Addition / Subtraction Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationComputer Architecture. ESE 345 Computer Architecture. Design Process. CA: Design process
Computer Architecture ESE 345 Computer Architecture Design Process 1 The Design Process "To Design Is To Represent" Design activity yields description/representation of an object -- Traditional craftsman
More informationDual-Field Arithmetic Unit for GF(p) and GF(2 m ) *
Institute for Applied Information Processing and Communications Graz University of Technology Dual-Field Arithmetic Unit for GF(p) and GF(2 m ) * CHES 2002 Workshop on Cryptographic Hardware and Embedded
More informationInteger Multipliers 1
Integer Multipliers Multipliers must have circuit in most DS applications variety of multipliers exists that can be chosen based on their performance Serial, Serial/arallel,Shift and dd, rray, ooth, Wallace
More informationA Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form (2 n (2 p ± 1))
The Computer Journal, 47(1), The British Computer Society; all rights reserved A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form ( n ( p ± 1)) Ahmad A. Hiasat Electronics Engineering
More informationA Low-Error Statistical Fixed-Width Multiplier and Its Applications
A Low-Error Statistical Fixed-Width Multiplier and Its Applications Yuan-Ho Chen 1, Chih-Wen Lu 1, Hsin-Chen Chiang, Tsin-Yuan Chang, and Chin Hsia 3 1 Department of Engineering and System Science, National
More informationDigital Logic: Boolean Algebra and Gates. Textbook Chapter 3
Digital Logic: Boolean Algebra and Gates Textbook Chapter 3 Basic Logic Gates XOR CMPE12 Summer 2009 02-2 Truth Table The most basic representation of a logic function Lists the output for all possible
More informationThe Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal
-1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,
More informationOptimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations
electronics Article Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations Masoud Sadeghian 1,, James E. Stine 1, *, and E. George Walters III 2, 1 Oklahoma
More informationECS 231 Computer Arithmetic 1 / 27
ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers
More informationLecture 2 Review on Digital Logic (Part 1)
Lecture 2 Review on Digital Logic (Part 1) Xuan Silvia Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese461/ Grading Engagement 5% Review Quiz 10% Homework 10% Labs 40%
More informationSample Test Paper - I
Scheme G Sample Test Paper - I Course Name : Computer Engineering Group Marks : 25 Hours: 1 Hrs. Q.1) Attempt any THREE: 09 Marks a) Define i) Propagation delay ii) Fan-in iii) Fan-out b) Convert the following:
More informationInnocuous Double Rounding of Basic Arithmetic Operations
Innocuous Double Rounding of Basic Arithmetic Operations Pierre Roux ISAE, ONERA Double rounding occurs when a floating-point value is first rounded to an intermediate precision before being rounded to
More informationELEN Electronique numérique
ELEN0040 - Electronique numérique Patricia ROUSSEAUX Année académique 2014-2015 CHAPITRE 3 Combinational Logic Circuits ELEN0040 3-4 1 Combinational Functional Blocks 1.1 Rudimentary Functions 1.2 Functions
More informationElements of Floating-point Arithmetic
Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow
More informationEfficient Function Approximation Using Truncated Multipliers and Squarers
Efficient Function Approximation Using Truncated Multipliers and Squarers E. George Walters III Lehigh University Bethlehem, PA, USA waltersg@ieee.org Michael J. Schulte University of Wisconsin Madison
More informationChapter 6: Solutions to Exercises
1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier Science, c 00 Updated: September 3, 003 With contributions by Elisardo Antelo and Fabrizio Lamberti
More informationInternational Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
More informationCSE140: Components and Design Techniques for Digital Systems. Decoders, adders, comparators, multipliers and other ALU elements. Tajana Simunic Rosing
CSE4: Components and Design Techniques for Digital Systems Decoders, adders, comparators, multipliers and other ALU elements Tajana Simunic Rosing Mux, Demux Encoder, Decoder 2 Transmission Gate: Mux/Tristate
More informationProject Two RISC Processor Implementation ECE 485
Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar 1 Statement of Problem This project requires the design and test of a RISC processor
More informationClass Website:
ECE 20B, Winter 2003 Introduction to Electrical Engineering, II LECTURE NOTES #5 Instructor: Andrew B. Kahng (lecture) Email: abk@ece.ucsd.edu Telephone: 858-822-4884 office, 858-353-0550 cell Office:
More informationA High-Speed Realization of Chinese Remainder Theorem
Proceedings of the 2007 WSEAS Int. Conference on Circuits, Systems, Signal and Telecommunications, Gold Coast, Australia, January 17-19, 2007 97 A High-Speed Realization of Chinese Remainder Theorem Shuangching
More informationECE/CS 552: Introduction To Computer Architecture 1. Instructor:Mikko H Lipasti. Fall 2010 University i of Wisconsin-Madison
ECE/CS 552: Arithmetic I Instructor:Mikko H Lipasti Fall 2010 Univsity i of Wisconsin-Madison i Lecture notes partially based on set created by Mark Hill. Basic Arithmetic and the ALU Numb representations:
More informationDesign and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic
Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic MadivalappaTalakal 1, G.Jyothi 2, K.N.Muralidhara 3, M.Z.Kurian 4 PG Student [VLSI & ES], Dept. of
More informationEC 413 Computer Organization
EC 413 Computer Organization rithmetic Logic Unit (LU) and Register File Prof. Michel. Kinsy Computing: Computer Organization The DN of Modern Computing Computer CPU Memory System LU Register File Disks
More informationECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks
ECE 545 Digital System Design with VHDL Lecture Digital Logic Refresher Part A Combinational Logic Building Blocks Lecture Roadmap Combinational Logic Basic Logic Review Basic Gates De Morgan s Law Combinational
More informationEECS150 - Digital Design Lecture 25 Shifters and Counters. Recap
EECS150 - Digital Design Lecture 25 Shifters and Counters Nov. 21, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John
More informationNEW SELF-CHECKING BOOTH MULTIPLIERS
Int. J. Appl. Math. Comput. Sci., 2008, Vol. 18, No. 3, 319 328 DOI: 10.2478/v10006-008-0029-4 NEW SELF-CHECKING BOOTH MULTIPLIERS MARC HUNGER, DANIEL MARIENFELD Department of Electrical Engineering and
More informationAn Approximate Parallel Multiplier with Deterministic Errors for Ultra-High Speed Integrated Optical Circuits
An Approximate Parallel Multiplier with Deterministic Errors for Ultra-High Speed Integrated Optical Circuits Jun Shiomi 1, Tohru Ishihara 1, Hidetoshi Onodera 1, Akihiko Shinya 2, Masaya Notomi 2 1 Graduate
More informationECE 545 Digital System Design with VHDL Lecture 1A. Digital Logic Refresher Part A Combinational Logic Building Blocks
ECE 545 Digital System Design with VHDL Lecture A Digital Logic Refresher Part A Combinational Logic Building Blocks Lecture Roadmap Combinational Logic Basic Logic Review Basic Gates De Morgan s Laws
More informationHakim Weatherspoon CS 3410 Computer Science Cornell University
Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. memory inst 32 register
More informationTree and Array Multipliers Ivor Page 1
Tree and Array Multipliers 1 Tree and Array Multipliers Ivor Page 1 11.1 Tree Multipliers In Figure 1 seven input operands are combined by a tree of CSAs. The final level of the tree is a carry-completion
More informationLecture 8. Sequential Multipliers
Lecture 8 Sequential Multipliers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter
More informationFundamentals of Digital Design
Fundamentals of Digital Design Digital Radiation Measurement and Spectroscopy NE/RHP 537 1 Binary Number System The binary numeral system, or base-2 number system, is a numeral system that represents numeric
More informationThe Design Procedure. Output Equation Determination - Derive output equations from the state table
The Design Procedure Specification Formulation - Obtain a state diagram or state table State Assignment - Assign binary codes to the states Flip-Flop Input Equation Determination - Select flipflop types
More informationWritten exam with solutions IE Digital Design Friday 21/
Written exam with solutions IE204-5 Digital Design Friday 2/0 206 09.00-3.00 General Information Examiner: Ingo Sander. Teacher: Kista, William Sandvist tel 08-7904487, Elena Dubrova phone 08-790 4 4 Exam
More informationDSP Design Lecture 7. Unfolding cont. & Folding. Dr. Fredrik Edman.
SP esign Lecture 7 Unfolding cont. & Folding r. Fredrik Edman fredrik.edman@eit.lth.se Unfolding Unfolding creates a program with more than one iteration, J=unfolding factor Unfolding is a structured way
More informationHow to Prepare Weather and Climate Models for Future HPC Hardware
How to Prepare Weather and Climate Models for Future HPC Hardware Peter Düben European Weather Centre (ECMWF) Peter Düben Page 2 The European Weather Centre (ECMWF) www.ecmwf.int Independent, intergovernmental
More informationPart II Addition / Subtraction
Part II Addition / Subtraction Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationComputer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1
Computer Architecture ECE 361 Lecture 5: The Design Process & Design 361 design.1 Quick Review of Last Lecture 361 design.2 MIPS ISA Design Objectives and Implications Support general OS and C- style language
More informationDIGIT-SERIAL ARITHMETIC
DIGIT-SERIAL ARITHMETIC 1 Modes of operation:lsdf and MSDF Algorithm and implementation models LSDF arithmetic MSDF: Online arithmetic TIMING PARAMETERS 2 radix-r number system: conventional and redundant
More informationCombinational Logic Design Arithmetic Functions and Circuits
Combinational Logic Design Arithmetic Functions and Circuits Overview Binary Addition Half Adder Full Adder Ripple Carry Adder Carry Look-ahead Adder Binary Subtraction Binary Subtractor Binary Adder-Subtractor
More informationNotes for Chapter 1 of. Scientific Computing with Case Studies
Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What
More informationLOGIC CIRCUITS. Basic Experiment and Design of Electronics
Basic Experiment and Design of Electronics LOGIC CIRCUITS Ho Kyung Kim, Ph.D. hokyung@pusan.ac.kr School of Mechanical Engineering Pusan National University Outline Combinational logic circuits Output
More informationLecture 2: Number Representations (2)
Lecture 2: Number Representations (2) ECE 645 Computer Arithmetic 1/29/08 ECE 645 Computer Arithmetic Lecture Roadmap Number systems (cont'd) Floating point number system representations Residue number
More informationCS61C : Machine Structures
CS 61C L15 Blocks (1) inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #15: Combinational Logic Blocks Outline CL Blocks Latches & Flip Flops A Closer Look 2005-07-14 Andy Carle CS
More informationMultiplication of signed-operands
Multiplication of signed-operands Recall we discussed multiplication of unsigned numbers: Combinatorial array multiplier. Sequential multiplier. Need an approach that works uniformly with unsigned and
More informationNyquist-Rate A/D Converters
IsLab Analog Integrated ircuit Design AD-51 Nyquist-ate A/D onverters כ Kyungpook National University IsLab Analog Integrated ircuit Design AD-1 Nyquist-ate MOS A/D onverters Nyquist-rate : oversampling
More information