A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

Size: px
Start display at page:

Download "A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte"

Transcription

1 A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic is commonly used in Reed-Solomon encoding and decoding. This paper presents the design of a combined 16-bit binary and dual Galois field (GF) multiplier. This multiplier is capable of performing either a 16-bit two s complement or unsigned multiplication, or two independent -bit GF(2 ) multiplications in SIMD fashion. The combined multiplier is designed by modifying a conventional binary tree multiplier. It uses a novel wiring methodology to provide two simultaneous GF(2 ) multiplies with a minor impact on area and delay. Three alternatives for the multiplier design are presented. Area and delay estimates indicate that compared to a conventional binary tree multiplier, the combined multiplier has roughly 6% more delay and 23% more area. 1. INTRODUCTION Galois field (GF) arithmetic is a powerful algebraic tool, employed in many encoding techniques [1]. In particular, it is used in Reed-Solomon (R-S) encoding, which frequently provides error correction for wireless communications and compact discs. R-S codes are usually implemented with the -bit field GF(2 ). A brief explanation of some key concepts and operations in GF multiplication follows. Further details can be found in [2]. A GF(2 m ) field is an extension of the field GF(2), with elements f; 1g. The operations defined in GF(2) are addition and multiplication, each performed modulo 2. When GF(2) is extended to GF(2 m ), the result is a vectorial field of dimension m over GF(2). Elements of GF(2 m ) can thus be represented as m-bit binary words. The field is characterized by the irreducible polynomial f (x) = x m + f m?1x m?1 + + f1x + f; (1) with f i 2 GF(2). All 2 m elements in the field can be represented by means of the vector basis f m?1 ; : : : ; 1 ; g, where is a root of the irreducible polynomial f (x) and is called a primitive element of the field. This base allows an element A 2 GF(2 m ) to be expressed as A = a m?1 m?1 + + a1 1 + a ; (2) with a i 2 GF(2). Thus, elements of GF(2 m ) can be associated with polynomials that have coefficients in GF(2), where the bit in the i th position represents the coefficient of i. For example, with GF(2 4 ): (1 1 )! Addition and multiplication in GF(2 m ) can be viewed as polynomial addition and multiplication modulo f (x). Since the polynomial s coefficients belong to GF(2), operations at the coefficient level are taken modulo 2. Thus, addition in GF(2 m ) is performed by bitwise XORing the m coefficients of the two polynomials being added. Similarly, multiplication is computed as a modulo 2 sum of shifted partial products, where the sum is again computed using bitwise XORing. The result of a GF(2 m ) multiplication is a (2m?1)-bit word, which represents a degree (2m? 2) polynomial, in what is called extended form. To achieve closure, the extended form polynomial is reduced modulo f (x). This is equivalent to calculating the remainder from the extended form polynomial divided by f (x). All GF(2 m ) fields are isomorphic, for a given m. Consequently, a particular field can be chosen without affecting the results, if its representation offers computational advantages. Using a fixed f (x), however, is inconvenient if we are trying to design hardware to work with existing applications that use different field representations. Thus, it is useful to allow f (x) to be programmable. There exists abundant literature on specialized designs for GF multipliers. Few of these designs, however, are oriented towards the use of GF multipliers as part of a programmable digital signal processor (DSP). Consequently, they are designed for a specific field, which makes the them less versatile. Initial designs for GF multiplication used a serial approach. Although serial GF multipliers have low hardware requirements, they are very slow. Consequently, several parallel designs for GF multiplies have been introduced: Mastrovito multipliers use a fixed irreducible polynomial that allows the hardware to be optimized [3],[4]. Al-

2 though these designs have been used successfully in application-specific implementations, they are less flexible for use in programmable DSPs. Systolic GF multipliers are very useful when the rate of the input operands is constant at high clock rates [5]. Systolic GF multipliers, however, are not suitable for implementation in programmable DSPs, which have slower clock cycles and irregular input rates. Parallel GF multipliers use a dedicated functional unit and allow f (x) to be specified. The parallel GF multiplier presented in [6] is conceptually very similar to part of the design presented here. The main difference is that the design from [6] is a dedicated unit, which does not also support conventional binary multiplications. The design from [6] is used in this paper for comparison purposes. The TMS32C64x DSP provides support for programmable GF multiplications, with m ranging from 1 to [7]. It executes four independent GF(2 ) multiplies with results ready after four cycles. No details about the actual implementation could be found. The combined GF multiplier presented in [],[9] is similar to the designs presented in this paper. Their design is based on a Wallace Tree Multiplier, which has been modified to perform either conventional binary or GF multiplication. Their polynomial reduction introduces a linear delay. A similar design is used in this paper for comparison purposes. 2. COMBINED BINARY AND DUAL GF MULTIPLIER The multiplier presented in this paper receives two 16-bit inputs, X and Y, and produces a 32-bit output Z. X and Y represent 16-bit integers, or the concatenation of two -bit independent GF(2 ) elements: X high = X[15 : : : ]; X low = X[7 : : : ] Y high = Y [15 : : : ]; Y low = Y [7 : : : ] The multiplier also receives two control signals f and t. f is one for fixed point multiplication and zero for GF multiplication. t is one for two s complement multiplication and zero for unsigned multiplication. When f is zero, t does not affect the GF multiplication. For two s complement and unsigned multiplication Z is the corresponding 32-bit (signed or unsigned) product. For GF multiplication, Z consists of sixteen leading zeros, followed by Zhigh GF = X high Y high and Zlow GF = X low Y low, with the products calculated in GF(2 ). An additional input to the unit is the vector necessary for the polynomial reduction of GF products. Depending on the polynomial reduction method, this is either an -bit or 56-bit word. The polynomial reduction unit is explained in detail in Section 2.3. Figure 1 shows a block diagram of the combined multiplier. f t Xhigh Xlow Yhigh Ylow MUX 24 Carry Lookahead Adder f 25 MUX Partial Product generation and Reduction 24 MUX Z Poly. Reducer Poly. Reducer Fig. 1. Block diagram of the combined multiplier Partial Product Matrix For both fixed point and GF multiplications, the first step is generating a matrix of partial products. These are calculated by ANDing the corresponding term in X and Y as: pp i;j = x i y j. Partial products are arranged in rows, with each row shifted i positions to the left as in Figure 2. Each dot represents the output of an AND gate. The fixed point product Z = X Y is obtained by adding the resulting partial products. The partial product matrix is composed of four submatrices, X low Y low, X low Y high, X high Y low and X high Y high, as shown in Figure 2. The upper-right and lower-left submatrices correspond to the partial products to be added for GF multiplications. These partial products are indicated by hollow dots in Figure 2. The partial products in the other two submatrices, indicated by black dots, are set to zero when calculating a GF product, by ANDing the x j to these submatrices with the control signal f. The new inputs are called Xhigh m and Xm low. This extra hardware only represents 16 AND gates. It adds only one AND gate delay to the critical path Partial Product Reduction Tree A tree multiplier was selected to implement the partial product reduction, due to its speed advantage over array multi-

3 y15 y13 y11 y9 y y7 y6 y5 y4 y3 y2 y1 y x15 y14 x14 x13 y12 x12 x11 y1 x1 Fig. 2. Partial product matrix. pliers. The Reduced Area multiplier was chosen because it requires less area than other techniques, places full adders as early as possible in the reduction tree, and minimizes the size of the carry-propagate adder [1]. The layout of the Reduced Area multiplier also offers advantages when performing GF multiplication. The 16-bit Reduced Area multiplier that serves as the base for the combined multiplier uses six reduction stages that have a worst case delay equivalent to six full adders. GF and fixed point multiplication are similar in concept, but there is an important difference in the partial product reduction. As explained in Section 1, the partial products are added modulo 2 for GF multiplication, which implies that no carries are included in the summation. Two ways to avoid adding carries in the GF multiplication are considered: Modify the full adders and half adders to set the carry output to zero when a GF multiplication is performed. This is the approach taken in [],[9]. Do not modify the adder cells. Instead, modify the wiring of the partial product reduction tree and add a small number of XOR gates, to ensure that no carries are added with partial product bits before the GF product bit for each column is calculated. With the first alternative, an extra control input to each adder cell prevents it from generating a carry when performing GF multiplication. The extra logic (an AND gate at the carry output of each adder cell) increases the area and delay. The new wiring scheme presented here as the second alternative allows important savings in terms of area and delay. Each adder cell computes a sum bit that is either relevant or irrelevant to the GF multiplication. A sum bit is relevant if it depends on at least one partial product from either GF submatrix. In this case, the sum bit cannot be an input to an adder that has inputs that depend on carry signals from other adders. Relevant sum bits have to be added together as early as possible. In practice, there are few places where this restriction modifies the wiring of the adder cells. Also there are six columns where only two GF-relevant sum bits are x9 x x7 x6 x5 x4 x3 x2 x1 x available after the first reduction stage. In these cases, both bits are extracted and added with a XOR gate, to form the expected result without modifying the partial product reduction tree. After the third reduction stage, all the GF-relevant partial products in each column have been added together, and can be extracted to form the two extended results (15- bit vectors), which are sent to polynomial reducers. The only overhead for this design is six XOR gates and some minor restrictions for the connection of the full adders. The theoretical critical delay path does not increase. The output of the partial product reduction tree for fixed point multiplication is obtained at the bottom of the tree. It consists of the least significant 7 bits of the final result, and two 25-bit vectors that are added using a carry propagate adder to yield the remaining bits of the final product [1]. The combined multiplier uses a carry-lookahead adder with a block size of 4 [11]. With the modified full adders, the GF-extended result vectors are extracted at the bottom of the tree. With the new wiring technique, each bit from the two extended result vectors is extracted from the partial product reduction tree as soon as it is ready. For the 16-bit combined multiplier, the resulting bits for the GF multiplication are ready after the third reduction stage. In both cases, the extended results are reduced modulo the irreducible polynomial. To do so, two polynomial reduction units operate in parallel Polynomial Reduction When performing a GF multiplication, the two 15-bit results obtained from the reduction tree each represent a polynomial P () of order 14 (in general, order 2m? 2 for GF(2 m )). GF multiplication is computed modulo the irreducible polynomial f (x), such that C() = P () mod f (). C() is computed as the remainder of the polynomial division. The example below shows how to compute C() using the Linear Polynomial Reduction (LPR) algorithm, with m = 4, P () = ( ), and f (x) = x 4 + x + 1! f () = (1 1 1). Subtraction in GF(2 m ) is also equivalent to bitwise XORing. P: p6 p5 p4 p3 p2 p1 p? p6 p6f3 p6f2 p6f1 p6f? p 5 p 4 p 3 p 2 p 1 p p 5 p 5 f 3 p 5 f 2 p 5 f 1 p 5 f? p 4 p 4 f 3 p 4 f 2 p 4 f 1 p 4 f C: c3 c2 c1 c p 4 p 3 p 2 p 1 p

4 ? ? ? The hardware implementation of the LPR algorithm is straight forward [12], as shown in Figure 3 for GF(2 4 ). It requires m(m? 1) AND gates and m(m? 1) XOR gates. The worst case delay is (m? 1) AND gates and (m? 1) XOR gates. p6 f3 p5 f2 p4 f1 p3 f p2 p1 p The A i () are the canonical representations in the field s base of the field elements i that appear in the first summation in Equation (3). A i () can be added with the second summation (which represents one element in canonical form), to produce the final result. The previous example for GF(2 4 ) polynomial reduction is now computed as: f (x) = x 4 + x + 1! 4 = (1); 5 = (1); 6 = (11) P () = ( )! () : p3 : : : p 11 : p6 6 1 : p5 5 + : p : c3 : : : c p p p P[3...] A 4 A 5 A c3 c2 c1 c Fig. 3. Linear polynomial reduction for m = 4. The inputs to each polynomial reduction unit are the extended GF result (15 bits), and the least significant bits of the irreducible polynomial. The MSB of f () is not needed because it is always 1. It is important to notice that the LPR algorithm causes a linear delay, as opposed to the logarithmic delay introduced by the partial product reduction tree. However, the combined structure of the multiplier masks this delay. The 25-bit wide carry lookahead adder (CLA), which operates in parallel with the polynomial reduction, is considerably slower. Therefore, the GF-specific part of the design is not on the critical path. A different implementation, the Parallel Polynomial Reduction (PPR), has been considered to reduce the delay of this stage [6]. The extended result P () can be expressed as 2m?2 X m?1 P () = p i X i + p i i ; (3) i=m i= where using Equation (2) we can substitute i = A i () = X m?1 a i;j j ; m i 2m? 2: (4) j= XOR AND AND AND XOR XOR 4 4 Fig. 4. Parallel polynomial reduction for m = 4. The design presented in this paper uses the pre-calculated canonical representation of the seven GF-elements of the form i, i = : : : 14. Each of these seven values is an - bit vector. The reduction is performed by adding the corresponding GF element A i () to substitute for each bit above the th. The seven -bit values to be added are computed as soon as the extended result is ready. The modulo 2 addition of each A i () to the least significant bits of the extended result is done in parallel, in a binary tree configuration, which has logarithmic delay. An implementation is shown for m = 4 in Figure 4, where the AND blocks perform p i A i (), and the XOR blocks perform GF addition. The gate count is m(m? 1) AND gates and m(m? 1) XOR gates, the same as for the LPR. The theoretical worst case delay of the PPR is only 1 AND gate and dlog2 me XOR gates. C 4

5 Design A Design B Design C Design D Design E Delay (ns) Norm. Delay Table 1. Total and normalized delays for each design. Design A Design B Design C Design D Design E Equiv. Gates Norm. Area Table 2. Total and normalized areas for each design. This polynomial reduction unit requires more registers, seven bytes instead of one for GF(2 ), but the decrease in delay is considerable. The delay is now comparable to that of the partial product reduction stage, which is important for pipelined designs. The outputs of the polynomial reduction units are the two -bit GF products, Zhigh GF and ZGF low. The only operation left is multiplexing the final output depending on the type of multiplication, GF or fixed point. 3. SYNTHESIS RESULTS Variations of the combined 16-bit binary and dual GF multiplier were modeled in VHDL and then synthesized. The following units were chosen for modeling and synthesis: Design A is a regular 16-bit fixed point multiplier [1], which serves as the reference design. Design B uses modified adder cells in the partial product reduction tree to avoid adding carries. The polynomial reduction stage is performed with the LPR. This design is similar to the one presented in [9]. Design C rewires the reduction tree to avoid adding carry bits before GF results are ready. It uses the LPR. Design D is the same as Design C, but uses the PPR. Design E is the regular fixed point multiplier in design A, plus two dedicated parallel GF multipliers [6]. These designs were synthesized for minimum area with Exemplar Logic s Leonardo toolset and the LCA3K.6 micron standard cell library. To minimize the effect of the particular technology, design A is used as a reference, and all measures are normalized to its area and delay. Delay figures are shown in Table 1. The worst case delay is less for designs C and D than for B, with designs C and D having roughly 7.5% less delay than B. The overhead in designs C and D corresponds to the output multiplexor and the AND gates used to produce Xhigh m and Xm low. In B, the effect of the extra AND gates in the adders of the partial product reduction tree further increases the delay. The similarity of data between C and D shows that the delay of the unit depends mostly on the partial products reduction and the carry propagate adder. Changing the implementation of the polynomial reduction does not affect the total delay, because the CPA is slower than either polynomial reduction method. This effect is also reflected in design E. The dedicated parallel GF multiplier only requires 2.9 ns to complete the multiplication, but without pipelining the clock cycle has to be set to the larger fixed point multiplication time. Table 2 gives area estimates for each design. As expected, for designs C and D these estimates are identical, since the gate counts for both polynomial reduction stages are the same. The normalized increase over the base multiplier is about 23%. In comparison, design B requires about 59% more area than the base multiplier. The extra buffers required for supplying the f signal to every adder in the reduction tree increase the area. Although not every adder needs the conditional carry function, the critical path still has six extra AND gates and area is still larger than for designs C and D. The area for design E is clearly larger than for designs C and D. The estimate given only analyzes the area invested in gates, but in practice adding dedicated GF multipliers has a much larger cost, in terms of registers, buses, extra control logic, etc. Regarding area measures, it should be noted that the registers required to store the field s representation (either f (x) for the LPR, or the seven A i () for the PPR) are not included in the gate count. These data can be kept in specialpurpose registers in the register file of the DSP. Parallel Linear Area (eq. gates) 2 2 Delay (ns) Table 3. Area and delay of the polynomial reduction stages. Table 3 shows the synthesis results for linear and parallel polynomial reduction methods. The delay is about 4.67 times smaller for the PPR, for exactly the same gate count.

6 In pipelined designs, the delay for polynomial reduction can be on the critical delay path. For processors with high clock rates, it may be desirable to pipeline the combined multiplier. One approach is to use a two-stage pipeline. For fixed point multiplication, the first pipeline stage generates partial products and reduces them to sum and carry vectors. The second stage performs the carry-propagate addition and the final output selection. Since the extended result for GF multiplication is available after two AND gate delays plus three full adder delays and the parallel polynomial reducer has one AND gate delay plus three full adder delays, the complete GF multiplication can be performed in a single pipeline stage. With this approach, fixed point multiplication has a two-cycle latency, but two parallel GF multiplications have just a one-cycle latency. 4. CONCLUSIONS This paper has shown that a DSP s 16-bit fixed point tree multiplier can be easily modified to support two parallel GF(2 ) multiplications. For GF multiplications, the addition of carries is avoided with the novel connection methodology presented in this paper. This approach requires significantly less area and delay than previous designs, which use additional gates to set the carries to zero. The GF multiply results are ready in extended form after only two AND gate delays plus three full adder delays. The subsequent polynomial reduction over the two extended GF products can be performed in one AND gate delay plus three XOR gate delays by two Parallel Polynomial Reduction units. Adding dual GF(2 ) multiplication to a 16-bit multiplier increases the delay by about 6% and the gate count by about 23%. A combined multiplier has the advantage of reusing data buses and control logic for the existing multiplier, which simplifies the implementation. Acknowledgment This material is based upon work supported by the National Science Foundation under Grant No Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. [3] E. D. Mastrovito, VLSI Designs for Multiplications over Finite Fields GF(2 m ), in Proc. Sixth Int l Conf. Applied Algebra, Algebraic Algorithms, and Error- Correcting Codes (AAECC-6), 19, pp [4] T. Zhang and K. K. Parhi, Systematic Design of Original and Modified Mastrovito Multipliers for General Irreducible Polynomials, IEEE Transactions on Computers, vol. 5, pp , 2. [5] C. Yeh, I. S. Reed, and T. K. Trouong, Systolic Multipliers for Finite Fields GF(2 m ), IEEE Transactions on Computers, vol. C-33, pp. 357, 194. [6] L. Gao and K.K. Parhi, Custom VLSI Design of Efficient Low Latency and Low Power Finite Field Multiplier for Reed-Solomon Codec, in Proc. of 2 IEEE International Symposium on Circuits and Systems (ISCAS), 2, pp. IV [7] Texas Instruments, TMS32C64x Technical Overview. [] W. Drescher, G. Fettweis, and K. Bachmann, VLSI Architecture For Non-Sequential Inversion Over GF(2 m ) Using The Euclidean Algorithm, in Int. Conf on Signal Processing Applications & Technology, [9] W. Drescher, K. Bachmann, and G. Fettweis, VLSI Architecture for Datapath Integration of Arithmetic over GF(2 m ) on DSPs, in Proc. IEEE ICASSP 97, [1] K. C. Bickerstaff, M. J. Schulte, and E. Schwartzlander, Parallel Reduced Area Multipliers, Journal of VLSI Signal Processing, vol. 9, pp , [11] P. Pirsh, Architectures for Digital Signal Processing, Wiley, 199. [12] M. Matsumoto and K. Murase, Multiplier in a Galois Field, U.S. Patent 4,91,63, REFERENCES [1] S. B. Wicker and V. K. Bhargava, Reed-Solomon Codes and Their Applications, IEEE Press, [2] R. Lidl and H. Niederreiter, Introduction to Finite Fields and Their Applications, Cambridge Univ. Press, 1994.

Lecture 8: Sequential Multipliers

Lecture 8: Sequential Multipliers Lecture 8: Sequential Multipliers ECE 645 Computer Arithmetic 3/25/08 ECE 645 Computer Arithmetic Lecture Roadmap Sequential Multipliers Unsigned Signed Radix-2 Booth Recoding High-Radix Multiplication

More information

AN IMPROVED LOW LATENCY SYSTOLIC STRUCTURED GALOIS FIELD MULTIPLIER

AN IMPROVED LOW LATENCY SYSTOLIC STRUCTURED GALOIS FIELD MULTIPLIER Indian Journal of Electronics and Electrical Engineering (IJEEE) Vol.2.No.1 2014pp1-6 available at: www.goniv.com Paper Received :05-03-2014 Paper Published:28-03-2014 Paper Reviewed by: 1. John Arhter

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Instruction Set Extensions for Reed-Solomon Encoding and Decoding

Instruction Set Extensions for Reed-Solomon Encoding and Decoding Instruction Set Extensions for Reed-Solomon Encoding and Decoding Suman Mamidi and Michael J Schulte Dept of ECE University of Wisconsin-Madison {mamidi, schulte}@caewiscedu http://mesaecewiscedu Daniel

More information

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1> Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building

More information

Tree and Array Multipliers Ivor Page 1

Tree and Array Multipliers Ivor Page 1 Tree and Array Multipliers 1 Tree and Array Multipliers Ivor Page 1 11.1 Tree Multipliers In Figure 1 seven input operands are combined by a tree of CSAs. The final level of the tree is a carry-completion

More information

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials C. Shu, S. Kwon and K. Gaj Abstract: The efficient design of digit-serial multipliers

More information

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters April 15, 2010 John Wawrzynek 1 Multiplication a 3 a 2 a 1 a 0 Multiplicand b 3 b 2 b 1 b 0 Multiplier X a 3 b 0 a 2 b 0 a 1 b

More information

A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases

A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases A New Bit-Serial Architecture for Field Multiplication Using Polynomial Bases Arash Reyhani-Masoleh Department of Electrical and Computer Engineering The University of Western Ontario London, Ontario,

More information

Computer Architecture 10. Fast Adders

Computer Architecture 10. Fast Adders Computer Architecture 10 Fast s Ma d e wi t h Op e n Of f i c e. o r g 1 Carry Problem Addition is primary mechanism in implementing arithmetic operations Slow addition directly affects the total performance

More information

GF(2 m ) arithmetic: summary

GF(2 m ) arithmetic: summary GF(2 m ) arithmetic: summary EE 387, Notes 18, Handout #32 Addition/subtraction: bitwise XOR (m gates/ops) Multiplication: bit serial (shift and add) bit parallel (combinational) subfield representation

More information

Hardware Design I Chap. 4 Representative combinational logic

Hardware Design I Chap. 4 Representative combinational logic Hardware Design I Chap. 4 Representative combinational logic E-mail: shimada@is.naist.jp Already optimized circuits There are many optimized circuits which are well used You can reduce your design workload

More information

Chapter 5 Arithmetic Circuits

Chapter 5 Arithmetic Circuits Chapter 5 Arithmetic Circuits SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 11, 2016 Table of Contents 1 Iterative Designs 2 Adders 3 High-Speed

More information

A Digit-Serial Systolic Multiplier for Finite Fields GF(2 m )

A Digit-Serial Systolic Multiplier for Finite Fields GF(2 m ) A Digit-Serial Systolic Multiplier for Finite Fields GF( m ) Chang Hoon Kim, Sang Duk Han, and Chun Pyo Hong Department of Computer and Information Engineering Taegu University 5 Naeri, Jinryang, Kyungsan,

More information

PARALLEL MULTIPLICATION IN F 2

PARALLEL MULTIPLICATION IN F 2 PARALLEL MULTIPLICATION IN F 2 n USING CONDENSED MATRIX REPRESENTATION Christophe Negre Équipe DALI, LP2A, Université de Perpignan avenue P Alduy, 66 000 Perpignan, France christophenegre@univ-perpfr Keywords:

More information

Arithmetic Circuits-2

Arithmetic Circuits-2 Arithmetic Circuits-2 Multipliers Array multipliers Shifters Barrel shifter Logarithmic shifter ECE 261 Krish Chakrabarty 1 Binary Multiplication M-1 X = X i 2 i i=0 Multiplicand N-1 Y = Y i 2 i i=0 Multiplier

More information

Design of Sequential Circuits

Design of Sequential Circuits Design of Sequential Circuits Seven Steps: Construct a state diagram (showing contents of flip flop and inputs with next state) Assign letter variables to each flip flop and each input and output variable

More information

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,

More information

Cost/Performance Tradeoff of n-select Square Root Implementations

Cost/Performance Tradeoff of n-select Square Root Implementations Australian Computer Science Communications, Vol.22, No.4, 2, pp.9 6, IEEE Comp. Society Press Cost/Performance Tradeoff of n-select Square Root Implementations Wanming Chu and Yamin Li Computer Architecture

More information

CS 140 Lecture 14 Standard Combinational Modules

CS 140 Lecture 14 Standard Combinational Modules CS 14 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris 1 Part III. Standard Modules A. Interconnect B. Operators. Adders Multiplier

More information

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute DIGITAL TECHNICS Dr. Bálint Pődör Óbuda University, Microelectronics and Technology Institute 4. LECTURE: COMBINATIONAL LOGIC DESIGN: ARITHMETICS (THROUGH EXAMPLES) 2016/2017 COMBINATIONAL LOGIC DESIGN:

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation

More information

Lecture 8. Sequential Multipliers

Lecture 8. Sequential Multipliers Lecture 8 Sequential Multipliers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter

More information

A new class of irreducible pentanomials for polynomial based multipliers in binary fields

A new class of irreducible pentanomials for polynomial based multipliers in binary fields Noname manuscript No. (will be inserted by the editor) A new class of irreducible pentanomials for polynomial based multipliers in binary fields Gustavo Banegas Ricardo Custódio Daniel Panario the date

More information

New Bit-Level Serial GF (2 m ) Multiplication Using Polynomial Basis

New Bit-Level Serial GF (2 m ) Multiplication Using Polynomial Basis 2015 IEEE 22nd Symposium on Computer Arithmetic New Bit-Level Serial GF 2 m ) Multiplication Using Polynomial Basis Hayssam El-Razouk and Arash Reyhani-Masoleh Department of Electrical and Computer Engineering

More information

EECS150 - Digital Design Lecture 21 - Design Blocks

EECS150 - Digital Design Lecture 21 - Design Blocks EECS150 - Digital Design Lecture 21 - Design Blocks April 3, 2012 John Wawrzynek Spring 2012 EECS150 - Lec21-db3 Page 1 Fixed Shifters / Rotators fixed shifters hardwire the shift amount into the circuit.

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world. Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check

More information

Efficient Hardware Calculation of Inverses in GF (2 8 )

Efficient Hardware Calculation of Inverses in GF (2 8 ) Efficient Hardware Calculation of Inverses in GF (2 8 ) R. W. Ward, Dr. T. C. A. Molteno 1 Physics Department University of Otago Box 56, Dunedin, New Zealand 1 Email: tim@physics.otago.ac.nz Abstract:

More information

Design and Comparison of Wallace Multiplier Based on Symmetric Stacking and High speed counters

Design and Comparison of Wallace Multiplier Based on Symmetric Stacking and High speed counters International Journal of Engineering Research and Advanced Technology (IJERAT) DOI:http://dx.doi.org/10.31695/IJERAT.2018.3271 E-ISSN : 2454-6135 Volume.4, Issue 6 June -2018 Design and Comparison of Wallace

More information

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS ARITHMETIC COMBINATIONAL MODULES AND NETWORKS 1 SPECIFICATION OF ADDER MODULES FOR POSITIVE INTEGERS HALF-ADDER AND FULL-ADDER MODULES CARRY-RIPPLE AND CARRY-LOOKAHEAD ADDER MODULES NETWORKS OF ADDER MODULES

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

More information

Arithmetic Circuits-2

Arithmetic Circuits-2 Arithmetic Circuits-2 Multipliers Array multipliers Shifters Barrel shifter Logarithmic shifter ECE 261 Krish Chakrabarty 1 Binary Multiplication M-1 X = X i 2 i i=0 Multiplicand N-1 Y = Y i 2 i i=0 Multiplier

More information

Calculating Algebraic Signatures Thomas Schwarz, S.J.

Calculating Algebraic Signatures Thomas Schwarz, S.J. Calculating Algebraic Signatures Thomas Schwarz, S.J. 1 Introduction A signature is a small string calculated from a large object. The primary use of signatures is the identification of objects: equal

More information

Reducing the Complexity of Normal Basis Multiplication

Reducing the Complexity of Normal Basis Multiplication Reducing the Complexity of Normal Basis Multiplication Ömer Eǧecioǧlu and Çetin Kaya Koç Department of Computer Science University of California Santa Barbara {omer,koc}@cs.ucsb.edu Abstract In this paper

More information

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits Logic and Computer Design Fundamentals Chapter 5 Arithmetic Functions and Circuits Arithmetic functions Operate on binary vectors Use the same subfunction in each bit position Can design functional block

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN

More information

Area-Time Optimal Adder with Relative Placement Generator

Area-Time Optimal Adder with Relative Placement Generator Area-Time Optimal Adder with Relative Placement Generator Abstract: This paper presents the design of a generator, for the production of area-time-optimal adders. A unique feature of this generator is

More information

EECS Components and Design Techniques for Digital Systems. Lec 26 CRCs, LFSRs (and a little power)

EECS Components and Design Techniques for Digital Systems. Lec 26 CRCs, LFSRs (and a little power) EECS 150 - Components and esign Techniques for igital Systems Lec 26 CRCs, LFSRs (and a little power) avid Culler Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~culler

More information

Cost/Performance Tradeoffs:

Cost/Performance Tradeoffs: Cost/Performance Tradeoffs: a case study Digital Systems Architecture I. L10 - Multipliers 1 Binary Multiplication x a b n bits n bits EASY PROBLEM: design combinational circuit to multiply tiny (1-, 2-,

More information

GALOP : A Generalized VLSI Architecture for Ultrafast Carry Originate-Propagate adders

GALOP : A Generalized VLSI Architecture for Ultrafast Carry Originate-Propagate adders GALOP : A Generalized VLSI Architecture for Ultrafast Carry Originate-Propagate adders Dhananjay S. Phatak Electrical Engineering Department State University of New York, Binghamton, NY 13902-6000 Israel

More information

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10,

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10, A NOVEL DOMINO LOGIC DESIGN FOR EMBEDDED APPLICATION Dr.K.Sujatha Associate Professor, Department of Computer science and Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu,

More information

Binary Multipliers. Reading: Study Chapter 3. The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding

Binary Multipliers. Reading: Study Chapter 3. The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding Binary Multipliers The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 4 6 8 2 4 6 8 3 3 6 9 2 5 8 2 24 27 4 4 8 2 6

More information

Combinational Logic Design Combinational Functions and Circuits

Combinational Logic Design Combinational Functions and Circuits Combinational Logic Design Combinational Functions and Circuits Overview Combinational Circuits Design Procedure Generic Example Example with don t cares: BCD-to-SevenSegment converter Binary Decoders

More information

Galois Field Algebra and RAID6. By David Jacob

Galois Field Algebra and RAID6. By David Jacob Galois Field Algebra and RAID6 By David Jacob 1 Overview Galois Field Definitions Addition/Subtraction Multiplication Division Hardware Implementation RAID6 Definitions Encoding Error Detection Error Correction

More information

CMP 334: Seventh Class

CMP 334: Seventh Class CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative

More information

Design and Implementation of High Speed CRC Generators

Design and Implementation of High Speed CRC Generators Department of ECE, Adhiyamaan College of Engineering, Hosur, Tamilnadu, India Design and Implementation of High Speed CRC Generators ChidambarakumarS 1, Thaky Ahmed 2, UbaidullahMM 3, VenketeshK 4, JSubhash

More information

High Performance GHASH Function for Long Messages

High Performance GHASH Function for Long Messages High Performance GHASH Function for Long Messages Nicolas Méloni 1, Christophe Négre 2 and M. Anwar Hasan 1 1 Department of Electrical and Computer Engineering University of Waterloo, Canada 2 Team DALI/ELIAUS

More information

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3 Digital Logic: Boolean Algebra and Gates Textbook Chapter 3 Basic Logic Gates XOR CMPE12 Summer 2009 02-2 Truth Table The most basic representation of a logic function Lists the output for all possible

More information

ECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks

ECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks ECE 545 Digital System Design with VHDL Lecture Digital Logic Refresher Part A Combinational Logic Building Blocks Lecture Roadmap Combinational Logic Basic Logic Review Basic Gates De Morgan s Law Combinational

More information

1 Reed Solomon Decoder Final Project. Group 3 Abhinav Agarwal S Branavan Grant Elliott. 14 th May 2007

1 Reed Solomon Decoder Final Project. Group 3 Abhinav Agarwal S Branavan Grant Elliott. 14 th May 2007 1 Reed Solomon Decoder 6.375 Final Project Group 3 Abhinav Agarwal S Branavan Grant Elliott 14 th May 2007 2 Outline Error Correcting Codes Mathematical Foundation of Reed Solomon Codes Decoder Architecture

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH

More information

VHDL Implementation of Reed Solomon Improved Encoding Algorithm

VHDL Implementation of Reed Solomon Improved Encoding Algorithm VHDL Implementation of Reed Solomon Improved Encoding Algorithm P.Ravi Tej 1, Smt.K.Jhansi Rani 2 1 Project Associate, Department of ECE, UCEK, JNTUK, Kakinada A.P. 2 Assistant Professor, Department of

More information

7 Multipliers and their VHDL representation

7 Multipliers and their VHDL representation 7 Multipliers and their VHDL representation 7.1 Introduction to arithmetic algorithms If a is a number, then a vector of digits A n 1:0 = [a n 1... a 1 a 0 ] is a numeral representing the number in the

More information

Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach

Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach M A Hasan 1 and C Negre 2 1 ECE Department and CACR, University of Waterloo, Ontario, Canada 2 Team

More information

Chapter 2 Basic Arithmetic Circuits

Chapter 2 Basic Arithmetic Circuits Chapter 2 Basic Arithmetic Circuits This chapter is devoted to the description of simple circuits for the implementation of some of the arithmetic operations presented in Chap. 1. Specifically, the design

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 28: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Simple Processor CprE 28: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Digital

More information

Overview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples

Overview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples Overview rithmetic circuits Last lecture PLDs ROMs Tristates Design examples Today dders Ripple-carry Carry-lookahead Carry-select The conclusion of combinational logic!!! General-purpose building blocks

More information

A High-Speed Realization of Chinese Remainder Theorem

A High-Speed Realization of Chinese Remainder Theorem Proceedings of the 2007 WSEAS Int. Conference on Circuits, Systems, Signal and Telecommunications, Gold Coast, Australia, January 17-19, 2007 97 A High-Speed Realization of Chinese Remainder Theorem Shuangching

More information

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Study Homework: Arithmetic NTU IC54CA (Fall 2004) SOLUTIONS Short adders A The delay of the ripple

More information

The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers

The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers The equivalence of twos-complement addition and the conversion of redundant-binary to twos-complement numbers Gerard MBlair The Department of Electrical Engineering The University of Edinburgh The King

More information

VHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES

VHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES VHDL DESIGN AND IMPLEMENTATION OF C.P.U BY REVERSIBLE LOGIC GATES 1.Devarasetty Vinod Kumar/ M.tech,2. Dr. Tata Jagannadha Swamy/Professor, Dept of Electronics and Commn. Engineering, Gokaraju Rangaraju

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. memory inst 32 register

More information

EECS150. Arithmetic Circuits

EECS150. Arithmetic Circuits EE5 ection 8 Arithmetic ircuits Fall 2 Arithmetic ircuits Excellent Examples of ombinational Logic Design Time vs. pace Trade-offs Doing things fast may require more logic and thus more space Example:

More information

Problem Set 6 Solutions

Problem Set 6 Solutions CS/EE 260 Digital Computers: Organization and Logical Design Problem Set 6 Solutions Jon Turner Quiz on 2/21/02 1. The logic diagram at left below shows a 5 bit ripple-carry decrement circuit. Draw a logic

More information

Fundamentals of Digital Design

Fundamentals of Digital Design Fundamentals of Digital Design Digital Radiation Measurement and Spectroscopy NE/RHP 537 1 Binary Number System The binary numeral system, or base-2 number system, is a numeral system that represents numeric

More information

Carry Look Ahead Adders

Carry Look Ahead Adders Carry Look Ahead Adders Lesson Objectives: The objectives of this lesson are to learn about: 1. Carry Look Ahead Adder circuit. 2. Binary Parallel Adder/Subtractor circuit. 3. BCD adder circuit. 4. Binary

More information

I. INTRODUCTION. CMOS Technology: An Introduction to QCA Technology As an. T. Srinivasa Padmaja, C. M. Sri Priya

I. INTRODUCTION. CMOS Technology: An Introduction to QCA Technology As an. T. Srinivasa Padmaja, C. M. Sri Priya International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 5 ISSN : 2456-3307 Design and Implementation of Carry Look Ahead Adder

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

NEW SELF-CHECKING BOOTH MULTIPLIERS

NEW SELF-CHECKING BOOTH MULTIPLIERS Int. J. Appl. Math. Comput. Sci., 2008, Vol. 18, No. 3, 319 328 DOI: 10.2478/v10006-008-0029-4 NEW SELF-CHECKING BOOTH MULTIPLIERS MARC HUNGER, DANIEL MARIENFELD Department of Electrical Engineering and

More information

Combinational Logic Design Arithmetic Functions and Circuits

Combinational Logic Design Arithmetic Functions and Circuits Combinational Logic Design Arithmetic Functions and Circuits Overview Binary Addition Half Adder Full Adder Ripple Carry Adder Carry Look-ahead Adder Binary Subtraction Binary Subtractor Binary Adder-Subtractor

More information

Binary addition by hand. Adding two bits

Binary addition by hand. Adding two bits Chapter 3 Arithmetic is the most basic thing you can do with a computer We focus on addition, subtraction, multiplication and arithmetic-logic units, or ALUs, which are the heart of CPUs. ALU design Bit

More information

Boolean Algebra and Digital Logic 2009, University of Colombo School of Computing

Boolean Algebra and Digital Logic 2009, University of Colombo School of Computing IT 204 Section 3.0 Boolean Algebra and Digital Logic Boolean Algebra 2 Logic Equations to Truth Tables X = A. B + A. B + AB A B X 0 0 0 0 3 Sum of Products The OR operation performed on the products of

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 19: Adder Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L19

More information

Revisiting Finite Field Multiplication Using Dickson Bases

Revisiting Finite Field Multiplication Using Dickson Bases Revisiting Finite Field Multiplication Using Dickson Bases Bijan Ansari and M. Anwar Hasan Department of Electrical and Computer Engineering University of Waterloo, Waterloo, Ontario, Canada {bansari,

More information

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

KEYWORDS: Multiple Valued Logic (MVL), Residue Number System (RNS), Quinary Logic (Q uin), Quinary Full Adder, QFA, Quinary Half Adder, QHA.

KEYWORDS: Multiple Valued Logic (MVL), Residue Number System (RNS), Quinary Logic (Q uin), Quinary Full Adder, QFA, Quinary Half Adder, QHA. GLOBAL JOURNAL OF ADVANCED ENGINEERING TECHNOLOGIES AND SCIENCES DESIGN OF A QUINARY TO RESIDUE NUMBER SYSTEM CONVERTER USING MULTI-LEVELS OF CONVERSION Hassan Amin Osseily Electrical and Electronics Department,

More information

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier Espen Stenersen Master of Science in Electronics Submission date: June 2008 Supervisor: Per Gunnar Kjeldsberg, IET Co-supervisor: Torstein

More information

Arithmetic in Integer Rings and Prime Fields

Arithmetic in Integer Rings and Prime Fields Arithmetic in Integer Rings and Prime Fields A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 FA C 3 FA C 2 FA C 1 FA C 0 C 4 S 3 S 2 S 1 S 0 http://koclab.org Çetin Kaya Koç Spring 2018 1 / 71 Contents Arithmetic in Integer

More information

Low complexity bit-parallel GF (2 m ) multiplier for all-one polynomials

Low complexity bit-parallel GF (2 m ) multiplier for all-one polynomials Low complexity bit-parallel GF (2 m ) multiplier for all-one polynomials Yin Li 1, Gong-liang Chen 2, and Xiao-ning Xie 1 Xinyang local taxation bureau, Henan, China. Email:yunfeiyangli@gmail.com, 2 School

More information

Part II Addition / Subtraction

Part II Addition / Subtraction Part II Addition / Subtraction Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations

More information

Simplification of Procedure for Decoding Reed- Solomon Codes Using Various Algorithms: An Introductory Survey

Simplification of Procedure for Decoding Reed- Solomon Codes Using Various Algorithms: An Introductory Survey 2014 IJEDR Volume 2, Issue 1 ISSN: 2321-9939 Simplification of Procedure for Decoding Reed- Solomon Codes Using Various Algorithms: An Introductory Survey 1 Vivek Tilavat, 2 Dr.Yagnesh Shukla 1 PG Student,

More information

Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems

Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems J Sign Process Syst DOI 10.1007/s11265-008-0249-8 Forward and Reverse Converters and Moduli Set Selection in Signed-Digit Residue Number Systems Andreas Persson Lars Bengtsson Received: 8 March 2007 /

More information

Finite Fields. SOLUTIONS Network Coding - Prof. Frank H.P. Fitzek

Finite Fields. SOLUTIONS Network Coding - Prof. Frank H.P. Fitzek Finite Fields In practice most finite field applications e.g. cryptography and error correcting codes utilizes a specific type of finite fields, namely the binary extension fields. The following exercises

More information

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Miloš D. Ercegovac Computer Science Department Univ. of California at Los Angeles California Robert McIlhenny

More information

FPGA Realization of Low Register Systolic All One-Polynomial Multipliers Over GF (2 m ) and their Applications in Trinomial Multipliers

FPGA Realization of Low Register Systolic All One-Polynomial Multipliers Over GF (2 m ) and their Applications in Trinomial Multipliers Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2016 FPGA Realization of Low Register Systolic All One-Polynomial Multipliers Over GF (2 m ) and their

More information

Novel Bit Adder Using Arithmetic Logic Unit of QCA Technology

Novel Bit Adder Using Arithmetic Logic Unit of QCA Technology Novel Bit Adder Using Arithmetic Logic Unit of QCA Technology Uppoju Shiva Jyothi M.Tech (ES & VLSI Design), Malla Reddy Engineering College For Women, Secunderabad. Abstract: Quantum cellular automata

More information

VLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California

VLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California VLSI Arithmetic Lecture 9: Carry-Save and Multi-Operand Addition Prof. Vojin G. Oklobdzija University of California http://www.ece.ucdavis.edu/acsel Carry-Save Addition* *from Parhami 2 June 18, 2003 Carry-Save

More information

B. Cyclic Codes. Primitive polynomials are the generator polynomials of cyclic codes.

B. Cyclic Codes. Primitive polynomials are the generator polynomials of cyclic codes. B. Cyclic Codes A cyclic code is a linear block code with the further property that a shift of a codeword results in another codeword. These are based on polynomials whose elements are coefficients from

More information

Adders, subtractors comparators, multipliers and other ALU elements

Adders, subtractors comparators, multipliers and other ALU elements CSE4: Components and Design Techniques for Digital Systems Adders, subtractors comparators, multipliers and other ALU elements Instructor: Mohsen Imani UC San Diego Slides from: Prof.Tajana Simunic Rosing

More information

A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form (2 n (2 p ± 1))

A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form (2 n (2 p ± 1)) The Computer Journal, 47(1), The British Computer Society; all rights reserved A Suggestion for a Fast Residue Multiplier for a Family of Moduli of the Form ( n ( p ± 1)) Ahmad A. Hiasat Electronics Engineering

More information

A Low-Error Statistical Fixed-Width Multiplier and Its Applications

A Low-Error Statistical Fixed-Width Multiplier and Its Applications A Low-Error Statistical Fixed-Width Multiplier and Its Applications Yuan-Ho Chen 1, Chih-Wen Lu 1, Hsin-Chen Chiang, Tsin-Yuan Chang, and Chin Hsia 3 1 Department of Engineering and System Science, National

More information

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1 VLSI Design Adder Design [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1 Major Components of a Computer Processor Devices Control Memory Input Datapath

More information

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors CSC258 Week 3 1 Logistics If you cannot login to MarkUs, email me your UTORID and name. Check lab marks on MarkUs, if it s recorded wrong, contact Larry within a week after the lab. Quiz 1 average: 86%

More information

DESİGN AND ANALYSİS OF FULL ADDER CİRCUİT USİNG NANOTECHNOLOGY BASED QUANTUM DOT CELLULAR AUTOMATA (QCA)

DESİGN AND ANALYSİS OF FULL ADDER CİRCUİT USİNG NANOTECHNOLOGY BASED QUANTUM DOT CELLULAR AUTOMATA (QCA) DESİGN AND ANALYSİS OF FULL ADDER CİRCUİT USİNG NANOTECHNOLOGY BASED QUANTUM DOT CELLULAR AUTOMATA (QCA) Rashmi Chawla 1, Priya Yadav 2 1 Assistant Professor, 2 PG Scholar, Dept of ECE, YMCA University

More information

Highly Efficient GF(2 8 ) Inversion Circuit Based on Redundant GF Arithmetic and Its Application to AES Design

Highly Efficient GF(2 8 ) Inversion Circuit Based on Redundant GF Arithmetic and Its Application to AES Design Saint-Malo, September 13th, 2015 Cryptographic Hardware and Embedded Systems Highly Efficient GF(2 8 ) Inversion Circuit Based on Redundant GF Arithmetic and Its Application to AES Design Rei Ueno 1, Naofumi

More information

FPGA BASED DESIGN OF PARALLEL CRC GENERATION FOR HIGH SPEED APPLICATION

FPGA BASED DESIGN OF PARALLEL CRC GENERATION FOR HIGH SPEED APPLICATION 258 FPGA BASED DESIGN OF PARALLEL CRC GENERATION FOR HIGH SPEED APPLICATION Sri N.V.N.Prasanna Kumar 1, S.Bhagya Jyothi 2,G.K.S.Tejaswi 3 1 prasannakumar429@gmail.com, 2 sjyothi567@gmail.com, 3 tejaswikakatiya@gmail.com

More information

Design at the Register Transfer Level

Design at the Register Transfer Level Week-7 Design at the Register Transfer Level Algorithmic State Machines Algorithmic State Machine (ASM) q Our design methodologies do not scale well to real-world problems. q 232 - Logic Design / Algorithmic

More information

Chapter 6. BCH Codes

Chapter 6. BCH Codes Chapter 6 BCH Codes Description of the Codes Decoding of the BCH Codes Outline Implementation of Galois Field Arithmetic Implementation of Error Correction Nonbinary BCH Codes and Reed-Solomon Codes Weight

More information