A 32-bit Decimal Floating-Point Logarithmic Converter

Similar documents
Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms

On-Line Hardware Implementation for Complex Exponential and Logarithm

Chapter 5: Solutions to Exercises

Cost/Performance Tradeoff of n-select Square Root Implementations

CS 140 Lecture 14 Standard Combinational Modules

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

Lecture 11. Advanced Dividers

A High-Speed Realization of Chinese Remainder Theorem

ECE380 Digital Logic. Positional representation

ALU (3) - Division Algorithms

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Tunable Floating-Point for Energy Efficient Accelerators

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *

Svoboda-Tung Division With No Compensation

Chapter 6: Solutions to Exercises

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Part VI Function Evaluation

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Lecture 8: Sequential Multipliers

Chapter 1: Solutions to Exercises

Chapter 5 Arithmetic Circuits

Hardware Design I Chap. 4 Representative combinational logic

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Adders, subtractors comparators, multipliers and other ALU elements

CORDIC, Divider, Square Root

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Chapter 8: Solutions to Exercises

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

LOGIC CIRCUITS. Basic Experiment and Design of Electronics

Chapter 1 Error Analysis

Binary Floating-Point Numbers

Complement Arithmetic

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

DIVISION BY DIGIT RECURRENCE

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Design of Sequential Circuits

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Numbers and Arithmetic

Introduction to the Xilinx Spartan-3E

Fixed-Point Trigonometric Functions on FPGAs

A VLSI Algorithm for Modular Multiplication/Division

Overview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples

DSP Configurations. responded with: thus the system function for this filter would be

A Hardware-Oriented Method for Evaluating Complex Polynomials

Digital Systems Roberto Muscedere Images 2013 Pearson Education Inc. 1

Table-based polynomials for fast hardware function evaluation

An Effective New CRT Based Reverse Converter for a Novel Moduli Set { 2 2n+1 1, 2 2n+1, 2 2n 1 }

Residue Number Systems Ivor Page 1

A Low-Error Statistical Fixed-Width Multiplier and Its Applications

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Mark Redekopp, All rights reserved. Lecture 1 Slides. Intro Number Systems Logic Functions

Chapter 4. Combinational: Circuits with logic gates whose outputs depend on the present combination of the inputs. elements. Dr.

Chapter 4 Number Representations

VLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California

ECE260: Fundamentals of Computer Engineering

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Numbers and Arithmetic

ENG2410 Digital Design Introduction to Digital Systems. Fall 2017 S. Areibi School of Engineering University of Guelph

Design and Implementation of a Radix-4 Complex Division Unit with Prescaling

DESIGN OF OPTIMAL CARRY SKIP ADDER AND CARRY SKIP BCD ADDER USING REVERSIBLE LOGIC GATES

Combinational Logic Design Arithmetic Functions and Circuits

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors

Review for Test 1 : Ch1 5

CMP 334: Seventh Class

14:332:231 DIGITAL LOGIC DESIGN. Why Binary Number System?

Fundamentals of Digital Design

Chapter 1. Binary Systems 1-1. Outline. ! Introductions. ! Number Base Conversions. ! Binary Arithmetic. ! Binary Codes. ! Binary Elements 1-2

What s the Deal? MULTIPLICATION. Time to multiply

AREA EFFICIENT MODULAR ADDER/SUBTRACTOR FOR RESIDUE MODULI

Number Representation and Waveform Quantization

An Approximate Parallel Multiplier with Deterministic Errors for Ultra-High Speed Integrated Optical Circuits

Pipelined Viterbi Decoder Using FPGA

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary.

Carry Look Ahead Adders

Chapter 2 Basic Arithmetic Circuits

DSP Design Lecture 2. Fredrik Edman.

Unit II Chapter 4:- Digital Logic Contents 4.1 Introduction... 4

Radix-4 Vectoring CORDIC Algorithm and Architectures. July 1998 Technical Report No: UMA-DAC-98/20

KEYWORDS: Multiple Valued Logic (MVL), Residue Number System (RNS), Quinary Logic (Q uin), Quinary Full Adder, QFA, Quinary Half Adder, QHA.

Research Article Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid Number System

EE260: Digital Design, Spring n Digital Computers. n Number Systems. n Representations. n Conversions. n Arithmetic Operations.

Computer Architecture 10. Fast Adders

FIXED WIDTH BOOTH MULTIPLIER BASED ON PEB CIRCUIT

Menu. Review of Number Systems EEL3701 EEL3701. Math. Review of number systems >Binary math >Signed number systems

Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic

Combinational Logic. Mantıksal Tasarım BBM231. section instructor: Ufuk Çelikcan

Number representation

Adders, subtractors comparators, multipliers and other ALU elements

1 Floating point arithmetic

Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

Transcription:

A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan 1 Campus Drive 57, Sasaktoon, SK, Canada, seokbum.ko@usask.ca Institute of Information and Communication, Chonbuk National University Jeonju 1-7, Korea, moonho@chonbuk.ac.kr Abstract This paper presents a new design and implementation of a 3-bit decimal floating-point (DFP) logarithmic converter based on the digit-recurrence algorithm. The converter can calculate accurate logarithms of 3-bit DFP numbers which are defined in the IEEE 75-008 standard. Redundant digit e 1 is obtained by look-up table in the first iteration and the rest redundant digits e j are selected by rounding the scaled remainder during the succeeding iterations. The sequential architecture of the proposed 3-bit DFP logarithmic converter is implemented on Xilinx Virtex-II Pro P30 FPGA device and then synthesized with TMSC 0.18-um standard cell library. The implementation results indicate that the maximum frequency of the proposed architecture is 7.7 MHz in FPGA and 107.9 MHz in TMSC 0.18-um technology. The faithful 3-bit DFP logarithm results can be obtained in 18 cycles. Keywords: Decimal Logarithmic Converter, Decimal Floating-Point, Digit-Recurrence Algorithm, Selection by Rounding. 1. Introduction Nowadays, there are many commercial demands for DFP arithmetic operations such as financial analysis, tax calculation, phone billing, currency conversion, Internet based applications, and e-commerce [3]. This trend gives rise to further development on DFP arithmetic unit which can perform more accurate calculations compared with a BFP arithmetic unit. Due to the significance of DFP arithmetic, the IEEE 75-008 standard for floating-point arithmetic [1] includes it in the specifications. Decimal arithmetic unit, as a main part of a decimal processor, is attracting more and more researchers attention. The decimal-encoded formats and arithmetic have been implemented in IBM s POWER6 [5], system z9 [] and z10 processors [1]. The logarithms operation, as one of the elementary function, is a useful arithmetic concept in many areas of science and engineering. Some applications, such as logarithmic number system (LNS) and digital signal processing, are implemented by using a logarithmic unit to replace the normal computer arithmetic. Moreover, the decimal logarithms operation as a decimal arithmetic operation is defined in the new IEEE 75-008 standard [1]. Based on the improvement of basic decimal arithmetic units, more complex DFP elementary operations such as logarithm, exponential, trigonometric, etc would be the next useful building blocks. Muller [10] presents both software and hardwareoriented algorithms to compute elementary functions. While most elementary functions are implemented by the software-oriented methods due to their advantage of using large look-up tables and providing more accurate results, these methods are usually too slow for numerically intensive and real-time applications. Hardware-oriented methods with high-speed solutions have been developed as an alternative. A CORDIC-like BKM algorithm is presented in [7] for fast computation of complex exponentials and logarithms. Another digit-recurrence hardware-oriented algorithm is an interesting alternative method due to its low area requirements, especially for high-precision computations. The selection by rounding is introduced for highradix binary division, square-root in [6],[8] and logarithm in [11]. This method can efficiently decrease the cost of implementation and, in particular, the complexities of the selection function for the digits. In this paper, a radix-10 digit-recurrence algorithm with selection by rounding based approach is proposed to implement a 3-bit DFP logarithmic converter. This paper is organized as follows: In section, the basic DFP standard and 3-bit DFP logarithmic calculation is described. Section 3 presents a radix-10 fixed-point (FXP) logarithm operation by digit-recurrence algorithm with selection by rounding, and the related architecture is constructed. In section, the architecture of the proposed 3- bit DFP logarithmic converter is presented with an example

and then verified by a function verification platform. In Section 5, first, the implementation results of the proposed converter are analyzed; then, we compare the hardware performance of decimal logarithmic converter with a binary logarithmic converter [11]; finally we analyze how we can scale the proposed 3-bit DFP logarithmic converter to a 6-bit or 18-bit converter. Section 6 gives conclusions.. A 3-bit DFP Logarithm.1. DFP Standard The IEEE 75-008 standard specifies three interchange DFP formats: decimal3, decimal6 and decimal18 encoded in 3, 6 and 18 bits respectively. Figure 1 shows the basic DFP format specified in IEEE 75-008. 1 bits 55 bits w bits 6 j bits 0 bits \ 3j/10 / 6 digits digits Sign S Combination Field G Exponent Continuation E Coefficient Continuation C Figure 1. DFP Number Format. The sign is a 1-bit field and indicates the sign of the number in the same way as BFP numbers. The combination field is a 5-bit field that encodes two most significant bits (MSBs) of the exponent and the most significant digit (MSD) of the coefficient. The Not-a-Number (NaN) and Infinite number (Inf) are indicated in the combination field. The exponent field (w+ bits) is formed by appending the w-bit of exponent continuation as a suffix to the -bit MSBs derived from the combination field. The whole encoded exponent is an unsigned binary integer with the largest unsigned value. The value of the exponent is calculated by subtracting a exponent bias from the value of the encoded exponent, to be able to represent both negative and positive exponents. The coefficient field (j+ bits) is formed by appending the decoded continuation digits (j-bit) as a suffix to the most significant digit (MSD) derived from the combination field. The j-bit coefficient continuation is a multiple of 10-bit and the most significant group is on the left. Each 10-bit group represents three decimal digits, using Densely Packed Decimal (DPD) encoding [] and can be decoded to a 1-bit binary-coded decimal (BCD) representation. The total coefficient digit is q =3j/10+1 digits 1. In IEEE-75-008 DFP standard, the value of the coefficient is an non-normalized unsigned decimal fraction in the form of d 0.d 1 d...d 6, 0 d i < 10. In decimal computer arithmetic, the coefficient is usually represented as an integer. The value of a 3-bit DFP number, compliant with 1 Note that w =6, 8 and 1; j =0, 50 and 110; exponent bias = 101, 398 and6176; q =7, 16, and3 respectively in decimal3, decimal6 and decimal18 formats. decimal3 format, is represented as: X =( 1) s 10 e coefficient (1) In (1), e is in the range of (E min 6) e (E max 6) (E max =96, E min = 95) and the coefficient field is represented as an integer. If the DFP numbers with absolute values are larger than the largest DFP number ( X max = 9999999 10 90 ) then overflow occurs. Similarly, if they are less than smallest 3-bit DFP number ( X min =10 101 ) then underflow occurs. When the absolute value of DFP number is less than 1000000 10 101 and larger than 0000001 10 101, it will produce subnormal... Calculation of 3-bit DFP Logarithm A valid 3-bit DFP logarithmic calculation is defined as: R = log 10 (X) = log 10 (10 e ) + log 10 (coefficient) () In (), the exponent is in the range of 101 e 90, and the coefficient is a q-digit (q =7) no-normalized integer in the range of 0000001 coefficient 9999999. There are some exceptional cases need to be dealt with during a 3-bit DFP logarithmic calculation. First of all, X must be a positive floating-point number (S =0), otherwise the logarithmic converter simply returns NaN. Moreover, if X is NaN and Zero, the logarithmic converter then simply returns NaN, if X is infinite, the logarithmic converter simply returns Inf. The inexact logarithm results need to be rounded and normalized to exact q-digit logarithm results. Since the maximum and minimum logarithm results are log 10 ( X max )=96.99999 and log 10 ( X min )= 101 respectively, the subnormal, overflow and underflow will not be produced during logarithmic calculations. The calculation of log 10 (coefficient) is a 7-digit FXP decimal logarithm operation. Since it is defined as a nonormalized integer, the coefficient of DFP number should be adjusted into the range of [0.1, 1) before calculated. Therefore, (3) is obtained: R = log 10 (X) =e + k + log 10 (m) (3) k is the characteristic of the logarithms and can be easily achieved by leading-zero-detector (LZD), (1 k 7); m is a decimal fraction that consists of all q-digit (q =7)of the coefficient part of 3-bit DFP number, (0.1 m< 1). Since the target is a 3-bit DFP calculation, the 7-digit FXP logarithm calculation should be able to achieve enough accuracy to guarantee faithful. A straightforward approach is required to guarantee at least a precision of q digits (1- digit) so that the inexact rounding can be implied by a left shift of up to exactly q-digit (7-digit).

3. Decimal FXP Logarithmic Converter 3.1. Overview of Algorithm A digit-recurrence algorithm to calculate log 10 (m) is summarized as follows, (0.1 m< 1). log 10 (m) = log 10 (m f j ) log 10 (f j ) () If the following condition is satisfied: 3.. Selection by Rounding The selected redundant digits are achieved through rounding to the integer part of the residual indicated as (15), where e j { 9, 8, 7..., 0,...7, 8, 9}. Then obtain (16): e j = round(w [j]) (15) 0.5 W [j] e j 0.5 (16) Then Finally lim j {m f j } 1 (5) lim {log 10(m f j )} 0 (6) j log 10 (m) =0 log 10 (f j ) (7) j=1 Since e j+1 9, thus, Equation (1) can be written as: 9.5 <W[j +1]< 9.5 (17) W [j + 1]=10(W [j] e j )+e j 10 1 j (W [j] e j +e j ) (18) According to (16), (17) and (18), the numerical analysis is processed as follows: f j is defined as f j =1+e j 10 j in which m is transformed to 1 by successive multiplication. This form of f j allows the use of a shift-and-add implementation. The corresponding recurrences for transforming m and computing the logarithm are presented in (8) and (9), where j 1, E[1]=m and L[1]=0. E(j +1)=E[j](1 + e j 10 j ) (8) L(j +1)=L[j] log 10 (1 + e j 10 j ) (9) The digits e j are selected so that E(j +1)converges to 1, 1-digit accuracy of the calculation result is, therefore, obtained in each iteration. After performing the last iteration of recurrence, the results are: E(N +1) 1 (10) L(N +1) log 10 (m) (11) To have the selection function for e j, a scaled remainder is defined as: W [j] =10 j (1 E[j]) (1) Thus, Substituting (13) in (8) yields E[j]=1 W [j]10 j (13) W [j + 1] = 10(W [j] e j + e j W [j]10 j ) (1) According to (1), the digits e j are selected as a function of leading digits of scaled residual in a way that the residual W [j] remains bounded. 0.5 10 + e j 10 1 j ( 0.5+e j ) > 9.5 (19) 0.5 10 + e j 10 1 j (0.5+e j ) < 9.5 (0) The numerical analysis results show that if and only j 3, the conditions (19) and (0) are satisfied. In doing so, the selection by rounding is only valid for iterations j 3 and e 1 and e can be only achieved by look-up tables. However, using two look-up tables for j = 1, will significantly increase the overall hardware implementations. Therefore, the restriction for e 1 is defined so that e can be achieved by selection by rounding and one look-up table will be saved. Because W [1]=10(m 1), W [] can be achieved as: W []=100 100 m 10e 1 m (1) When the value of j equates to, the value of e is in the range of 7 <e < 7 so that (19) and (0) are satisfied. 7<e <7 is brought to (16), then () is obtained: 6.5 <W[] < 6.5 () From (1) and (), we can obtain a conclusion that input FXP decimal number m is in the range of 0.5 m 1 and e can be achieved by selection by rounding. The look-up table for selection of e 1 is shown in Table 1. Because m is in the range of 0.1 m<1, the input number in the range of 0.1 m<0.5 needs to be adjusted by multiplying with, 3 or 5. Then the adjusted numbers m which are in the range of 0.5 m 1 are calculated by selection by rounding. Finally, the logarithm results log 10 (m ) are adjusted by subtracting the constant (log 10 (), log 10 (3) or log 10 (5)) to obtain the final logarithm results of log 10 (m).

Table 1. Look-up Table for e 1 Selection Therangeof m e 1 (BCD) [0.96, 1.00) 0(0000) [0.88, 0.95] 1(0001) [0.81, 0.87] (0010) [0.76, 0.80] 3(0011) [0.70, 0.75] (0100) [0.66, 0.69] 5(0101) [0.6, 0.65] 6(0110) [0.59, 0.61] 7(0111) [0., 0.58] 8(1000) [0.50, 0.55] 9(1001) 3.3. Approximation of Logarithm Logarithm result can be achieved by accumulating the values of log 10 (1+e j 10 j ) in each iteration. The values of log 10 (1 + e j 10 j ) are stored in another look-up table II. With the increasing number of iteration, however, the size of the table will become prohibitively larger. Therefore, a method for reducing the size of table, which can achieve a significant reduction in the overall hardware requirement, is necessary. A series expansion of logarithm function log 10 (1+x) is expressed in (3): log 10 (1 + x) =(x x +...)/ ln(10) (3) After iteration j = k, the values of log 10 (1 + e j 10 j ) can be approximated by e j 10 j / ln(10). Since a 1-digit accuracy needs to be guaranteed in this study, the series approximation can be used in the iterations when the constraint x ln(10) <10 16 is met, where x=e j 10 j e j10 j / ln(10) < 10 16 () The numerical analysis of () shows that after the number of k =8iterations, while the values of log 10 (1 + e j 10 j ) does not need to be stored in table, the values of e j 10 j / ln(10), instead, will be used for approximation. 3.. Algorithm Summary First iteration (j =1), e 1 is obtained by look-up table I under the restriction of 0.5 m 1, and the number in the range of 0.1 m<0.5 need to be adjusted. Iterations j =to j =15, convergence is achieved with selection by rounding and the redundant digits e j are obtained. Iterations j 8, the logarithms can be achieved by look-up table II in which the values of log 10 (1 + e j 10 j ) are stored. In the iterations j > 8, thelogarithm results can be approximated by e j 10 j / ln(10). The values of log 10 (1+e j 10 j ) in iteration j 8 and e j 10 j / ln(10) in iteration j>8 are accumulated to achieve log 10 (m ) which is adjusted by subtracting the constant (0, log 10 (), log 10 (3) or log 10 (5)) to obtain a 1-digit accuracy FXP decimal logarithm result. 3.5. Error Analysis and Evaluation The errors in the proposed algorithm are produced in four ways. The first error is the inherent error of algorithm, ε i, resulted from the difference between the logarithm results obtained from finite iterations and the exact results obtained from infinite iterations. The second is the approximation error, ε a, produced by approximating the values of log 10 (1 + e j 10 j ) with the value of e j 10 j / ln(10). The third is the quantization error, ε q, resulted from the finite precision of the intermediate values in the hardwareoriented algorithm. The fourth is the final output rounding error, ε r, whose maximum value is 1/ unit in the last place (ulp). In order to obtain a 1-digit accuracy logarithm result, the following condition must be satisfied : E absolute =ε i +ε a +ε q +ε r 10 1 (5) 3.5.1 Inherent Error of Algorithm Since each FXP decimal logarithm result is achieved after the 15 th iteration, ε i can be defined as: ε i = log 10 (1+e j 10 j ) (6) j=16 In order to use the static error analysis method, we choose the worst cases (e j =9 or 9) to analyze the maximum ε i : ε i = log 10 (1±9 10 j ) (7) j=16 According to (7), the maximum ε i is in the range:.3 10 16 ε i.3 10 16 (8) 3.5. Approximation Error We use approximate value, e j 10 j / ln 10, to estimate log 10 (1+e j 10 j ) from the 9 th to the 15 th iteration. According to the series expansion of logarithm function in (3), this approach produces an approximation error, ε a : ε a = 15 ( (e j10 j ) + (e j10 j ) 3...)/ ln(10) (9) 3

Since 15 ( (e j10 j ) 3...)/ ln(10) 10 16 (30) 3 we keep (e j 10 j ) / ln(10) to analyze ε a : ε a 15 ( (e j10 j ) )/ ln(10) (31) Considering the worst cases (e j =9or 9) in (3), we obtain the maximum ε a : 3.5.3 Quantization Error ε a 1.78 10 17 (3) Since only those intermediate values who have finite precisions are operated in the hardware-oriented algorithm, three quantization errors occur. First, the logarithm results are achieved by accumulating the 16-digit rounding values of log 10 (1+e j 10 j ) from the 1 st to the 8 th iteration. In each iteration, the maximum rounding error of log 10 (1+e j 10 j ) is 0.5 10 16, therefore the maximum ε q1 is: ε q1 8 0.5 10 16 = 10 16 (33) j=1 Second, the logarithm results are achieved by accumulating the 16-digit rounding values of e j 10 j / ln(10) from the 9 th to the 15 th iteration. Since the maximum quantization error of the value 1/ln(10) is 0.5 10 1, when e j =9or 9, the maximum ε 1 q is: ε 1 q 15 ±9 10 j 0.5 10 1 10 16 (3) Another quantization error, ε q, is produced by the finite 16- digit precision truncating value of e j 10 j / ln(10). In each iteration, the maximum truncating error of e j 10 j / ln(10) is 1 10 16, therefore the maximum ε q is: 15 ε q 1 10 16 =7 10 16 (35) Third, the logarithm result log 10 (m ) is adjusted by a finite 16-digit rounding constant (0, log 10 (), log 10 (3) or log 10 (5)) in the last iteration, so the quantization error, ε q3, occurs. The maximum ε q3 is: ε q3 0.5 10 16 (36) Therefore, the maximum quantization error, ε q,is: ε q ε q1 +ε 1 q+ε q+ε q3 1.15 10 15 (37) 3.5. Error Evaluation Since the final logarithm result has 1-digit accuracy, the maximum final rounding error is 1/ ulp, ε r =0.5 10 1. With ε i, ε a, ε q in (8), (3) and (37) respectively, E absolute =ε i +ε a +ε q +ε r =0.660 10 1 (38) E absolute satisfies the condition (5), so the proposed algorithm can guarantee faithful rounding for 1-digit precision decimal logarithm results. Moreover, a MATLAB simulation model which is completely consistent with the hardware implementation of the proposed 7-digit FXP logarithmic converter is set up. The MATLAB simulation model proves that there is a need to keep at least 1-digit precision for W [j] to obtain correct e j during 15 iterations. Furthermore, both the 10,000 7-digit decimal operands (close to 1.0) and the 100,000 random decimal operands in the rage of [0.1, 1) are simulated as test vectors in the MATLAB model. All the logarithm results achieved from simulation model can guarantee 1-digit accuracy. 3.6. Architecture Figure shows a sequential architecture of the proposed 7-digit FXP decimal logarithmic converter. The hardware implementation of this logarithmic converter includes two stages. The stage 1 shown in Figure is to obtain e j with selection by rounding. After e j is achieved, the logarithm results will be produced in the stage. Finally, for the input decimal numbers that are in the range of 0.1 m<0.5, the corresponding logarithm results are adjusted. 3.6.1 Main Features of Architecture All variables in this architecture are represented with 10 s complement number system. Each digit of positive FXP decimal number is represented by -bit BCD code, whereas each digit of negative number is represented by its 10 s complement format. The reason of choosing 10 s complement format is the same as binary s complement format, all digits, including the sign digit, participate in add or subtract operation. Moreover, the decimal subtraction operation can be replaced by a decimal addition in 10 s complement format. The architecture of this logarithmic converter includes look-up tables. The look-up table I is constructed by a size of ROM in which the values of e 1 is stored as shown in Table I. The look-up table II stores all the 16-digit values of log 10 (1+e j 10 j ) for achieving logarithm results; here Note that the proposed architecture can be transformed to a decimal base e logarithmic converter by storing the values of ln(1 + e j 10 j ), ln(5), ln(3) and ln() in the look-up table II.

Reg W[j] 8 Reg 1 8 8 Detector Mult1 8 3 TABLE I mm3m5m e 1 Reg m 0000 m' 0000 m' e ej 1 m' e j 1 W[j] Mux 1 Mux Mux 3 Mux 9'sCom 1-Digit Decimal CLA Adder Mult Shifter (x10 -j ) Mux 5 9'sCom Stage 1 Shifter (x10) 1-Digit Decimal CLA Adder W[j] Rounding Logic e j Mux 6 critical path Shifter (x100) Reg 3 Reg 5 e 1 e j Mux 7 TABLE II Log 10 (1+e j 10 -j ) Stage 6 6 (1/ln(10)) Mult3 Adjusted Costant 0 & Log 10 (5,,3) 6 6 Mux 8 Mux 9 6 6 Reg 6 16-Digit Decimal CLA Adder 6 Figure. Architecture of FXP Decimal Logarithmic Converter. j is in the range of 1 j 8 because the logarithm results can be achieved by the approximation of series expansion of logarithm function after 8 iterations. Furthermore, the 16-digit adjustment parameters, log 10 (), log 10 (3) and log 10 (5) are stored in this table. The size of look-up table II is 8 6. The Mult1, Mult and Mult3 in the architecture are multiple logics for obtaining the multiple of values. The Mult1 is to achieve the m, m, 3m and 5m; themult and Mult3 are designed to achieve e j m and e j / ln(10). Heree j is a value in the range of 9 e j 9, so the multiple logic is to obtain the results calculated from 9m to 9m. In this paper, the Mult1, Mult and Mult3 are implemented based on the partial product generation logic described in literature [9]. Based on 1-digit decimal carry-look-ahead (CLA) adder described in literature [15], the 10 s complement decimal CLA adder is implemented. For achieving faster speed, the 16-digit and 1-digit decimal numbers are divided into four groups in which there is a separate CLA adder in each group. The subtraction operations in algorithm are carried out by this CLA adder due to the 10 s complement decimal format used in this architecture. 3.6. Cycle Process At the first clock cycle, the first 7 digits FXP decimal number is obtained from Reg1. The input numbers in the range of 0.1 m<0.5 are adjusted in the Mult1. The corresponding input m (selected from m, m, 3m and 5m) and e 1 (obtained from the look-up table I) are sent to Reg. In the first iteration ( nd clock cycle), m and e 1 are selected by Mux1 and Mux respectively for achieving the e 1 m in Mult. At the same time, the m and 1 are chosen by Mux3 and Mux to obtain the 1 m in 1-digit CLA adder. Then, e 1 m is shifted left 1-digit to achieve 10e 1 m and 1 m is shifted left -digit to achieve 100(1 m ). Finally W [] is obtained by adding 10e 1 m and 100(1 m ) together in 1-digit CLA adder. Then, W [] is rounded to integer in Rounding logic to obtain e. As the same time, e 1 is chosen by Mux7 and sent to stage, so the value of log 10 (1+e 1 10 1 ) is obtained from look-up table II. This value is selected by Mux8 and adjusted constant (log 10 (), log 10 (3) and log 10 (5), 0) is chosen by Mux9. Finally, the logarithms result L[] is obtained in a 16-digit CLA adder in stage. From the second to the eighth iteration (3 rd to 9 th clock cycle), W [j] is chosen by Mux1, and e j obtained from the previous iteration, is selected by Mux and then, e j W [j] is obtained in Mult. Meanwhile, e j and W [j] are chosen by Mux3 and Mux to obtain the W [j] e j in 1-digit CLA adder. Then, e j W [j] out from Mult is shifted right (j 1)- digit to achieve e j W [j]10 (j 1), and W [j] e j is shifted left 1-digit to achieve 10(e j W [j]). Finally W [j +1]is obtained by adding e j W [j]10 (j 1) and 10(W [j] e j ) together in 1-digit CLA adder. Then, W [j +1] is rounded to integer in Rounding logic to obtain e j+1. At the same time, e j is chosen by Mux7 and sent to stage. The values of log 10 (1+e j 10 j ) are determined by look-up table II and chosen by Mux8. The result of logarithm in previous iteration is chosen by Mux9, and then they are added together in 16-digit CLA adder to obtain the L[j].

From the ninth to the fifteenth iteration (10 th to 16 th clock cycle), e j+1 is obtained from the same process as the previous iterations. However, the logarithm results are approximated by e j 10 j / ln(10) instead of from look-up table II. After 15 iterations, the final logarithms results are obtained. This 7-digit FXP decimal logarithmic converter takes 16 clock cycles to achieve decimal logarithms result with 1-digit accuracy.. A 3-bit DFP Logarithmic Converter.1. Architecture The architecture of the 3-bit DFP logarithmic converter is shown in Figure 3. First of all, the 3-bit DFP number is sent to a IEEE-75 decoder which unpacks the 3-bit DFP format to 8-bit exponent, 8-bit coefficient and -bit signal to represent exception cases. Second, 8-bit binary unsigned exponent is converted to a 1-bit BCD representation with a combinational Bin-to-BCD converter, which is implemented based on shift-and-add algorithms. The Leading- Zero-Detector is defined to allow the 7-digit integer coefficient in the range of [0.1, 1). Meanwhile, the value of the characteristic k with minus bias (k bias) adds the BCD exponent to represent the integer part of decimal logarithm results. The 7-digit adjusted coefficient is then calculated with FXP logarithmic converter and the 16-digit logarithm is saved for the next faithful rounding. If the integer field of the decimal logarithm result is positive, it is subtracted by 1 and combined with 10 s complement of 16-digit FXP logarithm result to obtain a decimal logarithm result. Otherwise, the integer field is directly combined with the 16-digit FXP logarithm result to achieve an inexact logarithm result. The Shift register is to shift inexact logarithm result to obtain the exact 36-bit coefficient part, 8-bit binary unsigned exponent field and 1-bit sign field. The Rounding logic is to round the 36-bit coefficient to 8-bit faithful coefficient part by the round-halfeven algorithm. Finally, a 1-bit sign field, a 8-bit exponent field, a 8-bit coefficient field and -bit signals for exceptional cases are coded in IEEE-75 coder to pack a faithful 3-bit DFP logarithm result. We choose the DFP number, ( 1) 0 9999999 10 7, as an example to illustrate the data flow of the proposed 3-bit DFP logarithmic architecture. The 3-bit DFP format of this number, represented in the hexadecimal format, is 6DE3FCFF. First, the IEEE-75 decoder decomposes the 3-bit DFP format to a 8-bit unsigned binary exponent 01011110 and a 8-bit coefficient 9999999 in the form of BCD code. Second, the 8-bit unsigned binary exponent is converted to a 1-bit decimal BCD exponent 09 in BIN-to-BCD converter; the adjusted 8-bit coefficient 0.9999999 and the 3-digit characteristic k with minus bias Sign exception cases: 01 Infinite 10 NaN 00 Normal Sign Combinational Field X 3 Exponent Continuation 1 5 6 0 Coefficient Continuation IEEE 75 Decoder with Input Register Unpacking exponent 8 coefficient 8 Bin-to-BCD k-bias Leading-zero-Detector BCD exponent 1 1 coefficient 8 3-digit Decimal CLA Adder Fix-Point Decimal log_int 1 Log Converter 6 log_fra 1 Exp_out- 1 10's Complement 1 log_int-1 6 6 1+log_fra Mux 1 Mux 1 log_integer 6 log_fraction Combine Log_int. Log_fra 76 log_int. log_fra Shift Register 36 1 8 Rounding sign exponent_out 8 coefficient_out IEEE 75 Coder with Output Register Packing 3 1 5 6 0 Combinational Field Exponent Continuation 3 R=Log 10 (X) Coefficient Continuation Figure 3. 3-bit DFP Logarithmic Converter. (k bias) 906 are achieved in Leading-Zero-Detector. Third, the integer part of the logarithm result 000 is obtained by adding the BCD exponent 09 with the (k bias) 906 in the 3-digit decimal CLA adder. Meanwhile, the result of the 16-digit FXP logarithm, 0.00000003961 is achieved in FXP decimal logarithmic converter. Fourth, since the integer of the logarithm result is 000 which is not positive, so it is directly combined with the 16-digit FXP logarithm result to obtain the inexact logarithm result 000.00000003961. Fifth, the exact 36-bit integer coefficient 3961, the 8-bit binary unsigned exponent 01011110, and the 1-bit sign 1 is obtained by Shift register. Finally, the exact 36-bit integer coefficient is rounded to 8-bit faithful integer coefficient 395 in Rounding logic and the 3-bit DFP format of the logarithm result B1770ACD in the hexadecimal format is obtained in IEEE-75 coder... Function Verification This section presents the function verification platform for verifying the proposed 3-bit DFP logarithmic converter. The function verification platform is implemented in Xilinx University Program Virtex-II Pro Development System [1] with Embedded Development Kit (EDK). This system includes a Virtex-II PRO P30 FPGA configuration [1]. The proposed verification method is created in Power PC with C language. First, the valid DFP test vectors which are coded to 3-bit DFP format are sent to 3-bit

Table. Critical Path of the Proposed 3-bit DFP Logarithmic Converter. Reg Mux Mult Shifter Mux5 CLA Rounding Total delay(ns) 1.188 ns 1. ns 9.37 ns 1.38 ns 1.350 ns 5.519 ns 0.6 ns 0.97 ns DFP logarithmic converter. The logarithm results calculated by this converter are sent back to Power PC. Meanwhile, these test vectors are calculated by Power PC to achieve the accurate double precision BFP results as the benchmarks. Finally, the logarithm results calculated by 3-bit DFP logarithmic converter are compared with these accurate results obtained by Power PC. If they are not identical, the corresponding 3-bit DFP format test vector will be displayed in personal computer for debugging the 3-bit DFP logarithmic converter. It is difficult to verify all the test vectors (19 10 7 ) due to the infinite processing time in this verification platform, so 10,000 special cases (NaN, Infinite, Zero, Subnormal) and 100,000 random test vectors are chosen and sent to this verification platform. The verification results show that all these 3-bit DFP logarithm results calculated by the proposed 3-bit DFP logarithmic converter are correct. 5. Experimental Results and Analysis 5.1. Implementation Results The proposed 3-bit DFP logarithmic converter is modeled with VHDL and implemented in Virtex-II PRO P30 FPGA configuration. The proposed 3-bit DFP logarithmic converter is synthesized with XST and placed and routed by Xilinx ISE 9.1. It occupies 1 out of 16 GCLK I/O block, 66 out of 6 I/O blocks, and,8 out of 13696 slices. The maximum clock frequency and latency are 7.7 MHz and 18 clock cycles respectively. The critical path of the proposed architecture is in stage 1 of the FXP decimal logarithmic converter which is highlighted in Figure (dotted line) and its details are available in Table. Furthermore, the proposed 3-bit DFP logarithmic converter is synthesized with TMSC 0.18-um standard cell library and the implementation results indicate that its maximum frequency and area are 107.9 MHz and 1589.66 unit. Since there is no comparable decimal DFP logarithmic converter, we compare the proposed decimal FXP logarithmic converter with the radix-8 binary FXP logarithmic converter [11] for two cases in different precisions (Case 1: 7- digit and -bit; Case : 16-digit and 53-bit), because 1) they have similar dynamic range for the normalized coefficients ( 3 < 10 7 < ) for case 1, and ( 5 < 10 16 < 53 ) for case ; ) they are implemented by same digitrecurrence algorithm with selection by rounding; and 3) the radix-10 is close to radix-8. For the purpose of comparison, the proposed decimal FXP logarithmic converter is synthesized with a TMSC 0.18-um standard cell library [13]. The synthesis results show that the worse case path delay and area in the 7-digit decimal FXP logarithmic converter are 8.5 ns and 1577.8 units; in the 16-digit decimal FXP logarithmic converter are 9.8 ns and 3616.33 units. Since the timing and area evaluation units in [11] areτ and fa ( 1τ = the delay of 1-bit full adder, 1fa = the area of 1-bit full adder), we use the same units to represent the delay and area of decimal FXP logarithmic converter in this paper 3. Table 3 shows the compared results of case 1 and, in which the proposed 7-digit architecture is.73 times slower and.51 times larger than the -bit radix-8 binary FXP logarithmic converter in case 1; and the proposed 16- digit architecture is.38 times slower and 1. times larger than the 53-bit radix-8 binary FXP logarithmic converter in case. The reason is that 1) the number, in the form of BCD code in the proposed architecture, is less efficient than the binary number in the radix-8 binary FXP logarithmic converter and needs more resource to be implemented. ) the latency of decimal arithmetic, such as decimal CLA adder and Multiple logic in Figure, is larger than the signed-digit (SD) binary adder and SD Multiple logic in the architecture of the radix-8 binary FXP logarithmic converters. 5.. Scale to Decimal6 and Decimal18 Note that while decimal3 is only a storage format in IEEE 75-008 standard, decimal6 and decimal18 are more accurate formats for decimal calculation. To explain how we scale the proposed 3-bit DFP logarithmic converter to 6-bit and 18-bit converters, compliant with decimal6 and decimal18 formats, we mainly discuss the transformation of the core part, the decimal FXP logarithmic converter. The 7-digit coefficient field in decimal3 format is extended to the 16-digit and 3-digit in decimal6 and decimal18 formats respectively, so the decimal FXP logarithmic converter should be able to achieve the 3-digit and 68-digit accurate results in order to guarantee faithful rounding for the 6-bit and 18-bit DFP logarithm results. The main alterations of the decimal FXP logarithmic converters for decimal6 and decimal18 are: 1) The digit width of Mult1 in the stage 1 of the decimal FXP logarithmic converter (refer to Figure.) needs to be extended to 16-digit and 3-digit. ) It needs to keep at least 3-digit 3 Note that the τ and fa are delay and area of 1-bit full adder (AD- FULD) in TMSC 0.18-um standard cell library[13].

Table 3. Hardware Performance Comparison. Radix-10 Decimal Log Converter Radix-8 Binary Log Converter [11] P recision 7-digit 16-digit -bit 53-bit Areas 1630 fa 60 fa 67 fa 189 fa Cycle time 17 τ 19 τ 7 τ 8 τ N umber of cycles 9 18 8 18 Latency 153 τ 3 τ τ 1 τ and 68-digit precision for W [j] in order to obtain correct e j during 33 and 69 iterations, therefore the digit width of CLA adder, Mult and other blocks in the stage 1 of the decimal FXP logarithmic converter need to be extended to 3-digit and 68-digit. 3) The decimal FXP logarithm results can be achieved by accumulating the values of log 10 (1 + e j 10 j ) in iteration j k and e j 10 j / ln(10) in iteration j>k, where k =17for 3-digit accuracy results, and k =35for 68-digit accuracy results. The digit width of CLA adder, Mult3 and other blocks in the stage of the decimal FXP logarithmic converter need to be extended to 3-digit and 70-digit. ) The look-up table I, where the values of e 1 are stored, remains the same. However, the look-up table II needs to store the 3-digit and 70-digit values of log 10 (1 + e j 10 j ) when j is in the range of 1 j 17 and 1 j 35 for achieving 3-digit and 68-digit accurate logarithm results. Furthermore, the 3-digit and 68-digit adjustment constants, log 10 (), log 10 (3) and log 10 (5) are stored in this table too. The size of look-up table II in the decimal FXP logarithmic converter needs to be extended to 9 136 and 10 80 for decimal6 and decimal18 formats respectively. 6. Conclusions In this paper, we first present a 3-bit DFP format and its related logarithm operation. Second, we develop a decimal digit-recurrence algorithm with selection by rounding to achieve the radix-10 fixed-point (FXP) logarithm operation. Third, we construct the architecture of the 3-bit DFP logarithmic converter which is implemented and verified on an FPGA. Finally, we analyze implementation results of the proposed architecture, and compare the proposed decimal FXP logarithmic converter with a radix-8 binary FXP logarithmic converter for two cases. The compared results show that the decimal FXP logarithmic converter is slower and occupies more area than the binary FXP logarithmic converter. The presented architecture, however, can be optimized to achieve a faster speed or occupy a smaller area. Note that e j 10 j / ln(10) < 10 3,j 17 for decimal6; e j 10 j / ln(10)<10 70,j 35 for decimal18. References [1] IEEE standard 75-008. IEEE standard for floating-point arithmetic. IEEE Computer Society, Aug 008. [] M. F. Cowlishaw. Densely Packed Decimal Encoding. IEEE Computers and Digital Techniques, pp. 10-10, May 00. [3] M. F. Cowlishaw. Decimal Floating-Point: Algorism for Computers. IEEE Symp. on Computer Arithmetic, pp. 10-111, Jun 003. [] A. Y. Duale, M. H. Decker, H.-G. Zipperer, M. Aharoni, and T. J.Bohizic. Decimal Floating-Point in z9: An Implementation and Testing Perspective. J. IBM Res. and Dev., Jan 007. [5] L. Eisen, J. W. W. III, H.-W. Tast, N. Mading, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough. IBM POWER6 Accelerators: Vmx and dfu,. J. IBM Res. and Dev., Nov 007. [6] M. D. Ercegovac, T. Lang, and P. Montuschi. Very High- Radix Division with Selection by Rounding and Prescaling. IEEE Trans. on Computers, pp. 909-918, May 199. [7] L. Imbert, J. Muller, and F. Rico. A Radix-10 BKM Algorithm for Computing Transcendentals on Pocket Computers. J. VLSI Signal Processing, pp. 179-186, Jun 000. [8] T. Lang and P. Montuschi. Very-High Radix Square Root with Prescaling and Rounding and a Combined Division/Square Root Unit. IEEE Trans. on Computers, pp. 87-81, May 1999. [9] T. Lang and A. Nannarelli. A Radix-10 Combinational Multiplier. IEEE Asilomar Conference on Signals, Systems and Computers, pp. 313-317, Oct 006. [10] J. Muller. Elementary Functions, Algorithms and Implementation. Birkhauser. [11] A. Pińeiro, M. D. Ercegovac, and J. D. Bruguera. High- Radix Logarithm with Selection by Rounding: Algorithm and Implementation. J. VLSI Signal Processing, pp. 109-13, May 005. [1] E. M. Schwarz, J. S. Kapernick, and M. F. Cowlishaw. Decimal Floating-Point Support on the IBM System z10 Processor. J. IBM Res. and Dev., Jan 009. [13] Virtual Silicon Technology Inc. Native-18 Standard Cell Library 0.18V TSMC Process, Sep 1999. [1] Xilinx Inc. Xilinx University Program Virtex-II Pro Development System, Hardware Reference Manual. [15] Y. You, Y. Kim, and J. Choi. Dynamic Decimal Adder Circuit Design by using the Carry Lookahead. IEEE Design and Diagnostics of Electronic Circuits and Systems, pp. -, Apr 006.