2 ABSTRACT We now that omputers store numbers not with infinite preision but rather in some approximation that an be paed into a fixed number of bits or bytes, beause of whih we are loosing some information. Our aim is to study the effet of loosing this information on the response of digital filters. This effet we alled Finite word length effet. There are number of effets of finite word length lie overflow error in addition, round off error in multipliation, effets of oeffiient quantiation, limit yle, et. This paper tals about effets on response of IIR filters for the ase of oeffiient quantiation. Setion gives brief introdution to number system and shows why finite word length effet ours. Setion studies same phenomena from the view point of filters it also inludes results we have obtained. We have studied effet of finite word length on the response of Butterworth low pass IIR filter. Also we have studied effet of finite word length on the response of 4 th order IIR filter for diret form and parallel form realiation. On the basis of results we have onluded well nown result that parallel form realiation is better than diret form realiation.

3 . INTRODUCTION Computers store numbers not with infinite preision but rather in some approximation that an be paed into a fixed number of bits or bytes. Almost all omputers allow the programmer a hoie among several different suh representations or data types. Data types an differ in the number of bits utilied, but also in the more fundamental respet of whether the stored number is represented in fixed-point or floating-point format.. Fixed point representation A number in fixed point representation is exat. Arithmeti between numbers in fixed point representation is also exat, with the onditions that (i) the answer is not outside the range of integers that an be represented, and (ii) that division is interpreted as produing an integer result, throwing away any integer remainder. There are many formats to represent fixed point numbers lie, Sign-magnitude, One s ompliment and Two s ompliment, et. A Real number an be represented with infinite preision in two s omplement form as x X m ( b i 0 + bi ) i Where, X m is an arbitrary sale fator and b i s are either 0 or. The quantity b 0 is referred to as sign bit. If b 0 0, then 0 x Xm and if b 0, then X m x < 0. An arbitrary real number x would require an infinite number of bits for its exat binary representation. If we use only a finite number of bits (B+), then the representation of above equation must be modified to xˆ Q B [ x] X m ( b B i 0 + bi ) X mxˆ B i 3

4 The resulting binary representation is quantied, so that the smallest differene between numbers is X m -B The operation of quantiing number to (B + ) bits an be implemented by rounding or by trunation, but in either ase quantiation is a nonlinear memory less operation. Figure. shows input - output relation for two s omplement rounding and trunation, respetively, for the ase of B. Figure. Nonlinear Relationship representing two s omplement (a) rounding and (b) trunation for B [] In onsidering the effets of quantiation, we often define quantiation error as eq B [x] x. For the ase of two s omplement rounding, - / < e /, and for two s omplement trunation, - < e 0(Figure.). 4

5 Figure. probability density funtion for quantiation errors (a) Rounding (b) Trunation. [] If a number is larger than X m, a situation alled overflow ours. Figure.3 (a) shows two s omplement quantier, inluding the effet of regular two s omplement arithmeti overflow. An alternative, whih is alled saturation overflow or lipping, is shown in figure.3 (b). This method of handling overflow is generally implemented for A/D onversion, and it sometimes is implemented in speialied DSP miroproessor for addition of two s omplement numbers. With saturation overflow, the sie of the error does not inreases abruptly when overflow ours; however disadvantage of suh methods is that it voids the property of two s omplement-arithmeti that If several two s-omplement numbers whose sum would not overflow are added, then the result of two s-omplement aumulation of these numbers is orret even though intermediate sums might overflow. 5

6 Figure.3 Two s omplement rounding (a) Natural Overflow (b) Saturation. []. Floating point representation In floating-point representation (IEEE 754 standard), a number is represented internally by a sign bit s, an exat integer exponent E, and an exat positive integer mantissa M. Taen together these represent the number x s E 7 where E is eight bit exponent (0 < E <55), s is sign bit ( 0 for positive and for negative ) and f is 3 bit fration ( 0 < f < ). Floating point representations provide a 3 3 onvenient means for maintaining wide dynami range.. f 6

7 . FINITE WORD LENGTH EFFECTS Numerial quantiation affets the implementation of linear time-invariant disrete time system in several ways. Below we have given brief overview of some of them. Parameter quantiation in digital filters In the realiation of FIR and IIR filters hardware or in software on a general purpose omputer, the auray with whih filter oeffiients an be speified is limited by word length of the omputer. Sine the oeffiients used in implementing a given filter are not exat, the poles and eros of system funtion will be different from desired poles and eros. Consequently, we obtain a filter having a frequeny response that is different from the frequeny response of the filter with unquantied oeffiients. Also it sometimes affets stability of filter. Round off noise in multipliation As already explained when a signal is sampled or a alulation in the omputer is performed, the results must be plaed in a register or memory loation of fixed bit length. Rounding the value to the required sie introdues an error in the sampling or alulation equal to the value of the lost bits, reating a nonlinear effet. Round off error is a harateristi of omputer hardware. Sampling/Digitiation Error There is another, different, ind of error that is a harateristi of the program or algorithm used, independent of the hardware on whih the program is exeuted. Many numerial algorithms ompute disrete approximations to some desired ontinuous quantity. For example, an integral is evaluated numerially by omputing a funtion at a disrete set of points, rather than at every point. Or, a funtion may be evaluated by 7

8 summing a finite number of leading terms in its infinite series, rather than all infinity terms. In ases lie this, there is an adjustable parameter, e.g., the number of points or of terms, suh that the true answer is obtained only when that parameter goes to infinity. Any pratial alulation is done with a finite, but suffiiently large, hoie of that parameter. The differene between the true answer and the answer obtained in a pratial alulation is alled the trunation error. Trunation error would persist even on a hypothetial, perfet omputer that had an infinitely aurate representation and no round off error. Overflow in addition Overflow in addition of two or more binary numbers ours when the sum exeeds the word sie available in the digital implementation of the system. Limit yles Sine quantiation inherent in the finite preision arithmeti operations render the system nonlinear, in reursive system these nonlinearities often ause periodi osillation to our in the output, even when input sequene is ero or some nonero value. Suh an osillation in reursive systems are alled limit yles. As explained in above paragraphs finite word length affets LTI system in many ways. We have onentrated on effets due to oeffiient quantiation on filter response and in that also on IIR filters. Later we have given brief overview of effets of oeffiient quantiation in FIR system for the sa of ompleteness.. Effets of oeffiient quantiation in IIR system When the parameters of a rational system funtion or orresponding differene equation are quantied, the poles and eros of the system move to the new position in the -plane, equivalently, the frequeny response is perturbed from the original value. 8

9 The system funtion representation orresponding to both diret forms is H ( ) M b 0 N The sets of oeffiients {a } and {b } are ideal infinite-preision oeffiients. If we quantie these oeffiients, we obtain the system funtion Hˆ ( ) M bˆ 0 N where â a + a and b b + b are the quantied oeffiients that differ from original oeffiients by quantiation by quantiation error a and b. a aˆ Kaiser showed that if poles (or eros) are tightly lustered it is possible that small error in denominator (numerator) oeffiient an ause large shifts of the poles and (eros) for diret form struture. Thus, if the poles (eros) are tightly lustered, orresponding narrow band pass filter or narrow-bandwidth low pass filter, then we an express poles of the diret-form struture to be quite sensitive to quantiation error in the oeffiients. Kaiser analysis also showed that the larger the number of lustered poles (eros), the greater is the sensitivity to quantiation error. The asade and parallel form system funtion is onsists of seond order diretform systems. However, in both ases eah pair of omplex onjugate poles pair is realied independently of all other poles. Thus, the error in a partiular pole pair is independent of its distane from the other poles of system funtion. For the asade form same arguments holds for the eros, sine they are realied as independent seond order fators. Thus asade form is generally muh less sensitive to oeffiient quantiation than the equivalent diret-form realiation. 9

10 H ( ) ( N + ) / b0 + b a + b a The eros of the parallel form struture are realied impliitly through ombining the quantied seond order setions. Thus, partiular ero is affeted by quantiation error in the numerator and denominator oeffiients of all the seond order setions. However for most pratial filter the parallel form is also found to be muh less sensitive to oeffiient quantiation than the equivalent diret-form realiation. H ( ) M N ( N + ) / C + 0 e a 0 + e a In summery, beause of the sensitivity to finite word length effet, the diret forms are rarely used for implementing anything other than seond - order strutures. Casade and parallel strutures are more often used... What we did Before jumping on to designing of filters and seeing finite word length effet let us explain what we have did. Here we are not going to explain designing of filters or any other filter designing fundamentals, one an refer any good boo available for same [][]. But one should as how we did quantiation, so let us explain how we did quantiation and give some examples whih show apability and limitation of our routine. Our quantiation routine is very simple and it basially performs following steps: ) Tae 3 bit floating point number between ranges 0 to. ) Multiply it with 3 (if your numbers are between 0 to then multiply it with 3 ) to get equivalent integer number stores it in 3 bit format 3) Shift above number required number of bits as per requirement to obtain N bits representation of orresponding number (In a way mae ero least signifiant 3 N bits. So we have number whih is still in 3 bits but least signifiant bits removed). 4) Convert above number ba into orresponding floating point number. 0

11 Quantiation routine and examples Below we have given out C program routine whih taes as input floating point number whih we want to quantie and desire bit representation and gives as output orresponding floating point number in desire bit representation. float quant(float of,int n) /* Routine that generates deimal equivalent of the binary representation of a deimal number with p bits for magnitude part obtained by rounding of floating point number in IEEE 754 standard ( No between 0 ) n - Bit representation I want.any number between 0-3 */ { unsigned long int iof0,m; int sign ; float frat,quan; if(of<0) { sign -; of - * of; } //loop that stores sign of number //if number is negative mae it positive iof eil((pow(,3) ) * of); // Convert floating point no. // between 0 into orresponding // 3 bit integer representation // - ind of saling // eil() is a funtion in C whih // rounds of the numbers. m 3 - n; iof iof >> m; iof iof << m; // m is the number position by // whih I need shift number to get // n bit representation frat (float)iof/(pow(,3) ); // Convert integer number // ba into floating point quan sign * frat; // put ba sign } return(quan);

12 Examples: Before starting let s see how muh bit represents (Note: below examples are onsidering numbers between range 0 to ). : e -0 Input floating point number Number of bit representation Obtained floating point number 3 bits bits bits bits bits bits bits bits bits bits Comment We are not using full dynami range bits Here it fails.. Designing of Butterworth low pass filter using bilinear transformation Let us start with fundamental steps needed to design Butterworth low pass filter using bilinear transformation. Desription is very brief just to give basi idea: ) Determination of the analog filter s edge frequenies. Use below equation where Ω frequeny. ω Ω tan T is Analog frequeny, T is sampling time period and ω is digital

13 ) Determination of order of the filter Ω Ω log / / log δ δ N Where N is filter order, δ and δ is Pass band and Stop band ripple respetively. Ω and are filter edge frequenies. Ω 3) Determination of -3 db utoff frequeny N Ω Ω δ 4) The transfer funtion of Butterworth filter is usually written in the fatored as given below Ω + Ω + Ω / ) ( N s b s B s H N, 4, 6, Or Ω + Ω + Ω Ω + Ω / ) ( 0 0 ) ( N s b s B s B s H N 3, 5, 7, Where b and are given by N b ) ( sin π and The parameter B an be obtained from / N B A, for even N And / ) ( N B A, for odd N 5) Determination of H() ) ( ) ( ) ( ) ( + T s s H H 3

14 Filter parameters: Pass band ripple: 0.99 Stop band ripple: 0.00 Pass band frequeny:.566 Stop band frequeny:.885 Filter Order: 4 (so total seven nd order filters are there) Cutoff Frequeny: Filter oeffiient: Numerator oeffiients are BB-7 B Ω Denominator oeffiients are bb-7 b Ω and -7 Ω Coeffiie nt Original Value Quantied value 4 bits Quantied value 6 bits Quantied value bits Quantied value 8 bits Quantied value 5 bits BB bb bb bb bb bb bb bb Note: In above table don t get onfused by values of oeffiients. It may seem they are going beyond range 0- but atually it s beause of multipliation with Ω term. See the equation of H(s) NOTE: In below figures red line is quantied response. 4

15 Fig. Response when oeffiient quantied to 3 bits Fig. Response when oeffiient quantied to 4 bits 5

16 Fig.3 Response when oeffiient quantied to 6 bits Fig.4 Response when oeffiient quantied to bits 6

17 Fig.5 Response when oeffiient quantied to 8 bits Fig.6 Response when oeffiient quantied to 5 bits 7

18 ..3 Designing of 4 th order low pass filter and to show response of filter while diret realiation and parallel form realiation Diret form realiation H() Parallel form realiation H()

19 Filter Coeffiient Diret Form Realiation: Coeffi ients Original Value Quantied value 4 bits Quantied value bits Quantied value 8 bits Quantied value 6 bits Quantied value 4 bits b b b a a a a a Filter Coeffiient parallel Form Realiation: Coeffii ents Original Value Quantied value 4 bits Quantied value bits Quantied value 8 bits Quantied value 6 bits Quantied value 4 bits b b b b a a a a a a

20 Fig.7 Response when oeffiient quantied to 4 bits (Diret form) Fig.8 Response when oeffiient quantied to 4 bits (Parallel form) 0

21 Fig.9 Response when oeffiient quantied to bits (Diret form) Fig.0 Response when oeffiient quantied to bits (Parallel form)

22 Fig. Response when oeffiient quantied to 8 bits (Diret form) Fig. Response when oeffiient quantied to 8 bits (Parallel form)

23 Fig.3 Response when oeffiient quantied to 6 bits (Diret form) Fig.4 Response when oeffiient quantied to 6 bits (Parallel form) 3

24 Fig.5 Response when oeffiient quantied to 4 bits (Diret form) Fig.6 Response when oeffiient quantied to 4 bits (Parallel form) 4

25 . Effets of oeffiient quantiation in FIR system For FIR system, we have to onerned with loations of eros only, sine for ausal FIR system all poles are at 0. Although we have just seen that diret form struture should be avoided for high order IIR system, it turns out that diret form struture is ommonly used for FIR systems. To understand why this is so, we express the system funtion for a diret form FIR system in the form H ( ) M n 0 h[ n ] Now suppose that the oeffiients {h[n]} are quantied, resulting in a new set of oeffiients {ĥ[n] h[n] + h[n]}. The system funtion for quantied system is then n Where M Hˆ ( ) h[ n] n 0 n H ( ) + H ( ) M H ( ) h[ n] n 0 Thus, system funtion of the quantied system is linearly related to the quantiation errors in the impulse response oeffiients. n If the eros of H () are tightly lustered, then their loations will be highly sensitive to quantiation errors in the impulse response oeffiients. The reason that diret form FIR system is widely used is that for most linear phase FIR filters, the eros are more or less uniformly spread in the -plane. Designing of FIR low pass filter using Pars-MClellan design tehnique Pass band ripple: 0.99 Stop band ripple: 0.00 Pass band frequeny:.566 Stop band frequeny:.885 5

26 Fig.7 FIR quantiation example (a) Log magnitude for unquantied ase; Approximation error for (b) unquantied ase () 6 bit quantiation [] Fig.7 (ontinued) Approximation error for (d) 4 bit quantiation (e) 3 bit quantiation (f) 8 bit quantiation [] 6

27 CONCLUSION Finite word length is inherent problem whih our due to finite bit representation of number in digital representation. Effet of finite word lengths are Overflow in addition, Limit yles and Round off noise in multipliation. We have seen effet of oeffiient quantiation on filter response. Also we have onlude that oupled form and parallel form struture of filter realiation are more seure against finite word length effet as ompare to diret form realiation. Although due to advaned in tehnology we have now available mahine with 64 bit representation (whih is almost infinite preision), but it s still needs to be onsider due to rise of embedded tehnology and ompetitive maret whih needs low ost produt. 7

28 REFERENCES Disrete Time Signal Proessing, Oppenheim A. V and Shafer R. W., Prentie-Hall. Digital Signal Proessing, John G. Proais and Dimitris G. Manolais, Prentie-Hall. Numerial Reipes in C, The Art of Sientifi Computing William H. Press, Saul A. Teuolsy, William T. Vetterling, P. Flannery, Seond Edition, Cambridge University Press.. Digital Signal Proessing, A Computer-Based Approah, Sanjit K. Mitra, MGraw Hill, Seond Edition

More information