FINITE PRECISION EFFECTS 1. FLOATING POINT VERSUS FIXED POINT 3. TYPES OF FINITE PRECISION EFFECTS 4. FACTORS INFLUENCING FINITE PRECISION EFFECTS

Similar documents
Analysis of Finite Wordlength Effects


INFINITE-IMPULSE RESPONSE DIGITAL FILTERS Classical analog filters and their conversion to digital filters 4. THE BUTTERWORTH ANALOG FILTER

DSP Design Lecture 2. Fredrik Edman.

ECE4270 Fundamentals of DSP Lecture 20. Fixed-Point Arithmetic in FIR and IIR Filters (part I) Overview of Lecture. Overflow. FIR Digital Filter

LINEAR-PHASE FIR FILTER DESIGN BY LINEAR PROGRAMMING

REAL TIME DIGITAL SIGNAL PROCESSING

DSP Configurations. responded with: thus the system function for this filter would be

DFT & Fast Fourier Transform PART-A. 7. Calculate the number of multiplications needed in the calculation of DFT and FFT with 64 point sequence.

Lecture 7 - IIR Filters

Optimum Ordering and Pole-Zero Pairing of the Cascade Form IIR. Digital Filter

Lecture 19 IIR Filters

Stability Condition in Terms of the Pole Locations

Digital Signal Processing

Lab 4: Quantization, Oversampling, and Noise Shaping

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

APPLIED SIGNAL PROCESSING

Examination with solution suggestions SSY130 Applied Signal Processing

Chapter 7: IIR Filter Design Techniques

Optimum Ordering and Pole-Zero Pairing. Optimum Ordering and Pole-Zero Pairing Consider the scaled cascade structure shown below

DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A

EE482: Digital Signal Processing Applications

Other types of errors due to using a finite no. of bits: Round- off error due to rounding of products

E : Lecture 1 Introduction

Practical Considerations in Fixed-Point FIR Filter Implementations

DIGITAL SIGNAL PROCESSING UNIT III INFINITE IMPULSE RESPONSE DIGITAL FILTERS. 3.6 Design of Digital Filter using Digital to Digital

Lecture 7 Discrete Systems

UNIVERSITI SAINS MALAYSIA. EEE 512/4 Advanced Digital Signal and Image Processing

Exercises in Digital Signal Processing

Question Bank. UNIT 1 Part-A

Digital Filters Ying Sun

Digital Filter Implementation 1

Finite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut

Lecture 11 FIR Filters

Digital Signal Processing Lecture 4

EE 521: Instrumentation and Measurements

Efficient algorithms for the design of finite impulse response digital filters

INF3440/INF4440. Design of digital filters

Digital Signal Processing Lecture 8 - Filter Design - IIR

Introduction to DSP Time Domain Representation of Signals and Systems

Discrete-time signals and systems

VALLIAMMAI ENGINEERING COLLEGE. SRM Nagar, Kattankulathur DEPARTMENT OF INFORMATION TECHNOLOGY. Academic Year

EE 313 Linear Signals & Systems (Fall 2018) Solution Set for Homework #7 on Infinite Impulse Response (IIR) Filters CORRECTED

Oversampling Converters

IT DIGITAL SIGNAL PROCESSING (2013 regulation) UNIT-1 SIGNALS AND SYSTEMS PART-A

Lecture 14: Windowing

UNIT 1. SIGNALS AND SYSTEM

Introduction to Digital Signal Processing

( ) John A. Quinn Lecture. ESE 531: Digital Signal Processing. Lecture Outline. Frequency Response of LTI System. Example: Zero on Real Axis

3 Finite Wordlength Effects

CHANNEL PROCESSING TWIN DETECTORS CHANNEL PROCESSING TWIN DETECTORS CHANNEL PROCESSING TWIN DETECTORS CHANNEL PROCESSING TWIN DETECTORS

Z - Transform. It offers the techniques for digital filter design and frequency analysis of digital signals.

ECE503: Digital Signal Processing Lecture 6

EECE 301 Signals & Systems Prof. Mark Fowler

/ (2π) X(e jω ) dω. 4. An 8 point sequence is given by x(n) = {2,2,2,2,1,1,1,1}. Compute 8 point DFT of x(n) by

EE482: Digital Signal Processing Applications

Digital Signal Processing

III. Time Domain Analysis of systems

LECTURE NOTES DIGITAL SIGNAL PROCESSING III B.TECH II SEMESTER (JNTUK R 13)

Digital Signal Processing Lecture 10 - Discrete Fourier Transform

Discrete Time Systems

Signals and Systems. Problem Set: The z-transform and DT Fourier Transform

Digital Signal Processing Lecture 9 - Design of Digital Filters - FIR

Roundoff Noise in Digital Feedback Control Systems

Chap 2. Discrete-Time Signals and Systems

PS403 - Digital Signal processing

Let H(z) = P(z)/Q(z) be the system function of a rational form. Let us represent both P(z) and Q(z) as polynomials of z (not z -1 )

Filter Analysis and Design

UNIVERSITY OF OSLO. Please make sure that your copy of the problem set is complete before you attempt to answer anything.

Signal Processing. Lecture 10: FIR Filter Design. Ahmet Taha Koru, Ph. D. Yildiz Technical University Fall

Vel Tech High Tech Dr.Ranagarajan Dr.Sakunthala Engineering College Department of ECE

ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals. 1. Sampling and Reconstruction 2. Quantization

DSP IC, Solutions. The pseudo-power entering into the adaptor is: 2 b 2 2 ) (a 2. Simple, but long and tedious simplification, yields p = 0.

Recursive Gaussian filters

-Digital Signal Processing- FIR Filter Design. Lecture May-16

Second and Higher-Order Delta-Sigma Modulators

EE 354 Fall 2013 Lecture 10 The Sampling Process and Evaluation of Difference Equations

Lecture 18: Stability

MULTIRATE SYSTEMS. work load, lower filter order, lower coefficient sensitivity and noise,

Recursive, Infinite Impulse Response (IIR) Digital Filters:

A DESIGN OF FIR FILTERS WITH VARIABLE NOTCHES CONSIDERING REDUCTION METHOD OF POLYNOMIAL COEFFICIENTS FOR REAL-TIME SIGNAL PROCESSING

Convolution. Define a mathematical operation on discrete-time signals called convolution, represented by *. Given two discrete-time signals x 1, x 2,

Problem Set 9 Solutions

Electronic Circuits EE359A

Analog vs. discrete signals

Higher-Order Σ Modulators and the Σ Toolbox

ECSE 512 Digital Signal Processing I Fall 2010 FINAL EXAMINATION

ECE503: Digital Signal Processing Lecture 5

Discrete-Time Systems

Topic 7: Filter types and structures

Universiti Malaysia Perlis EKT430: DIGITAL SIGNAL PROCESSING LAB ASSIGNMENT 3: DISCRETE TIME SYSTEM IN TIME DOMAIN

Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters

Digital Signal Processing 2/ Advanced Digital Signal Processing Lecture 3, SNR, non-linear Quantisation Gerald Schuller, TU Ilmenau

Z Transform (Part - II)

George Mason University ECE 201: Introduction to Signal Analysis Spring 2017

The information loss in quantization

Digital Signal Processing. Midterm 1 Solution

University of Illinois at Urbana-Champaign ECE 310: Digital Signal Processing

DIGITAL SIGNAL PROCESSING. Chapter 6 IIR Filter Design

LAB 6: FIR Filter Design Summer 2011

Transcription:

FINITE PRECISION EFFECTS 1. FLOATING POINT VERSUS FIXED POINT 2. WHEN IS FIXED POINT NEEDED? 3. TYPES OF FINITE PRECISION EFFECTS 4. FACTORS INFLUENCING FINITE PRECISION EFFECTS 5. FINITE PRECISION EFFECTS: FIR VERSUS IIR FILTERS 6. DIRECT FIR FILTER STRUCTURE 7. CASCADE FIR FILTER STRUCTURE 8. STRUCTURES WITH QUANTIZERS 9. THE OVERFLOW PROBLEM 10. TIME-DOMAIN SCALING 11. FREQUENCY-DOMAIN SCALING 12. OVERFLOW OF INTERNAL SIGNALS 13. QUANTIZATION OF FILTER COEFFICIENTS 14. COEFFICIENT QUANTIZATION: EXAMPLE 15. SIGNAL QUANTIZATION 16. QUANTIZATION ERROR POWER 17. EXPRESSING σe 2 IN TERMS OF THE NUMBER OF BITS 18. SIGNAL TO QUANTIZATION NOISE RATIO (SNR) 19. SIGNAL QUANTIZATION I. Selesnick EL 713 Lecture Notes 1

FLOATING POINT VERSUS FIXED POINT Signals can not be represented exactly on a computer, but only with finite precision. In floating point arithmetic, the finite precision errors are generally not a problem. However, with fixed point arithmetic, the finite word length causes several problems. For this reason, a floating point implementation is preferred when it is feasible. Because of finite precision effects, fixed point implementation is used only when 1. speed, 2. power, 3. size, and 4. cost are very important. I. Selesnick EL 713 Lecture Notes 2

WHEN IS FIXED POINT NEEDED? From Matlab documentation: When you have enough power to run a floating point DSP, such as on a desktop PC or in your car, fixed-point processing and filtering is unnecessary. But, when your filter needs to run in a cellular phone, or you want to run a hearing aid for hours instead of seconds, fixed-point processing can be essential to ensure long battery life and small size. Many real-world DSP systems require that filters use minimum power, generate minimum heat, etc. Meeting these requirements often means using a fixed-point implementation. I. Selesnick EL 713 Lecture Notes 3

TYPES OF FINITE PRECISION EFFECTS Overflow Quantization of filter coefficients Signal quantization 1. Analog/Digital conversion (A/D converter) 2. Round-off noise 3. Limit cycles (recursive filters only) FACTORS INFLUENCING FINITE PRECISION EFFECTS 1. The structure used for implementation (Direct, transpose, etc) 2. The word-length and data-type (fixed-point, floating-point, etc) (1-s complement, 2-s complement, etc) 3. The multiplication-type (rounding, truncation, etc) I. Selesnick EL 713 Lecture Notes 4

FINITE PRECISION EFFECT: FIR VERSUS IIR FILTERS Remarks on FIR filters: 1. Filter coefficient quantization: coefficient quantization is quite serious for FIR filters due to the typically large dynamic range of coefficients of FIR filters. 2. Round-off noise: Round-off noise is not as serious for FIR filters as it is for IIR filters. 3. Limit cycles: Non-recursive (FIR) filters do not have limit cycles. 4. For FIR filters, the direct form is generally preferred. Remarks on IIR filters: 1. Filter coefficient quantization: coefficient quantization can make a stable IIR filter unstable! (The implementation of an IIR filters using a cascade of second order sections prevents that.) 2. Round-off noise: For a cascade of second order sections the round-off noise depends on (a) the poles-zero pairing, (b) the ordering of the sections. I. Selesnick EL 713 Lecture Notes 5

DIRECT FIR FILTER STRUCTURE To clarify the forgoing discussion on the effects of fixed-point arithmetic, it will be convenient to refer a couple of different structures for the implementation of FIR filters. Fig 1 This structure implements an FIR filter of length 4. The output y(n) is given by y(n) = h(0) x(n) + h(1) x(n 1) + h(2) x(n 2) + h(3) x(n 3) and so the impulse response is h(n) = h(0) δ(n) + h(1) δ(n 1) + h(2) δ(n 2) + h(3) δ(n 3) and the transfer function is given by H(z) = h(0) + h(1) z 1 + h(2) z 2 + h(3) z 3. This is called a direct structure because the transfer function parameters appear as multipliers in the structure itself. I. Selesnick EL 713 Lecture Notes 6

CASCADE FIR FILTER STRUCTURE If the FIR transfer function H(z) is factored as H(z) = H 1 (z) H 2 (z) then the FIR filter can be implemented as a cascade of two shorter FIR filters. If each of the shorter filters are implemented with the direct form, then the total flowgraph for the cascade structure is as shown in the following figure. Fig 2 If both structures are implemented with infinite precision, then they will implemented exactly the same transfer function. However, when they are implemented with finite-precision arithmetic, they will have different properties. Some structures are more immune than other structures when fixed-point precision is used. I. Selesnick EL 713 Lecture Notes 7

STRUCTURES WITH QUANTIZERS In fixed-point implementations 1. The multipliers are quantized. 2. The input, output, and internal signals are quantized. Both types of quantization introduce some error. If the binary numbers X and Y are each B bits long, then the product Z = X Y is 2B 1 bits long. Therefore, the multiplications occurring in a digital filter with fixed point arithmetic must be truncated down to the word length used. (This is represented by the Q block in the following diagrams.) On the other hand, the addition of two-fixed point numbers does not lead to a loss of precision. In some (old) processors, signal quantization occurs after every multiplier. For example, the FIR direct structure can be illustrated as shown. Fig 3 In most processors today, the quantizer Q comes after the multiplysum operation. This is the multiply-accumulate, or MAC, command. The following figure illustrates this. Fig 4 Later, we will see how to analyze the degradation of accuracy caused by the quantizer. In processors where the quantizer comes after every multiplier, it can be seen that there are many more quantizations than when the quantizer comes after the multiply-sum operation. We will assume from now that the quantizer comes after the sum. I. Selesnick EL 713 Lecture Notes 8

OVERFLOW The range of values of a B-bit binary number is limited. If a digital filter is not properly implemented, overflow may occur when the output or internal signals are larger than the maximum value possible. To avoid overflow, it may be necessary to scale down the input signal. If the signal is not scaled appropriately, many overflows will occur. If the signal is scaled down too much, then the the full available dynamic range is not utilized and the ratio of the signal power to the round-off noise (SNR) will be degraded. Scaling is usually 1. absorbed in the other multipliers (so it does not introduce further round-off noise), or 2. by 2 k so it can be implemented by a bit shift (to minimize complexity) I. Selesnick EL 713 Lecture Notes 9

TIME-DOMAIN SCALING Suppose the signals x(n) and y(n) can only be represented in the range x(n) x max y(n) y max We want to implement a scaled version of the filter H(z) and we want to choose the scaling constant c so as to maximize the dynamic range of the output, while at the same time avoiding y(n) from overflowing (exceeding y max ). x(n) c H(z) y(n) Problem: find the maximum value of c such that y(n) y max. You can assume that x(n) x max. The solution begins with the convolutional sum. y(n) = k c h(k) x(n k) y(n) c k h(k) x(n k) c h(k) x max k c x max h(k) c x max h(n) 1 where the 1-norm of h(n) is defined as k h(n) 1 := k h(k). I. Selesnick EL 713 Lecture Notes 10

So gives y(n) c x max h(n) 1 y max c y max x max 1 h(n) 1 This is the time-domain 1-norm scaling rule. In practice, c is often 1. rounded downward to 1/2 l (so it can be implemented with a bit shift), or 2. absorbed into existing multipliers in the filter structure. The value c = y max 1 x max h(n) 1 is often overly conservative. It guarantees that overflow will never occur, however, it many cases, it is acceptable to allow overflow to occur as long as it is rare, and to use a larger value of c so that the full dynamic range is better utilized. I. Selesnick EL 713 Lecture Notes 11

FREQUENCY-DOMAIN SCALING Another approach to scaling for overflow control is derived in the frequency domain. y(n) = IDTFT { Y f (ω) } = IDTFT { c H f (ω) X f (ω) } = 1 2π π π c H f (ω) X f (ω) e jnω dω Because the absolute value of an integral is smaller than the integral of the absolute value, we can write y(n) c 2π π π c 2π Xf H f (ω) X f (ω) dω π π H f (ω) dω where the frequency-domain infinity-norm is defined as X f := max X f (ω). ω Defining the frequency-domain 1-norm as we can write H f 1 := 1 2π π π H f (ω) dω, y(n) c X f H f 1 To guarantee that no overflows occur, we have the condition or y(n) c X f H f 1 y max c y max X f 1 H f 1 I. Selesnick EL 713 Lecture Notes 12

This is the frequency-domain 1-norm scaling rule. Note that it requires an estimate of X f. In the development of this scaling rule, we could have swapped the role of X f (ω) and H f (ω). Then we would obtain the following scaling rule. c y max X f 1 1 H f This is the frequency-domain infinity-norm scaling rule. I. Selesnick EL 713 Lecture Notes 13

OVERFLOW OF INTERNAL SIGNALS In addition to ensuring that the output signal rarely overflows, it is also necessary to ensure that the internal signals rarely overflow as well. For example, consider the cascade of two FIR filters. x(n) H 1 (z) H 2 (z) y(n) Both H 1 (z) and H 2 (z) must be scaled. We then have the following diagram. x(n) c 1 H 1 (z) c 2 H 2 (z) y(n) It is necessary to choose the two scaling constants c 1 and c 2 so that overflow of both the intermediate signal and the output signal y(n) are rare. If we call the intermediate (internal) signal v(n), then 1. Choose c 1 first, so that v(n) does not overflow. 2. Choose c 2 second, so that y(n) does not overflow. If there are more than one internal signal which may overflow, then begin with the first internal signal and proceed through the filter structure. I. Selesnick EL 713 Lecture Notes 14

QUANTIZATION OF FILTER COEFFICIENTS For a fixed point implementation of a digital filter the filter coefficients must be quantized to binary numbers. This changes the actual system to be implemented and usually results in a degradation of the frequency response. The quantization of floating point values can be performed in Matlab with the round command: y = q*round(x/q); where q is the quantization step. COEFFICIENT QUANTIZATION: EXAMPLE If floating-point (or infinite precision) is used for h(n), then the minimal length TYPE-I FIR filter satisfying with 1 δ p A(ω) 1 + δ p ω 0.32π A(ω) δ p δ p = 0.02, 0.37π ω π δ s = 50 db is of length N = 83. (See the figures on the next page.) However, if a fixed-point filter h(n) is used by quantizing h(n), then the degraded frequency response does not meet the specifications. A longer impulse response is needed. (See the next page.) I. Selesnick EL 713 Lecture Notes 15

COEFFICIENT QUANTIZATION: EXAMPLE 10 0 10 20 30 40 50 60 70 H f (ω) (N = 83, FLOATING POINT) 80 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 0 10 20 30 40 50 60 70 H f (ω) (N = 83, 12 BITS) 80 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 0 10 20 30 40 50 60 70 H f (ω) (N = 93, 12 BITS) 80 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 I. Selesnick EL 713 Lecture Notes 16

This example was produced with the following Matlab code. N = 83; Dp = 0.02; % desired passband error Ds = 10^(-50/20); % desired stopband error (50dB) Kp = 1/Dp; Ks = 1/Ds; wp = 0.32*pi; ws = 0.37*pi; wo = (wp+ws)/2; L = 1000; w = [0:L]*pi/L; W = Kp*(w<=wp) + Ks*(w>=ws); D = (w<=wo); [h,del] = fircheb(n,d,w); dp = del/kp; % actual passband error ds = del/ks; % actual stopband error [H,w] = freqz(h); subplot(3,1,1) plot(w/pi,20*log10(abs(h)),[0 1],20*log10(Ds)*[1 1], -- ) axis([0 1-80 10]) title( H^f(\omega) (N = 83, FLOATING POINT) ) b = 12; % number of bits used for fixed-point word-length Q = 1/2^b; % quantization step hq = Q*round(h/Q); % quantized coefficients [Hq,w] = freqz(hq); % frequency response of quantized coefficients subplot(3,1,2) plot(w/pi,20*log10(abs(hq)),[0 1],20*log10(Ds)*[1 1], -- ) axis([0 1-80 10]) title( H^f(\omega) (N = 83, 12 BITS) ) N = 93; % increase the filter length [h,del] = fircheb(n,d,w); % redesign with longer filter length hq = Q*round(h/Q); % quantize longer impulse response [Hq,w] = freqz(hq); subplot(3,1,3) plot(w/pi,20*log10(abs(hq)),[0 1],20*log10(Ds)*[1 1], -- ) axis([0 1-80 10]) title( H^f(\omega) (N = 93, 12 BITS) ) I. Selesnick EL 713 Lecture Notes 17

SIGNAL QUANTIZATION Quantizing a signal is a memoryless nonlinear operation. UNIFORM QUANTIZER CHARACTERISTIC 1 0.8 0.6 0.4 0.2 Q[x] 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 x I. Selesnick EL 713 Lecture Notes 18

SIGNAL QUANTIZATION If Q[x(n)] is the quantized version of x(n), then we can define the quantization error signal e(n) as Then we can write e(n) := Q[x(n)] x(n) Q[x(n)] = x(n) + e(n) Under many realistic operating conditions, the quantization error signal e(n) will appear to fluctuate randomly from sample to sample. So even though the quantization is totally deterministic, it can be modeled as a random variable. Specifically, under the following conditions, or assumptions, the quantization error signal can be accurately modeled as random signal which is (1) uncorrelated from sample to sample and (2) uncorrelated from the true signal x(n). 1. The quantization step is small compared to the signal amplitude. 2. Overflow is negligible. 3. The signal changes rapidly enough from sample to sample. We will assume that a uniform quantizer is used (all quantization steps are equal). If the quantization step is Q, then the quantization error signal will vary between Q/2 and Q/2, Q 2 e(n) Q 2 In addition, if all values between Q/2 and Q/2 are equally likely, then the probability distribution function (PDF) p(e) that describes e(n) is uniform between Q/2 and Q/2. Fig 5: Uniform PDF I. Selesnick EL 713 Lecture Notes 19

QUANTIZATION ERROR: MATLAB EXAMPLE L = 10000; x = 2*rand(L,1)-1; % original signal x(n), with -1 < x(n) < 1 Rfs = 2; b = 8; Q = Rfs/2^b; y = Q*round(x/Q); e = x - y; % Rfs: full scale range % number of bits % quantization step % quantized signal y(n) % quantization error signal e(n) figure(1) subplot(2,1,1) stem(e(1:100),. ) title( QUANTIZATION ERROR SIGNAL ) subplot(2,1,2) hist(e,10) % compute histogram (with 10 bins) colormap([1 1 1]*1) title( QUANTIZATION ERROR HISTOGRAM ) axis([-q Q 0 2*L/10]) print -deps sig_quant 4 x 10 3 QUANTIZATION ERROR SIGNAL 2 0 2 4 0 10 20 30 40 50 60 70 80 90 100 2000 QUANTIZATION ERROR HISTOGRAM 1500 1000 500 0 6 4 2 0 2 4 6 x 10 3 I. Selesnick EL 713 Lecture Notes 20

QUANTIZATION ERROR POWER The variance of e(n) is used to quantify how bad the quantization error is. As e(n) is zero mean, the variance of e(n) is just given by the average value of e 2 (n). σ 2 e := VAR{e(n)} = E{e 2 (n)} E{e 2 (n)} is the called the power of the random signal e(n). We can compute the average value of e 2 (n) by the formula E{e 2 (n)} = = Q/2 Q/2 Q/2 Q/2 e 2 p(e) de e 2 1 Q de = 1 Q e3 e=q/2 3 e= Q/2 [ (Q = 1 ) 3 ( Q ) ] 3 3 Q 2 2 = 1 [ ] Q 3 3 Q 8 + Q3 8 [ ] = Q2 2 3 8 = Q2 12 σ 2 e = Q2 12 I. Selesnick EL 713 Lecture Notes 21

QUANTIZATION ERROR POWER The following Matlab program numerically checks the correctness of the expression σ 2 e = Q2 12. L = 10000; x = 2*rand(L,1)-1; Rfs = 2; b = 8; Q = Rfs/2^b; y = Q*round(x/Q); e = x - y; errvar_meas = mean(e.^2) errvar_model = Q^2/12 % original signal x(n) % with -1 < x(n) < 1 % Rfs: full scale range % number of bits % quantization step % quantized signal y(n) % quantization error signal e(n) % E{e(n)^2} : measured value % E{e(n)^2} : value from model We got the following result when we ran this program, which verifies the formula for σ 2 e. errvar_meas = 5.1035e-06 errvar_model = 5.0863e-06 Note that each time we run the program, we will get a different value for errvar meas because the simulation begins with a randomly genegerated signal x(n). I. Selesnick EL 713 Lecture Notes 22

EXPRESSING σ 2 e IN TERMS OF THE NUMBER OF BITS Lets find and expression for σ 2 e in terms of the number of bits used. Let R fs denote the full-scale range of the quantizer. That is the range from the least value to the greatest value. If b bits are used to represent x(n), then there are 2 b quantization levels. Therefore 2 b Q = R fs and and Q = R fs 2 b = 2 b R fs σe 2 = Q2 12 = 2 2b Rfs 2 12 σ 2 e = 2 2b R 2 fs 12 I. Selesnick EL 713 Lecture Notes 23

SIGNAL TO QUANTIZATION NOISE RATIO (SNR) The signal to quantization noise ratio (SNR OR SQRN) is defined as where SNR := 10 log 10 σ 2 x σ 2 e = 20 log 10 σ x σ e σ 2 x := VAR{x(n)} σ 2 e := VAR{e(n)} in db It will be interesting to examine how the SNR depends on the number of bits b. First, we can write Using σ 2 e = 2 2b R 2 fs 12 or SNR = 10 log 10 σ 2 x 10 log 10 σ 2 e. we have SNR = 10 log 10 σ 2 x 10 log 10 ( 2 2b R 2 fs 12 ( R 2 = 10 log 10 σx 2 10 log 10 2 2b fs 10 log 10 12 ) = 10 log 10 σ 2 x + 20 b log 10 2 10 log 10 ( R 2 fs 12 SNR = C + 20 b log 10 2 where C does not depend on b. As 20 log 10 2 = 6.02, we have ) ) SNR = C + 6.02 b That is the well known result: for each additional bit, the SNR is improved by 6 db. I. Selesnick EL 713 Lecture Notes 24