Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 26
Lossy Coding Distortion Coding With Distortion 2 / 26
Coding with Distortion Lossy Coding Distortion Placeholder Figure A ǫ 2 = lim T 1 T T/2 T/2 E(x(t) ˆx(t)) 2 dt = M.S.E. 3 / 26
Discrete-Time Signals Lossy Coding Distortion If signals are bandlimited, one can sample at nyquist rate and convert continuous-time problem to discrete-time problem. This sampling is part of the A/D converter. Placeholder Figure A ǫ 2 = lim m 1 m m E(x i ˆx i ) 2 i=1 4 / 26
A/D A/D Conversion and D/A Conversion 5 / 26
A/D Conversion A/D Assume a random variablex which falls into the range (X min,x max ). The goal is forx to be converted intok binary digits. Let M = 2 k. The usual A/D converter first subdivides the interval (X min,x max ) intom equal sub-intervals. Subintervals are of width = (X max X min )/M. Shown below for the case ofk = 3 andm = 8. Placeholder Figure A We call thei th sub-interval,r i. 6 / 26
D/A Conversion A/D Assume that ifx falls in the regionr i, i.e.x R i converter uses as an estimate ofx, the value ˆX = Y which is the center of thei th region. The mean-squared error betweenx and ˆX is ǫ 2 = E[(X ˆX) 2 ] = Xmax X min (X ˆX) 2 f x (x)dx wheref x (x) is the probability density function of the random variablex. Letf X Ri (x) be the conditional density function ofx given that X falls in the regionr i. Then ǫ 2 = M P[x R i ] (x y i ) 2 f x Ri (x)dx x R i i=1 7 / 26
D/A Conversion A/D Note that for i = 1,2,...,M M P[x R i ] = i=1 and x R i f X Ri (x)dx = 1. Make the further assumption thatk is large enough so that f X Ri (x) is a constant over the regionr i. Thenf X Ri (x) = 1 for alli, and x R i (x y i ) 2 f X Ri (x)dx = 1 = 1 b a 2 2 = 1 2 3 (x ( b a 2 ))2 dx (x 0) 2 dx ( ) 3 = 2 2 12 8 / 26
D/A Conversion A/D In other words,ǫ 2 = M i=1 P[x R i ] 2 12 = 2 12 IfX has varianceσx 2, the signal-to-noise ratio of the A to the D (& D to A) converter is often defined as ( σ 2 x/ 2 12 IfX min is equal to and/orx max = +, then the last and first intervals can be infinite in extent. Howeverf x (x) is usually small enough in those intervals so that the result is still approximately the same. ) Placeholder Figure A 9 / 26
Quantization Quantizer Optimal Quantizer Iterative Solution SCALAR QUANTIZATION of GAUSSIAN SAMPLES 10 / 26
Quantization Quantizer Optimal Quantizer Iterative Solution ENCODER: Placeholder Figure A x 3b 000 0 < x < b 100 3b < x 2b 001 b < x 2b 101 2b x b 010 2b < x 3b 110 b x 0 011 3b < x 111 DECODER: 000 3.5b 100 +.5b 001 2.5b 101 +1.5b 010 1.5b 110 +2.5b 011.5b 111 +3.5b 11 / 26
Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution Let us construct boundariesb i,(b 0 =,b M = + and quantization symbolsa i such that b i 1 x < b i ˆx = a i i = 1,2,...,M Placeholder Figure A The question is how to optimize{b i } and{a i } to minimize distortionǫ 2 ǫ 2 = M i=1 bi b i 1 (x a i ) 2 f x (x)dx 12 / 26
Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution To optimizeǫ 2 = M i=1 bi b i 1 (x a i ) 2 f x (x)dx we take derivatives and putting them equal to zero: δǫ 2 δa j = 0 And use Leibnitz s Rule: δǫ 2 δb j = 0 δ δt b(t) a(t) f(x,t)dx =f(b(t),t) δb(t) δt f(a(t),t) δa(t) δt b(t) δ + δt f(x,t)dt a(t) 13 / 26
Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution δ δb j bj δ δb j ( M i=1 bi b j 1 (x a j ) 2 f x (x)dx+ δ δb j b i 1 (x a i ) 2 f x (x)dx bj+1 b j ) = (x a j+1 ) 2 f x (x)dx = (b j a j ) 2 f x (x) x=bj (b j a j+1 ) 2 f x (x) x=bj = 0 b 2 j 2a j b j +a 2 j = b 2 j 2b j a j+1 +a 2 j+1 2b j (a j+1 a j ) = a 2 j+1 a 2 j b j = a j+1+a j 2 (I) 14 / 26
Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution δ δa j ( M i=1 b i 1 b i (x a i ) 2 f x (x)dx a j bj a j = b j 1 f x (x)dx = ) bj b j b j 1 xf x (x)dx b j b j 1 f x (x)dx = 2 bj b j 1 xf x (x)dx b j 1 (x a j )f x (x)dx = (II) 15 / 26
Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution Note that the{b i } can be found from (I) once the{a i } is known. In fact, the{b i } are the midpoints of the{a i }. The{a i } can also be solved from (II) once the{b i } are known. The{a i } are centroids of the corresponding regions. Thus one can use a computer to iteratively solve for the{a i } and the{b i } 1. One starts with an initial guess for the{b i }. 2. One uses (II) to solve for the{a i }. 3. One uses (I) to solve for the{b i }. 4. One repeats steps 2 and 3 until the{a i } and the{b i } stop changing. 16 / 26
Comments on Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution 1. This works for anyf x (x) 2. Iff x (x) only has a finite support one adjustsb 0 &b M to be the limits of the support. 3. One needs to know Placeholder Figure β f x (x)dx and α A β xf x (x)dx (true for anyf x (x)) 4. For a Gaussian, we can integrate by parts or lety = x 2 α β α 1 β e 1 2 x2 dx = Q(β) Q(α) 2π α x 1 2π e 1 2 x2 dx =... 17 / 26
Comments on Optimum Scalar Quantizer Quantization Quantizer Optimal Quantizer Iterative Solution 5. IfM = 2 a one could useabinary digits to represent the quantized value. However since the quantized values are not necessarily equally likely, one could use a HUFFMAN CODE to use fewer binary digits(on the average). 6. After the{a i } and{b i } are known, one computesǫ 2 from M ǫ 2 = i=1 bi b i 1 (x a i ) 2 f x (x)dx 7. ForM = 2 andf x (x) = 1 2πσ 2 e 1 2 x 2 σ 2 we have b 0 =,b 1 = 0,b 2 = +, anda 2 = a 1 = 2σ 2 π 8. Also easy to show thatǫ 2 = (1 2 π )σ2 =.3634σ 2. 18 / 26
Quantization Gaussain DMS Gaussain DMS 19 / 26
Quantization Gaussain DMS Gaussain DMS One can achieve a smallerǫ 2 by quantizing several samples at a time. We would then use regions in anm-dimensional space Placeholder Figure A Shannon characterized this in terms of rate-distortion formula which tells us how smallǫ 2 can be (m ). For a Gaussian source with one binary digit per sample, ǫ 2 σ2 4 = 0.25σ2 This follows from the result on the next page. Contrast this with scalar case: ǫ 2 s = (1 2 π )σ2 =.3634σ 2. 20 / 26
VQ: Discrete Memoryless Gaussian Source Quantization Gaussain DMS Gaussain DMS Let source produce i.i.d. Gaussian samplesx 1,X 2,... where f X (x) = 1 2πσ 2 e 1 2 Let the source encoder produce a sequence of binary digits at a rate ofrbinary digits/source symbol. x 2 σ 2 In our previous terminologyr = logm Let the source decoder produce the sequence ˆX 1, ˆX 2,..., ˆX i,... such that the mean-squared error between{x i } and{ ˆX i } isǫ 2. ǫ 2 = 1 n n E{(X i ˆX i ) 2 } i=1 21 / 26
VQ: Discrete Memoryless Gaussian Source Quantization Gaussain DMS Gaussain DMS Then one can prove that for any such system R 1 2 log 2( σ2 ǫ 2 ) for ǫ2 σ 2 Note thatr = 0 forǫ 2 σ 2. What does this mean? Note that forr = logm = 1, 1 1 2 log 2( σ2 ǫ 2 ) 2 log 2( σ2 ǫ 2 ) 4 σ2 ǫ 2 ǫ2 (1/4)σ 2 This is an example of Rate-Distortion Theory. 22 / 26
MP3 CD MPEG-1 Layer 3 Reduced Fidelity Audio Compression 23 / 26
Reduced Fidelity MP3 players use a form of audio compression called MPEG-1 Audio Layer 3. MP3 CD MPEG-1 Layer 3 It takes advantage of a psycho-acoustic phenomena whereby a loud tone at one frequency masks the presence of softer tones at neighboring frequencies; hence, these softer neighbouring tones need not be stored(or transmitted). Compression efficiency of an audio compression scheme is usually described by the encoded bit rate (prior to the introduction of coding bits.) 24 / 26
Reduced Fidelity MP3 CD MPEG-1 Layer 3 The CD has a bit rate of(44.1 10 3 2 16) = 1.41 10 6 bits/second. The term44.1 10 3 is the sampling rate which is approximately the Nyquist frequency of the audio to be compressed. The term2comes from the fact that there are two channels in a stereo audio system. The term16 comes from the 16-bit (or2 16 = 65536 level) A to D converter. Note that a slightly higher sampling rate48 10 3 samples/second is used for a DAT recorder. 25 / 26
Reduced Fidelity MP3 CD MPEG-1 Layer 3 Different standards are used in MP3 players. Several bit rates are specified in the MPEG-1, Layer 3 standard. These are32,40,48,56,64,80,96,112,128,144,160, 192, 224, 256 and 320 kilobits/sec. The sampling rates allowed are32,44.1 and48 kilohz but the sampling rate of44.1 10 3 Hz is almost always used. The basic idea behind the scheme is as follows. A block of 576 time domain samples are converted into 576 frequency domain samples using a DFT. The coefficients then modified using psycho-acoustic principles. The processed coefficients are then converted into a bit stream using various schemes including Huffman Encoding. The process is reversed at the receiver: bits frequency domain coefficients time domain samples. 26 / 26