EE5585 Data Compression April 18, Lecture 23

Similar documents
EE5585 Data Compression May 2, Lecture 27

Digital Image Processing Lectures 25 & 26

EE 5345 Biomedical Instrumentation Lecture 12: slides

CODING SAMPLE DIFFERENCES ATTEMPT 1: NAIVE DIFFERENTIAL CODING

EE5585 Data Compression April 4, Lecture 19

Pulse-Code Modulation (PCM) :

Multimedia Communications. Differential Coding

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Overview. Analog capturing device (camera, microphone) PCM encoded or raw signal ( wav, bmp, ) A/D CONVERTER. Compressed bit stream (mp3, jpg, )

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

Lecture 11: Polar codes construction

Lecture 7 Predictive Coding & Quantization

BASICS OF COMPRESSION THEORY

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Basic Principles of Video Coding

Chapter 10 Applications in Communications

BASIC COMPRESSION TECHNIQUES

DSP Design Lecture 2. Fredrik Edman.

An introduction to basic information theory. Hampus Wessman

Image Data Compression

Lecture 7. Union bound for reducing M-ary to binary hypothesis testing

Number Representation and Waveform Quantization

ELEN E4810: Digital Signal Processing Topic 2: Time domain

Lecture 20: Quantization and Rate-Distortion

Entropy as a measure of surprise

CSE 408 Multimedia Information System Yezhou Yang

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

Multimedia Networking ECE 599

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

SPEECH ANALYSIS AND SYNTHESIS

Information Theory and Coding Techniques

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image

3F1 Information Theory, Lecture 3

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Time-domain representations

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming

Principles of Communications

Lecture 11: Quantum Information III - Source Coding

EE5356 Digital Image Processing

Least Squares and Kalman Filtering Questions: me,

Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences

Waveform-Based Coding: Outline

Lecture 11: Continuous-valued signals and differential entropy

ECE Information theory Final

Compression methods: the 1 st generation

Predictive Coding. Prediction Prediction in Images

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019

EE 121: Introduction to Digital Communication Systems. 1. Consider the following discrete-time communication system. There are two equallly likely

Predictive Coding. Prediction

ECE4270 Fundamentals of DSP Lecture 20. Fixed-Point Arithmetic in FIR and IIR Filters (part I) Overview of Lecture. Overflow. FIR Digital Filter

Lossless Image and Intra-frame Compression with Integer-to-Integer DST

Module 4 MULTI- RESOLUTION ANALYSIS. Version 2 ECE IIT, Kharagpur

Lecture 5b: Line Codes

Introduction to Signal Processing

Least Squares Estimation Namrata Vaswani,

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Anatomy of a Bit. Reading for this lecture: CMR articles Yeung and Anatomy.

Lecture 20: Discrete Fourier Transform and FFT

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

PCM Reference Chapter 12.1, Communication Systems, Carlson. PCM.1

Chapter 9: Forecasting

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

1. Probability density function for speech samples. Gamma. Laplacian. 2. Coding paradigms. =(2X max /2 B ) for a B-bit quantizer Δ Δ Δ Δ Δ

Analysis of methods for speech signals quantization

Rate-Distortion Based Temporal Filtering for. Video Compression. Beckman Institute, 405 N. Mathews Ave., Urbana, IL 61801

Statistics 349(02) Review Questions

On Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University

Product Obsolete/Under Obsolescence. Quantization. Author: Latha Pillai

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Chapter 3 Combinational Logic Design

Examples. f (x) = 3x 2 + 2x + 4 f (x) = 2x 4 x 3 + 2x 2 5x 2 f (x) = 3x 6 5x 5 + 7x 3 x

Image Compression - JPEG

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Lecture 4 - Spectral Estimation

Speech Coding. Speech Processing. Tom Bäckström. October Aalto University

Information and Entropy. Professor Kevin Gold

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Source Coding for Compression

Image and Multidimensional Signal Processing

Lec 05 Arithmetic Coding

EE 123: Digital Signal Processing Spring Lecture 23 April 10

Chapter REVIEW ANSWER KEY

Lecture 01 August 31, 2017

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Factor Analysis and Kalman Filtering (11/2/04)

Lecture 1 : Data Compression and Entropy

Image Dependent Log-likelihood Ratio Allocation for Repeat Accumulate Code based Decoding in Data Hiding Channels

1 Principal Components Analysis

Physical Layer and Coding

9 Forward-backward algorithm, sum-product on factor graphs

Multimedia Information Systems

Fundamentals of the DFT (fft) Algorithms

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

HARMONIC VECTOR QUANTIZATION

Source Coding. Scalar Quantization

Example: sending one bit of information across noisy channel. Effects of the noise: flip the bit with probability p.

Transcription:

EE5585 Data Compression April 18, 013 Lecture 3 Instructor: Arya Mazumdar Scribe: Trevor Webster Differential Encoding Suppose we have a signal that is slowly varying For instance, if we were looking at a video frame by frame we would see that only a few pixels are changing between subsequent frames In this case, rather than encoding the signal as is, we would first sample it and look at the difference signal and encode this instead: d n = x n x n 1 This d n would then need to be quantized, creating an estimate ( ˆ d n ) which would contain some quantization noise (q n ) Q(d n ) = ˆ d n ˆd n = d n + q n What we are truly interested in recovering are the values of x Unfortunately, the quantization error will get accumulated in the value of x The reason being, that the operation forming d n is of the following matrix form 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 Inverting this matrix we obtain the accumulator matrix 1 1 0 0 0 0 1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 1

If you make an error in the d n the noise in x n will be accumulated as follows ˆx n = x 0 + n i=0 q n The quantization noise q n can be positive or negative and in the long term we expect it to add up to zero, but what happens with high probability is that the range of the error becomes too great for us to handle Therefore, we adopt the following strategy d n = x n ˆx n 1 We can then implement this technique using the following encoder The corresponding decoder will look as follows This encoder/decoder scheme shown is governed by the relationship introduced previously Rearranged it looks as follows xˆ n = d ˆ n + ˆx n 1 Walking through an example using this scheme we would have the following sequence ˆx 0 = x 0 (1) ˆx 1 = ˆd 1 + ˆx 0 () = d 1 + q 1 + ˆx 0 (3) = x 1 + q 1 (4) ˆx = ˆd + ˆx 1 (5) = d + q + ˆx 1 (6) = x + q (7) (8)

We see that in general xˆ n = d ˆ n + ˆx n 1 = d n + q n + ˆx n 1 = x n + q n Notice that while the quantization noise q n was originally accumulated in x n, by adopting the aforementioned strategy we have made each estimate of x n dependent on only its own respective quantization error This type of method is known as a Differential Pulse Coded Modulation Scheme (DPCM) Differential Pulse Coded Modulation Scheme Generalizing the method described above, we see that we use a previous predicted value of x n, denoted by p n, to construct the difference sequence which is then quantized and used to predict the current value of x n Furthermore, the predicted x n is operated on by a Predictor (P ) which provides an estimate of the previous x n in order to recursively find the difference signal This is shown in the respective encoder and decoder figures below Formalizing this procedure, we write p n = f(ˆx n 1, x n ˆ, x n 3 ˆ,, ˆx 0 ) d n = x n p n = x n f(ˆx n 1, x n ˆ, x n 3 ˆ,, ˆx 0 ) We are hopeful that the values of d n are much smaller than the values of x n, given that it is a difference signal for a slowly varying signal So if we were looking at the energy in d n we would guess it to be much smaller than the energy in x n When we are dealing with the energy of a signal it is analogous to its variance, which we define as follows So if we were to optimize something in this DPCM scheme, we would want to find f such that we minimize the energy in d n 3

Find f(ˆx n 1, x n ˆ, x n 3 ˆ,, ˆx 0 ) such that σ d is minimized σ d = E[(x n f(ˆx n 1, x n ˆ, x n 3 ˆ,, ˆx 0 )) ] Since it is necessary to know previous estimates of x n in order to calculate f, but we also need to know f in order to calculate previous estimates, we find ourselves in a roundabout and difficult situation Therefore, we make the following assumption Suppose, p n = f(x n 1, x n, x n 3,, x 0 ) In other words, we design the predictor assuming there is no quantization This is called a Fine Quantization assumption In many cases this makes sense because the estimate of x n is very close to x n, as the difference signal is very small, and the quantization error is even smaller Now we have σ d = E[(x n f(x n 1, x n, x n 3,, x 0 ) ] Considering the limitations on f, we note that f can be any nonlinear function But we often don t know a way to implement many nonlinear functions So we assume further that the predictor f is linear and seek the best linear function By definition, a linear function expressing f(x n 1, x n, x n 3,, x 0 ) will merely be a weighted sum of previous values of x n Assuming we are looking back through a window at the first N samples of a signal, the predictor can then be expressed as follows Now the variance of d n becomes p n = N i=1 a ix n i σ d = E[(x n N i=1 a ix n i ) ] Now we need to find the coefficients a i which will minimize this variance derivative with respect to a i and set it equal to zero To do this we take the σ d a 1 σ d a j = E[ (x n N i=1 a ix n i )x n 1 ] = E[x n x n 1 N i=1 a ix n i x n 1 ] = 0 = E[x n x n j N i=1 a ix n i x n j ] = 0 By setting the derivative above to zero and rearranging, we see that the coefficients we are looking for are dependent upon second order statistics E[x n x n j ] = N i=1 a ie[x n i x n j ] We make the assumption that x is stationary, wich means that the correlation between two values of x n is only a function of the lag between them Formally, we denote this by the fact that the expectation of the product of two values of x n separated in time by k samples (also known as the autocorrelation) is purely a function of the time difference, or lag, k E[x n x n+k ] = R xx (k) The relationship governing the coefficients a i can then be rewritten using this notation as R xx (j) = N i=1 a ir xx (i j) for 1 j N Thus, looking at the autocorrelation functions for each j we have 4

R xx (1) = N i=1 a ir xx (i 1) R xx () = N i=1 a ir xx (i ) R xx (N) = N i=1 a ir xx (i N) Now that we have N equations and N unknowns we can solve for the coefficients a i in matrix form R xx (0) R xx (1) R xx (N 1) R xx (1) R xx (0) R xx (N 1) R xx (0) a 1 a a n = R xx (1) R xx () R xx (N) Rewriting this more compactly where R is a matrix and a and P are column vectors we have Ra = P a = R 1 P Thus, we see in order to determine the coefficients for a linear predictor for DPCM we must invert the autocorrelation matrix and multiply by the autocorrelation vector P Once we have determined the coefficients we can design the predictor necessary for our DPCM scheme Now suppose that our signal is rapidly changing as shown below We see that since the signal varies significantly over short intervals the difference signal d n would be very large and DPCM might not be a very good scheme Suppose instead that we passed this rapidly changing signal through both a low pass and high pass filter as shown below Low Pass High Pass 5

Let us create two signals based off the values of x n to emulate this low pass and high pass response Let y n represent an averaging operation that smoothes out the response of x n (low pass) and let z n represent a difference operation which emulates the high frequency variation of x n (high pass) y n = xn+xn 1 z n = xn xn 1 Applying this method for each x n we would send or store two values (y n and z n ) This is unnecessary Instead, what we can do is apply the following strategy y n = xn+xn 1 z n = xn xn 1 Then we can recover both even and odd values of x n as follows y n + z n = x n y n z n = x n 1 This process of splitting signal components into multiple portions is called decimation and can be extrapolated out further until a point at which you perform bit allocation This method is called sub-band coding We note that DPCM is well suited for the y n low pass components whereas another technique would likely suite the z n high pass components more effectively Distributed Storage Today s storage systems often utilize distributed storage In such a system, data will be stored in many separate locations on servers Suppose we want to do data compression in such a system What we would like to do is query a portion of our data set - this could be 1 page out of an entire document for instance Rather than having to sift through the entire data set to find 1 page, we would like to be able to go directly there In other words, we would like such a system to be query efficient For this reason suppose we divide data out into sections, such as by page, before compressing it in an effort to preserve information about where the data came from Such a partitioning of a data set is shown below (101 010 011 ) Next we define query efficiency as the # of bits we need to process to get back a single bit Suppose that the partitioned bit stream shown above has length N and we have partitioned it in chunks of m = 3 bits In this instance, the query efficiency would be m Suppose we have a binary vector of length n represented by x which we can compress to H(x) + 1 Our compression would then be Compression Rate = H(xn 1 )+1 n = H(xn 1 ) n = nh(x) n + 1 n + 1 n = H(x) + 1 n Now, in the case of our previous example regarding query efficiency The compression rate will be Rate = H(x) + 1 m = H(x) + 1 query efficiency 6

If we were able to compress the file as a whole the query efficiency would be huge and the Compression Rate would be Rate = H(x) + 1 N Difference in rate from optimal = 1 Query Efficiency Thus we see there is a trade off between Compression Rate and Query Efficiency Let us end the lecture by propose a research problem were we find a better relation between query efficiency and redundancy 7