Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Lecture 7 Predictive Coding & Quantization June 3, 2009
Outline Predictive Coding Motion Estimation and Compensation Context-Based Coding Quantization Scalar Quantization Vector Quantization 1
Predictive Coding
Why is compression possible: Statistical redundancy And finally, lossless coding (lossless data compression) is always useful to further remove any more statistical redundancy existing in the data. 3
Why is compression possible: Spatial redundancy 180 Correlation between vertically adjacent pixels 250 Correlation between horizontally adjacent pixels 160 140 200 Pixel at (1,i+1) 120 100 80 60 pixel at (i+1,1) 150 100 40 20 50 0 0 20 40 60 80 100 120 140 160 180 Pixel at (1,i) 0 0 50 100 150 200 250 Pixel at (i,1) Correlation between horizontally (left) and vertically (right) adjacent pixels A picture of Konstanz (13.12.2008) Spatial predictive coding and transform coding become useful! Autocorrelation in the horizontal direction of natural images 4
Why is compression possible: Temporal redundancy The 2 nd frame The 1 st frame The difference Temporal Predictive coding (motion estimation and compensation) become useful! 5
Where is predictive coding? A/D Conversion Color Space Conversion Pre-Filtering Partitioning Predictive Coding Differential Coding Motion Estimation and Compensation Context-Based Coding Input Image/Video Pre- Processing Lossy Coding Lossless Coding Post- Processing (Post-filtering) Quantization Transform Coding Model-Based Coding Entropy Coding Dictionary-Based Coding Run-Length Coding Encoded Image/Video 6
The basic idea What we have? Some prior information about the source: it is not a memoryless random source, but a stationary one or a time-varying one. A number of symbols we have encoded. What we can do? Construct a predictor from previously encoded symbols. Encode the predication errors (difference) instead of the original signal. 7
Typical predictive coding methods Naive differential coding Differential pulse code modulation (DPCM) Delta modulation (DM) Motion estimation and compensation Context-based coding 8
Naive differential coding What is the to-be-encoded signal? x=(x 1, x 2,, x n, ) What is the signal we actually encode? Δx=(x 1, x 2 -x 1,, x n -x n-1, ) Widely used in multimedia coding standards. 6 x 104 5 x 105 4.5 5 4 4 3.5 x 3 3 2.5 Δx 2 2 1.5 1 1 0.5 0 0 50 100 150 200 250 0-250 -200-150 -100-50 0 50 100 150 200 250 9
Differential Pulse Code Modulation (DPCM) What is PCM (pulse code modulation)? It is an analog to digital (A/D) transformation used in digital communication systems. It mainly belongs to the topic of quantization. DPCM = Differential coding + PCM Adaptive PCM (ADPCM) It belongs to the topic of quantization, too. 10
DPCM One-order DPCM = Naive differential coding x i Δx i Δx i * Delay xi-1 Simple encoder Δx i * x i * Error propagation x i-1 * Delay Simple decoder x i Δx i Δx i * Δx i * x i * x i-1 * Delay x Delay i-1 * x i * practical encoder A simulated decoder Practical decoder 11
Higher-order DPCM Modifying the Delay component to include n>1 previous encoded symbols. For each previously encoded symbol x i-j, assign a weight a i-j (0,1). The predication is the weighted sum of n terms. How to determine the weights becomes an optimization problem. Assuming the predicator is linear will help solve the problem. 12
2/3-D DPCM for digital images/video Pixel values in the previously encoded row and those in the same row (to the left of the to-be-encoded pixel value) are used for predication. A B C D E F G H I J K 3-D DPCM for digital video Pixel values in the last frame is also used. 13
Delta modulation (DM) DM = One-order DPCM + One-bit (twolevel) quantization x i Delay xi-1 Q 0/1 0/1 x i * Q -1 x i-1 * Delay Simple encoder x i Δx i 0/1 Q Q -1 0/1 Q -1 Simple decoder x i * x i-1 * Delay x Delay i-1 * x i * practical encoder Practical decoder 14
Frame replenishment The basic idea Only transmitting locations of significantly changed pixels and the differences of pixel values. The term significantly changed is defined according to a threshold of the pixel value difference. Problem The threshold has to be larger if there are more rapid changes. The video quality becomes worse. 15
Motion Estimation and Compensation
Motion estimation Temporal prediction Problem 3-D DPCM can be used for video, but movements of objects change the locations of the objects and thus enlarge the predication errors. Solution We estimate the movements of objects to some extents and compensate them before doing the predication. 17
An example Without motion compensation With motion compensation 18
Forward and backward motion We need one or more reference frames (in the past or in the future) to encode a frame. There are three kinds of frames in video coding. I-frames: intra encoded frames, independent of any reference frames. P-frames: predicated from reference frame(s) in the past B-frames: bi-directionally encoded frame from reference frames in the past and in the future. 19
I-/P-/B-frames The encoding order of frames are different from the display order! 20
Motion Estimation and Compensation Motion Estimation Motion Vector Calculation Block Matching Pel Recursive Technique (Optimization) Optical Flow Method (Computer Vision) Motion Compensated Predicative Coding 21
Block matching Why do we use block matching? It is simple, fast and easy to implement. What are matched? Small blocks in a video frame, most often squared/rectangular, non-overlapped, and of the same size. Motion model Only consider translation motion. Other types of motions of large objects can be approximated by translation motion of their smaller parts. 22
How to match blocks? Goal Find the block in the reference frame which best matches the current block in the to-be-encoded frame. Method Simply search some blocks in a searching region. Criteria Similarity measure between two blocks. Do you still remember those quality metrics on image quality we learned before? 23
Block matching: A visual show 24
Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Block matching: motion vectors A P-frame with MVs A B-frame with MVs 25
Searching strategies for block matching Full search Simply search all blocks in the searching region. Three-step search (TSS, Koga et al. 1981) After each step, the step size is reduced by one. 26
Searching strategies for block matching Two-dimensional logarithmic search (TDL, Jain & Jain 1981) The step size is halved when the bestmatched position is the center. 27
Searching strategies for block matching Cross search (Ghanbari 1990) Similar to TSS, but search four positions instead of nine. 28
Searching strategies for block matching Once-at-a-time search (OTS, Srinivasan and Rao 1985) First horizontal and then vertical. 29
Searching strategies for block matching Orthogonal Search Algorithm (OSA, Puri et al. 1987) Horizontal + Vertical Step size halved. 30
Advanced forms of block matching Sub-pixel block matching (via interpolation) Greedy block matching Hierachical/Multi-resolution block matching Overlapped block matching Non-squared block matching Multigrid/Variable-size block matching Object segmentation based block matching More complicated motion models Scaling, rotation, skewing, more complicated deformation, 31
Block matching: Performance comparison Matching performance Contribution to compression Computational complexity Scalability (large and small scales) Implementation issues (hardware and software) 32
Context-based Coding
Yet another form of predictive coding The Basic Idea The context dynamically influences the coding process. Context-Based Entropy Coding Adaptive Huffman Coding Adaptive Arithmetic Coding Adaptive Dictionary Based Coding Adaptive Predictive Coding 34
Quantization
Where is quantization? A/D Conversion Color Space Conversion Pre-Filtering Partitioning Predictive Coding Differential Coding Motion Estimation and Compensation Context-Based Coding Input Image/Video Pre- Processing Lossy Coding Lossless Coding Post- Processing (Post-filtering) Quantization Transform Coding Model-Based Coding Entropy Coding Dictionary-Based Coding Run-Length Coding Encoded Image/Video 36
What is quantization? It is the process of converting an analog signal to a digital one. (A/D) Example: 130.75 131 Note: An analog signal should first be sampled (quantized in the temporal domain). It is also the process of converting a digital signal under precision n 1 into another digital signal under precision n 2 <n 1. (D/D) Example: 131/2 8 32/2 6 37
How to do quantization? Uniform quantization Non-uniform quantization Adaptive quantization Scalar quantization Vector quantization (VQ) Quantization and subsampling is the main source of information loss! Lossy coding! 38
Uniform quantization Quantize an analog signal x uniformly into N levels {y i } with a fixed step {d i+1 -d i =Δ}. Uniform Midtread Quantizer Uniform Misrise Quantizer 39
Quantization error (x y i ) 2 f(x)dx, where f(x) is the i=1 d i probability density function of the signal x. MSE = X N Z di+1 40
Optimizing the quantizer Two necessary conditions Letting the derivatives of MSE with respect to d i and y i be 0. (d i y i 1 ) 2 f(d i ) (d i y i ) 2 f(d i )=0 Z di+1 d i (x y i )f(x)dx =0 Three sufficient conditions Zx 1 =, x 2 =+ di+1 (x y i )f(x)dx =0, i=1,,n d i d i =(y i-1 +y i )/2, i=1,,n 41
Non-uniform quantization When f(x) is a uniform distribution, the uniform quantization is optimal. When f(x) is not a uniform distribution, the uniform quantization is generally not optimal. Non-uniform quantization is needed! 42
Non-uniform quantization: An example Gaussian distribution with zero mean and unit variance (N=8). 43
Adaptive quantization Making the quantizer adapt to the statistics of the source f(x). Forward adaptive quantization 44
Adaptive quantization Making the quantizer adapt to the statistics of the source f(x). Backward adaptive quantization 45
Adaptive quantization Making the quantizer adapt to the statistics of the source f(x). Switched quantization 46
DPCM + Quantization and DM x i Δx i Δx i * Q Q -1 Δx i * Q -1 x i-1 * Delay x i * x Delay i-1 * x i * practical encoder x i Δx i 0/1 Q Q -1 0/1 Q -1 Practical decoder x i * x i-1 * Delay x Delay i-1 * x i * practical encoder Practical decoder 47
Vector Quantization (VQ) VQ is a pattern recognition problem! Find the best code in the codebook 48
Vector Quantization (VQ) VQ is also a lossy code f:x n C={c1,,c m } C is called a codebook. For a set of training data, an algorithm is used to generate the codebook C. This part is the core, and it is a PR problem. For each input n-d vector, The quantizer tries to find the best code in C to minimize the quantization error. A usual optimization problem like motion estimation. 49
References
References for further reading Khalid Sayood, Introduction to Data Compression, Chapter 11 Differential Encoding, Chapter 9 Scalar Quantization and Chapter 10 Vector Quantization, 3rd Edition, Morgan Kaufmann, 2005 Yun Q. Shi and Huifang Sun, Image and Video Compression for Multimedia Engineering: Fundamentals, Algorithms, and Standards, Chapter 3 Differential Coding, Chapter 10 Motion Analysis and Motion Compensation, Chapter 11 Block Matching, Chapter 2 Quantization and Section 9.2 Vector Quantization, 2nd Edition, CRC Press, 2008 Iain E.G. Richardson, Video Codec Design: Developing Image and Video Compression Systems, Chapter 6 Motion Estimation and Compensation, John, Wiley & Sons Ltd, 2002 David S. Taubman and Michael W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice, Chapter 3 Quantization, Kluwer Academic Publishers, 2002 (*) 51