Transform coding - topics. Principle of block-wise transform coding

Similar documents
Transform Coding. Transform Coding Principle

Multimedia Networking ECE 599

Basic Principles of Video Coding

IMAGE COMPRESSION-II. Week IX. 03/6/2003 Image Compression-II 1

Waveform-Based Coding: Outline

SYDE 575: Introduction to Image Processing. Image Compression Part 2: Variable-rate compression

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

IMAGE COMPRESSION IMAGE COMPRESSION-II. Coding Redundancy (contd.) Data Redundancy. Predictive coding. General Model

The Karhunen-Loeve, Discrete Cosine, and Related Transforms Obtained via the Hadamard Transform

Compression methods: the 1 st generation

Image Data Compression

Overview. Analog capturing device (camera, microphone) PCM encoded or raw signal ( wav, bmp, ) A/D CONVERTER. Compressed bit stream (mp3, jpg, )

Laboratory 1 Discrete Cosine Transform and Karhunen-Loeve Transform

3 rd Generation Approach to Video Compression for Multimedia

Basics of DCT, Quantization and Entropy Coding

EE67I Multimedia Communication Systems

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image

Contents. Acknowledgments

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

BASICS OF COMPRESSION THEORY

Objectives of Image Coding

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

Basics of DCT, Quantization and Entropy Coding. Nimrod Peleg Update: Dec. 2005

7. Variable extraction and dimensionality reduction

Compression. What. Why. Reduce the amount of information (bits) needed to represent image Video: 720 x 480 res, 30 fps, color

Digital Image Processing Lectures 25 & 26

Video Coding with Motion Compensation for Groups of Pictures

Transform Coding. Transform Coding Principle

6. H.261 Video Coding Standard

Wavelet-based Image Coding: An Overview

Multimedia & Computer Visualization. Exercise #5. JPEG compression

Compressing a 1D Discrete Signal

Digital communication system. Shannon s separation principle

2. the basis functions have different symmetries. 1 k = 0. x( t) 1 t 0 x(t) 0 t 1

Lecture 7 Predictive Coding & Quantization

Compressing a 1D Discrete Signal

Linear, Worst-Case Estimators for Denoising Quantization Noise in Transform Coded Images

Predictive Coding. Prediction Prediction in Images

Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y)

EE368B Image and Video Compression

Source Coding for Compression

Proyecto final de carrera

GRAPH SIGNAL PROCESSING: A STATISTICAL VIEWPOINT

Lossy coding. Lossless coding. Organization: 12 h Lectures + 8 h Labworks. Hybrid coding. Introduction: generalities, elements of information theory

CSE 126 Multimedia Systems Midterm Exam (Form A)

Compression and Coding. Theory and Applications Part 1: Fundamentals

Digital Image Processing

On Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University

Quantization. Introduction. Roadmap. Optimal Quantizer Uniform Quantizer Non Uniform Quantizer Rate Distorsion Theory. Source coding.

JPEG Standard Uniform Quantization Error Modeling with Applications to Sequential and Progressive Operation Modes

Half-Pel Accurate Motion-Compensated Orthogonal Video Transforms

Direct Conversions Between DV Format DCT and Ordinary DCT

AN IMPROVED CONTEXT ADAPTIVE BINARY ARITHMETIC CODER FOR THE H.264/AVC STANDARD

EECS490: Digital Image Processing. Lecture #26

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding

A Variation on SVD Based Image Compression

Rate Bounds on SSIM Index of Quantized Image DCT Coefficients

Filterbank Optimization with Convex Objectives and the Optimality of Principal Component Forms

Objective: Reduction of data redundancy. Coding redundancy Interpixel redundancy Psychovisual redundancy Fall LIST 2

CSE 408 Multimedia Information System Yezhou Yang

on a per-coecient basis in large images is computationally expensive. Further, the algorithm in [CR95] needs to be rerun, every time a new rate of com

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

Multidimensional Signal Processing

Predictive Coding. Prediction

Direction-Adaptive Transforms for Coding Prediction Residuals

Compression and Coding

VIDEO CODING USING A SELF-ADAPTIVE REDUNDANT DICTIONARY CONSISTING OF SPATIAL AND TEMPORAL PREDICTION CANDIDATES. Author 1 and Author 2

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Introduction to Video Compression H.261

Statistical signal processing

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Statistical Pattern Recognition

Lossless Image and Intra-frame Compression with Integer-to-Integer DST

Fourier Kingdom 2 Time-Frequency Wedding 2 Windowed Fourier Transform 3 Wavelet Transform 4 Bases of Time-Frequency Atoms 6 Wavelet Bases and Filter

Image Coding Algorithm Based on All Phase Walsh Biorthogonal Transform

Module 4. Multi-Resolution Analysis. Version 2 ECE IIT, Kharagpur

EE16B - Spring 17 - Lecture 12A Notes 1

Source Coding: Part I of Fundamentals of Source and Video Coding

EFFICIENT encoding of signals relies on an understanding

at Some sort of quantization is necessary to represent continuous signals in digital form

Compression and Coding. Theory and Applications Part 1: Fundamentals

Multimedia Communications. Scalar Quantization

Audio Coding. Fundamentals Quantization Waveform Coding Subband Coding P NCTU/CSIE DSPLAB C.M..LIU

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Analysis of Rate-distortion Functions and Congestion Control in Scalable Internet Video Streaming

Department of Electrical Engineering, Polytechnic University, Brooklyn Fall 05 EL DIGITAL IMAGE PROCESSING (I) Final Exam 1/5/06, 1PM-4PM

ECG782: Multidimensional Digital Signal Processing

Modelling of produced bit rate through the percentage of null quantized transform coefficients ( zeros )

Multimedia communications

c 2010 Melody I. Bonham

Computer Assisted Image Analysis

RLE = [ ; ], with compression ratio (CR) = 4/8. RLE actually increases the size of the compressed image.

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization

Improvement of DCT-based Compression Algorithms Using Poisson s Equation

ECE Lecture Notes: Matrix Computations for Signal Processing

Image Compression - JPEG

Intelligent Visual Prosthesis

Non-Iterative MAP Reconstruction Using Sparse Matrix Representations

Transcription:

Transform coding - topics Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform Threshold coding Typical coding artifacts Fast implementation of the DCT Bernd Girod: EE38b Image and Video Compression Transform Coding no. Principle of block-wise transform coding original image reconstructed image original image block reconstructed block transform Transform A Quantization & Transmission Inverse transform A- quantized transform Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Properties of orthonormal transforms Forward transform y = Ax NxN transform, arranged as a vector Inverse transform Transform matrix of size N xn x = A y = A T y Input signal block of size NxN, arranged as a vector x Linearity: functions. is represented as linear combination of basis Parseval s Theorem holds: transform is a rotation of the signal vector around the origin of an N -dimensional vector space. Bernd Girod: EE38b Image and Video Compression Transform Coding no. 3 Separable orthonormal transforms, I An orthonormal transform is separable, if the transform of a signal block of size NxN can be expressed by y = AxA T Note: A = A A NxN transform Orthonormal transform matrix of size NxN The inverse transform is NxN block of input signal Kronecker product x = A T ya Great practical importance: The transform requires matrix multiplications of size NxN instead one multiplication of a vector of size xn with a matrix of size N xn Reduction of the complexity from O(N ) to O(N 3 ) Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Separable orthonormal transforms, II -d transform realized by one-dimensional transforms (along rows and columns of the signal block) N NxN block of pixels N x Ax AxA T column-wise N-transform row-wise N-transform NxN block of transform Bernd Girod: EE38b Image and Video Compression Transform Coding no. 5 Criteria for the selection of a particular transform Decorrelation, energy concentration (e.g., KLT, DCT,...) Visually pleasant basis functions (e.g., pseudo-randomnoise, m-sequences, lapped transforms) Low complexity of computation Bernd Girod: EE38b Image and Video Compression Transform Coding no. 3

Karhunen Loève Transform (KLT) Karhunen Loève Transform (KLT) yields decorrelated transform. Basis functions are eigenvectors of the covariance matrix of the input signal. KLT achieves optimum energy concentration. Disadvantages: KLT dependent on signal statistics KLT not separable for image blocks Transform matrix cannot be factored into sparse matrices. Bernd Girod: EE38b Image and Video Compression Transform Coding no. 7 Comparison of various transforms, I Karhunen Loève transform (98/9) Haar transform (9) Walsh-Hadamard transform(93) Slant transform (Enomoto, Shibata, 97) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 97) Comparison of -d basis functions for block size N=8 Bernd Girod: EE38b Image and Video Compression Transform Coding no. 8

Comparison of various transforms, II Energy concentration measured for typical natural images, block size x3 [Lohscheller]: KLT is optimum DCT performs only slightly worse than KLT Bernd Girod: EE38b Image and Video Compression Transform Coding no. 9 Discrete cosine transform and discrete Fourier transform Transform coding of images using the Discrete Fourier Transform (DFT): For stationary image statistics, the energy concentration properties of the DFT converge against those of the KLT for large block sizes. Problem of blockwise DFT coding: blocking effects due to circular topology of the DFT and Gibbs phenomena. Remedy: reflect image at block boundaries, DFT of larger symmetric block -> DCT edge folded pixel folded Bernd Girod: EE38b Image and Video Compression Transform Coding no. 5

DCT Type II-DCT of blocksize MxM is defined by transform matrix A containing elements D basis functions of the DCT: π(k +)i a ik = α i cos M for i, k =,..., M with α = α i = M M i Bernd Girod: EE38b Image and Video Compression Transform Coding no. Amplitude distribution of the DCT Histograms for 8x8 DCT coefficient amplitudes measured for natural images (from Mauersberger): DC coefficient is typically uniformly distributed. For the other, the distribution resembles a Laplacian pdf. Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Bit allocation for transform I Problem: divide bit-rate R among MxM transform i such that resulting distortion D is minimized. Assumptions R = R i D = D i i i Total rate Rate for coefficient i Total distortion Distortion contributed by coefficient i lead to "Pareto condition" D i R i = D j R j for all i, j Bernd Girod: EE38b Image and Video Compression Transform Coding no. 3 Bit allocation for transform II Additional assumptions Gaussian r.v. and mse distortion yield the optimum rate for each transform coefficient i: variance of transform coefficient i R i = max, log σ i θ { } D i = min σ i,θ bit Threshold parameter, same for all i In practice, with variable length coding, one often uses distortion allocation instead of bit allocation Bernd Girod: EE38b Image and Video Compression Transform Coding no. 7

Bit allocation for transform III Extension to weighted m.s.e. distortion measure D = i w i D i R i = max, log w i σ i bit θ D i = min σ i, θ w i Often implemented by scaling by to quantization ( weighting matrix ) ( w i ) prior Bernd Girod: EE38b Image and Video Compression Transform Coding no. 5 Threshold coding, I Transform that fall below a threshold are discarded. Implementation by uniform quantizer with threshold characteristic: Quantizer output Quantizer input Positions of non-zero transform are transmitted in addition to their amplitude values. Bernd Girod: EE38b Image and Video Compression Transform Coding no. 8

Threshold coding, II Efficient encoding of the position of non-zero transform : zig-zag-scan + run-level-coding ordering of the transform by zig-zag-scan Bernd Girod: EE38b Image and Video Compression Transform Coding no. 7 Threshold coding, III 98 9 79 8 8 9 8 87 9 9 8 8 85 89 7 88 85 93 79 88 88 87 7 DCT 8 88 8 87 83 8 95 7 9 93 89 87 8 83 8 85 93 95 93 9 7 89 87 8 8 85 83 8 75 8 85 7 95 85 77 78 7 79 95 75 Original 8x8 block Reconstructed 8x8 block 9 95 8 77 8 93 7 89 9 95 8 8 87 9 7 88 85 9 8 85 87 89 7 89 88 85 83 83 8 9 75 9 9 8 89 79 8 88 78 9 9 89 9 77 8 8 79 89 88 85 8 75 8 87 79 89 88 78 7 73 83 93 8 8 9-5 -8 8-8 -8 - -3-5 9 5-8 -5 5 5-8 - - 3-8 8-8 - -5 - - -3 3 3-3 -3 - - scaling and inverse DCT Q 85 3-3 - - - - - - - 85 3-3 - - - - - - - inverse zig-zagscan (85 3 - -3 - - - - - - EOB) run-levelcoding Mean of block: 85 (,3) (,) (,) (,) (,) (,-) (,) (,) (,) (,-3) (,) (,-) (,) (,-) (,- ) (,-) (,) (9,-) (,-) (EOB) transmission Mean of block: 85 (,3) (,) (,) (,) (,) (,-) (,) (,) (,) (,-3) (,) (,-) (,) (,-) (,- ) (,-) (,) (9,-) (,-) (EOB) run-leveldecoding (85 3 - -3 - - - - - - EOB) Bernd Girod: EE38b Image and Video Compression Transform Coding no. 8 9

3 - - -3 3 - - -3 3 - - -3 3 - - -3 3 - - -3 3 - - -3 Detail in a block vs. DCT transmitted image block DCT of block quantized DCT of block block reconstructed from quantized Bernd Girod: EE38b Image and Video Compression Transform Coding no. 9 Typical DCT coding artifacts DCT coding with increasingly coarse quantization, block size 8x8 quantizer stepsize for AC : 5 quantizer stepsize for AC : quantizer stepsize for AC : Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Adaptive transform coding Input signal Transform Quantization Entropy coding Block classification class Quantization and entropy coding optimized separately for each class. Typical classes: Blocks without detail Horizontal structures Vertical structures Diagonals Textures without preferred orientation Bernd Girod: EE38b Image and Video Compression Transform Coding no. Influence of DCT block size Efficiency as a function of blocksize NxN, measured for 8 bit quantization in the original domain and equivalent quantization in the transform domain G = memoryless entropy of original signal mean entropy of transform Block size 8x8 is a good compromise. Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Fast DCT algorithm I DCT matrix factored into sparse matrices (Arai, Agui, and Nakajima; 988): y = Ax = SPM M M 3 M M 5 M x S S S S = S 3 S S 5 S S 7 P = M = M = M 3 = C C C C C C M = M 5 = M = Bernd Girod: EE38b Image and Video Compression Transform Coding no. 3 Fast DCT algorithm II Signal flow graph for fast (scaled) 8-DCT according to Arai, Agui, Nakajima: x s y scaling x x x 3 a s s s y y y only 5 + 8 multiplications x x 5 a a 3 s 5 s s 7 s 3 y 5 y y 7 y 3 (direct matrix multiplication: multiplications) x a x 7 a 5 Multiplication: u m m u Addition: u v u v u+v u-v a = C a = C C a 3 = C a = C + C a 5 = C s = s k = C k k =,..., 7 C k = cos π k Bernd Girod: EE38b Image and Video Compression Transform Coding no.

Transform coding: summary Orthonormal transform: rotation of coordinate system in signal space Purpose of transform: decorrelation, energy concentration KLT is optimum, but signal dependent and, hence, without a fast algorithm DCT shows reduced blocking artifacts compared to DFT Bit allocation proportional to logarithm of variance Threshold coding + zig-zag-scan + 8x8 block size is widely used today (e.g. JPEG, MPEG, ITU-T H.3) Fast algorithm for scaled 8-DCT: 5 multiplications, 9 additions Bernd Girod: EE38b Image and Video Compression Transform Coding no. 5 3