Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Similar documents
Lec 04 Variable Length Coding (VLC) in JPEG

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Lec 05 Arithmetic Coding

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Lecture 3 : Algorithms for source coding. September 30, 2016

Chapter 2: Source coding

Entropy as a measure of surprise

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Chapter 5: Data Compression

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Lecture 4 : Adaptive source coding algorithms

3F1 Information Theory, Lecture 3

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Information Theory and Statistics Lecture 2: Source coding

1 Introduction to information theory

COMM901 Source Coding and Compression. Quiz 1

CMPT 365 Multimedia Systems. Lossless Compression

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

UNIT I INFORMATION THEORY. I k log 2

Coding of memoryless sources 1/35

Shannon-Fano-Elias coding

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Lecture 11: Polar codes construction

Lecture 1 : Data Compression and Entropy

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

lossless, optimal compressor

Lecture 22: Final Review

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

CMPT 365 Multimedia Systems. Final Review - 1

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

3F1 Information Theory, Lecture 3

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Information & Correlation

Homework Set #2 Data Compression, Huffman code and AEP

2018/5/3. YU Xiangyu

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

DCSP-3: Minimal Length Coding. Jianfeng Feng

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Chapter 9 Fundamental Limits in Information Theory

Intro to Information Theory

LECTURE 13. Last time: Lecture outline

CSCI 2570 Introduction to Nanocomputing

Summary of Last Lectures

Using an innovative coding algorithm for data encryption

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Multimedia Information Systems

Optimal prefix codes for pairs of geometrically-distributed random variables

Digital Image Processing Lectures 25 & 26

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Information Theory, Statistics, and Decision Trees

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

Lecture 11: Quantum Information III - Source Coding

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

EE 376A: Information Theory Lecture Notes. Prof. Tsachy Weissman TA: Idoia Ochoa, Kedar Tatwawadi

(Classical) Information Theory II: Source coding

SGN-2306 Signal Compression. 1. Simple Codes

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Coding for Discrete Source

Information and Entropy

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

Data Compression Techniques

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Lec 05 Arithmetic Coding

Context tree models for source coding

Communications Theory and Engineering

Lecture 16 Oct 21, 2014

Data Compression Techniques

Motivation for Arithmetic Coding

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Ch. 2 Math Preliminaries for Lossless Compression. Section 2.4 Coding

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Optimisation and Operations Research

Solutions to Set #2 Data Compression, Huffman code and AEP

Exercises with solutions (Set B)

Lecture 5: Asymptotic Equipartition Property

Chapter 5. Data Compression

Transcription:

CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p.

Outline Lecture 02 ReCap Hoffman Coding Golomb Coding and JPEG 2000 Lossless Coding Z. Li Multimedia Communciation, 207 Spring p.2

Entropy Self Info of an event i X = k = log Pr X = k = log(p k ) Entropy of a source H X = k p k log( p k ) Conditional Entropy, Mutual Information H X X 2 = H X, X 2 H(X 2 ) I X, X 2 = H X + H X 2 H X, X 2 Relative Entropy Total area: H(X, Y) Main application: Contet Modeling a b c b c a b c b a b c b a D(p q = k p k log p k q k H(X Y) I(X; Y) H(Y X) H(X) H(Y) Z. Li Multimedia Communciation, 207 Spring p.3

Contet Reduces Entropy Eample Condition reduces entropy: H( 5 ) > H( 5 4, 3, 2, ) H( 5 ) > H( 5 f( 4, 3, 2, )) Contet: 2 3 4 5 f( 4, 3, 2, )==00 getentropy.m, lossless_coding.m f( 4, 3, 2, )<00 Contet function: f( 4, 3, 2, )= sum( 4, 3, 2, ) Z. Li Multimedia Communciation, 207 Spring p.4

Lossless Coding Prefi Coding Codes on leaves No code is prefi of other codes Simple encoding/decoding 0 0 0 0 Root node 0 Internal node leaf node 0 Kraft- McMillan Inequality: For a coding scheme with code length: l, l 2, l n, k 2 l k Given a set of integer length {l, l 2, l n } that satisfy above inequality, we can always find a prefi code with code length l, l 2, l n Z. Li Multimedia Communciation, 207 Spring p.5

Outline Lecture 02 ReCap Hoffman Coding Golomb Coding and JPEG 2000 Lossless Z. Li Multimedia Communciation, 207 Spring p.6

Huffman Coding A procedure to construct optimal prefi code Result of David Huffman s term paper in 952 when he was a PhD student at MIT Shannon Fano Huffman (925-999) Z. Li Multimedia Communciation, 207 Spring p.7

Huffman Code Design Requirement: The source probability distribution. (But not available in many cases) Procedure:. Sort the probability of all source symbols in a descending order. 2. Merge the last two into a new symbol, add their probabilities. 3. Repeat Step, 2 until only one symbol (the root) is left. 4. Code assignment: Traverse the tree from the root to each leaf node, assign 0 to the top branch and to the bottom branch. Z. Li Multimedia Communciation, 207 Spring p.8

Eample Source alphabet A = {a, a 2, a 3, a 4, a 5 } Probability distribution: {0.2, 0.4, 0.2, 0., 0.} 0 000 000 00 Sort a2 (0.4) merge Sort 0.4 merge Sort 0.4 merge 0 0.2 0 0.4 00 a(0.2) 0.6 000 0.2 000 0.2 a3(0.2) 0.4 000 0 a4(0.) 0.2 0.2 a5(0.) 00 00 Assign code Sort 0.6 0.4 merge 0 Z. Li Multimedia Communciation, 207 Spring p.9

Huffman code is prefi-free 0 000 000 00 (a 2 ) 0 (a ) 000 (a 3 ) 000 (a 4 ) 00(a 5 ) All codewords are leaf nodes No code is a prefi of any other code. (Prefi free) Z. Li Multimedia Communciation, 207 Spring p.0

Average Codeword Length vs Entropy Source alphabet A = {a, b, c, d, e} Probability distribution: {0.2, 0.4, 0.2, 0., 0.} Code: {0,, 000, 000, 00} Entropy: H(S) = - (0.2*log2(0.2)*2 + 0.4*log2(0.4)+0.*log2(0.)*2) = 2.22 bits / symbol Average Huffman codeword length: L = 0.2*2+0.4*+0.2*3+0.*4+0.*4 = 2.2 bits / symbol This verifies H(S) L < H(S) +. Z. Li Multimedia Communciation, 207 Spring p.

Huffman Code is not unique Two choices for each split: 0, or, 0 0.4 0.6 0 0.4 0 0.6 0.4 00 0.4 0 0.6 0.4 0.6 0.4 0.2 0.2 0 Multiple ordering choices for tied probabilities 0 a 0.4 0 0.6 b 0.4 0 0.6 b 0.4 0 0.6 0.4 0 a 0.4 0 0.6 0.4 0 c 0.2 c 0.2 Z. Li Multimedia Communciation, 207 Spring p.2

Huffman Coding is Optimal Assume the probabilities are ordered: p p 2 p m. Lemma: For any distribution, there eists an optimal prefi code that satisfies: If p j p k, then l j l k : otherwise can swap codewords to reduce the average length. The two least probable letters have the same length: otherwise we can truncate the longer one without violating prefi-free condition. The two longest codewords differ only in the last bit and correspond to the two least probable symbols. Otherwise we can rearrange to achieve this. Proof skipped. Z. Li Multimedia Communciation, 207 Spring p.3

Canonical Huffman Code Huffman algorithm is needed only to compute the optimal codeword lengths The optimal codewords for a given data set are not unique Canonical Huffman code is well structured Given the codeword lengths, can find a canonical Huffman code Also known as slice code, alphabetic code. Z. Li Multimedia Communciation, 207 Spring p.4

Canonical Huffman Code Rules: Assign 0 to left branch and to right branch Build the tree from left to right in increasing order of depth Each leaf is placed at the first available position Eample: Codeword lengths: 2, 2, 3, 3, 3, 4, 4 Verify that it satisfies Kraft-McMillan inequality A non-canonical eample N i 2 The Canonical Tree l i 0 000 00 0 00 00 0 00 0 0 00 0 0 Z. Li Multimedia Communciation, 207 Spring p.5

Canonical Huffman Properties: The first code is a series of 0 Codes of same length are consecutive: 00, 0, 0 If we pad zeros to the right side such that all 00 0 codewords have the same length, shorter codes would have lower value than longer codes: 0000 < 000 < 000 < 00 < 00 < 0 < 00 0 0 0 Coding from length level n to level n+: C(n+, ) = 2 ( C(n, last) + ): append a 0 to the net available level-n code First code of length n+ Last code of length n If from length n to n + 2 directly: e.g.,, 3, 3, 3, 4, 4 C(n+2, ) = 4( C(n, last) + ) 0 00 0 0 0 Z. Li Multimedia Communciation, 207 Spring p.6

Advantages of Canonical Huffman. Reducing memory requirement 000 00 0 00 Non-canonical tree needs: All codewords Lengths of all codewords Need a lot of space for large table 00 0 00 0 00 0 0 0 Canonical tree only needs: Min: shortest codeword length Ma: longest codeword length Distribution: Number of codewords in each level Min=2, Ma=4, # in each level: 2, 3, 2 Z. Li Multimedia Communciation, 207 Spring p.7

Outline Lecture 02 ReCap Hoffman Coding Golomb Coding Z. Li Multimedia Communciation, 207 Spring p.8

Unary Code (Comma Code) Encode a nonnegative integer n by n s and a 0 (or n 0 s and an ). No need to store codeword table, very simple Is this code prefi-free? n Codeword 0 0 0 2 0 3 0 4 0 5 0 0 0 0 0 0 0 0 When is this code optimal? When probabilities are: /2, /4, /8, /6, /32 D-adic Huffman code becomes unary code in this case. Z. Li Multimedia Communciation, 207 Spring p.9

Implementation Very Efficient Encoding: UnaryEncode(n) { while (n > 0) { WriteBit(); n--; } WriteBit(0); } Decoding: UnaryDecode() { n = 0; while (ReadBit() == ) { n++; } return n; } Z. Li Multimedia Communciation, 207 Spring p.20

Golomb Code [Golomb, 966] A multi-resolutional approach: Divide all numbers into groups of equal size m o Denote as Golomb(m) or Golomb-m Groups with smaller symbol values have shorter codes Symbols in the same group has codewords of similar lengths o The codeword length grows much slower than in unary code m m m m 0 ma Codeword : (Unary, fied-length) Group ID: Unary code Inde within each group: Z. Li Multimedia Communciation, 207 Spring p.2

Golomb Code q: Quotient, n used unary code q Codeword 0 0 0 2 0 3 0 4 0 5 0 6 0 qm r n m m r r: remainder, fied-length code K bits if m = 2^k m=8: 000, 00,, If m 2^k: (not desired) log 2 m log 2 m bits for smaller r bits for larger r m = 5: 00, 0, 0, 0, Z. Li Multimedia Communciation, 207 Spring p.22

Golomb Code with m = 5 (Golomb-5) n q r code n q r code n q r code 0 0 0 000 5 0 000 0 2 0 000 0 00 6 00 2 00 2 0 2 00 7 2 00 2 2 2 00 3 0 3 00 8 3 00 3 2 3 00 4 0 4 0 9 4 0 4 2 4 0 Z. Li Multimedia Communciation, 207 Spring p.23

Golomb vs Canonical Huffman Codewords: 000, 00, 00, 00, 0, 000, 00, 00, 00, 0 Canonical form: From left to right From short to long Take first valid spot Golomb code is a canonical Huffman With more properties Z. Li Multimedia Communciation, 207 Spring p.24

Golobm-Rice Code A special Golomb code with m= 2^k The remainder r is the fied k LSB bits of n m = 8 n q r code 0 0 0 0000 0 000 2 0 2 000 3 0 3 00 4 0 4 000 5 0 5 00 6 0 6 00 7 0 7 0 n q r code 8 0 0000 9 000 0 2 000 3 00 2 4 000 3 5 00 4 6 00 5 7 0 Z. Li Multimedia Communciation, 207 Spring p.25

Implementation Encoding: GolombEncode(n, RBits) { q = n >> RBits; UnaryCode(q); WriteBits(n, RBits); } Decoding: Remainder bits: RBits = 3 for m = 8 Output the lower (RBits) bits of n. n q r code 0 0 0 0000 0 000 2 0 2 000 3 0 3 00 4 0 4 000 5 0 5 00 6 0 6 00 7 0 7 0 GolombDecode(RBits) { q = UnaryDecode(); n = (q << RBits) + ReadBits(RBits); return n; } Z. Li Multimedia Communciation, 207 Spring p.26

Eponential Golomb Code (Ep-Golomb) Golomb code divides the alphabet into groups of equal size m m m m 0 ma In Ep-Golomb code, the group size increases eponentially Codes still contain two parts: Unary code followed by fied-length code 2 4 8 6 n code 0 0 00 2 0 3 000 4 00 5 00 6 0 7 0000 8 000 9 000 0 00 000 0 ma Proposed by Teuhola in 978 2 00 3 00 4 0 5 00000 Z. Li Multimedia Communciation, 207 Spring p.27

Decoding Implementation EpGolombDecode() { GroupID = UnaryDecode(); if (GroupID == 0) { return 0; } else { Base = ( << GroupID) - ; Inde = ReadBits(GroupID); return (Base + Inde); } } } n code Group ID 0 0 0 00 2 0 3 000 2 4 00 5 00 6 0 7 0000 3 8 000 9 000 0 00 000 2 00 3 00 4 0 Z. Li Multimedia Communciation, 207 Spring p.28

Outline Golomb Code Family: Unary Code Golomb Code Golomb-Rice Code Eponential Golomb Code Why Golomb code? Z. Li Multimedia Communciation, 207 Spring p.29

Geometric Distribution (GD) Geometric distribution with parameter ρ: P() = ρ ( - ρ), 0, integer. Prob of the number of failures before the first success in a series of independent Yes/No eperiments (Bernoulli trials). Unary code is the optimal prefi code for geometric distribution with ρ /2: ρ = /4: P(): 0.75, 0.9, 0.05, 0.0, 0.003, Huffman coding never needs to re-order equivalent to unary code. Unary code is the optimal prefi code, but not efficient ( avg length >> entropy) ρ = 3/4: P(): 0.25, 0.9, 0.4, 0., 0.08, Reordering is needed for Huffman code, unary code not optimal prefi code. ρ = /2: Epected length = entropy. Unary code is not only the optimal prefi code, but also optimal among all entropy coding (including arithmetic coding). Z. Li Multimedia Communciation, 207 Spring p.30

Geometric Distribution (GD) Geometric distribution is very useful for image/video compression Eample : run-length coding Binary sequence with i.i.d. distribution P(0) = ρ : Eample: 00000000000000000000000 Entropy <<, prefi code has poor performance. Run-length coding is efficient to compress the data: or: Number of consecutive 0 s between two s o run-length representation of the sequence: 5, 8, 4, 0, 6 Probability distribution of the run-length r: o P(r = n) = ρ n (- ρ): n 0 s followed by an. o The run has one-sided geometric distribution with parameter ρ. P(r) r Z. Li Multimedia Communciation, 207 Spring p.3

Geometric Distribution GD is the discrete analogy of the Eponential distribution f ( ) e f() Two-sided geometric distribution is the discrete analogy of the Laplacian distribution (also called double eponential distribution) f() f ( ) e 2 Z. Li Multimedia Communciation, 207 Spring p.32

Why Golomb Code? Significance of Golomb code: For any geometric distribution (GD), Golomb code is optimal prefi code and is as close to the entropy as possible (among all prefi codes) How to determine the Golomb parameter? How to apply it into practical codec? Z. Li Multimedia Communciation, 207 Spring p.33

Geometric Distribution Eample 2: GD is a also good model for Prediction error e(n) = (n) pred((),, (n-)). Most e(n) s have smaller values around 0: p ( n ) n, 0. can be modeled by geometric distribution. p(n) n 2 3 4 5 0.2 0.3 0.2 0.2 0 0 0 0 0 Z. Li Multimedia Communciation, 207 Spring p.34

Optimal Code for Geometric Distribution Geometric distribution with parameter ρ: P(X=n) = ρ n ( - ρ) Unary code is optimal prefi code when ρ /2. Also optimal among all entropy coding for ρ = /2. How to design the optimal code when ρ > /2? Transform into GD with ρ /2 (as close as possible) How? By grouping m events together! Each can be written as m q r P() P() m m m qmr qm mq m PX ( q) P ( ) ( ) ( ) ( ) q X qm r r0 r0 q has geometric dist with parameter ρ m. Unary code is optimal for q if ρ m /2 m m is the minimal possible integer. log 2 log 2 Z. Li Multimedia Communciation, 207 Spring p.35

Golomb Parameter Estimation (J2K book: pp. 55) P ) ( ) ( Goal of adaptive Golomb code: For the given data, find the best m such that ρ m /2. How to find ρ from the statistics of past data? ) ( ) ( ) ( ) ( 2 0 E. ) ( ) ( E E 2 ) ( ) ( m m E E ) ( ) ( log / E E m Let m=2 k. ) ( ) ( log / log 2 2 E E k Too costly to compute Z. Li Multimedia Communciation, 207 Spring p.36

Golomb Parameter Estimation (J2K book: pp. 55) ( ) E A faster method: Assume ρ, ρ 0. ) ( ) ( E m m m m m ρ m /2 ) ( 2 2 E m k. ) ( 2 log 0, ma 2 E k Z. Li Multimedia Communciation, 207 Spring p.37

Summary Hoffman Coding A prefi code that is optimal in code length (average) Canonical form to reduce variation of the code length Widely used Golomb Coding Suitable for coding prediction errors in image Optimal for Geometrical Distribution of p=0.5 Simple to encode and decode Many practical applications, e.g., JPEG-2000 lossless. Z. Li Multimedia Communciation, 207 Spring p.38

Q&A Z. Li Multimedia Communciation, 207 Spring p.39