Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Similar documents
Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Chapter 2: Source coding

Data Compression. Limit of Information Compression. October, Examples of codes 1

Entropy as a measure of surprise

Chapter 5: Data Compression

Coding of memoryless sources 1/35

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Information Theory and Statistics Lecture 2: Source coding

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 1: September 25, A quick reminder about random variables and convexity

1 Introduction to information theory

Lecture 3 : Algorithms for source coding. September 30, 2016

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Chapter 5. Data Compression

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

3F1 Information Theory, Lecture 3

Communications Theory and Engineering

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Motivation for Arithmetic Coding

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Lecture 4 : Adaptive source coding algorithms

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Intro to Information Theory

Chapter 9 Fundamental Limits in Information Theory

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Lecture 1 : Data Compression and Entropy

On the Cost of Worst-Case Coding Length Constraints

Quantum-inspired Huffman Coding

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

DCSP-3: Minimal Length Coding. Jianfeng Feng

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

U Logo Use Guidelines

lossless, optimal compressor

3F1 Information Theory, Lecture 3

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

UNIT I INFORMATION THEORY. I k log 2

Exercises with solutions (Set B)

Information and Entropy

Shannon-Fano-Elias coding

COMM901 Source Coding and Compression. Quiz 1

Coding for Discrete Source

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

2018/5/3. YU Xiangyu

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

CSCI 2570 Introduction to Nanocomputing

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Homework Set #2 Data Compression, Huffman code and AEP

Ch. 2 Math Preliminaries for Lossless Compression. Section 2.4 Coding

On the redundancy of optimum fixed-to-variable length codes

COS597D: Information Theory in Computer Science October 19, Lecture 10

Solutions to Set #2 Data Compression, Huffman code and AEP

CS4800: Algorithms & Data Jonathan Ullman

ELEC 515 Information Theory. Distortionless Source Coding

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Introduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld

CSEP 590 Data Compression Autumn Arithmetic Coding

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

Data Compression Techniques

Digital communication system. Shannon s separation principle

Information Theory in Intelligent Decision Making

Information Theory and Coding Techniques

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

(Classical) Information Theory II: Source coding

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Lecture 5: Asymptotic Equipartition Property

An introduction to basic information theory. Hampus Wessman

Lecture 11: Polar codes construction

Introduction to information theory and coding

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Information Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Lecture 11: Quantum Information III - Source Coding

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

Introduction to information theory and coding

Chapter 3: Asymptotic Equipartition Property

APC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Compression and Coding

Transcription:

Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average the number of bits that is required to represent a source symbol. This indicates a mapping between source symbol and bits. Source coding can be seen as a mapping mechanism between symbols and bits. For a string of symbols, how can we use less bits to represent them? Intuition: Use short description to represent the most frequently occurred symbols. Use necessarily long description to represent the less frequently occurred symbols.

3. An Introduction to Source Coding Symbols: 2 4 4 3 4 4 bits: Or can that be a shorter string of bits? Definition: Let x denote a source symbol and C(x) denote a source codeword of x. If the length of C(x) is l(x) (in bits) and x happens with a probability of p(x), then the expected length L(C) of source code C is: L C = x p(x) l(x). It implies the average number of bits that is required to represent a symbol in source coding scheme C.

3. An Introduction to Source Coding Let us look at the following example: Example 3. Let X be a random variable with the following distributions: X {, 2, 3, 4} P X = =, P X = 2 =, P X = 3 =, P X = 4 = 2 4 8 8 Entropy of X is: H X = X {,2,3,4} P(X) log 2 [P X ] =.75 bits/sym.

3. An Introduction to Source Coding Source Coding (C): C =, C 2 =, C 3 =, C 4 = L C = 2 + 2 + 2 + 2 = 2 bits. 2 4 8 8 On average, we use 2 bits to denote a symbol. L C > H(X). Source Coding 2 (C ): C =, C 2 =, C 3 =, C 4 = L C = + 2 + 3 + 3 =.75 bits 2 4 8 8 On average, we use.75 bits to denote a symbol. L(C ) = H(X). Remark: C should be a better source coding scheme than C.

3. An Introduction to Source Coding Theorem 3. Shannon s Source Coding Theorem Given a memoryless source X whose symbols are chosen from the alphabet {x, x 2,, x m } with each alphabet symbol probabilities of P x = p, P x 2 = p 2,, P x m = p m, and i= m p i =. If the source is of length n, when n, it can be encoded with H(X) bits per symbol. The coded sequence will be of nh(x) bits. Note: H X = m i= p i log 2 p i bits/sym.

3. An Introduction to Source Coding Important features of source coding:. Unambiguous representation of source symbols (Non-singularity). X C(X) 2 3 4 Problem: When we try to decode, it can be 2 or 4 or 3. The decoding is NOT unique. 2. Uniquely decodable X C(X) 2 3 4 Problem: When we try to decode, we have 2 3 2... 4 2 We will have to wait and see the end of the bit string. The decoding is NOT instantaneous.

3. An Introduction to Source Coding 3. Instantaneous code Definition: For an instantaneous code, no codeword is a prefix of any other codeword. X C(X) 2 3 4 Observation: If you try to decode, you would notice that the puncturing positions are determined by the instance you have reached a source codeword. The decoding is instantaneous, and the decoding output is 4 3 2 3 4.

3.2 Optimal Source Codes - How can we find an optimal source code? - An optimal source code : () Instantaneous code (prefix code) (2) Smallest expected length L = p i l i Theorem 3.2 Kraft Inequality For an instantaneous code over an alphabet of size D (e. g., D = 2 for binary codes), the codeword lengths l, l 2,, l m must satisfy i D l i. Remark: An instantaneous code i D l i Example 3.2 For the source code C of Example 3.. 2 + 2 2 + 2 3 + 2 3 =.

3.2 Optimal Source Codes Root Proof: - The above tree illustrates the assignment of source codeword symbols in a binary way when D = 2. A complete solid path represents a source codeword. - Based on property of the instantaneous code, if the first source codeword goes the path, the next source codeword should not go the path. Such a source codeword symbol assignment process repeats as the number of data symbols increases.

3.2 Optimal Source Codes Root - At level l max of the tree (source codeword length is l max ), there are at most D l max codewords. Similarly, at level l i of the tree, there are at most D l i codewords. All the codewords at level l i have at most D l max l i descendants at level l max. Considering all levels l i, the total number of descendants should not be greater than the maximal number of nodes at level l max as i D l max l i D l max i D l i.

3.2 Optimal Source Codes Root - The expected length of this tree is E l = i l i p i - l i : length of a source codeword for symbol x i p i : probability of symbol x i Expected length of the tree is the expected length of the source code. - The tree represents an instantaneous source code.

3.2 Optimal Source Codes - Finding the smallest expected length L becomes minimize: L = i p i l i subject to i D l i. - The constrained minimization problem can be written as minimize: J = i p i l i + λ( i D l i) Lagrange Multipliers J - Calculus : = p l i λd l i log e D. i To enable J =, we need l i D l i = p i λ log e D. To satisfy the Kraft Inequality, we have λ= log e D. Hence, p i = D l i. To minimized L, we need l i = log D p i.

3.2 Optimal Source Codes - With l i = log D P i, we have L = i p i l i = i p i log D p i = H D (X) Entropy of the source symbols Theorem 3.3 Lower Bound of the Expected Length The expected length L of an instantaneous D-ary code for a random variable X is lower bounded by L H D (X). Remark: since l i can be only be an integer, L = H D (X), if l i = log D p i. L > H D (X), if l i = log D p i.

3.2 Optimal Source Codes Corollary 3.4 Upper Bound of the Expected Length The expected length L of an instantaneous D-ary code for a random variable X is upper bounded by L < H D X +. Proof: Since log D p i l i < log D p i +. By multiplying p i to the above inequality and performing summation over i as i p i log p i i p i l i < i p i log p i + p i i H D X L < H D X +.

3.2 Shannon-Fano Code - Given a source that contains symbols x, x 2, x 3,, x m with probabilities of p, p 2, p 3,, p m, respectively. - Determine the source codeword length for symbol x i as l i = log 2 p i bits. - Further determine l max = max{l i, i}. - Shannon-Fano code construction: Step : Construct a binary tree of depth l max. Step 2: Choose a node of depth l i and delete its following paths and nodes. The path from root to the node represents the source codeword for symbol x i.

3.2 Shannon-Fano Code - Example 3.3 Given a source with symbols x, x 2, x 3, x 4, they occur with probabilities of p =.4, p 2 =.3, p 3 =.2, p 4 =., respectively. Construct its Shannon-Fano code. We can determine l = log 2 p = 2, l 2 = log 2 p 2 = 2, l 3 = log 2 p 3 = 3, l = log 2 p 4 = 4, and l max = 4. Construct a binary tree of depth 4. Root The source codewords are x : x 2 : x 3 : x 4 :.

3.4 Huffman Code - Given a source that contains symbols x, x 2, x 3,, x m with probabilities of p, p 2, p 3,, p m, respectively. - Huffman code construction: Step : Merge the 2 smallest symbol probabilities; Step 2: Assign the 2 corresponding symbols with and, then go back to Step ; Repeat the above process until two probabilities are merged into a probability of. - Huffman code is the shortest prefix code, i.e., an optimal code.

3.4 Huffman Code Example 3.4 Consider a random variable set of X = {, 2, 3, 4, 5}. They have probabilities of P X = =.25, P X = 2 =.25, P X = 3 =.2, P X = 4 =.5, P X = 5 =.5. Construct a Huffman code to represent variable X. Codeword X P(X).25.3 2.25.25 3.2.25 4.5.2 5.5

3.4 Huffman Code Codeword X P(X).25.3.45 2.25.25.3 3.2.25.25 4.5.2 5.5 Codeword X P(X).25.3.45.55 2.25.25.3.45 3.2.25.25 4.5.2 5.5

3.4 Huffman Code Codeword X P(X).25.3.45.55 2.25.25.3.45 3.2.25.25 4.5.2 5.5 Validations: l = 2, l 2 = 2, l 3 = 2, l 4 = 3, l 5 = 3 L = X l(x) P(X) = 2.3 bits/symbol H 2 X = X P(X) log 2 P X = 2.3 bits/symbol. L H 2 X.

3.4 Huffman Code So now, let us look back at the problem proposed at the beginning. How to represent the source vector { 2 4 4 3 4 4}? Codeword X P(X).25.25.5 2.25.25.5 3.25.5 4.5 It should be represented as { } and L =.75 bits/symbol. Question: How if the source vector becomes { 2 4 3 4 4 2 }? Remark: the Huffman code and its expected length depends on the source vector, i.e., entropy of the source.

3.4 Huffman Code - Huffman code can also be defined as a D-ary code. - A D-ary Huffman code can be similarly constructed following the binary construction. Step : Merge the D smallest symbol probabilities; Step 2: Assign the corresponding symbols with,,..., D, then go back to Step ; Repeat the above process until D probabilities are merged into a probability of.

3.4 Huffman Code Example 3.5 Consider a random variable set of X = {, 2, 3, 4, 5, 6}. They have probabilities of P(X = ) =.25, P(X = 2) =.25, P(X = 3) =.2, P(X = 4) =., P(X = 5) =., P(X = 6) =.. Construct a ternary {,, 2} Huffman code. Codeword X P(X).25.25.25 2.25.25.25 2 3.2.2.5 2 4.. 2 2 5..2 2 2 6. 2 2 2 Dummy Note: A dummy symbol is created such that 3 probabilities can merge into a probability of in the end.

3.4 Huffman Code Properties on an optimal D-ary source code (Huffman code) () If p j > p k, then l j l k ; (2) The D longest codewords have the same length; (3) The D longest codewords differ only in the last symbol and correspond to the D least likely source symbols. Theorem 3.5 Optimal Source Code A source code (C*) is optimal if giving any other source code C, we have L(C*) L(C ). Note: Huffman code is optimal.

References: [] Elements of Information Theory, by T. Cover and J. Thomas. [2] Scriptum for the lectures, Applied Information Theory, by M. Bossert.