Ch. 2 Math Preliminaries for Lossless Compression. Section 2.4 Coding

Similar documents
An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Chapter 3. Systems of Linear Equations: Geometry

Consider this problem. A person s utility function depends on consumption and leisure. Of his market time (the complement to leisure), h t

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Chapter 5: Data Compression

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 1 : Data Compression and Entropy

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

Ch. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

COMM901 Source Coding and Compression. Quiz 1

Lecture 1: Shannon s Theorem

Announcements Wednesday, September 06

Chapter 9 Fundamental Limits in Information Theory

Minentropy and its Variations for Cryptography

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Part I consists of 14 multiple choice questions (worth 5 points each) and 5 true/false question (worth 1 point each), for a total of 75 points.

lossless, optimal compressor

Chapter 2: Source coding

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Ch. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Entropy as a measure of surprise

A Generalization of a result of Catlin: 2-factors in line graphs

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Data Compression. Limit of Information Compression. October, Examples of codes 1

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Introduction To Resonant. Circuits. Resonance in series & parallel RLC circuits

Econ 201: Problem Set 3 Answers

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

The Dot Product

Logic Effort Revisited

Bloom Filters and Locality-Sensitive Hashing

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

Business Cycles: The Classical Approach

On the approximation of real powers of sparse, infinite, bounded and Hermitian matrices

Information Theory and Statistics Lecture 2: Source coding

GRADIENT DESCENT. CSE 559A: Computer Vision GRADIENT DESCENT GRADIENT DESCENT [0, 1] Pr(y = 1) w T x. 1 f (x; θ) = 1 f (x; θ) = exp( w T x)

Randomized Smoothing Networks

Complex Numbers and the Complex Exponential

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Lecture 3 : Algorithms for source coding. September 30, 2016

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Coding of memoryless sources 1/35

Exploiting Source Redundancy to Improve the Rate of Polar Codes

Worksheet Week 1 Review of Chapter 5, from Definition of integral to Substitution method

Minimize Cost of Materials

3 - Vector Spaces Definition vector space linear space u, v,

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Algebra II Solutions

1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument

CSCI 2570 Introduction to Nanocomputing

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

CS173 Lecture B, November 3, 2015

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Solving with Absolute Value

Lecture 1: September 25, A quick reminder about random variables and convexity

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

PGSS Discrete Math Solutions to Problem Set #4. Note: signifies the end of a problem, and signifies the end of a proof.

CMPT 365 Multimedia Systems. Lossless Compression

Constrained and Unconstrained Optimization Prof. Adrijit Goswami Department of Mathematics Indian Institute of Technology, Kharagpur

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Linear models: the perceptron and closest centroid algorithms. D = {(x i,y i )} n i=1. x i 2 R d 9/3/13. Preliminaries. Chapter 1, 7.

MATH 115, SUMMER 2012 LECTURE 4 THURSDAY, JUNE 21ST

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

CS173 Strong Induction and Functions. Tandy Warnow

Summary of Last Lectures

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 12 Scribe: Indraneel Mukherjee March 12, 2008

Semi-simple Splicing Systems

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

ENGINEERING MATH 1 Fall 2009 VECTOR SPACES

Lecture 8 January 30, 2014

Polynomial and Rational Functions

APC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words

Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur. Lecture 1 Real Numbers

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

ARE211, Fall2012. Contents. 2. Linear Algebra (cont) Vector Spaces Spanning, Dimension, Basis Matrices and Rank 8

Shannon-Fano-Elias coding

Announcements Wednesday, September 06. WeBWorK due today at 11:59pm. The quiz on Friday covers through Section 1.2 (last weeks material)

1 Introduction to information theory

Before this course is over we will see the need to split up a fraction in a couple of ways, one using multiplication and the other using addition.

Introduction to information theory and coding

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

Logistic Regression. Machine Learning Fall 2018

Math 249B. Geometric Bruhat decomposition

Digital communication system. Shannon s separation principle

On the redundancy of optimum fixed-to-variable length codes

Intro to Information Theory

Homework 6 Solutions

Optimisation and Operations Research

Coding for Discrete Source

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Transcription:

Ch. 2 Math Preliminaries for Lossless Compression Section 2.4 Coding

Some General Considerations Definition: An Instantaneous Code maps each symbol into a codeord Notation: a i φ (a i ) Ex. : a 0 For Ex. : φ (a a 2 3 ) = 00 a 3 00 a 4 Ex. 2: a 0 a 2 0 a 3 0 a 4 This code has a tree structure: 0 0 0 a a 2 a 3 a 4 2

What characteristics must a code φ have? Unambiguous (UA): For a i a j, φ(a i ) φ(a j ) The codes in Ex. and Ex.2 each are UA Is UA enough?? No! Consider Ex. coding to different source sequences: a a 2 a a a 2 a 2 a a 2 a 3 a 4 a a 2 a a a 2 a 2 They each get coded to the bit stream: 0 0 0 a a 2 a 3 a 4 Can t uniquely decode this bit sequence!! So UA guarantees that can decode each symbol by itself but not necessarily a stream of coded symbols!! 3

Define mapping of sequences under code φ Φ ( a ) ( ) ( ) ( ) ( ) ( ) i a i a 2 i a 3 i a 4 i =φ a N i φ a i φ a 2 i φ a 3 i φ a 4 in S i Concatenation of code ords Don t ant to sequences of symbols to map to the same bit stream: S i Φ S j Source Sequence Space Φ Code Sequence Space Leads to need for Uniquely Decodable (UD): Let S i & S j be to sequences from the same source (not necessarily of the same length). Then code φ is UD if the only ay that Φ(S i ) = Φ(S j ) is for S i S j 4

Does UD UA??? YES! UA Codes UD Codes Then UD is enough??? YES! But in practice it is helpful to restrict to a subset of UD codes called Prefix Codes. 5

Prefix Code: A UD code in hich no codeord may be the prefix of another codeord. Ex. 2 above is a prefix code: Ex. 2: 0 a 0 0 a 2 0 a 3 0 a 4 0 a a 2 a 3 a 4 This code is not prefix Ex. 3: a 0 a 2 0 a 3 0 a 4 0 Does Prefix UD??? YES! UA Codes UD Codes Prefix Codes Do e lose anything by restricting to prefix codes? No as e ll see later! 6

Ho do e compare various UD codes??? (i.e., What is our measure of performance?) Average Code Length: Info theory says to use average code length per symbol For a source ith symbols a, a 2, a N and a code φ the average code length is define by N l ( φ) = P( ai) n{ φ( ai)} Prob. of a i # of bits in codeord for a i Optimum Code: The UD code ith the smallest average code length 7

Example: For P(a ) = ½ P(a 2 ) = ¼ P(a 3 ) = P(a 3 ) = /8 This source has a entropy of.75 bits Here are three possible codes and their average lengths: Symbol UA Non-UD Code (Ex. ) Prefix Code (Ex. 2) UD Non-Prefix (Ex. 3) Info of symbol -log 2 [P(a i )] a 0 0 0 a 2 0 0 2 a 3 00 0 0 3 a 4 0 3 Avg. Length:.25 bits.75 bits.875 bits H(S) =.75 bits Length better than H(S)!! Length equals H(S) Length larger than H(S) BUT not usable because it is Non-UD Prefix Code gives smallest usable code!! 8

Info Theory Says: Optimum Code is alays a prefix code!! UA Codes UD Codes Prefix Codes The proof of this uses the Kraft-McMillan Inequality hich e ll discuss next. Ho do e find the optimum prefix code? (Note: not just any prefix code ill be optimum!) We ll discuss this later. 9

2.4.3 Kraft-McMillan Inequality This result tells us that an optimal code can alays be chosen to be a prefix code!!! The Theorem has 2 parts. Theorem Part #: Let C be a code having N codeords ith codeord lengths of l, l 2, l 3,, l N If C is uniquely decodable, then For notation: N = l KC ( ) Δ 2 i N li 2 Proof: Here is the main idea used in the proof If K(C) >, then [K(C)] n gros exponentially.r.t. n So if e can sho that [K(C)] n gros, say, no more than linearly e have our proof. Thus e need to sho that [ KC ( )] n αn+ β Some constants 0

For arbitrary integer n: [ KC] n N n N N N l l li i i2 in ( ) = 2 = 2 2 2 i2= in = = N N N i = i = i = 2 n 2 ( l l l ) + + + i i2 i n l Note use of n different dummy variables! ( ) Note that this exponent is nothing more than the length of a sequence of selected codeords of code C Let this be L(i, i 2, i 3,, i n ) and e can re-rite ( ) as [ KC] N N N n L i i i L L L N N N (,,, ) (,,,) (,,,2 ) (,,, ) 2 n ( ) = 2 = 2 + 2 + + 2 i = i = i = 2 n The smallest L(i, i 2, i 3,, i n ) can be is n (hen each codeord in the sequence is bit long) The longest L(i, i 2, i 3,, i n ) can be is nl here l is the longest codeord in C. So then: [ KC] (,,,) (,,,2 ) (,,, ) n L L L N N N ( ) = 2 + 2 + + 2 = A 2 + A 2 + + A 2 n ( n+ ) ( nl) n n+ nl ( ) [ ] nl n KC ( ) = Ak 2 k= n k A k = # times L(i, i 2, i 3,, i n ) = n

Remember that e are trying to establish this bound: [ KC ( )] n αn+ β e don t need the A k values exactly just need a good upper bound on them! There may be some in the 2 k that are not valid First: The # of k-bit binary sequences = 2 k The If part of the theorem! Second: If our code is uniquely decodable, then each of these can represent one and only one sequence of codeords hose total length = k bits A We can no use this bound in ( ) to get a bound on [K(C)] n : nl k 2 k n k k k [ ] k KC ( ) = A2 2 2 = nl n+ nl k= n k= n Thus [K(C)] n gros sloer than exponentially = Hence K(C) <End of Proof> 2

Part # says: If code ith lengths {l, l 2, l N } is uniquely decodable, then the lengths satisfy the inequality Part #2 says: Given lengths {l, l 2, l N } that satisfy the inequality, then e can alays find a prefix code / these lengths li Theorem Part #2: Given integers {l, l 2, l N } such that 2 We can alays find a prefix code ith lengths {l, l 2, l N } Proof: This is a Proof by Construction : e ill sho ho to construct the desired prefix code. WLOG. Assume that l l 2 l N Define the numbers, 2,, N using j = 0 j j i = 2, j > l l N Think of this in terms of a binary representation (see next slide for an example) 3

Example of Creating the li l = l = 3 l = 3 l = 5 l = 5 2 = 0.825 < 2 3 4 5 l2 li 3 2 2 4 00 2 2 2 l3 li 3 3 3 2 2 2 5 0 3 2 3 l4 li 5 5 3 5 3 2 2 2 2 24 00 l5 li 5 5 3 5 3 5 5 = 2 = + + + = 5 4 2 5 = 0 = = = = = = + = = = = + + = = j 4 2 2 2 2 25 = 002 0 For j >, the binary representation of j uses log 2 j bits Easy to sho (see textbook) that: # bits in j l j log 2 j l j, for j This is here e use that N li 2 4

No use the binary reps of the j to construct the prefix codeords having lengths {l, l 2,, l N } If log 2 j = l j log 2 j < l j then set j th codeord = binary j then set j th codeord = [binary j 0 0] So no e ve constructed a code ith the desired lengths Is it a prefix code??? Append enough 0 s to get l j total bits Sho it is by using contradiction Assume that it is NOT a prefix code and sho that it leads to something that contradicts a knon condition 5

Suppose that the constructed code is not prefix thus, for some j < k the codeord C j is a prefix of codeord C k (l j MSBs of k ) = j Right-Shift & Chop k = j k l 2 k l = j j ( ) But by design : k k = l 2 k l i There is alays a But in a proof by contradiction! So see if ( ) contradicts this required condition: Put this k into ( ) and sho that something goes rong 6

( ) 2 l k k l j k = 2 l j l i Thus, k l 2 k j j j + l j + > j k l l l l = 2 + 2 j j i j i k 0 l 2 2 j l = i j + + j+ Integer Must fall here j Integer j by Defn j + k l 2 k l + j j hich contradicts ( ) So code is prefix! <End of Proof> 7

Meaning of Kraft-McMillan Theorem Question: So hat do these to parts of the theorem tell us??? Anser: Shortest Avg. Length We are looking for the optimal UD code. Once e find it e kno its codeord lengths satisfy the K-M inequality Part # of the theorem tells us that!!! Once e have such lengths (that satisfy the K-M ineq.) e can construct a prefix code having those optimal lengths This is guaranteed by Part #2 of the theorem This gives us a prefix code that is optimal!!! So everytime e find the optimal code, if it isn t already prefix e can replace it ith a prefix code that is just as optimal! Can focus on finding optimal prefix codes /o orrying that e could find a better code that is not prefix! 8