Ch. 2 Math Preliminaries for Lossless Compression. Section 2.4 Coding

Ch. 2 Math Preliminaries for Lossless Compression Section 2.4 Coding

Some General Considerations Definition: An Instantaneous Code maps each symbol into a codeord Notation: a i φ (a i ) Ex. : a 0 For Ex. : φ (a a 2 3 ) = 00 a 3 00 a 4 Ex. 2: a 0 a 2 0 a 3 0 a 4 This code has a tree structure: 0 0 0 a a 2 a 3 a 4 2

What characteristics must a code φ have? Unambiguous (UA): For a i a j, φ(a i ) φ(a j ) The codes in Ex. and Ex.2 each are UA Is UA enough?? No! Consider Ex. coding to different source sequences: a a 2 a a a 2 a 2 a a 2 a 3 a 4 a a 2 a a a 2 a 2 They each get coded to the bit stream: 0 0 0 a a 2 a 3 a 4 Can t uniquely decode this bit sequence!! So UA guarantees that can decode each symbol by itself but not necessarily a stream of coded symbols!! 3

Define mapping of sequences under code φ Φ ( a ) ( ) ( ) ( ) ( ) ( ) i a i a 2 i a 3 i a 4 i =φ a N i φ a i φ a 2 i φ a 3 i φ a 4 in S i Concatenation of code ords Don t ant to sequences of symbols to map to the same bit stream: S i Φ S j Source Sequence Space Φ Code Sequence Space Leads to need for Uniquely Decodable (UD): Let S i & S j be to sequences from the same source (not necessarily of the same length). Then code φ is UD if the only ay that Φ(S i ) = Φ(S j ) is for S i S j 4

Does UD UA??? YES! UA Codes UD Codes Then UD is enough??? YES! But in practice it is helpful to restrict to a subset of UD codes called Prefix Codes. 5

Prefix Code: A UD code in hich no codeord may be the prefix of another codeord. Ex. 2 above is a prefix code: Ex. 2: 0 a 0 0 a 2 0 a 3 0 a 4 0 a a 2 a 3 a 4 This code is not prefix Ex. 3: a 0 a 2 0 a 3 0 a 4 0 Does Prefix UD??? YES! UA Codes UD Codes Prefix Codes Do e lose anything by restricting to prefix codes? No as e ll see later! 6

Ho do e compare various UD codes??? (i.e., What is our measure of performance?) Average Code Length: Info theory says to use average code length per symbol For a source ith symbols a, a 2, a N and a code φ the average code length is define by N l ( φ) = P( ai) n{ φ( ai)} Prob. of a i # of bits in codeord for a i Optimum Code: The UD code ith the smallest average code length 7

Example: For P(a ) = ½ P(a 2 ) = ¼ P(a 3 ) = P(a 3 ) = /8 This source has a entropy of.75 bits Here are three possible codes and their average lengths: Symbol UA Non-UD Code (Ex. ) Prefix Code (Ex. 2) UD Non-Prefix (Ex. 3) Info of symbol -log 2 [P(a i )] a 0 0 0 a 2 0 0 2 a 3 00 0 0 3 a 4 0 3 Avg. Length:.25 bits.75 bits.875 bits H(S) =.75 bits Length better than H(S)!! Length equals H(S) Length larger than H(S) BUT not usable because it is Non-UD Prefix Code gives smallest usable code!! 8

Info Theory Says: Optimum Code is alays a prefix code!! UA Codes UD Codes Prefix Codes The proof of this uses the Kraft-McMillan Inequality hich e ll discuss next. Ho do e find the optimum prefix code? (Note: not just any prefix code ill be optimum!) We ll discuss this later. 9

2.4.3 Kraft-McMillan Inequality This result tells us that an optimal code can alays be chosen to be a prefix code!!! The Theorem has 2 parts. Theorem Part #: Let C be a code having N codeords ith codeord lengths of l, l 2, l 3,, l N If C is uniquely decodable, then For notation: N = l KC ( ) Δ 2 i N li 2 Proof: Here is the main idea used in the proof If K(C) >, then [K(C)] n gros exponentially.r.t. n So if e can sho that [K(C)] n gros, say, no more than linearly e have our proof. Thus e need to sho that [ KC ( )] n αn+ β Some constants 0

For arbitrary integer n: [ KC] n N n N N N l l li i i2 in ( ) = 2 = 2 2 2 i2= in = = N N N i = i = i = 2 n 2 ( l l l ) + + + i i2 i n l Note use of n different dummy variables! ( ) Note that this exponent is nothing more than the length of a sequence of selected codeords of code C Let this be L(i, i 2, i 3,, i n ) and e can re-rite ( ) as [ KC] N N N n L i i i L L L N N N (,,, ) (,,,) (,,,2 ) (,,, ) 2 n ( ) = 2 = 2 + 2 + + 2 i = i = i = 2 n The smallest L(i, i 2, i 3,, i n ) can be is n (hen each codeord in the sequence is bit long) The longest L(i, i 2, i 3,, i n ) can be is nl here l is the longest codeord in C. So then: [ KC] (,,,) (,,,2 ) (,,, ) n L L L N N N ( ) = 2 + 2 + + 2 = A 2 + A 2 + + A 2 n ( n+ ) ( nl) n n+ nl ( ) [ ] nl n KC ( ) = Ak 2 k= n k A k = # times L(i, i 2, i 3,, i n ) = n

Remember that e are trying to establish this bound: [ KC ( )] n αn+ β e don t need the A k values exactly just need a good upper bound on them! There may be some in the 2 k that are not valid First: The # of k-bit binary sequences = 2 k The If part of the theorem! Second: If our code is uniquely decodable, then each of these can represent one and only one sequence of codeords hose total length = k bits A We can no use this bound in ( ) to get a bound on [K(C)] n : nl k 2 k n k k k [ ] k KC ( ) = A2 2 2 = nl n+ nl k= n k= n Thus [K(C)] n gros sloer than exponentially = Hence K(C) <End of Proof> 2

Part # says: If code ith lengths {l, l 2, l N } is uniquely decodable, then the lengths satisfy the inequality Part #2 says: Given lengths {l, l 2, l N } that satisfy the inequality, then e can alays find a prefix code / these lengths li Theorem Part #2: Given integers {l, l 2, l N } such that 2 We can alays find a prefix code ith lengths {l, l 2, l N } Proof: This is a Proof by Construction : e ill sho ho to construct the desired prefix code. WLOG. Assume that l l 2 l N Define the numbers, 2,, N using j = 0 j j i = 2, j > l l N Think of this in terms of a binary representation (see next slide for an example) 3

Example of Creating the li l = l = 3 l = 3 l = 5 l = 5 2 = 0.825 < 2 3 4 5 l2 li 3 2 2 4 00 2 2 2 l3 li 3 3 3 2 2 2 5 0 3 2 3 l4 li 5 5 3 5 3 2 2 2 2 24 00 l5 li 5 5 3 5 3 5 5 = 2 = + + + = 5 4 2 5 = 0 = = = = = = + = = = = + + = = j 4 2 2 2 2 25 = 002 0 For j >, the binary representation of j uses log 2 j bits Easy to sho (see textbook) that: # bits in j l j log 2 j l j, for j This is here e use that N li 2 4

No use the binary reps of the j to construct the prefix codeords having lengths {l, l 2,, l N } If log 2 j = l j log 2 j < l j then set j th codeord = binary j then set j th codeord = [binary j 0 0] So no e ve constructed a code ith the desired lengths Is it a prefix code??? Append enough 0 s to get l j total bits Sho it is by using contradiction Assume that it is NOT a prefix code and sho that it leads to something that contradicts a knon condition 5

Suppose that the constructed code is not prefix thus, for some j < k the codeord C j is a prefix of codeord C k (l j MSBs of k ) = j Right-Shift & Chop k = j k l 2 k l = j j ( ) But by design : k k = l 2 k l i There is alays a But in a proof by contradiction! So see if ( ) contradicts this required condition: Put this k into ( ) and sho that something goes rong 6

( ) 2 l k k l j k = 2 l j l i Thus, k l 2 k j j j + l j + > j k l l l l = 2 + 2 j j i j i k 0 l 2 2 j l = i j + + j+ Integer Must fall here j Integer j by Defn j + k l 2 k l + j j hich contradicts ( ) So code is prefix! <End of Proof> 7

Meaning of Kraft-McMillan Theorem Question: So hat do these to parts of the theorem tell us??? Anser: Shortest Avg. Length We are looking for the optimal UD code. Once e find it e kno its codeord lengths satisfy the K-M inequality Part # of the theorem tells us that!!! Once e have such lengths (that satisfy the K-M ineq.) e can construct a prefix code having those optimal lengths This is guaranteed by Part #2 of the theorem This gives us a prefix code that is optimal!!! So everytime e find the optimal code, if it isn t already prefix e can replace it ith a prefix code that is just as optimal! Can focus on finding optimal prefix codes /o orrying that e could find a better code that is not prefix! 8