Data Compression. Limit of Information Compression. October, Examples of codes 1

Size: px
Start display at page:

Download "Data Compression. Limit of Information Compression. October, Examples of codes 1"

Transcription

1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality Kraft inequality - infinite case Optimal codes 6 3. Construction of optimal codes Bounds on the optimal code length Kraft Inequality for Uniquely Decodable Codes 0 5 Huffman Codes 2 5. Huffman codes Optimality of Huffman codes Shannon-Fano-Elias Coding 6 6. The code Competitive optimality of the Shannon code Eamples of codes Eamples of codes X : S X RV, X p() pmf, D the set of finite-length strings of symbols from a D-ary alphabet Definition. A source code for X is a mapping C : X D ; C() is the codeword corresponding to and l() is the length of C(X).

2 E: C(red) = 00, C(blue) = is a source code for X = {red, blue} with alphabet D = {0, }. Definition 2. The epected length L(C) of a source code C() for a RV X with pmf p() is L(C) = p()l(), () X where l() is the length of the codeword associated to. w.l.o.g. we can assume D = {0,,..., D }. ( ) Eample 3. X =, the codewords C() = 0, C(2) = 0, C(3) = 0, C(4) =. The entropy of X is H(X) =.75 bits; the epected length L(C) = E(l(X)) =.75. Any sequence of of bits can be uniquely decoded into a sequence of symbols of X. The bit string is decoded as Eample 4. X = ( ), the codewords C() = 0, C(2) = 0, C(3) =. The code is uniquely decodable. H(X) = log 3 =.58 bits, but L(C) =.66 bits > H(X). Eample 5 (Morse code). Code for English alphabet: dot, dash, letter space,word space. Short sequences represent frequent letters (e.g. dot is E); long sequences represent infrequent letters (dash, dash,dot, dash is Q). It is not optimal. Definition 6. A code is nonsingular if every element of X maps into a different string in D, i.e. = = C() = C( ). (2) 2

3 Nonsingularity suffices for an unambiguous description of a single value of X. For sequences of values we can ensure decodability by adding a special symbol (a comma ) between any two code words. This is inefficient. Definition 7. The etension C of a code C is a mapping C : X n D defined by C ( 2... n ) = C( )C( 2 )... C( n ). (3) Eample 8. If C( ) = 00 and C( 2 ) =, then C( 2 ) = 00. Definition 9. A code is uniquely decodable if its etension is nonsingular. Any encoded string in a uniquely decodable code has only one possible source string producing it. However, one must look at the entire string to determine even the first symbol in the corresponding source string. Definition 0. A code is called a prefi code or an instantaneous code if no codeword is a prefi of any other codeword. An instantaneous code can be decoded without reference to future codeword: the symbol i can be decoded as soon as we come to the end of the codeword corresponding to it. An instantaneous code is a self-punctuating code: we can look at the sequence code symbols and add comas to separate codewords without looking at other symbols. E: ,0,,0,0. Figure shows the relationship between these codes. Figure : Classes of codes Eamples. C(E, F, G, H) = (0,, 00, ) IU C(E, F) = (0, 0) IU 3

4 Nonsingular, but not Uniquely Decodable, X Singular Uniquely Decodable but not Instantaneous Instantaneous Table : Classes of codes C(E, F) = (, 0) IU C(E, F, G, H) = (00, 0, 0, ) IU C(E, F, G, H) = (0, 0, 0, ) IU 2 Kraft Inequality 2. Kraft Inequality Kraft Inequality Aim: to construct instantaneous codes of minimum epected length We cannot assign short codewords to all source symbols and still be prefifree The set of codeword lengths possible for instantaneous code is limited by the following inequality: Theorem 2 (Kraft inequality). For any instantaneous code (prefi code) over an alphabet of size D, the codeword lengths l, l 2,..., l m must satisfy the inequality D li. (4) i Conversely, given a set of codeword lengths that satisfy this inequality, there eists an instantaneous code with these word lengths. Proof. Necessity. We construct a D-ary tree; the branches represent the symbols of the codeword. Each code is represented by a leaf on the tree. The path from the root traces out the symbols of the codeword. See an eample for D = 2 in Figure 2. Prefi condition = no codeword is an ancestor of any other codeword = each codeword eliminates its descendants as possible codewords. Let l ma be the length of the longest codeword and consider all the nodes at level l ma ; these can be codewords, descendants, and neither (unused). A codeword at level l i has D l ma l i descendants 4

5 at level l ma. Each of these descendant sets must be disjoint. The total number of nodes in sets must be D l ma. Summing over all codewords D l ma l i D l ma D l i. Sufficiency. Conversely, given any set of codeword lengths l, l 2,..., l m that satisfy the Kraft inequality, we can construct a tree like the one in Figure 2. Label the first node (leicographically) of depth l as codeword, and remove its descendants from the tree. Then label the first remaining node of depth l 2 as codeword 2, and so on. Proceeding in this way, we construct prefi code with the specified l, l 2,..., l m. Figure 2: Code tree for the Kraft inequality (D = 2) 2.2 Kraft inequality - infinite case Kraft inequality - infinite case Theorem 3 (Etended Kraft inequality). For any countably infinite set of codewords that form a prefi code, the codeword lengths satisfy the etended Kraft inequal- 5

6 ity, D l i. (5) i= Conversely, given any l, l 2,... satisfying the etended Kraft inequality, we can construct a prefi code with these codeword lengths. Proof. D = {0,,..., D }; the ith codeword y y 2 y li and 0.y y 2 y li = l i y j D j. (6) j= This codeword corresponds to the interval [ 0.y y 2 y li, 0.y y 2 y li + ) D l [0, ], i the set of all real numbers whose D-ary epansion begins with 0.y y 2 y li. These intervals are disjoint, due to the prefi condition, so the sum of their lengths, that is D l i. i= Proof - continuation. Conversely, if the lengths l, l 2,... satisfy the Kraft inequality we reorder the indeing such that l l Then assign the intervals in order from the low end of the unit interval. For eample if we wish to construct a binary code with l =, l 2 = 2,..., we assign the intervals [ ) 2, 4,... to the symbols, with the corresponding codewords 0, 0,... [ 0, 2 ), 3 Optimal codes 3. Construction of optimal codes Construction of optimal codes We look for the prefi code with the minimum epected length. Minimize X L = p i l i (7) i= over all integers l, l 2,..., l m satisfying D l i. (8) 6

7 We ignore the condition l i integer, assume equality in (8) and use the Lagrange multiplication method. Find ( ) min J = p i l i + λ D l i (9) Differentiating w.r.t. l i, we obtain Equating to 0 J l i = p i λd li ln D. D l i = p i λ ln D. Substituting this in (8) we find λ = / ln D and p i = D l i, yielding optimal code lengths, l i = log D p i. (0) This noninteger codeword lengths yield epected codeword length L = p i l i = p i log D p i = H D (X). () Rather than demonstrate that li = log D l i is a global minimum we verify optimality directly in the proof of the following theorem. Theorem 4. The epected length L of any instantaneous D-ary code for a random variable X is greater than or equal to the entropy H D (X); that is, with equality iff D l i = p i. Proof. L H D (X), (2) L H D (X) = p i l i p i log D p i = p i log D D l i + p i log p i. Letting r i = D l i/ j D l j and c = D l i, we obtain L H D (X) = p i log D p i r i log D c = D(p r) + log D c 0 by the nonnegativity of relative entropy and Kraft inequality (c ). Hence L H D (X), with equality iff p i = D l i (i.e. iff log D p i N, i). Definition 5. A probability distribution is called D-adic if each of the probabilities is equal to D n for some n. 7

8 Thus, we have equality in the theorem iff the distribution of X is D-adic. The preceding proof also indicates a procedure for finding an optimal code: Find the D-adic distribution that is closest (in the relative entropy sense) to the distribution of X. This distribution provides the set of codeword lengths. Construct the code by choosing the first available node as in the proof of the Kraft inequality. We then have an optimal code for X. However, this procedure is not easy, since the search for the closest D-adic distribution is not obvious. 3.2 Bounds on the optimal code length Bounds on the optimal code length Since log D pi may not equal an integer; we round up l i = These lengths satisfy the Kraft inequality But log D p i log D D pi D log D p i = p i = log D p i l i < log D p i + Multiplying by p i and summing, we obtain An optimal code can do better than this code! H D (X) L < H D (X) + (3) Theorem 6. Let l, l 2,..., l m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L be the associated epected length of an optimal code (L = p i li ). Then H D (X) L < H D (X) +. (4) Proof. Let l i = log D pi. Then l i satisfy the Kraft inequality and from (3) H D (X) L = p i l i < H D (X) + L L, since the code is optimal and L H D (X) from Theorem 4, we have the conclusion. Assume D = 2! Overhead maimum bit. We can do better by encoding blocks of n symbols a block is a supersymbol from the alphabet X n 8

9 Define L n, the epected codeword length per input symbol L n = n p(,..., n )l(,..., n ) = n E (l(x,..., X n )) (5) Bounds H(X,..., X n ) E (l(x,..., X n )) < H(X,..., X n ) + (6) X,..., X n i.i.d H(X,..., X n ) = H(X i ) = nh(x) H(X) L n < H(X) + n Using large block lengths we can achieve an epected codelength per symbol arbitrarily close to the entropy. For a stochastic process with (X i ) not necessarily i.i.d: Theorem 7. The minimum epected codeword length per symbol satisfies H(X,..., X n ) n L n < H(X,..., X n ) n Moreover, if X,..., X n is a stationary stochastic process, where H (X ) is the entropy rate of the process. + n. (7) L n H (X ), (8) Proof. In this case we still have the bound (6). Dividing by n and defining L n to be the epected description length per symbol, we obtain H(X,..., X n ) n L n < H(X,..., X n ) n + n. If the stochastic process is stationary, then H(X,...,X n ) n H (X ), as n. If the code is designed for the wrong distribution q() (for eample the wrong distribution may be the best estimate of the unknown distribution that we can make): Theorem 8 (Wrong code). The epected length under p() of the code assignment l() = log satisfies q() H(p) + D(p q) E p (l(x)) < H(p) + D(p q) +. (9) 9

10 Proof. The epected codelength is E(l(X) = p() log Lower bound similarly. = p() q() ( log p() q() = p() log p() q() + = D(p q) + H(p) +. ( < p() log ) q() + ) p() + p() log p() + Thus, believing that the distribution is q() when the true distribution is p() incurs a penalty of D(p q) in the average description length. 4 Kraft Inequality for Uniquely Decodable Codes Kraft Inequality for Uniquely Decodable Codes The class of uniquely decodable code is larger than the class of instantaneous code. Can we achieve a lower epected codeword length for this class? No! Theorem 9 (McMillan). The codeword lengths of any uniquely decodable D-ary code must satisfy the Kraft inequality D l i. (20) Conversely, given a set of codeword lengths that satisfy this inequality, it is possible to construct a uniquely decodable code with these codeword lengths. Proof. l(,..., k ) = Trick: consider the kth power of lhs of (22) ( X k l( i ). (2) i= D l()? (22) X ) k D l() = D l( ) D l( k) X X = D l( ) D l( k) = D l(k),..., k X k k X k 0

11 Proof - continuation. Last relation by (2). Gathering terms by word lengths k X k D l(k) = kl ma a(m)d m, m= where l ma is the maimum codeword length and a(m) is the number of source sequences k mapping into codewords of length m, a(m) D m, and we have ( ) k D l() X kl ma D m D m = kl ma. (23) m= Proof - continuation. (23) is equivalent to Since lim k (kl ma ) /k =, we have D l j (kl ma ) /k. (24) j D l j. j Conversely, given any set of l, l 2,..., l m satisfying the Kraft inequality, we can construct an instantaneous code as proved in Section on Kraft inequality. Since every instantaneous code is uniquely decodable, we have also constructed a uniquely decodable code. Corollary 20. A uniquely decodable code for an infinite source alphabet X also satisfies the Kraft inequality. Proof. For infinite X the preceding proof crashes at (24). Fiing: any subset of uniquely decodable code is uniquely decodable; any finite subset satisfies Kraft inequality, hence D l i = lim N i= N i= D l i. Conversely, given any set of l, l 2,... satisfying the Kraft inequality, we can construct an instantaneous code as proved in Section on Kraft inequality. Since every instantaneous code is uniquely decodable, we have also constructed a uniquely decodable code with an infinite number of codewords. Hence the McMillan inequality also applies for infinite alphabets.

12 5 Huffman Codes 5. Huffman codes Huffman codes Huffman gave in [] an algorithm for the construction of an optimal code An optimal binary instantaneous code must satisfy:. p( i ) > p( j ) l( i ) l( j ) (else swap codewords) 2. The two longest codewords have the same length (else chop a bit off the longer codeword) 3. two longest codewords differing only in the last bit (else chop a bit off all of them) Huffman Code construction. Take the two smallest p( i ) and assign each a different last bit. Then merge into a single symbol. 2. Repeat step until only one symbol remains Used in JPEG, MP3,... Eamples Eample 2. X ( We can combine the symbols 4 and 5 into a single source symbol, with a probability assignment Proceeding this way, combining the two least likely symbols into one symbol until we are finally left with only one symbol, and then assigning codewords to the symbols, we obtain the following table: ) This code has average length L = 2.3 bits and H(X) = For D-ary code, first add etra zero-probability (dummy) symbols until X is a multiple of D and then group D symbols at a time. 2

13 5.2 Optimality of Huffman codes Optimality of Huffman codes We prove by induction that the binary Huffman code is optimal. There are many optimal codes: inverting all the bits or echanging two codewords of the same length will give another optimal code. The Huffman procedure constructs one such optimal code. Assume w.l.o.g. that p p 2 p m. A code is optimal if p i l i is minimal. Lemma 22. For any distribution, there eists an optimal instantaneous code (with minimum epected length) that satisfies the following properties:. The lengths are ordered inversely with the probabilities (i.e., if p j > p k, then l j l k ). 2. The two longest codewords have the same length. 3. Two of the longest codewords differ only in the last bit and correspond to the two least likely symbols. Proof. The proof amounts to swapping, trimming, and rearranging, as shown in Figure 3. Figure 3: Properties of optimal codes. 3

14 Optimality of Huffman codes - proof We assume that p p 2 p m. A possible instantaneous code is given in (a). By trimming branches without siblings, we improve the code to (b). We now rearrange the tree as shown in (c), so that the word lengths are ordered by increasing length from top to bottom. Finally, we swap probability assignments to improve the epected depth of the tree, as shown in (d). Every optimal code can be rearranged and swapped into canonical form as in (d), where l l 2 l m and l m = l m, and the last two codewords differ only in the last bit. Consider an optimal code C m : If p j > p k, then l j l k. (Swap) C m is C m with j k L(C m) L(C m ) = p i l i p i l i = p j l k + p k l j p j l j p k l k = (p j p k )(l k l j ) But p j p k > 0, and since C m is optimal, L(C m) L(C m ) 0. Hence, we must have l k l j. Thus, C m itself satisfies property. The two longest codewords are of the same length. (Trim) Otherwise, one can delete the last bit of the longer one, preserving the prefi property and achieving lower epected codeword length. By property, the longest codewords must belong to the least probable source symbols. The two longest codewords differ only in the last bit and correspond to the two least likely symbols. Not all optimal codes satisfy this property, but by rearranging, we can find an optimal code that does. If there is a maimallength codeword without a sibling, we can delete the last bit of the codeword and still satisfy the prefi property. This reduces the average codeword length and contradicts the optimality of the code. Hence, every maimal-length codeword in any optimal code has a sibling. Now we can echange the longest codewords so that the two lowest-probability source symbols are associated with two siblings on the tree. This does not change the epected length, p i l i. Thus, the codewords for the two lowest-probability source symbols have maimal length and agree in all but the last bit. Thus, we have shown that there eists an optimal code satisfying the properties of the lemma. We call such codes canonical codes. 4

15 Theorem 23. Huffman coding is optimal; that is, if C is a Huffman code and C is any other uniquely decodable code, L(C ) L(C). Proof. For any probability mass function for an alphabet of size m, p = (p, p 2,..., p m ) with p p 2 p m, we define the Huffman reduction p = (p, p 2,..., p m 2, p m + p m ) over an alphabet of size m (Figure 4). Let C m (p ) be an optimal code for p, and let C m(p) be the canonical optimal code for p. Induction - two steps. epand an optimal code for p to construct a code for p; 2. condense an optimal canonical code for p to construct a code for the Huffman reduction p. Comparing the average codeword lengths for the two codes establishes that the optimal code for p can be obtained by etending the optimal code for p. From p we construct an etension code for m elements: take the codeword in C m to weight p m + p m and etend it by adding a 0 to form a codeword for symbol m and adding to form the codeword for symbol m: C m (p ) C m(p) p w l w = w l = l p 2 w 2 l 2 w 2 = w 2 l 2 = l (25) p m 2 w m 2 l m 2 w m 2 = w m 2 l m 2 = l m 2 p m + p m w m l m w m = w m l m = l m w m = w m l m = l m Calculation of average length i p i l i L(p) = L (p ) + p m + p m. (26) Similarly, from p, we construct a code for p by merging the codewords for the two lowest-probability symbols m and m with the probabilities p m şi p m, which are siblings by the properties of canonical code. The new code for p has average length L(p ) = m 2 i= p i l i + p m (l m ) + p m (l m ) m = p i l i p m p m i= = L (p) p m p m. (27) 5

16 (26)+(27) L(p ) + L(p) = L (p ) + L (p) (28) or ( L(p ) L (p ) ) + (L(p) L (p)) = 0. (29) The epressions in paranthesis are nonnegative, so L(p) = L (p). Figure 4: Induction step for Huffman coding. A canonical optimal code is illustrated in (a). Combining the two lowest probabilities (b). Rearranging the probabilities in decreasing order (c) for m symbols. 6 Shannon-Fano-Elias Coding 6. The code Shannon-Fano-Elias Coding X = {, 2,..., m}; p() > 0, cdf - cumulative distribution function F() = p(a). (30) a 6

17 Figure 5: cdf and Shannon-Fano-Elias coding the modified cumulative distribution function (see Figure 5) F() = p(a) + p(). (3) a 2 a = b F(a) = F(b); we can determine if we know F(), thus F() can be used as a code for. F() R F() has an infinite number of bits use F() l() is F() truncated at l() bits l() = log p() F() F() l() <. (32) 2l() + = p() < = F() F( ) (33) 2l() 2 F() l() lies within the step corresponding to The code is prefi-free! codeword z z 2... z l [0.z z 2... z l, 0.z z 2... z l + 2 l ) ; the code is prefifree iff the intervals are disjoint interval length = 2 l() < half of the step in (33) = interval are disjoint 7

18 Since l() = log +, the epected length of the code is p() L = ( p()l() = p() log ) + < H(X) + 2. (34) p() The procedure does not require ordered probabilities Eample 24. We consider an eample where all the probabilities are dyadic. + Codeword p() F() F() F() [2] log p() L = 2.75 bits, H(X) =.75 Eample 25. The distribution is not dyadic, the binary representation is infinite. + Codeword p() F() F() F() [2] log p() Competitive optimality of the Shannon code Competitive optimality of the Shannon code Huffman coding is optimal - it has minimum epected length. For particular sequencess Huffman code can be worse than other codes. Formalization - zero-sum game: two people are given a probability distribution and are asked to design an instantaneous code for the distribution. Then a source symbol is drawn the payoff to player A is, - or 0 (code shorter, longer or tie) Since analysis for Huffman code is difficult, we consider Shannon code with codeword lengths l() = log. p() Theorem 26. Let l() be the codeword lengths of Shannon code and l () be the codeword lengths associated with any other uniquely decodable code. Then P(l(X) l (X) + c). (35) 2c 8

19 Proof. P ( l(x) l (X) + c ) ( ) = P log l (X) + c p(x) ( ) P log p(x) l (X) + c = P (p(x) ) 2 l (X) c+ = :p() 2 l (X) c+ p() 2 l (X) (c ) :p() 2 l (X) c+ 2 l (X) 2 (c ) Kraft inequality 2 (c ) Stronger result if p() is dyadic: Theorem 27. For a dyadic pmf p(), if l() = log is the length of the binary p() Shannon code and l () is the length of any other uniquely decodable binary code, then P ( l(x) < l (X) ) P ( l(x) > l (X) ), (36) with equality iff l () = l() for all. Thus, the code length assignment l() = is uniquely competitively optimal. log p() Proof. Consider It can be shown graphically that if t > 0 sgn(t) = 0 if t = 0 if t < 0 sgn(t) 2 t, t Z. (37) P ( l (X) < l(x) ) P ( l (X) > l(x) ) = :l ()<l() p() = p()sgn ( l() l () ) = E ( sgn ( l(x) l (X) )) p() ( ) 2 l() l () = ( 2 l() 2 l() l () ) = 2 l () Kraft inequality = 0. bound on sgn 2 l () 2 l() :l ()>l() p() 9

20 Figure 6: sgn function and bound We have equality in the above chain if t = 0 or t = in sgn(t) (i.e. l() = l () or l() = l () + ) and l () satisfies Kraft inequality with equality, that is l () is length for an optimal code (i.e. l () = l() for all ). Corollary 28. For nondyadic probability mass functions, where l() = E ( sgn ( l(x) l (X) )) 0, (38) log and l p() () is any other code for the source. Proof. Along the same lines as the preceding proof. Hence Shannon coding is optimal under a large variety of criteria; it is robust with respect to the payoff function. In particular, for dyadic p, E(l l ) 0, E (sgn(l l )) 0, and by use of inequality (37), E ( f (l l )) for any function f satisfying f (t) 2 t, t = 0, ±, ±2,.... References References [] Thomas M. Cover, Joy A. Thomas, Elements of Information Theory, 2nd edition, Wiley, [2] David J.C. MacKay, Information Theory, Inference, and Learning Algorithms, Cambridge University Press, [3] Robert M. Gray, Entropy and Information Theory, Springer,

21 Figure 7: David A. Huffman ( ) Figure 8: Robert Fano (97 ) 2

22 References [] D. A. Huffman, A method for the construction of minimum redundancy codes, Proc. IRE, 40: 098 0,952 22

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Chapter 5. Data Compression

Chapter 5. Data Compression Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1) EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

10-704: Information Processing and Learning Spring Lecture 8: Feb 5 10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias Lecture Notes on Digital Transmission Source and Channel Coding José Manuel Bioucas Dias February 2015 CHAPTER 1 Source and Channel Coding Contents 1 Source and Channel Coding 1 1.1 Introduction......................................

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

On the Cost of Worst-Case Coding Length Constraints

On the Cost of Worst-Case Coding Length Constraints On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed

More information

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org

More information

Quantum-inspired Huffman Coding

Quantum-inspired Huffman Coding Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,

More information

U Logo Use Guidelines

U Logo Use Guidelines COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Asymptotic redundancy and prolixity

Asymptotic redundancy and prolixity Asymptotic redundancy and prolixity Yuval Dagan, Yuval Filmus, and Shay Moran April 6, 2017 Abstract Gallager (1978) considered the worst-case redundancy of Huffman codes as the maximum probability tends

More information

2 Generating Functions

2 Generating Functions 2 Generating Functions In this part of the course, we re going to introduce algebraic methods for counting and proving combinatorial identities. This is often greatly advantageous over the method of finding

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

COS597D: Information Theory in Computer Science October 19, Lecture 10

COS597D: Information Theory in Computer Science October 19, Lecture 10 COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

2018/5/3. YU Xiangyu

2018/5/3. YU Xiangyu 2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the

More information

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory 1 The intuitive meaning of entropy Modern information theory was born in Shannon s 1948 paper A Mathematical Theory of

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Using an innovative coding algorithm for data encryption

Using an innovative coding algorithm for data encryption Using an innovative coding algorithm for data encryption Xiaoyu Ruan and Rajendra S. Katti Abstract This paper discusses the problem of using data compression for encryption. We first propose an algorithm

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd 4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

More information

On the redundancy of optimum fixed-to-variable length codes

On the redundancy of optimum fixed-to-variable length codes On the redundancy of optimum fixed-to-variable length codes Peter R. Stubley' Bell-Northern Reserch Abstract There has been much interest in recent years in bounds on the redundancy of Huffman codes, given

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2 582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

18.310A Final exam practice questions

18.310A Final exam practice questions 18.310A Final exam practice questions This is a collection of practice questions, gathered randomly from previous exams and quizzes. They may not be representative of what will be on the final. In particular,

More information

Information Theory. Week 4 Compressing streams. Iain Murray,

Information Theory. Week 4 Compressing streams. Iain Murray, Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]

More information

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary

More information

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16 EE539R: Problem Set 3 Assigned: 24/08/6, Due: 3/08/6. Cover and Thomas: Problem 2.30 (Maimum Entropy): Solution: We are required to maimize H(P X ) over all distributions P X on the non-negative integers

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is

More information

Information Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73

Information Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73 1/ 73 Information Theory M1 Informatique (parcours recherche et innovation) Aline Roumy INRIA Rennes January 2018 Outline 2/ 73 1 Non mathematical introduction 2 Mathematical introduction: definitions

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 4 Source modeling and source coding Stochastic processes and models for information sources First Shannon theorem : data compression

More information

A Mathematical Theory of Communication

A Mathematical Theory of Communication A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions

More information

Information & Correlation

Information & Correlation Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What

More information

Summary of Last Lectures

Summary of Last Lectures Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i

More information

EE-597 Notes Quantization

EE-597 Notes Quantization EE-597 Notes Quantization Phil Schniter June, 4 Quantization Given a continuous-time and continuous-amplitude signal (t, processing and storage by modern digital hardware requires discretization in both

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Peter Bro Miltersen August 29, 2005 Version 2.0 1 The paradox of data compression Definition 1 Let Σ be an alphabet and let

More information

Generalized Kraft Inequality and Arithmetic Coding

Generalized Kraft Inequality and Arithmetic Coding J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

Lecture 1: September 25, A quick reminder about random variables and convexity

Lecture 1: September 25, A quick reminder about random variables and convexity Information and Coding Theory Autumn 207 Lecturer: Madhur Tulsiani Lecture : September 25, 207 Administrivia This course will cover some basic concepts in information and coding theory, and their applications

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

(Classical) Information Theory III: Noisy channel coding

(Classical) Information Theory III: Noisy channel coding (Classical) Information Theory III: Noisy channel coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract What is the best possible way

More information

Tight Bounds on Minimum Maximum Pointwise Redundancy

Tight Bounds on Minimum Maximum Pointwise Redundancy Tight Bounds on Minimum Maximum Pointwise Redundancy Michael B. Baer vlnks Mountain View, CA 94041-2803, USA Email:.calbear@ 1eee.org Abstract This paper presents new lower and upper bounds for the optimal

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Introduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld

Introduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld Introduction to algebraic codings Lecture Notes for MTH 416 Fall 2014 Ulrich Meierfrankenfeld December 9, 2014 2 Preface These are the Lecture Notes for the class MTH 416 in Fall 2014 at Michigan State

More information

Robustness and duality of maximum entropy and exponential family distributions

Robustness and duality of maximum entropy and exponential family distributions Chapter 7 Robustness and duality of maximum entropy and exponential family distributions In this lecture, we continue our study of exponential families, but now we investigate their properties in somewhat

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Information Theory and Coding Techniques

Information Theory and Coding Techniques Information Theory and Coding Techniques Lecture 1.2: Introduction and Course Outlines Information Theory 1 Information Theory and Coding Techniques Prof. Ja-Ling Wu Department of Computer Science and

More information

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2 0 1 6 C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science In this chapter, we learn how data can

More information