ECE 587 / STA 563: Lecture 5 Lossless Compression

Size: px
Start display at page:

Download "ECE 587 / STA 563: Lecture 5 Lossless Compression"

Transcription

1 ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source Coding Motivating Eample Definitions Fundamental Limits of Compression Uniquely Decodable Codes & Kraft Inequality Codes on Trees Prefi-Free Codes & Kraft Inequality Shannon Code Huffman Code Optimality of Huffman code Coding Over Blocks Coding with Unknown Distributions Minima Redundancy Coding with Unknown Alphabet Lempel-Ziv Code Introduction to Lossless Source Coding Motivating Eample Eample: Consider assigning binary phone numbers to your friends friend probability code (i) code (ii) code (iii) code (iv) code (v) code (vi) Alice 1/ Bob 1/ Carol 1/ Analysis of codes: (i) Alice and Bob have same number. Does not work. (ii) Works, but phone numbers are too long (iii) Not decodable. 10 could mean Carol, or could mean Bob, Alice (iv) Works, but why do we need to zeros for Alice? After first zero it is clear who we want. (v) Ok, but Alice has a shorter code than Bob (vi) This is the optimal code. Once you are finished dialing you can be connected immediately.

2 2 ECE 587 / STA 563: Lecture 5 Desirable properties of a code: (1) Uniquely decodable. (2) Efficient, i.e., minimize the average codeword length: E[l(X) = p()l() X where l() is the length of the codeword associated with symbol. (3) Prefi-free, i.e., no codeword is the prefi of another code Definitions A source code is a mapping C from a source alphabet X to D-ary sequences D is set of finite-length strings of symbols from D-ary alphabet D = {1, 2,, D}, i.e. D = D D 2 D 3 C() D is the codeword for X l() is the length of C() A code is nonsingular if C() C( ) The etension of the code C is the mapping from finite length strings of X to finite length string of D C( 1 2 }{{ n ) = C( } 1 )C( 2 ) C( n ) input (source) output (code) A code C is uniquely decodable if its etension C is nonsingular, i.e., for all n, 1 2 n 1 2 n = C( 1 ), C( 2 ),, C( n ) C( 1 ), C( 2 ),, C( n ) A code is prefi-free if no codeword is prefied by another codeword. Such codes are also known as prefi codes or instantaneous codes. Venn diagram of instantaneous, uniquely decodable, and nonsingular codes. Given a distribution p() on the input symbol, the goal is to minimize the epected length per-symbol E[l(X) = X l()p()

3 ECE 587 / STA 563: Lecture Fundamental Limits of Compression This section considers the limits of of lossless compression and proves the following result. Theorem: For any source distribution p(), the epected length E[l(X) of the optimal uniquely decodable D-ary code obeys H(X) log D E[l(X) < H(X) log D + 1 Furthermore, there eists a prefi-free code which is optimal Uniquely Decodable Codes & Kraft Inequality Let l() be the length function associated with a code C. i.e. l() is length of codeword C() for all X A code C satisfies the Kraft Inequality if and only if D l() 1 (Kraft Inequality) X Theorem: Every uniquely decodable code satisfies the Kraft inequality, i.e. Uniquely decodable = Kraft Inequality Proof Let C be a uniquely decodable source code. For a source sequence n, the length of the etended codeword C( n ) is given by l( n ) = n l( i ) i=1 and thus, the length function of the etended codeword obeys nl min l( n ) nl ma where l min = min X l() and l ma = ma X l(). Let A k be the number of source sequences of length n for which l( n ) = k, i.e. A k = #{ n X n : l( n ) = k} Since the code is uniquely decodable, the number of source sequence with codewords of length k cannot eceed the number of D-ary sequence of length k, an so A k D k

4 4 ECE 587 / STA 563: Lecture 5 The etended codeword lengths must obey ( ) D l(n) = 1(l( n ) = k)d k n X n n X n k=1 ( ) = 1(l( n ) = k) D k n X n = k=1 A k D k k=1 nl ma k=nl min D k D k (since uniquely decodable) nl ma The etended codeword lengths must also obey D l(n) = D l( 1) D l(2) [ n D l(n) = D l() n X n 1 X 2 X n X X Combining the above displays shows that [ n D l() nl ma X for all n If the code does not satisfy the Kraft inequality, then the left hand side will blow up eponentially as n becomes large, and this inequality will be violated. Thus, the code must satisfy the Kraft inequality. Theorem: For any source distribution p(), the epected codeword length of every D-ary uniquely decodable code obeys the lower bound E[l(X) H(X) log D Proof E[l(X) H(X) log D = [ p() l() + log D p() [ p() log D D l() + log D p() = = ) p() log D (D l() p() [ p() log D (e) 1 D l() p() ( = log D (e) 1 ) D l() 0 (Fundamental Inq. ) (Kraft Inq.)

5 ECE 587 / STA 563: Lecture Codes on Trees Any D-ary code can be represented as a D-ary tree. A D-ary tree consists of a root with branches, nodes, and leaves. The root and every node has eactly D children. Eample of a binary tree The depth of a leaf (i.e., the number of steps it takes to reach the root) corresponds to the length of the codeword. Lemma: A code is prefi-free if, and only if, each of its codeword is a leaf. code is prefi-free every codeword is a leaf Prefi-Free Codes & Kraft Inequality Theorem: There eists a prefi-free code with length function l() if and only if l() satisfies the Kraft Inequality, i.e. l() is the length function of a prefi-free code D l() 1 Proof of = One option is to recognize that a prefi- free code is uniquely decodable and recall that all uniquely decodable codes must satisfy the Kraft inequality. A different proof is given below. Let l() be the length function of a prefi-free code and let l ma = ma l() The tree corresponding to the code has has maimum depth l ma, and thus total number of leaves in the tree is upper bounded by D lma. Note that this number is attained only if all codewords are the same length. Since the code is prefi-free, any codeword of length l() < l ma is a leaf, and thus prunes (i.e., removes) D lma l() leave from the total number of possible leaves. Since the number of leaves that is removed cannot eceed the total number of possible leaves, we have D lma l() D lma X

6 6 ECE 587 / STA 563: Lecture 5 Dividing both sides by D lma yields D l() 1. X Proof of = Let l() be a length function that satisfies the Kraft inequality. The goal is to create a D-ary tree where the depths of the leaves correspond to l(). It suffices to show that, for each integer k, after all codewords of length l() < k have been assigned, there remain enough unpruned nodes on level k to handle codewords with length l() = k. That is, we need to show that for each k, D k D k l() :l()<k no. remaining nodes after assigning short codes The right-hand side can be written as #{ : l() = k} = So, to succeed on level k we need D k : l()<k D k l() + #{ : l() = k} no. needed for codes of length k : l()=k : l()=k D k l() D k l() Dividing both sides by D k yields 1 : l() k D l() Since the lengths satisfy the Kraft inequality, : l() l D l() X D l() 1 And thus we have shown that there always eist enough remaining nodes to handle the codewords of length k. Theorem: For any source distribution p(), there eists a D-ary prefi-free code whose epected length satisfies the upper bound E[l(X) < H(X) log D + 1 Proof (This proof is nonintuitive, the net section gives an eplicit construction) By the previous theorem, is suffices to show that there eists a length function l() that satisfies the Kraft inequality and the stated inequality.

7 ECE 587 / STA 563: Lecture 5 7 Consider the length function l() = log D p() where denotes the ceiling function (i.e., round up to the nearest integer). Then ( ) ( ) 1 1 log D l() < log p() D + 1 p() Since p() = 1 D l() D log D p() = X X this length function satisfies the Kraft inequality, and there eists a prefi-free code with length function l(). The epected word length is given by [ ( ) 1 E[l(X) = E[ log D p(x) < E log D + 1 = H(X) p(x) log D Shannon Code We now investigate how to construct codes with nice properties. These include: short epected code length better compression prefi-free can decode instantaneously efficient representation don t need huge lookup table for encoding and decoding Intuitively, the key idea is to assign shorter codewords to more likely source symbols. The results of the previous section show that there eists a prefi-free code such that: The length function l() is given by: The epected length obeys l() = ( ) 1 log p() E[l(X) < H(X) log D + 1 In 1948, Shannon proposed a specific way to build this code. The resulting code is also known as the Shannon Fano Elias Code. Without loss of generality let the source alphabet be X = {1, 2,, m}. The cumulative distribution function (cdf) of the source distribution is F () = k p(k) Illustration of cdf

8 8 ECE 587 / STA 563: Lecture 5 Construction of the Shannon Code For {1, 2,, m}, let F () be the midpoint of the interval [F ( 1), F ()), i.e. F () = F ( 1) + F () 2 = F ( 1) + p() 2 Note that F () is a real number between zero and one that uniquely identifies. The codeword C() corresponds to the D-ary epansion of the real number F (), truncated at the point where the codeword is unique (i.e. cannot be confused with the midpoint of any other interval) C() = D-ary epansion of F () such that C() F () < 1 2 p(). If l() terms are retained then the codeword is given by D-ary epansion {}}{ F () = 0. z 1 z 2 z l() z l()+1 z l()+2 C() Note that it is sufficient to retain the first l() terms where since this implies that l() = ( ) 1 log D + 1 p() C() F () D l() p() D 1 2 p() Thus, the epected length of the Shannon code obeys: E[l(X) < H(X) log D + 2 Eample: Consider the following binary Shannon code p() F () F () F () in binary log 1 p() + 1 C() The entropy is H(X) (bits) and the epected length is E[l(X) = 3.5 Note that there is a one-to-one mapping between p() and C(). In fact, one can encode and decode without representing the code eplicitly.

9 ECE 587 / STA 563: Lecture Huffman Code The Shannon code described in the previous section is good, but it is not necessarily optimal. Recall that the Kraft inequality is a: necessary condition for uniquely decodable sufficient condition for the eistence of a prefi-free code The search for the optimal code can be states as the following optimization problem. Given p() find a length function l() that minimizes the epected length and satisfies the Kraft inequality: min l( ) p()l() X s.t. D l() 1, X l() is an integer The optimal code was discovered by David Huffman, who was a graduate student in an information theory course (1952). Construction of the Huffman Code (1) Take the two least probable symbols. These are assigned the longest codewords which have equal length and differ only in the last digit. (2) Merge these two symbols into a new symbol with combined probability mass and repeat. Eample: Consider the following source distribution. code p() The entropy is H(X) 2.45 bits and the epected length is E[l(X) = Optimality of Huffman code Let X = {1, 2,, m} and let l i = l(i), p i = p(i), and C i = C(i). Without loss of generality, assume probabilities are in descending order p 1 p 2 p m Lemma 1: In an optimal code, shorter codewords are assigned large probabilities, i.e. Proof: p i > p j = l i l j Assume otherwise, that is l i > l j and p i > p j. Then, by echanging these codewords the epected length will decrease, and thus the code is not optimal.

10 10 ECE 587 / STA 563: Lecture 5 Lemma 2: There eists an optimal code for which the codewords assigned to the smallest probabilities are siblings (i.e., they have the same length and differ only in the last symbol). Proof: Consider any optimal code. By lemma 1, codeword C m has the longest length. Assume for the sake of contradiction, its sibling is not a codeword. Then the epected length can be decreased by moving C m to its parent. Thus, the code is not optimal and a contradiction is reached. Now, we know the sibling of C m is a codeword. If it is C m 1, we are done. Assume it is some C i for i m 1 and the code is optimal. By Lemma 1, this implies p i = p m 1. Therefore, C i and C m 1 can be echanged without changing epected length. Theorem: Huffman s algorithm produces an optimal code tree Proof of optimality of Huffman Code Let l() be the length function of the optimal code. By lemmas 1 and 2, C m 1 and C m are siblings and the longest codewords. Let p 1 p 2 p m 1 denote the ordered probabilities after merging p m 1 and p m. Let l( ) be the length function of resulting code for this new distribution. (Note the new distribution has support of size m 1). Let E[l(X) be the epected length of the original code and E [ l( X) the epected length of the reduced code. Then E[l(X) = E [ l( X) + P [ l( X) l(x) 1 = E [ l( X) + p m 1 + p m prob of merged symbol Thus, l() is the length function of an optimal code if an only if l( ) is the length function of an optimal code. Therefore, we have reduced the problem to finding and optimal code tree for p 1, p m. Again, merge, and continue the process... Thus, the Huffman algorithm yields the optimal code in a greedy fashion (there may be other optimal codes). 5.5 Coding Over Blocks Let X 1, X 2, be an iid source with finite alphabet X. memoryless source This is known as a discrete One issue with symbol codes is that there is a penalty for using integer codeword lengths. Eample: Suppose that X 1, X 2, are iid Bernoulli(p) with p very small. The optimal code is given by C() = { 0, = 0 1, = 1

11 ECE 587 / STA 563: Lecture 5 11 The epected length is E[l(X) = 1 but the entropy obeys H(X) = H b (p) p log(1/p), p 0 We can overcome the integer effects by coding over blocks of inputs symbols. Group inputs into blocks of size n to create a new source X 1, X 2, where X 1 = [X 1, X 2,, X n X 2 = [X n+1, X n+2,, X 2n. X i = [X (i 1)n+1, X (i 1)n+2,, X in Each length-n vectors can be viewed as a symbol from the alphabet X = X n. This new source alphabet has size X n. The new probabilities are given by p( ) = n p( k ) k=1 The entropy of the new source distribution is H( X) = H(X 1, X 2,, X n ) = n H(X) The epected length of the optimal code for the source distribution p( ) obeys [ nh(x) E l( X) < nh(x) +1 H( X) H( X) To encode the source X 1, X 2, it is sufficient to encode the new source X 1, X 2,. If we use a prefi-free code, then once the codeword C( X 1 ) is received, we can decode X 1,and thus recover the first n source symbols X 1,, X n. The epected [ codeword length per source symbol is given by the epected codeword length E l( X) per block, normalized by the block length. It obeys H(X) 1 [l( n E X) < H(X) + 1 n Thus, the integer effects are negligible as we increase the block length!. However, by coding over an input block of length n we have introduced delay in the system. Furthermore, we have increased the compleity of the code.

12 12 ECE 587 / STA 563: Lecture Coding with Unknown Distributions Minima Redundancy Suppose X is drawn according to an distribution p θ () with unknown parameter θ belonging to set Θ. If θ is known, then we can construct a code that achieves the optimal epected length p θ ()l() = H(p θ ) The redundancy of coding a distribution p with the optimal code for a distribution q (i.e., l() = log q()) is given by actual length {}}{ optimal length {}}{ R(p, q) = p()l() H(p) ( ( 1 p() log = q() = ( ) p() p() log q() = D(p q) The minima redundancy is defined by R = min ma q R(p θ, q) = min θ Θ q ) ( 1 log p() )) ma D(p θ q) θ Θ Intuitively, the distribution q that leads to a code minimizing the minima redundancy is the distribution at the center of the information ball of radius R. Minima Theorem: Let f(, y) be a continuous function that is conve in and concave in y, and let X and Y be compact conve sets. Then: min ma X y Y f(, y) = ma min y Y X f(, y) This is a classic result in game theory. There are many etensions, such as Sion s minima theorem, which applies when f(, y) is quasi-conve-concave and at least one of the sets is compact. Recall that D(p q) is conve in the pair (p, q), i.e., for all λ [0, 1, D(λp 1 + (1 λ)p 2 λq 1 + (1 λ)q 2 ) λd(p 1 q 1 ) + (1 λ)d(p 2 q 2 ) Let Π be the set of all distributions on θ. Note that Π is a conve set, i.e., for all λ [0, 1, π 1, π 2 Π = λπ 1 + (1 λ)π 2 Π

13 ECE 587 / STA 563: Lecture 5 13 Since the maimum of a conve function over a conve set is attained at an etreme point of the set, the maimum over R(p θ, q) with respect to θ Θ is equal to the worst case epectation with respect to π Π: ma D(p θ q) = ma θ Θ π Π π(θ)d(p θ q) This means that the minima redundancy is can be epressed equivalently as R = min π(θ)d(p θ q) q ma π Π θ Θ Note that the objective is linear (and hence both conve and concave) in π and conve in q. Applying the minima theorem yields: R = ma min π(θ)d(p θ q) π Π q For each distribution π we want to find the optimal distribution q. As an educated guess, consider the distribution induced on by p θ when θ is drawn according to π i.e. θ Θ θ Θ q π () = θ Θ π(θ)p θ () To see that q π achieves the minimum, observe that π(θ)d(p θ q) = π(θ)d(p θ q) D(q π q) + D(q π q) θ θ = ( ) pθ () π(θ)p θ () log ( ) ( ) qπ () π(θ)p θ () log + D(q π q) q() q() θ θ = θ = θ [ ( p() π(θ)p θ () log q() ( pθ () π(θ)p θ () log q π () ) ( qπ () log q() ) + D(q π q) q π() ) + D(q π q) Since D(q π q) is nonnegative and equal to zero if and only if q = q π, we see that q π is the optimal distribution. To make the epression more interpretable, consider the notation p(θ) = π(θ), p( θ) = p θ (), p() = q π () Then, we have shown that the minima redundancy can be epressed as ( ) p( θ)) R = ma p(θ)p( θ) log p(θ) p() = ma p(θ) θ I(θ; X) Theorem: The minima redundancy is equal to the maimum mutual information between the the parameter θ and the source X In other words, the code which minimizes the minima redundancy has length function l() = log p() where p() is the distribution of X p( θ) when θ is drawn according to the distribution that maimizes the mutual information I(θ; X).

14 14 ECE 587 / STA 563: Lecture Coding with Unknown Alphabet We want to compress integers Z without specifying a probability distribution. c() =?? First consider the setting where we have an upper bound N on the integer, and thus X = {1, 2,, N}. We can simply send log N bits. For eample, N = 7, then we send three bits per integer: 3, 7 = c(3)c(7) = To analyze minima redundancy of this approach, consider the set of distributions: { 1, = θ p θ () =, X = Θ = {1, 2,, N}, 0, θ The minima redundancy is given by R = ma p(θ) I(θ; X) = ma H(X) = log N p() since the uniform distribution maimizes entropy on a finite set. But we want the code to be universal and work for any integer. A unary code sends a sequence of 1 0 s followed by a 1 to mark the end of the codeword. For eample, 3, 7 = c(3)c(7) = The unary code requires bits to represent each symbol. This seems wasteful. Idea: First use a unary code to describe how many bits are needed for the binary code, and then send the binary code, For eample, suppose we want to compress 9: The binary code is c binary (9) = 1001 c universal () = (c unary (l binary ()), c binary ()) The length of the binary code is l binary (9) = 4 So the universal code is c universal (9) = header number This universal code requires log 2 () + log 2 () = 2 log 2 () bits. In fact, we can do better by repeating the process to first compress universal code using itself! ( ) c (2) universal () = c (1) univeral (l binary()), c binary () The number of bits this scheme requires obeys l (2) universal () = log 2() + 2 log 2 ( log 2 () ) log 2 () + 2 log 2 (log 2 ()) + 4

15 ECE 587 / STA 563: Lecture 5 15 It is interesting to note that this length function obeys the Kraft inequality. Thus, the length function may be viewed as a universal prior p universal () = 2 l(2) universal () Recall that n 1 1/(n log n)p diverges for p = 1 but converges for p > 1. 1 (log()) 2 (5.1) It is also interesting to note that the entropy of this distribution is infinite, H(p universal ) = =1 1 (log()) 2 log( log()2 ) = + In this case, we have θ = X and so the minima redundancy corresponds to a distribution which maimizes I(X; X) = H(X) Lempel-Ziv Code Lempel-Ziv (LZ) codes are a key component of many moderns data compression algorithms, including: compress, gzip, pkzip, ZIP file format Graphics Interchange Format (GIF) Portable Document Format (PDF) Several variations, some are patented, some are not. Basic idea: Compress string using reference to its past. The more redundant the string, the better this process works. Roughly speaking, Lempel-Ziv codes are optimal in two senses: (1) They approach the entropy rate if the source is generated from a stationary and ergodic distribution. (2) They are competitive against all possible finite-state machines Two different variations, LZ 77 and LZ 78. The gzip algorithm using LZ 77 followed by a Huffman code. Construction of Lempel-Ziv code: Input: a string of source symbols 1, 2, 3, Output: sequence of code words: c( 1 ), c( 2 ), c( 3 ), Assume that we have compressed the string from 1 to i 1. The goal is to find the longest possible match between the net symbols and a sequence in the previous sequence. In other words, we want to find the largest integer k such that [ i, i+1,, i+k = [ j, j+1,, j+k new bits previous bits for some j i 1 Thus, this matching phrase can be represented by a pointer to inde i j and its length k. For convenience,

16 16 ECE 587 / STA 563: Lecture 5 If not match is found, we send the net symbol uncompressed. Use a flag to distinguish the two cases: Find a match = send (1, pointer, length) No match = send (0, i ) Eample: Compress the following sequence with window size W = 4 Parsed String: Output: ABBABBBAABBBA A, B, B, ABB, BA, ABBBA (0, A), (0, B), (1, 1, 1), (1, 3, 3), (1, 4, 2), (1, 5, 5) Theorem: If a process X 1, X 2, is stationary and ergodic, then the per-symbol epected codeword length of the Lempel-Ziv code asymptotically achieves the entropy rate of the source. Proof sketch Assume infinite window size. Assume that we only consider matches of eactly length m, and that the sequence has been running long enough that all possible strings of length n have occurred previously. Given a new sequence of length n, how far back in time must we look to find a match? The return time is defined by: R n (X 1, X 2,, X n ) = min{j 1 : [X 1 j, X 2 j,, X n j = [X 1 j, X 2 j,, X n j } Using universal integer code, can describe R n with log 2 R n + 2 log 2 log 2 R n + 4 bits. Thus, the epected per-symbol length of our code is given by 1 n E[log 2 R n + 2 log 2 log 2 R n + 4 Observe that if the sequence is iid, then the return time of a sequence of n 1 is geometrically distributed with probability p( n 1 ), and thus the epected wait time is E[R n (X1 n ) X1 n = n 1 = 1 p( n 1 ) Kac s Lemma: If X 1, X 2, is a stationary ergodic processes, then E[R n (X n 1 ) X n 1 = n 1 = 1 p( n 1 ) To conclude proof, we use Jensen s inequality: E[log R n = E X n 1 [E[log(R n (X n 1 )) X n 1 E X n 1 [log(e[log(r n (X1 n )) X1 n ) ( ) 1 = E X n 1 p(x1 n) = H(X 1,, X n ) By the AEP for stationary ergodic processes, H(X 1,, X n ) n H(X )

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16 EE539R: Problem Set 3 Assigned: 24/08/6, Due: 3/08/6. Cover and Thomas: Problem 2.30 (Maimum Entropy): Solution: We are required to maimize H(P X ) over all distributions P X on the non-negative integers

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

ECE 587 / STA 563: Lecture 2 Measures of Information Information Theory Duke University, Fall 2017

ECE 587 / STA 563: Lecture 2 Measures of Information Information Theory Duke University, Fall 2017 ECE 587 / STA 563: Lecture 2 Measures of Information Information Theory Duke University, Fall 207 Author: Galen Reeves Last Modified: August 3, 207 Outline of lecture: 2. Quantifying Information..................................

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

Information & Correlation

Information & Correlation Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

Compressing Kinetic Data From Sensor Networks. Sorelle A. Friedler (Swat 04) Joint work with David Mount University of Maryland, College Park

Compressing Kinetic Data From Sensor Networks. Sorelle A. Friedler (Swat 04) Joint work with David Mount University of Maryland, College Park Compressing Kinetic Data From Sensor Networks Sorelle A. Friedler (Swat 04) Joint work with David Mount University of Maryland, College Park Motivation Motivation Computer Science Graphics: Image and video

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

SGN-2306 Signal Compression. 1. Simple Codes

SGN-2306 Signal Compression. 1. Simple Codes SGN-236 Signal Compression. Simple Codes. Signal Representation versus Signal Compression.2 Prefix Codes.3 Trees associated with prefix codes.4 Kraft inequality.5 A lower bound on the average length of

More information

CMPT 365 Multimedia Systems. Lossless Compression

CMPT 365 Multimedia Systems. Lossless Compression CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

Lecture 16 Oct 21, 2014

Lecture 16 Oct 21, 2014 CS 395T: Sublinear Algorithms Fall 24 Prof. Eric Price Lecture 6 Oct 2, 24 Scribe: Chi-Kit Lam Overview In this lecture we will talk about information and compression, which the Huffman coding can achieve

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

Chapter 5. Data Compression

Chapter 5. Data Compression Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function.

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function. LECTURE 2 Information Measures 2. ENTROPY LetXbeadiscreterandomvariableonanalphabetX drawnaccordingtotheprobability mass function (pmf) p() = P(X = ), X, denoted in short as X p(). The uncertainty about

More information

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1) EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006 MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter

More information

Universal Loseless Compression: Context Tree Weighting(CTW)

Universal Loseless Compression: Context Tree Weighting(CTW) Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2 582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix

More information

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

18.310A Final exam practice questions

18.310A Final exam practice questions 18.310A Final exam practice questions This is a collection of practice questions, gathered randomly from previous exams and quizzes. They may not be representative of what will be on the final. In particular,

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Variable-to-Variable Codes with Small Redundancy Rates

Variable-to-Variable Codes with Small Redundancy Rates Variable-to-Variable Codes with Small Redundancy Rates M. Drmota W. Szpankowski September 25, 2004 This research is supported by NSF, NSA and NIH. Institut f. Diskrete Mathematik und Geometrie, TU Wien,

More information

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004 Types for Parametric Probability Distributions A = finite alphabet,

More information

Generalized Kraft Inequality and Arithmetic Coding

Generalized Kraft Inequality and Arithmetic Coding J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

Information Theory: Entropy, Markov Chains, and Huffman Coding

Information Theory: Entropy, Markov Chains, and Huffman Coding The University of Notre Dame A senior thesis submitted to the Department of Mathematics and the Glynn Family Honors Program Information Theory: Entropy, Markov Chains, and Huffman Coding Patrick LeBlanc

More information

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions Engineering Tripos Part IIA THIRD YEAR 3F: Signals and Systems INFORMATION THEORY Examples Paper Solutions. Let the joint probability mass function of two binary random variables X and Y be given in the

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

Multimedia Information Systems

Multimedia Information Systems Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 3 & 4: Color, Video, and Fundamentals of Data Compression 1 Color Science Light is an electromagnetic wave. Its color is characterized

More information

Summary of Last Lectures

Summary of Last Lectures Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i

More information

6.02 Fall 2012 Lecture #1

6.02 Fall 2012 Lecture #1 6.02 Fall 2012 Lecture #1 Digital vs. analog communication The birth of modern digital communication Information and entropy Codes, Huffman coding 6.02 Fall 2012 Lecture 1, Slide #1 6.02 Fall 2012 Lecture

More information

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw

More information

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias Lecture Notes on Digital Transmission Source and Channel Coding José Manuel Bioucas Dias February 2015 CHAPTER 1 Source and Channel Coding Contents 1 Source and Channel Coding 1 1.1 Introduction......................................

More information

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information. L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate

More information

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

repetition, part ii Ole-Johan Skrede INF Digital Image Processing repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission

More information

Multiple Choice Tries and Distributed Hash Tables

Multiple Choice Tries and Distributed Hash Tables Multiple Choice Tries and Distributed Hash Tables Luc Devroye and Gabor Lugosi and Gahyun Park and W. Szpankowski January 3, 2007 McGill University, Montreal, Canada U. Pompeu Fabra, Barcelona, Spain U.

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

CS 229r Information Theory in Computer Science Feb 12, Lecture 5 CS 229r Information Theory in Computer Science Feb 12, 2019 Lecture 5 Instructor: Madhu Sudan Scribe: Pranay Tankala 1 Overview A universal compression algorithm is a single compression algorithm applicable

More information

(Classical) Information Theory III: Noisy channel coding

(Classical) Information Theory III: Noisy channel coding (Classical) Information Theory III: Noisy channel coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract What is the best possible way

More information

Lec 05 Arithmetic Coding

Lec 05 Arithmetic Coding ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) ECE 74 - Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) 1. A Huffman code finds the optimal codeword to assign to a given block of source symbols. (a) Show that cannot be a Huffman

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2 Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

More information