Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Size: px
Start display at page:

Download "Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria"

Transcription

1 Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria

2 Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal Codes (Lempel-Ziv Coding) Source Coding 1/49

3 Source Coding Let us consider a source (discrete random variable X taking values in the alphabet X ) that produces i.i.d. symbols generated according to a given density p(x). A source code is a mapping from X to a set of codewords, each one defined by a sequence of finite-length string of bits Source p(x) X 1, X 2,..., X n Code C x ) C( x ),... C( x ) ( 1 2 n Example: Let X in X = (1, 2, 3) be a r.v. with the following pmf and codeword assignment Pr(X = 1) = 1 2, C(1) = 0 Pr(X = 2) = 1, C(2) = 10 4 Pr(X = 3) = 1, C(3) = 11 4 Source Coding 2/49

4 L(C) = = 1.5 bits 4 We cannot find any lossless code with expected length < 1.5 Source bits/symbols Coding 3/49 Source p(x) Code Lossless source coding: The source symbols can be exactly recovered from the binary string Q: What is the minimum expected length (in bits/symbol), L(C), of any lossless source code? A: The source s entropy H(X ) (Shannon s source coding theorem) For the example considered, the entropy of the source is H(X ) = x X p(x) log p(x) = 1 2 log log log 4 = 1.5 bits 4 and the average length of the code is

5 General goal: To study the fundamental limits for the compression of information Outline: Typical sequences and the Asymptotic Equipartition Property Shannon s source coding theorem Optimal codes (Huffman coding) Universal codes (Lempel-Ziv coding) Source Coding 4/49

6 Typicality: intuition Let X be a Bernoulli random variable with parameter p = 0.2: Pr{X = 1} = 0.2 and Pr {X = 0} = 0.8. Let s = (x 1, x 2,..., x N ) be a sequence of N independent realizations of X Let us consider two possible outcomes of s for N = 20 s 1 = ( ) s 2 = ( ) Which one is more likely that comes from X? Source Coding 5/49

7 The probabilities of each sequence are Pr(s 1 ) = (1 p) N = Pr(s 2 ) = p 4 (1 p) (N 4) = Therefore, s 1 is more probable than s 2 However, one would expect that a string of N i.i.d. Bernoulli variables with parameter p should have (on average) Np ones (in our example Np = 4) In this sense, s 2 seems more typical than s 1 : Why? How many sequences exist with all zeros? 1 How many sequences exist with 4 ones and 16 zeros? ( ) 20 = All of them with the same probability! Source Coding 6/49

8 Let s take a look at the entropy What s the entropy of X? H(X ) = p log(p) (1 p) log(1 p) = 0.2 log(0.2) 0.8 log(0.8) The average information or sample entropy of a sequence s = (x 1, x 2,..., x N ) is i(s) = 1 N log p(x 1, x 2,..., x N ) For the example considered, we have i(s 1 ) = 1 N log(1 p)n = log(1 p) i(s 2 ) = 1 N log (p 4 (1 p) (N 4)) Source Coding 7/49

9 The value of i(s 2 ) is identical to the entropy H(X ), whereas the value of the non-typical sequence, i(s 1 ), is very different from H(X ) We can divide the set of all 2 N sequences into two sets 1. The typical set, which contains all sequences whose sample entropy is close to the true entropy 2. The non-typical set, which contains the other sequences (note, for instance, that the most probable and least probable sequences belong to the non-typical set) Key findings as N grows large The typical set has probability close to 1 The typical set contains nearly 2 NH(X ) elements (sequences) a All elements in the typical set are nearly equiprobable a Note that when H(X ) << 1 this number can be a tiny fraction of the total number of 2 N sequences Source Coding 8/49

10 Typical set Definition: The typical set A N ɛ is the set of sequences (x 1, x 2,..., x N ) X N satisfying 2 N(H(X )+ɛ) N(H(X ) ɛ) p(x 1, x 2,..., x N ) 2 or, equivalently, is the set defined as A N ɛ = {(x 1, x 2,..., x N ) X N : 1 N log p(x 1, x 2,..., x N ) H(X ) < ɛ} For N sufficiently large, the set A N ɛ 1. Pr{A N ɛ } > 1 ɛ 2. The number of elements in A N ɛ is has the following properties (1 ɛ)2 N(H(X ) ɛ) A N N(H(X )+ɛ) ɛ 2 Source Coding 9/49

11 An example Consider a sequence of N i.i.d. Bernoulli random variables with parameter p: The probability of a sequence, s k, with k ones is p(s k ) = p k (1 p) (N k) The sample entropy of all sequences with k ones is 1 N log p(s k) The number of sequences with k ones is ( ) N k The pmf is ( ) N k p k (1 p) N k, k = 0, 1,..., N The entropy is H = p log p (1 p) log(1 p) What is the typical set A N ɛ for ɛ = 0.1? Source Coding 10/49

12 Let s consider the case N = 8 and p = 0.4, for which H = The typical set for ɛ = 1 is composed of all sequences whose sample entropy lies between H ɛ = and H + ɛ = k ( ) N ( ) k N p k (1 p) N k k 1 N log p(s k ) The number of sequences in the typical set is 154 (1 ɛ)2 N(H(X ) ɛ) = 112, 6 A N ɛ = N(H(X )+ɛ) = 379, 4 The probability of the typical set is 0.72 What happens when N increases? Source Coding 11/49

13 N = 25 -log(p(s k ))/N H + ε H - ε k # Sequences Typical Set = pmf Prob. Typical Set = k Source Coding 12/49

14 N = 100 -log(p(s k ))/N H + ε H - ε k # Sequences Typical Set = pmf Prob. Typical Set = k Source Coding 13/49

15 N = 200 -log(p(s k ))/N H + ε H - ε k # Sequences Typical Set = pmf Prob. Typical Set ~ 1 k Source Coding 14/49

16 AEP This concentration of measure phenomenon is a consequence of the weak (convergence in probability) law of large numbers, and can be formalized in the Asymptotic Equipartition Property (AEP) theorem Theorem (AEP): If X 1, X 2,... are i.i.d. with pmf p(x), then 1 N log p(x 1, X 2,..., X N ) H(X ) in probability Convergence in probability means that for every ɛ > 0 { Pr 1 } N log p(x 1, X 2,..., X N ) H(X ) > ɛ 0 The proof follows from the independence of the random variables and the weak law of large numbers Source Coding 15/49

17 Consequences of the AEP: data compression When H < 1 a consequence of the AEP is that a tiny fraction of the total number of sequences contains most of the probability: this can be used for data compression For an alphabet with X = 2 elements N 2 elements Non-typical set 2 Typical set N ( H +ε ) elements Source Coding 16/49

18 A coding scheme Let x N = (x 1, x 2,..., x N ) denote a sequence of the set and let l(x N ) be the length of the codeword corresponding to x N Let us denote the typical set as A N ɛ and the non-typical set as A N ɛ Proposed coding scheme (brute force enumeration) If x N A N ɛ : 0 + at most 1 + N(H + ɛ) bits If x N A N ɛ : 1 + at most 1 + N bits (recall that X =2) The code is one-to-one and easily decodable Typical sequences have short codewords of length NH We have overestimated the size of the non-typical set Source Coding 17/49

19 Non-typical set Description N + 2 bits Typical set Description N( H + ε ) + 2 bits Source Coding 18/49

20 If N is sufficiently large so that Pr{A N ɛ } > 1 ɛ, the expected codeword length is [ ] E l(x N ) = p(x N )l(x N ) + p(x N )l(x N ) x N A N ɛ x N A N ɛ p(x N )(2 + N(H + ɛ)) + p(x N )(2 + N) = Pr x N A N ɛ { A N ɛ } (2 + N(H + ɛ)) + Pr x N A N ɛ { A N ɛ (1 ɛ)(2 + N(H + ɛ)) + ɛ(2 + N) N(H + ɛ) + ɛn + 2 = N(H + ɛ ) } (2 + N) where ɛ = 2ɛ N can be made arbitrarily small Source Coding 19/49

21 Theorem (data compression): Let X N = (X 1, X 2,..., X N ) be i.i.d. with pmf p(x), and let ɛ > 0. Then, there exists a code that maps sequences x N of length N into binary strings such that the mapping is one-to-one (invertible) and [ ] 1 E N l(x N ) H(X ) + ɛ for N sufficiently large Shannon s source coding theorem (informal statement): N i.i.d. random variables with entropy H can be compressed into NH bits with negligible risk of information loss as N. Conversely, if they are compressed into fewer than NH bits information will be lost In the following, we will study some practical codes for data compression Source Coding 20/49

22 Definitions Definition: A source code C for a random variable X is a mapping from X, the alphabet of X, to D, the set of finite-length strings of symbols from a D-ary alphabet: D = {0, 1,... D 1} Let C(x) denote the codeword corresponding to x Let l(x) denote the length of C(x) Definition: The expected length of source code C(x) for a random variable X with pmf p(x) is given by L(C) = E [l(x )] = x X p(x)l(x) Source Coding 21/49

23 Example Let X in X = (1, 2, 3, 4) be a r.v. with the following pmf and codeword assignment from a binary alphabet D = {0, 1} Pr(X = 1) = 1, C(1) = 00 2 Pr(X = 2) = 1, C(2) = 01 4 Pr(X = 3) = 1, C(3) = 10 8 Pr(X = 4) = 1, C(4) = 11 8 The sequence of symbols X = ( ) is encoded as C(X ) = ( ) The bit string C(X ) = ( ) is decoded as X = ( ) The expected length of the code is L(C) = 2 bits. Source Coding 22/49

24 There are some basic requirements for a useful code: Every element in X should map into a different string in D non-singular codes Any encoded string must have a unique decoding (that is, a unique sequence of symbols producing it): uniquely decodable codes The encoded string must be easy to decode (that is, we should be able to perform symbol-by-symbol decoding): prefix codes or instantaneous codes In addition, the code should have minimum expected length (as close as possible to H) Our goal is to construct instantaneous codes of minimum expected length Source Coding 23/49

25 All codes Nonsingular codes Uniquely decodable codes Instantaneous codes Source Coding 24/49

26 Example Find the expected the lengths of the following codes, and whether or not they are nonsingular, uniquely decodable and instantaneous X pmf C 1 C 2 C Source Coding 25/49

27 Code trees Any prefix code from a D ary alphabet can be represented as a D ary tree (D branches at each node) Consider the following prefix code: { } Each node along the path to a leaf is a prefix of the leaf, so it cannot be a leaf itself Some leaves may be unused Source Coding 26/49

28 Kraft inequality Suppose we define a code by assigning a set of codeword lengths (integer numbers) (l 1, l 2,..., l m ) If we restrict to instantaneous (prefix) codes: What is the limit or restriction on the set of integers {l i }? Intuitions: Suppose we have an instantaneous code with codewords {00, 01, 10, 11}: If we shorten one of the codewords 00 0, then, the only way to retain instantaneous decodability is to lengthen other codewords There seems to be a constrained budget that we can spend on codewords Source Coding 27/49

29 Suppose we build a code from codewords of length l = 3 using a binary alphabet (D = 2): How many codewords can we have and retain unique decodability? D l = 2 3 = 8 { } Can we add another codeword of length l > 3 and retain instantaneous decodability? No Suppose now we fix the first codeword to 0 and complete the code with codewords of length 3. How many codewords can we have? { } So, a codeword of length 3 seems to have a cost that is 2 2 times smaller than a codeword of length 1 If the total budget is 1, the cost of a codeword whose length would be l is 2 l Kraft inequality Source Coding 28/49

30 Theorem (Kraft inequality): For any instantaneous code (prefix code) over an alphabet of D symbols, the codeword lengths l 1, l 2,..., l m must satisfy the inequality D l i 1 i Conversely, given a set of codeword lengths that satisfy this inequality, there exists an instantaneous code with these codeword lengths If we represent a prefix code as a tree, i D l i = 1 is achieved iff all leaves are utilized Source Coding 29/49

31 Example Suppose a code with codeword lengths (2, 2, 3, 3, 3) 5 2 l i = i=1 It satisfies Krafts s inequality, so there exists an instantaneous code with these codeword lengths: Can we find one? l k c k = k 1 i=1 2 l i Code = = = = = Try with (2, 2, 3, 3, 3, 3) and (2, 2, 2, 3, 3, 3) Source Coding 30/49

32 Codeword supermarket (From Mackay textbook) The total symbol code budget Source Coding 31/49

33 Optimal codes So far, we have proved that: Any instantaneous code must satisfy Kraft s inequality Kraft s inequality is a sufficient condition for the existence of a code with the specified codeword lengths Notice, however, that Kraft s inequality only involves the codeword lengths, not their probabilities Now we consider the design of optimal codes: 1. Instantaneous or prefix codes (thus satisfying Kraft s inequality ) 2. With minimum expected length L = i p i l i Source Coding 32/49

34 Given a set of probabilities (p 1,..., p m ), the optimization problem for designing an optimal code is as follows (we assume a binary alphabet for the codewords): minimize L(l 1,..., l m ) = i p i l i subject to 2 l i 1 over the set of integers We consider a relaxed version of the problem that neglects the integer constraint on the codeword lengths l i And solve the constrained minimization problem using Lagrange multipliers J = ( ) p i l i + λ 2 l i i i Source Coding 33/49

35 Differentiating with respect to l i we obtain J l i = p i λ2 l i ln 2 Setting the derivative to zero we obtain 2 l i = p i λ ln 2 and substituting this in the constraint to find λ we find λ = 1/ ln 2 and hence p i = 2 l i yielding the optimal code lengths l i = log p i With this solution the expected length of the code would be H(X ); however, notice that li might be noninteger Source Coding 34/49

36 We round up l i as l i = l i where x denotes the smallest integer x This choice of codewords satisfy log p i l i log p i + 1 and therefore we have the following theorem Theorem: Let (l 1,..., l m ) be optimal integer-valued codeword lengths over a binary alphabet (D = 2) for a source distribution (p 1,..., p m ), and let L = i p il i be the associated expected length. Then H(X ) L H(X ) + 1 So there is an overhead of at most 1 bit due to the fact that l i are constrained to be integer values Source Coding 35/49

37 Huffman coding David A. Huffman discovered a simple algorithm for designing optimal (shortest expected length) prefix codes No other code for the same alphabet can have lower expected length than Huffman s code The basic idea is to assign short codewords to those input blocks with high probabilities and long codewords with low probabilities A Huffman code is designed by merging together the two least probable characters (outcomes) of the random variable, and repeating this procedure until there is only one character remaining A tree is thus generated and the Huffman code is obtained from the labeling of the code tree Source Coding 36/49

38 Example 1 Source Coding 37/49

39 Example 2 Construct a Huffman code for the following pmf, calculate the entropy of each symbol and indicate the length of each codeword x i p i C(x i ) h(p i ) l i a 0.25 b 0.25 c 0.2 d 0.15 e 0.15 Source Coding 38/49

40 Example 2 Construct a Huffman code for the following pmf, calculate the entropy of each symbol and indicate the length of each codeword x i p i C(x i ) h(p i ) l i a b c d e The entropy is H = bits The expected length of the code is L = 2.3 bits Source Coding 39/49

41 Huffman code for the English language a i p i log 2 1 pi l i c(a i) a b c d e f g h i j k l m n o p q r s t u v w x y z a n c s i o e d h u y m r l t b g f p k x v w j z q Source Coding 40/49

42 Universal codes We have seen that Huffman coding produces optimal (minimal expected length) prefix codes Do they have some disadvantage? Huffman coding is optimal for a specific source distribution, which has to be known in advance However, the probability distribution underlying the source may be unknown To solve this problem we could first estimate the pmf of the source and then design the code, but this is not practical We would rather prefer optimal lossless codes that do not depend on the source pmf Such codes are called universal Source Coding 41/49

43 The cost of pmf mismatch Assume we have a random variable X drawn from some parameterized pmf, p θ (x), which depends on a parameter θ {1, 2,... m} If θ is known, we can construct an optimal code with codeword lengths (ignoring the integer constraint) and expected length l(x) = log p θ (x) = log 1 p θ (x), E [l(x)] = E [ log p θ (x))] = H(p θ ) If the true distribution is unknown some questions arise: What is the cost of using a wrong (mismatched) pmf? What is the optimal pmf that minimizes this cost for a given family of distributions p θ (x)? Source Coding 42/49

44 Suppose we use a code with codeword lengths l(x). This code would be optimal for a source with pmf given by q(x) = 2 l(x) l(x) = log 1 q(x) The redundancy is defined as the difference between the expected length of the code that assumes the wrong distribution, q(x), and the expected length of the optimal code designed for p θ (x) R(p θ (x), q(x)) = x p θ (x)l(x) x p θ (x) log 1 p θ (x) = x = x p θ (x) (log p θ (x) log q(x)) p θ (x) log p θ(x) q(x) = D(p θ(x) q(x)) The cost (in bits) is the relative entropy between the true and the wrong distributions Source Coding 43/49

45 If we want to design a code that works well regardless of the true distribution, we could use a minimax criterion min max R(p θ(x), q(x)) = min max D(p θ(x) q(x)) q(x) p θ (x) q(x) p θ (x) The solution of this problem is achieved by the distribution q (x) that is at the center (using KL distance) of the family of distributions p θ (x) p 1 (x) p m (x) q*(x) p 2 (x) p 3 (x) Source Coding 44/49

46 Lempel-Ziv (LZ) coding Discovered by Abraham Lempel and Jacob Ziv, LZ codes are a popular class of universal codes that are asymptotically optimal Their asymtotic compression rate approaches the entropy of the source Most file compression programs (gzip, pkzip, compress) use different implementations of the basic LZ algorithm LZ coding is based on the use of adaptive dictionaries that contain substrings that have happened already When we find a substring that is already in the dictionary we just have to encode a pointer to that substring, thus effectively compressing the source Source Coding 45/49

47 Basic Lempel-Ziv algorithm As an example, let us consider the following binary string (we consider a binary alphabet without loss of generality) source The first step is to parse the source of symbols into an ordered dictionary of substrings that have not appeared before source In the second step we numerate the substrings source n Source Coding 46/49

48 The basic idea is to encode each substring by giving a pointer to the earliest occurrence of that prefix and sending the extra bit The first pointer is empty A pointer 0 means also empty dictionary source n (pointer, bit) (, 1) (0, 0) (1, 1) (2, 1) (4, 0) (2, 0) (1, 0) (3, 1) If we have already enumerated n substrings, the pointer can be encoded in log n bits Source Coding 47/49

49 (, 1) (0, 0) (1, 1) (2, 1) (4, 0) (2, 0) (1, 0) (3, 1) (, 1) (0, 0) (01, 1) (10, 1) (100, 0) (010, 0) (001, 0) (011, 1) Finally, the encoded string is For this example the encoded sequence is actually larger than the original sequence, this is because the source signal is too short How is the decoder? Source Coding 48/49

50 Sending an uncompressed bit in each substring results in a loss of efficiency The way to solve this issue is to consider the extra bit as part of the next substring This modification proposed by Terry Welch in 1984 is the basis of most practical implementations of LZ coding In practice, the dictionary size is limited (4096) Source Coding 49/49

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

Chapter 3: Asymptotic Equipartition Property

Chapter 3: Asymptotic Equipartition Property Chapter 3: Asymptotic Equipartition Property Chapter 3 outline Strong vs. Weak Typicality Convergence Asymptotic Equipartition Property Theorem High-probability sets and the typical set Consequences of

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

Chapter 5. Data Compression

Chapter 5. Data Compression Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

Universal Loseless Compression: Context Tree Weighting(CTW)

Universal Loseless Compression: Context Tree Weighting(CTW) Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004 Types for Parametric Probability Distributions A = finite alphabet,

More information

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1) EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the

More information

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

Information Theory. Week 4 Compressing streams. Iain Murray,

Information Theory. Week 4 Compressing streams. Iain Murray, Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Image and Multidimensional Signal Processing

Image and Multidimensional Signal Processing Image and Multidimensional Signal Processing Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ Image Compression 2 Image Compression Goal: Reduce amount

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2 Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques

More information

Information Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73

Information Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73 1/ 73 Information Theory M1 Informatique (parcours recherche et innovation) Aline Roumy INRIA Rennes January 2018 Outline 2/ 73 1 Non mathematical introduction 2 Mathematical introduction: definitions

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

CS 229r Information Theory in Computer Science Feb 12, Lecture 5 CS 229r Information Theory in Computer Science Feb 12, 2019 Lecture 5 Instructor: Madhu Sudan Scribe: Pranay Tankala 1 Overview A universal compression algorithm is a single compression algorithm applicable

More information

Information Theory: Entropy, Markov Chains, and Huffman Coding

Information Theory: Entropy, Markov Chains, and Huffman Coding The University of Notre Dame A senior thesis submitted to the Department of Mathematics and the Glynn Family Honors Program Information Theory: Entropy, Markov Chains, and Huffman Coding Patrick LeBlanc

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

LECTURE 13. Last time: Lecture outline

LECTURE 13. Last time: Lecture outline LECTURE 13 Last time: Strong coding theorem Revisiting channel and codes Bound on probability of error Error exponent Lecture outline Fano s Lemma revisited Fano s inequality for codewords Converse to

More information

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding Ch 0 Introduction 0.1 Overview of Information Theory and Coding Overview The information theory was founded by Shannon in 1948. This theory is for transmission (communication system) or recording (storage

More information

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

10-704: Information Processing and Learning Spring Lecture 8: Feb 5 10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

On the Cost of Worst-Case Coding Length Constraints

On the Cost of Worst-Case Coding Length Constraints On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed

More information

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16 EE539R: Problem Set 3 Assigned: 24/08/6, Due: 3/08/6. Cover and Thomas: Problem 2.30 (Maimum Entropy): Solution: We are required to maimize H(P X ) over all distributions P X on the non-negative integers

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

SGN-2306 Signal Compression. 1. Simple Codes

SGN-2306 Signal Compression. 1. Simple Codes SGN-236 Signal Compression. Simple Codes. Signal Representation versus Signal Compression.2 Prefix Codes.3 Trees associated with prefix codes.4 Kraft inequality.5 A lower bound on the average length of

More information

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias Lecture Notes on Digital Transmission Source and Channel Coding José Manuel Bioucas Dias February 2015 CHAPTER 1 Source and Channel Coding Contents 1 Source and Channel Coding 1 1.1 Introduction......................................

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 4 Source modeling and source coding Stochastic processes and models for information sources First Shannon theorem : data compression

More information

CMPT 365 Multimedia Systems. Lossless Compression

CMPT 365 Multimedia Systems. Lossless Compression CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding

More information

Introduction to Information Theory. Part 3

Introduction to Information Theory. Part 3 Introduction to Information Theory Part 3 Assignment#1 Results List text(s) used, total # letters, computed entropy of text. Compare results. What is the computed average word length of 3 letter codes

More information

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital

More information

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms) Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

repetition, part ii Ole-Johan Skrede INF Digital Image Processing repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and

More information

2018/5/3. YU Xiangyu

2018/5/3. YU Xiangyu 2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the

More information

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

Lecture 11: Polar codes construction

Lecture 11: Polar codes construction 15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last

More information

Multimedia Information Systems

Multimedia Information Systems Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 3 & 4: Color, Video, and Fundamentals of Data Compression 1 Color Science Light is an electromagnetic wave. Its color is characterized

More information

Lecture 11: Quantum Information III - Source Coding

Lecture 11: Quantum Information III - Source Coding CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that

More information

Quantum-inspired Huffman Coding

Quantum-inspired Huffman Coding Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,

More information

Information and Entropy. Professor Kevin Gold

Information and Entropy. Professor Kevin Gold Information and Entropy Professor Kevin Gold What s Information? Informally, when I communicate a message to you, that s information. Your grade is 100/100 Information can be encoded as a signal. Words

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Lecture 11: Continuous-valued signals and differential entropy

Lecture 11: Continuous-valued signals and differential entropy Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components

More information

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression Kirkpatrick (984) Analogy from thermodynamics. The best crystals are found by annealing. First heat up the material to let

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information