Chapter 5. Data Compression
|
|
- Logan Terry
- 5 years ago
- Views:
Transcription
1 Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University
2 Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes 5.4 Bound on Optimal Code Length 5.5 Kraft Inequality for Uniquely Decodable Codes 5.6 Huffman Codes 5.7 Some Comments on Huffman Codes 5.8 Optimality of Huffman Codes 5.9 Shannon-Fano-Elias Coding 5.10 Competitive Optimality of the Shannon Code 5.11 Generation of Discrete Distributions from Fair Coins Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 2/41
3 5.1 Example of Codes Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 3/41
4 Source code Definition (Source code) A source codec for a random variablex is a mapping fromx, the range ofx, tod, the set of finite-length strings of symbols from ad-ary alphabet. LetC(x) denote the codeword corresponding toxand letl(x) denote the length ofc(x). For example, C(red) = 00, C(blue) = 11 is a source code with mapping fromx = {red, blue} tod 2 with alphabetd = {0,1}. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 4/41
5 Source code Definition (Expected length) The expected lengthl(c) of a source codec(x) for a random variablex with probability mass functionp(x) is given by L(C) = x X p(x)l(x). wherel(x) is the length of the codeword associated withx. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 5/41
6 Example Example LetX be a random variable with the following distribution and codeword assignment Pr{X = 1} = 1, codewordc(1) = 0 2 Pr{X = 2} = 1, codewordc(2) = 10 4 Pr{X = 3} = 1, codewordc(3) = Pr{X = 4} = 1, codewordc(4) = H(X) = 1.75 bits. El(x) = 1.75 bits. uniquely decodable Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 6/41
7 Example Example Consider following example. Pr{X = 1} = 1, codewordc(1) = 0 3 Pr{X = 2} = 1, codewordc(2) = 10 3 Pr{X = 3} = 1, codewordc(3) = 11 3 H(X) = 1.58 bits. El(x) = 1.66 bits. uniquely decodable Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 7/41
8 Source code Definition (non-singular) A code is said to be nonsingular if every element of the range ofx maps into a different string ind ; that is, x x C(x) C(x ) Definition (extension code) The extensionc of a codec is the mapping from finite length-strings ofx to finite-length strings ofd, defined by C(x 1 x 2 x n ) = C(x 1 )C(x 2 ) C(x n ) wherec(x 1 )C(x 2 ) C(x n ) indicates concatenation of the corresponding codewords. Example IfC(x 1 ) = 00 andc(x 2 ) = 11, thenc(x 1 x 2 ) = Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 8/41
9 Source code Definition (uniquely decodable) A code is called uniquely decodable if its extension is nonsingular. Definition (prefix code) A code is called a prefix code or an instantaneous code if no codeword is a prefix of any other codeword. For an instantaneous code, the symbolx i can be decoded as soon as we come to the end of the codeword corresponding to it. For example, the binary string produced by the code of Example is parsed as0,10,111,110,10. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 9/41
10 Source code X Singular Nonsingular, but not UD, UD But Not Inst. Inst Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 10/41
11 Decoding Tree Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 11/41
12 5.2 Kraft Inequality Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 12/41
13 Kraft Inequality Theorem (Kraft Inequality) For any instantaneous code (prefix code) over an alphabet of sized, the codeword lengthsl 1,l 2,...,l m must satisfy the inequality D l i 1. i Conversely, given a set of codeword lengths that satisfy this inequality, there exists an instantaneous code with these word lengths. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 13/41
14 Extended Kraft Inequality Theorem (Extended Kraft Inequality) For any countably infinite set of codewords that form a prefix code, the codeword lengths satisfy the extended Kraft inequality, i=1 D l i 1. Conversely, given anyl 1,l 2,... satisfying the extended Kraft inequality, we can construct a prefix code with these codeword lengths. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 14/41
15 5.3 Optimal Codes Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 15/41
16 Minimize expected length Problem Given the source pmfpp 1,p 2,...p m, find the code length l 1,l 2,...,l m such that the expected code length is minimized L = p i l i with constraint D l i 1. l 1,l 2,...,l m are integers. We first relax the original integer programming problem. The restriction of integer length is relaxed to real number. Solve by Lagrange multipliers. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 16/41
17 Solve the relaxed problem J = ( ) p i l i +λ D l i J, = p i λd l i lnd l i J = 0 D l i = p i l i λlnd D l i 1 λ = 1 lnd p i = D l i optimal code lengthli = log D p i The expected code length is L = p i l i = p i log D p i = H D (X) Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 17/41
18 Expected code length Theorem The expected lengthlof any instantaneousd-ary code for a random variablex is greater than or equal to the entropy H D (X); that is, L H D (X) with equality if and only ifd l i = p i. Proof. L H D (X) = p i l i + p i log D p i = p i log D D l i + p i log D p i = D(p q) 0 whereq i = D li. WRONG!! Because D l i 1 may not be a valid distribution. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 18/41
19 Expected code length Proof. L H D (X) = p i l i + p i log D p i Letc = j D l j,r i = D l i /c, = p i log D D l i + p i log D p i L H D (X) = p i log D p i r i log D c = D(p r)+log D 1 c 0 sinced(p r) 0 andc 1 by Kraft inequality. L H D (X) with equality iffp i = D l i. That is, iff log D p i is an integer for alli. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 19/41
20 D-adic Definition (D-adic) A probability distribution is calledd-adic if each of the probabilities is equal tod n for somen. L = H D (X) if and only if the distribution ofx isd-adic. How to find the optimal code? Find thed-adic distribution that is closest (in the relative entropy sense) to the distribution ofx. What is the upper bound of the optimal code? Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 20/41
21 5.4 Bound on Optimal Code Length Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 21/41
22 Optimal code length Theorem Letl 1,l 2,...,l m be optimal codeword lengths for a source distributionpand sad-ary alphabet, and letl be the associated expected length of an optimal code (L ast = p i l i ). Then H D (X) L < H D (X)+1. Proof. Letl i = log D 1 p i where x is the smallest integer x. These lengths satisfy the Kraft inequality since D log D 1 p i D log D 1 p i = p i = 1. These choice of codeword lengths satisfies log D 1 p i l i log D 1 p i +1. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 22/41
23 Optimal code length Multiplying byp i and summing overi, we obtain H D (X) L < H D (X)+1. SinceL is the expected length of the optimal code, L L < H D (X)+1. On another hand, from Theorem 5.3.1, L H D (X). Therefore, H D (X) L < H D (X)+1. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 23/41
24 Optimal code length Consider a system in which we send a sequence ofnsymbols fromx. DefineL n to be the expected codeword length per input symbol, We have L n = 1 p(x1,x 2,...,x n )l(x 1,x 2,...,x n ) n = 1 n E[l(X 1,X 2,...,X n )] H(X 1,X 2,...,X n ) E[l(X 1,X 2,...,X n )] < H(X 1,X 2,...,X n )+1 IfX 1,X 2,...,X n are i.i.d., we have H(X) L n < H(X)+ 1 n Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 24/41
25 Optimal code length Consider a system in which we send a sequence ofnsymbols fromx. DefineL n to be the expected codeword length per input symbol, We have L n = 1 p(x1,x 2,...,x n )l(x 1,x 2,...,x n ) n = 1 n E[l(X 1,X 2,...,X n )] H(X 1,X 2,...,X n ) E[l(X 1,X 2,...,X n )] < H(X 1,X 2,...,X n )+1 Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 25/41
26 Optimal code length IfX 1,X 2,...,X n are independent but not identically distributed, we have 1 n H(X 1,X 2,...,X n ) L n < 1 n H(X 1,X 2,...,X n )+1 If the random process is stationary 1 n H(X 1,X 2,...,X n ) H(X) Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 26/41
27 Optimal code length Theorem The minimum expected codeword length per symbol satisfies 1 n H(X 1,X 2,...,X n ) L n < 1 n H(X 1,X 2,...,X n )+1 IfX 1,X 2,...,X n is a stationary random process, L n H(X) whereh(x) is the entropy rate of the random process. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 27/41
28 Wrong Code Theorem (Wrong Code) The expected length underp(x) of the code assignmentl(x) = log 1 q(x) satisfies H(p)+D(p q) E p [l(x)] < H(p)+D(p q)+1. Proof. The expected codelength is E[l(X)] = x = x = x p(x) log 1 q(x) < x p(x) p(x)log 1 q(x) +1 ( p(x) log p(x) ) q(x) logp(x) +1 = H(p)+D(p q)+1 ( log 1 ) q(x) +1 The lower bound can be derived similarly. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 28/41
29 5.5 Kraft Inequality for Uniquely Decodable Codes Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 29/41
30 Uniquely Decodable Codes Theorem (Wrong Code) The codeword lengths of any uniquely decodabled-ary code must satisfy the Kraft inequality D l i 1. Conversely, given a set of codeword lengths that satisfy this inequality, it is possible to construct a uniquely decodable code with these codeword lengths. Proof. Consider ( ) k D l(x) = D l(x1) l(xk) D l(x2) D x x 1 x 2 x k = D l(x1) D l(x2) D l(xk) = x 1,...,x k X k x k X k D l(x k ) Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 30/41
31 Uniquely Decodable Codes We now gather the terms by word lengths to obtain x k X k D l(xk) = kl max m=1 a(m)d m wherel max is the maximum codeword length anda(m) is the number of source sequences mapping into codewords of lengthm. Since the code is uniquely decodable, so there is at most one sequence mapping into each codem-sequence and there are at mostd m code m-sequences. Thus, a(m) D m. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 31/41
32 Uniquely Decodable Codes Therefore, ( D l(x) ) k = kl max a(m)d m kl max D m D m = kl max x m=1 m=1 or D l j (kl max ) 1/k j Since this inequality is true for allk, it is true in the limit ask. Since(kl max ) 1/k 1, we have D l j 1. j Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 32/41
33 5.6 Huffman Codes Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 33/41
34 Example X = {1,2,3,4,5},p = {0.25,0.25,0.2,0.15,0.15}, binary. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 34/41
35 Example X = {1,2,3,4,5},p = {0.25,0.25,0.2,0.15,0.15}, ternary. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 35/41
36 Example X = {1,2,3,4,5,6},p = {0.25,0.25,0.2,0.1,0.1,0.1}, ternary. Atkth stage, the total number of symbols is1+k(d 1). Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 36/41
37 Shannon Code Using codeword lengths of log 1 p i May be much worse than the optimal code. For example, p 1 = 1 1/1024 andp 2 = 1/1024. log 1 p 1 = 1 and log 1 p 2 = 10. However, we can use exactly 1 bit. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 37/41
38 5.8 Optimality of Huffman Codes Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 38/41
39 Properties of Optimal Codes Lemma For any distribution, there exists an optimal instantaneous code that satisfies the following properties: 1. Ifp j > p k, thenl j l k. 2. The two longest codewords have the same length. 3. Two of the longest codewords differ only in the last bits. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 39/41
40 Optimality of Huffman codes Lemma LetC be the codes with distribution p 1 p 2 p K 1 p K. C is the codes with distribution p 1,p 2,...,p K 1 +p K. IfC is optimal with code assignment p 1 w 1,p 2 w 2,...,p K 1 +p K w K 1, thenc is also optimal with code assignment p 1 w 1,p 2 w 2,...,p K 1 w K 1 0,p K w K 1 1. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 40/41
41 Optimality of Huffman codes Proof. The average length forc is L(C ) = p 1 l 1 +p 2 l 2 + +(p K 1 +p K )l K 1. The average length forc is L(C) = p 1 l 1 +p 2 l 2 + +p K 1 (l K 1 +1)+p K (l K 1 +1). We have L(C) = L(C )+p K 1 +p K. That is, we can minimizel(c) by minimizingl(c ) sincep K 1 +p K is a constant. Peng-Hua Wang, April 2, 2012 Information Theory, Chap. 5 - p. 41/41
Chapter 5: Data Compression
Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationData Compression. Limit of Information Compression. October, Examples of codes 1
Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationLecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)
3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More information10-704: Information Processing and Learning Fall Lecture 9: Sept 28
10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationEE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16
EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationPART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015
Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationInformation Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18
Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More information4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information
4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationLecture 3 : Algorithms for source coding. September 30, 2016
Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39
More informationCoding for Discrete Source
EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively
More informationCOMM901 Source Coding and Compression. Quiz 1
German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationSolutions to Set #2 Data Compression, Huffman code and AEP
Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationEE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)
EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the
More informationHomework Set #2 Data Compression, Huffman code and AEP
Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code
More informationBasic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.
Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit
More informationOptimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.
Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini
More informationLecture 6: Kraft-McMillan Inequality and Huffman Coding
EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with
More information10-704: Information Processing and Learning Spring Lecture 8: Feb 5
10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output
More informationMotivation for Arithmetic Coding
Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationData Compression Techniques (Spring 2012) Model Solutions for Exercise 2
582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix
More informationExercises with solutions (Set B)
Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationMultimedia Communications. Mathematical Preliminaries for Lossless Compression
Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when
More informationEE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018
Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X
More informationlossless, optimal compressor
6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.
More informationChapter 9 Fundamental Limits in Information Theory
Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationChapter 11. Information Theory and Statistics
Chapter 11 Information Theory and Statistics Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 11 Information Theory and Statistics 11.1 Method of Types
More informationEE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.
EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page
More informationU Logo Use Guidelines
COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More informationLecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias
Lecture Notes on Digital Transmission Source and Channel Coding José Manuel Bioucas Dias February 2015 CHAPTER 1 Source and Channel Coding Contents 1 Source and Channel Coding 1 1.1 Introduction......................................
More informationQuantum-inspired Huffman Coding
Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,
More informationAPC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words
APC486/ELE486: Transmission and Compression of Information Bounds on the Expected Length of Code Words Scribe: Kiran Vodrahalli September 8, 204 Notations In these notes, denotes a finite set, called the
More informationDCSP-3: Minimal Length Coding. Jianfeng Feng
DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than
More information(Classical) Information Theory II: Source coding
(Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of
More informationLec 03 Entropy and Coding II Hoffman and Golomb Coding
CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More informationOn the Cost of Worst-Case Coding Length Constraints
On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed
More informationComplex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity
Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary
More informationVariable-to-Variable Codes with Small Redundancy Rates
Variable-to-Variable Codes with Small Redundancy Rates M. Drmota W. Szpankowski September 25, 2004 This research is supported by NSF, NSA and NIH. Institut f. Diskrete Mathematik und Geometrie, TU Wien,
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationDigital Communications III (ECE 154C) Introduction to Coding and Information Theory
Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 14 Statement
More informationInformation Theory. M1 Informatique (parcours recherche et innovation) Aline Roumy. January INRIA Rennes 1/ 73
1/ 73 Information Theory M1 Informatique (parcours recherche et innovation) Aline Roumy INRIA Rennes January 2018 Outline 2/ 73 1 Non mathematical introduction 2 Mathematical introduction: definitions
More informationSummary of Last Lectures
Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i
More informationLecture 1: September 25, A quick reminder about random variables and convexity
Information and Coding Theory Autumn 207 Lecturer: Madhur Tulsiani Lecture : September 25, 207 Administrivia This course will cover some basic concepts in information and coding theory, and their applications
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied
More informationEECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have
EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,
More informationELEC 515 Information Theory. Distortionless Source Coding
ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationChapter 9. Gaussian Channel
Chapter 9 Gaussian Channel Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 9 Gaussian Channel 9.1 Gaussian Channel: Definitions 9.2 Converse to the Coding
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More informationInformation Theory in Intelligent Decision Making
Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationCSCI 2570 Introduction to Nanocomputing
CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More informationDigital Communications III (ECE 154C) Introduction to Coding and Information Theory
Digital Communications III (ECE 54C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 204 / 2 Noiseless
More informationCS6304 / Analog and Digital Communication UNIT IV - SOURCE AND ERROR CONTROL CODING PART A 1. What is the use of error control coding? The main use of error control coding is to reduce the overall probability
More informationNational University of Singapore Department of Electrical & Computer Engineering. Examination for
National University of Singapore Department of Electrical & Computer Engineering Examination for EE5139R Information Theory for Communication Systems (Semester I, 2014/15) November/December 2014 Time Allowed:
More informationSource Coding: Part I of Fundamentals of Source and Video Coding
Foundations and Trends R in sample Vol. 1, No 1 (2011) 1 217 c 2011 Thomas Wiegand and Heiko Schwarz DOI: xxxxxx Source Coding: Part I of Fundamentals of Source and Video Coding Thomas Wiegand 1 and Heiko
More informationUniversal Loseless Compression: Context Tree Weighting(CTW)
Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)
More informationTight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes
Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org
More informationLecture 11: Quantum Information III - Source Coding
CSCI5370 Quantum Computing November 25, 203 Lecture : Quantum Information III - Source Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang. Holevo s bound Suppose Alice has an information source X that
More informationMARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for
MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1
More informationInformation Theory and Coding Techniques
Information Theory and Coding Techniques Lecture 1.2: Introduction and Course Outlines Information Theory 1 Information Theory and Coding Techniques Prof. Ja-Ling Wu Department of Computer Science and
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More informationContext tree models for source coding
Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding
More informationCS225/ECE205A, Spring 2005: Information Theory Wim van Dam Engineering 1, Room 5109
CS225/ECE205A, Spring 2005: Information Theory Wim van Dam Engineering 1, Room 5109 vandam@cs http://www.cs.ucsb.edu/~vandam/teaching/s06_cs225/ Formalities There was a mistake in earlier announcement.
More informationECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)
ECE 74 - Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) 1. A Huffman code finds the optimal codeword to assign to a given block of source symbols. (a) Show that cannot be a Huffman
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter
More informationKolmogorov complexity ; induction, prediction and compression
Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds
More informationIntroduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld
Introduction to algebraic codings Lecture Notes for MTH 416 Fall 2014 Ulrich Meierfrankenfeld December 9, 2014 2 Preface These are the Lecture Notes for the class MTH 416 in Fall 2014 at Michigan State
More informationLecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data
More informationEntropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39
Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur
More informationInformation Theory: Entropy, Markov Chains, and Huffman Coding
The University of Notre Dame A senior thesis submitted to the Department of Mathematics and the Glynn Family Honors Program Information Theory: Entropy, Markov Chains, and Huffman Coding Patrick LeBlanc
More informationInformation Theory. Coding and Information Theory. Information Theory Textbooks. Entropy
Coding and Information Theory Chris Williams, School of Informatics, University of Edinburgh Overview What is information theory? Entropy Coding Information Theory Shannon (1948): Information theory is
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 4 Source modeling and source coding Stochastic processes and models for information sources First Shannon theorem : data compression
More informationChapter 3: Asymptotic Equipartition Property
Chapter 3: Asymptotic Equipartition Property Chapter 3 outline Strong vs. Weak Typicality Convergence Asymptotic Equipartition Property Theorem High-probability sets and the typical set Consequences of
More informationData Compression Techniques
Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More informationChapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University
Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission
More information