Lecture 3 : Algorithms for source coding. September 30, 2016
|
|
- Benedict Lindsey
- 6 years ago
- Views:
Transcription
1 Lecture 3 : Algorithms for source coding September 30, 2016
2 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39
3 1. Coding and decoding with a prefix code Coding : Using a table indexed by the letters. Decoding : Using the associated binary tree Start at the root. For each bit turn left or right. When a leaf is attained, the corresponding letter is output and go back to the root. 2/39
4 Optimal Code Corollary of Kraft and Mac Millan theorems : Corollary 1. If there exists a uniquely decodable code with K words of length n 1, n 2,..., n K then there exists a prefix code with the same lengths. K k=1 1 2 n k 1. Definition A uniquely decodable code of a source X is optimal if there exists no other uniquely decodable code with a smaller average length. Proposition 1. For every source there exists an optimal prefix code. 3/39
5 Huffman Code Let X be a discrete source with alphabet X = {a 1,..., a K 2, a K 1, a K } with prob. distr. p. W.l.o.g. we can assume that p(a 1 )... p(a K 1 ) p(a K ) > 0. Let Y be the source with alphabet Y = {a 1,..., a K 2, b K 1 } and prob. distr. q(a k ) = p(a k ), k = 1..., K 2 q(b K 1 ) = p(a K 1 ) + p(a K ) Algorithm (Huffman) We compute recursively a prefix code of X. When K = 2, the words are given by ϕ K (a 1 ) = 0 and ϕ K (a 2 ) = 1. If K > 2, let ϕ K 1 be a Huffman code for Y, ϕ K (a k ) = ϕ K 1 (a k ) for k = 1..., K 2, ϕ K (a K 1 ) = ϕ K 1 (b K 1 ) 0, ϕ K (a K ) = ϕ K 1 (b K 1 ) 1, 4/39
6 Huffman Code : example 1 0,57 a 0,43 b 0,17 0,15 c 0,32 0,25 e 0,14 d 0,09 f 0,05 0,11 x P (x) log 2 (P (x)) ϕ(x) n x a b c d e f H = n = 2.28 E = 98.6% 5/39
7 The Huffman code is optimal Lemma 1. For every source, there exists an optimal prefix code such that 1. if p(x j ) > p(x k ) then φ(x j ) φ(x k ). 2. The two longest codewords have the same length and correspond to the least likely symbols. 3. These two codewords differ only in their last bit. 6/39
8 Proof of 1 1. If p(x j ) > p(x k ) then φ(x j ) φ(x k ). Let φ be an optimal code and assume that φ(x j ) > φ(x k ). φ is obtained by swapping the codewords of x j and x k. Then φ φ = p(x) φ (x) p(x) φ(x) = p(x j ) φ(x k ) + p(x k ) φ(x j ) p(x j ) φ(x j ) p(x k ) φ(x k ) = (p(x j ) p(x k ))( φ(x k ) φ(x j ) ) < 0 This contradicts the optimality of φ. 7/39
9 Proof of 2 2. The two longest codewords have the same length. If they do not have the same length then a bit can be removed from the longest one- the resulting code is still a prefix code. = a shorter prefix code is obtained in this way. By the previous point we know that they correspond to the least likely symbols. 8/39
10 Proof of The two longest codewords differ only in their last bit. Every φ(x) of maximal length has a brother (otherwise it would be possible to shorten the code). Among those, there are two which correspond to the least likely symbols. By swapping the leaves we can arrange these two symbols so that they become brothers. 9/39
11 The Huffman code is optimal Proof (Induction on k the alphabet size) Let f be optimal for X with p(a 1 ) p(a k 1 ) p(a k ), and f(a k 1 ) = m 0 et f(a k ) = m 1. We derive from f a code g on Y = (a 1,..., a k 2, b k 1 ) : { g(aj ) = f(a j ), j k 2 g(b k 1 ) = m f g = p(a k 1 ) + p(a k ) ϕ k ϕ k 1 = p(a k 1 ) + p(a k ) Hence f ϕ k = g φ k 1. Optimality of ϕ k 1 g φ k 1 0 f ϕ k 0. Optimality of f and f ϕ k 0 f ϕ k = 0 optimality of ϕ k. 10/39
12 Drawbacks of the Huffman code Need to know beforehand the source statistics. If the wrong probability distribution q instead of the true distribution p, instead of compressing with a rate of H(p) we compress with a rate of H(p) + D(p q). This can be dealt with by using an adaptive algorithm that computes the statistics on the fly and updates them during the coding process. Based on a memoryless source model. It is better to group letters by blocks of size l, for l large enough : (1) memoryless model sufficiently more realistic (2) coding efficiency 1. But space complexity = O( X l ). 11/39
13 Huffman code in practice Almost everywhere in the compression algorithms gzip, pkzip, winzip, bzip2 2. compressed images such as jpeg, png 3. compressed audio files such as mp3 Even the rather recent Google compression algorithm Brotli uses it... 12/39
14 2. Interval coding Gives a scheme for which we have in a natural way x i log p(x i ) bits and therefore φ H(X). Idea of arithmetic coding. 13/39
15 2-adic numbers In the interval [0, 1], they are of the form d i 2 i, where d i {0, 1} We write 0.d 1 d 2 d 3... i=1 For instance / Certain numbers have several different 2-adic representation, for instance and In the last case ; we choose the smallest representation. 14/39
16 Shannon-Fano-Elias Code Let X be a memoryless source with alphabet X, and probability distribution p. S(x) = x <x p(x ) p(x 1 ) + + p(x n 1 ) p(x 1 ) + + p(x i ) p(x 1 ) + p(x 2 ) p(x 1 ) The interval width is p(x i ). Each x i is encoded by its interval. 15/39
17 Shannon-Fano-Elias Coding Consider the middle of the interval S(x) def = 1 p(x) + S(x). 2 Let ϕ(x) for all x X be defined as the log p(x) + 1 first bits of the 2-adic representation of S(x). Proposition 2. The code ϕ is prefix and its average length ϕ verifies H(X) ϕ < H(X) /39
18 A simple explanation Lemma 2. For all reals u and v in [0, 1[, and for every integer l > 0,if u v 2 l then the l first bits of the 2-adic representation of u and v are distinct. 17/39
19 The Shannon-Fano-Elias code is prefix Proof W.l.o.g., we may assume y > x S(y) S(x) = p(y) 2 + p(x ) p(x) 2 p(x ) x <y x <x = p(y) 2 p(x) 2 + x x <y = p(y) 2 + p(x) 2 + > max x<x <y p(x ) p(x ) > max (2 log p(x) 1 log, 2 p(y) 1) ( p(y) 2, p(x) ) 2 Therefore they differ in their min( log p(x) + 1, log p(y) + 1) first bits : φ(x) et φ(y) can not be prefix of each other. 18/39
20 Shannon-Fano-Elias Code : example x p(x) log p(x) + 1 S(x) ϕ(x) Huffman a b c d e f /39
21 Shannon-Fano-Elias code : another example x p(x) log p(x) + 1 S(x) ϕ(x) Huffman a b c d x p(x) log p(x) + 1 S(x) ϕ(x) Huffman b a c d If the last bit of ϕ is deleted the code is not prefix anymore. If the last bit of ϕ is deleted the code is still a prefix code. 20/39
22 Shannon Code The Shannon code is defined in the same way as the Shannon-Fano-Elias code, with the exception of : The symbols are ordered by decreasing probability, The codeword of x is formed by the log p(x) first bits of S(x). (In particular, the codeword corresponding to the most likely letter is formed by log p(x) 0 ). Proposition 3. The Shannon code is prefix and its average length ϕ verifies H(X) ϕ < H(X) /39
23 The Shannon code is prefix Proof For all x, y X such that x < y (therefore log p(y) log p(x) ) S(y) S(x) = p(z) p(z) z<y z<x = p(z) x z<y = p(x) + x<z<y log p(x) p(z) p(x) 2 22/39
24 Proof for the lengths Proposition 4. The Shannon-Fano-Elias code ϕ verifies H(X) ϕ < H(X) + 2 Proof ϕ(x) = log p(x) + 1 < log 2 p(x) + 2. The conclusion follows by taking the average. Proposition 5. The Shannon code ϕ verifies H(X) ϕ < H(X) + 1 ϕ(x) = log p(x) < log 2 p(x) + 1. The conclusion follows by taking the average. 23/39
25 The Shannon code is optimal for a dyadic distribution In the dyadic case, the probabilities verify p(x) = 2 log p(x). In this case the lengths of encoding satisfy φ(x) = log p(x) = log 2 p(x) By taking the average p(x) φ(x) = p(x) log2 p(x) = H(X). As for the Huffman code : average coding length = entropy. 24/39
26 Non optimality of Shannon s code For the following source 1. output 1 with probability ; 2. output 0 with probability The Shannon code encodes 1 with a codeword of length log( ) = = 1 2. The Shannon code encodes 0 with a codeword of length log 2 10 = 10 = 10 The Huffman code uses 1 and 0, instead which is of course optimal. 25/39
27 3. Arithmetic coding One of the main drawbacks of the Huffman code is the space complexity when packets of length l are formed for the alphabet letters (space complexity C = O( X l )). r = redundancy def = 1 efficiency. Huffman code on the blocks of size l : r O ( 1 l ) O ( 1 log C Arithmetic coding allows to work on packets of arbitrary sizes with acceptable algorithmic cost. Here ( ) 1 r O, l but the dependency of the space complexity in l is much better. ). 26/39
28 The fundamental idea Instead of encoding the symbols of X, work directly on X l where l is the length of the word which is to encoded.. To encode x 1,..., x l 1. compute the interval [S(x 1,..., x l ), S(x 1,..., x l ) + p(x 1,..., x l )], with S(x 1,..., x l ) def = (y 1,...,y l )<(x 1,...,x l ) p(y 1,..., y l ), 2. encode (x 1,..., x l ) with an element of the interval whose 2-adic representation length is log(p(x 1,..., x l )) + 1. This identifies the interval deduce x 1... x l at the decoding step. 27/39
29 Advantage of arithmetic encoding codelength(x 1,..., x l ) = log p(x 1,..., x l ) + O(1). at most O(1) bits are used to encode l symbols and not O(l) as before! Very interesting in the previous example where 0 s are generated with probability The average coding length becomes h(2 10 )l + O(1) 0.011l instead of l for the Huffman code! 28/39
30 The major problem for implementing arithmetic coding Need to compute S(x 1,..., x l ) with a precision of log p(x 1,..., x l ) + 1 bits. perform computations with arbitrary large precision. 29/39
31 The key idea which allows to work with packets of size l Principle : efficient computation of p(x 1,..., x l ) and S(x 1,..., x l ). Computation of p(x 1,..., x l ) p(x 1,..., x l ) = p(x 1 )... p(x l ) iterative computation of S(x 1,..., x l ) : lexicographic order on X l. S(x 1,..., x l ) def = (y 1,...,y l )<(x 1,...,x l ) p(y 1,..., y l ). Proposition 6. S(x 1,..., x l ) = S(x 1,..., x l 1 ) + p(x 1,..., x l 1 )S(x l ). 30/39
32 Proof Let m be the smallest element in the alphabet S(x 1,..., x l ) = p(y 1, y 2,..., y l ) (y 1,...,y l )<(x 1,...,x l ) = A + B where A B def = def = p(y 1, y 2,..., y l ) (y 1,...,y l )<(x 1,...,x l 1,m) (x 1,...,x l 1,m) (y 1,...,y l )<(x 1,...,x l ) p(y 1, y 2,..., y l ) 31/39
33 Proof (cont d) A = p(y 1, y 2,..., y l ) = = = (y 1,...,y l )<(x 1,...,x l 1,m) y 1,...,y l :(y 1,...,y l 1 )<(x 1,...,x l 1 ) (y 1,...,y l 1 )<(x 1,...,x l 1 ) (y 1,...,y l 1 )<(x 1,...,x l 1 ) = S(x 1,..., x l 1 ). p(y 1, y 2,..., y l 1 )p(y l ) p(y 1, y 2,..., y l 1 ) p(y l ) y l p(y 1, y 2,..., y l 1 ) 32/39
34 Proof (end) B = p(y 1, y 2,..., y l ) = = (x 1,...,x l 1,m) (y 1,...,y l )<(x 1,...,x l ) p(x 1,..., x l 1, y l ) m y l <x l p(x 1,..., x l 1 )p(y l ) m y l <x l = p(x 1,..., x l 1 )S(x l ) 33/39
35 fractals... Example : Prob(a) = 1 2 ; Prob(b) = 1 6, Prob(c) = a b c 1 34/39
36 fractals Example : Prob(a) = 1 2 ; Prob(b) = 1 6, Prob(c) = a b c 1 aa ab ac ba bbbc ca cb cc 35/39
37 coding S(x 1,..., x l ) = S(x 1,..., x l 1 ) }{{} beginning of the previous interval + p(x 1,..., x l 1 ) }{{} S(x l ) length of. previous interval Therefore if the current interval is [a, b[,and if x i is read, then a new a + (b a)s(x i ) b new a new + (b a)p(x i ) The width of the interval becomes (b a)p(x i ) = p(x i )... p(x i 1 )p(x i ) = p(x 1,..., x i ). 36/39
38 Zooming Let [a, b[ be the current interval If b 1/2, [a, b[ [2a, 2b[, output ( 0 ) ; If a 1/2, [a, b[ [2a 1, 2b 1[, output ( 1 ) If 1/4 a < b 3/4, [a, b[ [2a 1/2, 2b 1/2[, no output (underflow expansion), but keep this in mind (with the help of a counter). 37/39
39 Decoding Let r = ϕ(x 1,..., x l ) be the real number whose 2-adic representation encodes (x 1,..., x l ) X l. The letters x 1,..., x l are the only ones for which S(x 1 ) < r < S(x 1 ) + p(x 1 ) S(x 1, x 2 ) < r. < S(x 1, x 2 ) + p(x 1, x 2 ) S(x 1,..., x l ) < r < S(x 1,..., x l ) + p(x 1,..., x l ) 38/39
40 Procedure S(x 1 ) < r < S(x 1 ) + p(x 1 ) Suppose that x 1 was found. We have S(x 1, x 2 ) < r < S(x 1, x 2 ) + p(x 1, x 2 ) S(x 1 ) + p(x 1 )S(x 2 ) r < S(x 1 ) + p(x 1 )S(x 2 ) + p(x 1 )p(x 2 ) p(x 1 )S(x 2 ) r S(x 1 ) < p(x 1 )(S(x 2 ) + p(x 2 )) S(x 2 ) r S(x 1 ) p(x 1 ) < S(x 2 ) + p(x 2 ) We have therefore to find out in which interval lies r 1 = r S(x 1) p(x 1 ). 39/39
Coding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationLecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)
3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationData Compression. Limit of Information Compression. October, Examples of codes 1
Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationChapter 5: Data Compression
Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationData Compression Techniques (Spring 2012) Model Solutions for Exercise 2
582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationBasic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.
Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More informationExercise 1. = P(y a 1)P(a 1 )
Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a
More informationLec 03 Entropy and Coding II Hoffman and Golomb Coding
CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap
More informationEE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16
EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More informationMotivation for Arithmetic Coding
Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater
More informationCSCI 2570 Introduction to Nanocomputing
CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication
More informationCSE 421 Greedy: Huffman Codes
CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:
More informationSource Coding Techniques
Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.
More informationSummary of Last Lectures
Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i
More informationCMPT 365 Multimedia Systems. Lossless Compression
CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding
More informationCSEP 590 Data Compression Autumn Arithmetic Coding
CSEP 590 Data Compression Autumn 2007 Arithmetic Coding Reals in Binary Any real number x in the interval [0,1) can be represented in binary as.b 1 b 2... where b i is a bit. x 0 0 1 0 1... binary representation
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationEE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)
EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the
More informationCSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression
CSEP 52 Applied Algorithms Spring 25 Statistical Lossless Data Compression Outline for Tonight Basic Concepts in Data Compression Entropy Prefix codes Huffman Coding Arithmetic Coding Run Length Coding
More informationLecture 6: Kraft-McMillan Inequality and Huffman Coding
EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationMultimedia Communications. Mathematical Preliminaries for Lossless Compression
Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when
More informationCOMM901 Source Coding and Compression. Quiz 1
German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding
More informationHomework Set #2 Data Compression, Huffman code and AEP
Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code
More informationU Logo Use Guidelines
COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature
More informationHuffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University
Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw
More informationCompression and Coding
Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)
More informationELEC 515 Information Theory. Distortionless Source Coding
ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered
More informationOptimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.
Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini
More informationLec 05 Arithmetic Coding
ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data
More informationCapacity of a channel Shannon s second theorem. Information Theory 1/33
Capacity of a channel Shannon s second theorem Information Theory 1/33 Outline 1. Memoryless channels, examples ; 2. Capacity ; 3. Symmetric channels ; 4. Channel Coding ; 5. Shannon s second theorem,
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationlossless, optimal compressor
6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.
More informationCoding for Discrete Source
EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively
More informationLecture 1: September 25, A quick reminder about random variables and convexity
Information and Coding Theory Autumn 207 Lecturer: Madhur Tulsiani Lecture : September 25, 207 Administrivia This course will cover some basic concepts in information and coding theory, and their applications
More informationInformation Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18
Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable
More informationInformation & Correlation
Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What
More informationCS4800: Algorithms & Data Jonathan Ullman
CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code
More informationUsing an innovative coding algorithm for data encryption
Using an innovative coding algorithm for data encryption Xiaoyu Ruan and Rajendra S. Katti Abstract This paper discusses the problem of using data compression for encryption. We first propose an algorithm
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationSolutions to Set #2 Data Compression, Huffman code and AEP
Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied
More informationData Compression Techniques
Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques
More informationPART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015
Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More informationChapter 5. Data Compression
Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes
More informationAutumn Coping with NP-completeness (Conclusion) Introduction to Data Compression
Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression Kirkpatrick (984) Analogy from thermodynamics. The best crystals are found by annealing. First heat up the material to let
More informationCMPT 365 Multimedia Systems. Final Review - 1
CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of
More informationLecture 22: Final Review
Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information
More informationrepetition, part ii Ole-Johan Skrede INF Digital Image Processing
repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and
More informationText Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2
Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable
More informationEE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.
EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page
More informationInformation Theory CHAPTER. 5.1 Introduction. 5.2 Entropy
Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More information10-704: Information Processing and Learning Fall Lecture 9: Sept 28
10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy
More informationChapter 9 Fundamental Limits in Information Theory
Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For
More information(Classical) Information Theory II: Source coding
(Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable
More informationInformation Theory. Week 4 Compressing streams. Iain Murray,
Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]
More information4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak
4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the
More informationIntroduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar
Introduction to Information Theory By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction [B.P. Lathi] Almost in all the means of communication, none produces error-free communication.
More informationLecture 11: Polar codes construction
15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last
More informationCOS597D: Information Theory in Computer Science October 19, Lecture 10
COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept
More informationLecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More informationInformation and Entropy
Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication
More information1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.
Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without
More informationLecture 6 I. CHANNEL CODING. X n (m) P Y X
6- Introduction to Information Theory Lecture 6 Lecturer: Haim Permuter Scribe: Yoav Eisenberg and Yakov Miron I. CHANNEL CODING We consider the following channel coding problem: m = {,2,..,2 nr} Encoder
More informationOn the redundancy of optimum fixed-to-variable length codes
On the redundancy of optimum fixed-to-variable length codes Peter R. Stubley' Bell-Northern Reserch Abstract There has been much interest in recent years in bounds on the redundancy of Huffman codes, given
More informationAn Approximation Algorithm for Constructing Error Detecting Prefix Codes
An Approximation Algorithm for Constructing Error Detecting Prefix Codes Artur Alves Pessoa artur@producao.uff.br Production Engineering Department Universidade Federal Fluminense, Brazil September 2,
More informationDCSP-3: Minimal Length Coding. Jianfeng Feng
DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than
More informationOn the Cost of Worst-Case Coding Length Constraints
On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed
More informationCourse notes for Data Compression - 1 The Statistical Coding Method Fall 2005
Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Peter Bro Miltersen August 29, 2005 Version 2.0 1 The paradox of data compression Definition 1 Let Σ be an alphabet and let
More information21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]
575 21. Dynamic Programming III FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5] Approximation 576 Let ε (0, 1) given. Let I opt an optimal selection. No try to find a valid selection
More informationA One-to-One Code and Its Anti-Redundancy
A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH. Outline of the Talk. Prefix Codes
More information0 1 d 010. h 0111 i g 01100
CMSC 5:Fall 07 Dave Mount Solutions to Homework : Greedy Algorithms Solution : The code tree is shown in Fig.. a 0000 c 0000 b 00 f 000 d 00 g 000 h 0 i 00 e Prob / / /8 /6 / Depth 5 Figure : Solution
More informationTight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes
Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org
More informationCS6304 / Analog and Digital Communication UNIT IV - SOURCE AND ERROR CONTROL CODING PART A 1. What is the use of error control coding? The main use of error control coding is to reduce the overall probability
More informationArithmetic and Incompleteness. Will Gunther. Goals. Coding with Naturals. Logic and Incompleteness. Will Gunther. February 6, 2013
Logic February 6, 2013 Logic 1 2 3 Logic About Talk Logic Things talk Will approach from angle of computation. Will not assume very much knowledge. Will prove Gödel s Incompleteness Theorem. Will not talk
More information