Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Size: px
Start display at page:

Download "Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2"

Transcription

1 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix code for the alphabet {a, b, c, d, e, f} with the following codeword lengths symbol a b c d e f length Specify also the intervals associated with the codewords. Solution: First, sort the alphabet by their code lengths. The resulting intervals and codewords are: symbol c b d f a e length [ [ interval, 2) 2, 8) 5 [ 5 8, 3 [ 3 4) 4, 7 ) [ 7 8 8, 5 ) [ 5 6 6, ) code 2. Problem: Give the codes γ(n), δ(n) and GR 4 (n) for n {, 2, 4}. Solution: Let n =, and k = log(n + ) = log = 6. γ() = unary(k) binary k (n 2 k + ) = unary(6) binary 6 ( ) = { { δ() = γ(k) binary k (n 2 k + ) = γ(6) binary 6 ( ) = { { Let k = 4 and q = n 2 k = 2 4 = 6. GR 4 () = unary(q) binary k (n q 2 k ) = unary(6) binary 4 ( ) = {{ Let n = 2, and k = log(n + ) = log 2 = 7. γ(2) = unary(k) binary k (n 2 k + ) = unary(7) binary 7 ( ) = { { δ(2) = γ(k) binary k (n 2 k + ) = γ(7) binary 7 ( ) = { {

2 Let k = 4 and q = n 2 k = = 2. GR 4 (2) = unary(q) binary k (n q 2 k ) = unary(2) binary 4 ( ) = { { Let n = 4, and k = log(n + ) = log 4 = 8. γ(4) = unary(k) binary k (n 2 k + ) = unary(8) binary 8 ( ) = { { δ(4) = γ(k) binary k (n 2 k + ) = γ(8) binary 8 ( ) = {{ Let k = 4 and q = n 2 k = = 25. GR 4 (4) = unary(q) binary k (n q 2 k ) = unary(25) binary 4 ( ) = {{ 3. Problem: Use the Shannon-Fano algorithm (Exercise.3) to construct a prefix code for the set N of nonnegative integers and give the codes for the integers -9 using the following probability distributions. (a) P (n) = ( p)p n, where p = 2 /4.84. (b) P (n) = (n + ) 2 /ζ(2), where ζ(2) = π 2 /6.645 (ζ is the Riemann zeta function). Solution: Below is a Python3 solution for the problem. The code words are in Table 3. Figure 3 shows the recursion tree of the algorithm for part (a). Note: the self-information of n in part (a) is log(( p)p n ) n/ , which is close to the code length of the GR 2 code, which is n/ Correspondingly, the self-information of n in part (b) is log((n+) 2 /ζ(2)) 2 log(n+)+.7, which is close to the length of the gamma code of n, which is 2 log(n+) +. import math 2 3 def P(x): 4 p = 2**(-/4) 5 return (-p) * p**x 6 7 def P2(x): 8 return (x+)**(-2) / (math.pi**2 / 6) 9 # A recursive Shannon-Fano coding function. Parameters: # P: a probability distribution for the integers 2 # [left, right]: range of integers to be coded 3 # total: total probability of integers in the range [left, right] 2

3 4 # codeword: longest prefix shared by all codewords in the range [left, right] 5 def shannon_fano(p, left, right, total, codeword): 6 7 if left > 9: 8 # Only interested in codewords for integers -9 9 return 2 2 if left == right: # Recursion bottom 22 print(str(left) + ": " + codeword) 23 return # Find the best split point 26 best_split = # Last index in the left side of the best split 27 split_sum = # Sum of probabilities in the left side of the split 28 smallest_bias = # Bias means abs(split_sum - total/2) 29 for x in range(left,right+): 3 split_sum += P(x) 3 if abs(split_sum - total/2) < smallest_bias: 32 smallest_bias = abs(split_sum - total/2) 33 best_split = x 34 else: break # Recurse into the left and right parts 37 left_total = sum(map(p, [x for x in range(left,best_split+)])) 38 right_total = total - left_total 39 shannon_fano(p, left, best_split, left_total, codeword + "") 4 shannon_fano(p, best_split+, right, right_total, codeword + "") 4 42 shannon_fano(p,,**9,,"") 43 shannon_fano(p2,,**9,,"") number (a) codes (b) codes Table : Code words generated by the Shannon-Fano algorithm 4. Problem: Let x, x 2,..., x n be a sequence of n non-negative integers with an average value x. Show that if we encode the sequence using a Golomb-Rice code GR k with a suitable k, the average code length is less than log( x + ) + 2. Compare the result to Elias δ code length. Solution: The average length of the code words for x, x 2,..., x n is: n GR k (x) = n ( x i 2 k + k + ) n ( x i 2 k + k + ) = 2 k ( n x i ) + n 3 (k + ) = x 2 k + k + ()

4 ,,2,3, /2,,2,3 4,5,6,... /4, 2,3 4,5,6,7 8,9,, / ,5 6,7 8,9,, / ,9... / Figure : Recursion of the Shannon-Fano algorithn in problem 3a. Selecting k = log 2 x gives the bound log 2 ( x) + 2. However, if x is close to zero, then k = log 2 ( x) is negative, which does not work, because parameter k can not be negative. We can fix this by choosing k = log 2 ( x + ), which gives the desired bound log 2 ( x + ) + 2. Let s now compare this result with the code lengths of delta coding. The length of the delta code of an integer x is log(x + 2) + 2 log log(x + 2), which is asymptotically larger than log 2 (x + ) + 2 because of the loglog term in the delta coding. See Figure 4 for a plot of both functions for comparison. Note: the choice k = log 2 x gives the bound that was asked for in the problem statement, but is this choice of k optimal? To find the best possible k, let us find the zero of the derivative of the average code length. The derivative is: d dk ( x 2 k + k + ) = x ln(2)e k ln(2) + Setting this to zero and solving for k gives k = log 2 ( x) + log 2 ln 2. Since log 2 ln 2 is only approximately.69, we see that k = log 2 ( x + ) was quite close to optimal. 5. Problem: Assume that the model of computation supports the following operations (in addition to standard arithmetic operations) in constant time: Read up to w consecutive bits starting from any position in a binary string, and interpret the bits as a binary representation of an integer. Determine the position of the left-most -bit in a w-bit integer. Suppose we have encoded a sequence of integers in the range [..2 w ] using the Elias -code. Show how the sequence can be decoded in constant time per integer. Solution: Suppose we want to decode a bit string B representing a concatenation of gammacoded integers. The number of leading ones in B is the parameter k in the gamma code of the first integer. The number of leading ones can be computed by computing the left-most -bit in B using second the operation given in the problem statement. The k bits after the zero are the binary part of the gamma code of the first integer, which can be interpreted using the first operation in the problem statement. In other words, the value of the first integer is: 2 k + InterpretBinary(B[(k + )..2k]) 4

5 Golomb-Rice Elias delta Figure 2: Golomb rice ( log 2 (x + ) + 2) versus Elias delta ( log(x + 2) + 2 log log(x + 2) ). After this we can forget about the first 2k + bits of B and repeat the procedure to decode the next integer, until all integers coded in B have been decoded. 6. Problem: What is (a) the entropy (b) the average Huffman code length of the following distribution symbol a b c d e f g prob. /6 /5 /2 /5 /2 /8 /4 Solution: The entropy is H(P ) 2.5 and the average Huffman code length is about See the lecture slides for details. 5

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1) EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

U Logo Use Guidelines

U Logo Use Guidelines COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature

More information

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

CSEP 590 Data Compression Autumn Arithmetic Coding

CSEP 590 Data Compression Autumn Arithmetic Coding CSEP 590 Data Compression Autumn 2007 Arithmetic Coding Reals in Binary Any real number x in the interval [0,1) can be represented in binary as.b 1 b 2... where b i is a bit. x 0 0 1 0 1... binary representation

More information

Lec 05 Arithmetic Coding

Lec 05 Arithmetic Coding ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

More information

DCSP-3: Minimal Length Coding. Jianfeng Feng

DCSP-3: Minimal Length Coding. Jianfeng Feng DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than

More information

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

Lec 04 Variable Length Coding (VLC) in JPEG

Lec 04 Variable Length Coding (VLC) in JPEG ECE 5578 Multimedia Communication Lec 04 Variable Length Coding (VLC) in JPEG Zhu Li Dept of CSEE, UMKC Z. Li Multimedia Communciation, 2018 p.1 Outline Lecture 03 ReCap VLC JPEG Image Coding Framework

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Summary of Last Lectures

Summary of Last Lectures Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i

More information

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) ECE 74 - Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE) 1. A Huffman code finds the optimal codeword to assign to a given block of source symbols. (a) Show that cannot be a Huffman

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Exercise 1. = P(y a 1)P(a 1 )

Exercise 1. = P(y a 1)P(a 1 ) Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

CMPT 365 Multimedia Systems. Lossless Compression

CMPT 365 Multimedia Systems. Lossless Compression CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd 4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms) Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

Quantum-inspired Huffman Coding

Quantum-inspired Huffman Coding Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,

More information

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression CSEP 52 Applied Algorithms Spring 25 Statistical Lossless Data Compression Outline for Tonight Basic Concepts in Data Compression Entropy Prefix codes Huffman Coding Arithmetic Coding Run Length Coding

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later: October 5, 017 Greedy Chapters 5 of Dasgupta et al. 1 Activity selection Fractional knapsack Huffman encoding Later: Outline Dijkstra (single source shortest path) Prim and Kruskal (minimum spanning tree)

More information

CMPT 365 Multimedia Systems. Final Review - 1

CMPT 365 Multimedia Systems. Final Review - 1 CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9. ( c ) E p s t e i n, C a r t e r, B o l l i n g e r, A u r i s p a C h a p t e r 17: I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science 17.1 Binary Codes Normal numbers we use

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques

More information

Error Detection and Correction: Hamming Code; Reed-Muller Code

Error Detection and Correction: Hamming Code; Reed-Muller Code Error Detection and Correction: Hamming Code; Reed-Muller Code Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Hamming Code: Motivation

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Digital Communications III (ECE 54C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 204 / 2 Noiseless

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Information & Correlation

Information & Correlation Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What

More information

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction to Information Theory By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction [B.P. Lathi] Almost in all the means of communication, none produces error-free communication.

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

Generalized Kraft Inequality and Arithmetic Coding

Generalized Kraft Inequality and Arithmetic Coding J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C Image Compression Background Reduce the amount of data to represent a digital image Storage and transmission Consider the live streaming of a movie at standard definition video A color frame is 720 480

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

Chapter 5. Data Compression

Chapter 5. Data Compression Chapter 5 Data Compression Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 5 Data Compression 5.1 Example of Codes 5.2 Kraft Inequality 5.3 Optimal Codes

More information

Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 2014)

Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 2014) Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 204) Jussi Määttä October 2, 204 Question [First, note that we use the symbol! as an end-of-message symbol. When we see it, we know

More information

Using an innovative coding algorithm for data encryption

Using an innovative coding algorithm for data encryption Using an innovative coding algorithm for data encryption Xiaoyu Ruan and Rajendra S. Katti Abstract This paper discusses the problem of using data compression for encryption. We first propose an algorithm

More information

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2 0 1 6 C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science In this chapter, we learn how data can

More information

A One-to-One Code and Its Anti-Redundancy

A One-to-One Code and Its Anti-Redundancy A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH. Outline of the Talk. Prefix Codes

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006 jzhang@cse.unsw.edu.au

More information

Lecture 4 Channel Coding

Lecture 4 Channel Coding Capacity and the Weak Converse Lecture 4 Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 15, 2014 1 / 16 I-Hsiang Wang NIT Lecture 4 Capacity

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Kolmogorov complexity ; induction, prediction and compression

Kolmogorov complexity ; induction, prediction and compression Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Compression and Coding

Compression and Coding Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5 Lecture : Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP959 Multimedia Systems S 006 jzhang@cse.unsw.edu.au Acknowledgement

More information

Fibonacci Coding for Lossless Data Compression A Review

Fibonacci Coding for Lossless Data Compression A Review RESEARCH ARTICLE OPEN ACCESS Fibonacci Coding for Lossless Data Compression A Review Ezhilarasu P Associate Professor Department of Computer Science and Engineering Hindusthan College of Engineering and

More information

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39 Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur

More information