U Logo Use Guidelines

Similar documents
U Logo Use Guidelines

U Logo Use Guidelines

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Entropy as a measure of surprise

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Chapter 5: Data Compression

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Motivation for Arithmetic Coding

Shannon-Fano-Elias coding

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Communications Theory and Engineering

Lecture 3 : Algorithms for source coding. September 30, 2016

Source Coding Techniques

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

DCSP-3: Minimal Length Coding. Jianfeng Feng

Information Theory. Week 4 Compressing streams. Iain Murray,

Data Compression. Limit of Information Compression. October, Examples of codes 1

Stream Codes. 6.1 The guessing game

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Chapter 2: Source coding

Data Compression Techniques

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Lecture 1: Shannon s Theorem

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Information Theory and Statistics Lecture 2: Source coding

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Summary of Last Lectures

Using an innovative coding algorithm for data encryption

Kolmogorov complexity ; induction, prediction and compression

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Fibonacci Coding for Lossless Data Compression A Review

Lecture 4 : Adaptive source coding algorithms

ELEC 515 Information Theory. Distortionless Source Coding

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

COMP2610/COMP Information Theory

Information and Entropy

Information & Correlation

Lecture 1: September 25, A quick reminder about random variables and convexity

CSCI 2570 Introduction to Nanocomputing

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

Lecture 22: Final Review

Lec 05 Arithmetic Coding

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

UNIT I INFORMATION THEORY. I k log 2

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Data Compression Techniques

lossless, optimal compressor

ECE 587 / STA 563: Lecture 5 Lossless Compression

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

ECE 587 / STA 563: Lecture 5 Lossless Compression

1 Introduction to information theory

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Lecture 1 : Data Compression and Entropy

Exercise 1. = P(y a 1)P(a 1 )

Chapter 5. Data Compression

CS4800: Algorithms & Data Jonathan Ullman

(Classical) Information Theory II: Source coding

Intro to Information Theory

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Homework Set #2 Data Compression, Huffman code and AEP

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

1 Computational Problems

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

Shannon s Noisy-Channel Coding Theorem

CSEP 590 Data Compression Autumn Arithmetic Coding

Introduction to information theory and coding

Shannon s noisy-channel theorem

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Notes for Lecture Notes 2

Lecture 5: Asymptotic Equipartition Property

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Lecture 11: Continuous-valued signals and differential entropy

Sources: The Data Compression Book, 2 nd Ed., Mark Nelson and Jean-Loup Gailly.

Chapter 9 Fundamental Limits in Information Theory

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

Coding of memoryless sources 1/35

ECS 452: Digital Communication Systems 2015/2. HW 1 Due: Feb 5

Lecture 4 Channel Coding

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

COMM901 Source Coding and Compression. Quiz 1

Compressing Tabular Data via Pairwise Dependencies

2018/5/3. YU Xiangyu

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Exercises with solutions (Set B)

CISC 876: Kolmogorov Complexity

6.02 Fall 2012 Lecture #1

Information and Entropy. Professor Kevin Gold

Transcription:

COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature of things. authenticity of our brand identity, there are n how our logo is used. go - horizontal logo go should be used on a white background. ludes black text with the crest in Deep Gold in MYK. Research School of Computer Science The Australian National University Preferred logo September 23rd, 2014 Black version inting is not available, the black logo can hite background. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 1 / 17

1 The Trouble with Huffman Coding 2 Guessing Game 3 Interval Coding Shannon-Fano-Elias Coding Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 2 / 17

Huffman Coding: Advantages and Disadvantages Advantages: Huffman Codes are provably optimal Algorithm is simple and efficient Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 3 / 17

Huffman Coding: Advantages and Disadvantages Advantages: Huffman Codes are provably optimal Algorithm is simple and efficient Disadvantages: Assumes a fixed distribution of symbols The extra bit in the SCT If H(X ) is large not a problem If H(X ) is small (e.g., 1 bit for English) codes are 2 optimal Huffman codes are the best possible symbol code but symbol coding is not always the best type of code Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 3 / 17

1 The Trouble with Huffman Coding 2 Guessing Game 3 Interval Coding Shannon-Fano-Elias Coding Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 4 / 17

A Guessing Game Let s play! Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 5 / 17

An Idealised Guessing Game Encoding: Given message x and guesser G: For i from 1 to x : 1 Set count n i = 1 2 While G guesses x i incorrectly: 1 n i n i + 1 3 Output n i Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 6 / 17

An Idealised Guessing Game Encoding: Given message x and guesser G: For i from 1 to x : 1 Set count n i = 1 2 While G guesses x i incorrectly: 1 n i n i + 1 3 Output n i Decoding: Given counts n and guesser G: For i from 1 to n and guesser G: 1 While G has made fewer than n i guesses: 1 x i next guess from G 2 Output x i Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 6 / 17

An Idealised Guesser If the guesser G is deterministic (i.e., its next output depends only on the history of values it has seen), the same guesser can be used to encode and decode. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 7 / 17

An Idealised Guesser If the guesser G is deterministic (i.e., its next output depends only on the history of values it has seen), the same guesser can be used to encode and decode. Advantages Scheme works regardless of how the sequence x was generated Only need to compute codes for observed sequence (not all possible blocks of some size) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 7 / 17

An Idealised Guesser If the guesser G is deterministic (i.e., its next output depends only on the history of values it has seen), the same guesser can be used to encode and decode. Advantages Scheme works regardless of how the sequence x was generated Only need to compute codes for observed sequence (not all possible blocks of some size) How can we use this? Have the guesser guess distributions rather than symbols (repeatedly) Use bits instead of counts (with fewer bits for lower counts) Define a clever guesser (one that learns ) Prove that this scheme is close to optimal (at least, not much worse than Huffman on probabilistic sources) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 7 / 17

1 The Trouble with Huffman Coding 2 Guessing Game 3 Interval Coding Shannon-Fano-Elias Coding Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 8 / 17

Real Numbers in Binary Real numbers are commonly expressed in decimal: 12 10 1 10 1 + 2 10 0 3.7 10 3 10 0 + 7 10 1 0.94 10 + 9 10 1 + 4 10 2 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 9 / 17

Real Numbers in Binary Real numbers are commonly expressed in decimal: 12 10 1 10 1 + 2 10 0 3.7 10 3 10 0 + 7 10 1 0.94 10 + 9 10 1 + 4 10 2 Some real numbers have infinite, repeating decimal expansions: 1 3 = 0.33333... 22 10 = 0.3 10 and 7 = 3.14285714... 10 = 3.142857 10 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 9 / 17

Real Numbers in Binary Real numbers are commonly expressed in decimal: 12 10 1 10 1 + 2 10 0 3.7 10 3 10 0 + 7 10 1 0.94 10 + 9 10 1 + 4 10 2 Some real numbers have infinite, repeating decimal expansions: 1 3 = 0.33333... 22 10 = 0.3 10 and 7 = 3.14285714... 10 = 3.142857 10 Real numbers can also be similarly expressed in binary: 11 2 1 2 1 + 1 2 0 1.1 2 1 2 0 + 1 2 1 0.01 2 + 0 2 1 + 1 2 2 1 3 = 0.010101... 22 2 = 1.01 2 and 7 = 11.001001... 2 = 11.001 2 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 9 / 17

Real Numbers in Binary Real numbers are commonly expressed in decimal: 12 10 1 10 1 + 2 10 0 3.7 10 3 10 0 + 7 10 1 0.94 10 + 9 10 1 + 4 10 2 Some real numbers have infinite, repeating decimal expansions: 1 3 = 0.33333... 22 10 = 0.3 10 and 7 = 3.14285714... 10 = 3.142857 10 Real numbers can also be similarly expressed in binary: 11 2 1 2 1 + 1 2 0 1.1 2 1 2 0 + 1 2 1 0.01 2 + 0 2 1 + 1 2 2 1 3 = 0.010101... 22 2 = 1.01 2 and 7 = 11.001001... 2 = 11.001 2 Question: 1) What is 0.1011 2 in decimal? 2) What is 7 5 in binary? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 9 / 17

Intervals in Binary An interval [a, b) is the set of all the numbers at least as big as a but smaller than b. That is, [a, b) = {x : a x < b}. Examples: [0, 1), [0.3, 0.6), [0.2, 0.4). Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 10 / 17

Intervals in Binary An interval [a, b) is the set of all the numbers at least as big as a but smaller than b. That is, [a, b) = {x : a x < b}. Examples: [0, 1), [0.3, 0.6), [0.2, 0.4). The set of numbers in [0, 1) that start with a given sequence of bits b = b 1... b n form the interval [0.b 1... b n, 0.(b 1... b n + 1)). 1 [0.1, 1.0) [0.5, 1] 10 01 [0.01, 0.10) [0.25, 0.5) 10 1101 [0.1101, 0.1110) [0.8125, 0.875) 10 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 10 / 17

Intervals in Binary An interval [a, b) is the set of all the numbers at least as big as a but smaller than b. That is, [a, b) = {x : a x < b}. Examples: [0, 1), [0.3, 0.6), [0.2, 0.4). The set of numbers in [0, 1) that start with a given sequence of bits b = b 1... b n form the interval [0.b 1... b n, 0.(b 1... b n + 1)). 1 [0.1, 1.0) [0.5, 1] 10 01 [0.01, 0.10) [0.25, 0.5) 10 1101 [0.1101, 0.1110) [0.8125, 0.875) 10 If b is a prefix of b the interval for b is contained in the interval for b. b = 01 is prefix of b = 0101 so [0.0101, 0.0110) [0.01, 0.10) }{{}}{{} [0.3125,0.375) 10 [0.25,0.5) 10 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 10 / 17

Cumulative Distribution Suppose p = (p 1,..., p K ) is a distribution over A = {a 1,..., a K }. Assume the alphabet A is ordered and define a i a j to mean i j. Consider F the cumulative distribution for p: F (a) = P(x a) = a i a p i Each symbol a i A is associated with the interval [F (a i 1 ), F (a i )). (The value of F (a 0 ) = 0). Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 11 / 17

Cumulative Distribution Suppose p = (p 1,..., p K ) is a distribution over A = {a 1,..., a K }. Assume the alphabet A is ordered and define a i a j to mean i j. Consider F the cumulative distribution for p: F (a) = P(x a) = a i a p i Each symbol a i A is associated with the interval [F (a i 1 ), F (a i )). (The value of F (a 0 ) = 0). Example: A = {r, g, b} = a 1 = r, a 2 = g, a 3 = b = r b since 1 3. If p = ( 1 2, 1 4, 1 4 ) then F (r) = 1 2, F (g) = 3 4, F (b) = 1. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 11 / 17

Cumulative Distribution Example 1 a 4 110 F(a i ) F(a i )-½p i a 3 100 F(a i-1 ) a 2 a 1 01000 0001 a 1 a 2 a 3 a 4 0 Cumulative distribution for p = ( 2 9, 1 9, 1 3, 1 3 ) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 12 / 17

Shannon-Fano-Elias Coding 1 [0, 2 9 ) 10 [0, 0.001110) 2 a 4 110 [ 2 9, 1 3 ) 10 [0.001110, 0.01) 2 F(a i ) F(a i )-½p i a 3 100 [ 1 3, 2 3 ) 10 [0.01, 0.10) 2 F(a i-1 ) a 2 a 1 01000 0001 [ 2 3, 1) 10 [0.10, 1) 2 a 1 a 2 a 3 a 4 0 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 13 / 17

Shannon-Fano-Elias Coding 1 [0, 2 9 ) 10 [0, 0.001110) 2 a 4 110 [ 2 9, 1 3 ) 10 [0.001110, 0.01) 2 F(a i ) F(a i )-½p i a 3 100 [ 1 3, 2 3 ) 10 [0.01, 0.10) 2 F(a i-1 ) a 2 01000 a 1 0001 [ 2 3, 1) 10 [0.10, 1) 2 a 1 a 2 a 3 a 0 4 Define the midpoint F (a i ) = F (a i ) 1 2 p 1 i and length l(a i ) = log 2 p i + 1. Code x A using first l(x) bits of F (x). x p(x) F (x) F (x) F (x) 2 l(x) Code a 1 2/9 2/9 1/9 0.000111 2 4 0001 a 2 1/9 1/3 5/18 0.01000111 2 5 01000 a 3 1/3 2/3 1/2 0.1 2 3 100 a 4 1/3 1 5/6 0.110 2 3 110 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 13 / 17

Shannon-Fano-Elias Coding 1 [0, 2 9 ) 10 [0, 0.001110) 2 a 4 110 [ 2 9, 1 3 ) 10 [0.001110, 0.01) 2 F(a i ) F(a i )-½p i a 3 100 [ 1 3, 2 3 ) 10 [0.01, 0.10) 2 F(a i-1 ) a 2 01000 a 1 0001 [ 2 3, 1) 10 [0.10, 1) 2 a 1 a 2 a 3 a 0 4 Define the midpoint F (a i ) = F (a i ) 1 2 p 1 i and length l(a i ) = log 2 p i + 1. Code x A using first l(x) bits of F (x). x p(x) F (x) F (x) F (x) 2 l(x) Code a 1 2/9 2/9 1/9 0.000111 2 4 0001 a 2 1/9 1/3 5/18 0.01000111 2 5 01000 a 3 1/3 2/3 1/2 0.1 2 3 100 a 4 1/3 1 5/6 0.110 2 3 110 Example: Sequence x = a 3 a 3 a 1 coded as 100 100 0001. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 13 / 17

Shannon-Fano-Elias Decoding Let p = { 2 9, 1 9, 1 3, 1 3 }. Suppose we want to decode 01000: Find symbol whose interval contains interval for 01000 1 a 4 a 3 0.33 a 2 0.22 0 01 010 0100 01000 a 1 0 [0,0.5) [0.25,0.5) [0.25,0.375) [0.25,0.3125) [0.25,0.2815) Note: We did not need to know all the codewords in C. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 14 / 17

Expected Code Length of SFE Code The extra bit for the code lengths is because we code p i 2 and log 2 2 p i = log 2 1 p i + log 2 2 = log 2 1 p i + 1 What is the expected length of a SFE code C for ensemble X with probabilities p? L(C, X ) = K p i l(a i ) = i=1 K ( ) 1 p i log 2 + 1 p i K ( ) 1 p i log 2 + 2 p i i=1 i=1 = H(X ) + 2 Similarly, H(X ) + 1 L(C, X ) for the SFE codes. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 15 / 17

What does the extra bit buy us? Let X be an ensemble, C SFE be a Shannon-Fano-Elias code for X and C H be a Huffman code for X. H(X ) L(C H, X ) H(X ) + 1 L(C }{{} SFE, X ) H(X ) + 2 Source Coding Theorem Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 16 / 17

What does the extra bit buy us? Let X be an ensemble, C SFE be a Shannon-Fano-Elias code for X and C H be a Huffman code for X. H(X ) L(C H, X ) H(X ) + 1 L(C }{{} SFE, X ) H(X ) + 2 Source Coding Theorem The extra bit guarantees that the interval for each code word lies entirely within the interval for each symbol. [Why?] Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 16 / 17

What does the extra bit buy us? Let X be an ensemble, C SFE be a Shannon-Fano-Elias code for X and C H be a Huffman code for X. H(X ) L(C H, X ) H(X ) + 1 L(C }{{} SFE, X ) H(X ) + 2 Source Coding Theorem The extra bit guarantees that the interval for each code word lies entirely within the interval for each symbol. [Why?] Example: SFE Code for p = { 2 9, 1 9, 1 3, 1 3 } is C = {0001, 01000, 100, 110}. Intervals for p are [0, 0.22), [0.22, 0.33), [0.33, 0.66), [0.66, 1) Intervals for {000, 0100, 10, 11} are [0, 0.125), [0.25, 0.3125), [0.5, 0.75), [0.75, 1) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 16 / 17

Summary and Reading Main points: Problems with Huffman coding Guessing game for coding without assuming fixed symbol distribution Binary strings to/from intervals in [0, 1] Shannon-Fano-Elias Coding: Code C via cumulative distribution function for p H(X ) + 1 L(C, X ) H(X ) + 2 Extra bit guarantees interval containment Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 17 / 17

Summary and Reading Main points: Problems with Huffman coding Guessing game for coding without assuming fixed symbol distribution Binary strings to/from intervals in [0, 1] Shannon-Fano-Elias Coding: Code C via cumulative distribution function for p H(X ) + 1 L(C, X ) H(X ) + 2 Extra bit guarantees interval containment Reading: Guessing game and interval coding: MacKay 6.1 and 6.2 Shannon-Fano-Elias Coding: Cover & Thomas 5.9 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 17 / 17

Summary and Reading Main points: Problems with Huffman coding Guessing game for coding without assuming fixed symbol distribution Binary strings to/from intervals in [0, 1] Shannon-Fano-Elias Coding: Code C via cumulative distribution function for p H(X ) + 1 L(C, X ) H(X ) + 2 Extra bit guarantees interval containment Reading: Guessing game and interval coding: MacKay 6.1 and 6.2 Shannon-Fano-Elias Coding: Cover & Thomas 5.9 Next time: Extending SFE Coding to sequences of symbols Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory September 23rd, 2014 17 / 17