3F1 Information Theory, Lecture 3

Similar documents
3F1 Information Theory, Lecture 3

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Motivation for Arithmetic Coding

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Lecture 3 : Algorithms for source coding. September 30, 2016

3F1 Information Theory, Lecture 1

Shannon-Fano-Elias coding

Lecture 4 : Adaptive source coding algorithms

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

U Logo Use Guidelines

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

UNIT I INFORMATION THEORY. I k log 2

2018/5/3. YU Xiangyu

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Homework Set #2 Data Compression, Huffman code and AEP

Entropy as a measure of surprise

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Coding of memoryless sources 1/35

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

Coding for Discrete Source

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

CSCI 2570 Introduction to Nanocomputing

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Information and Entropy

Chapter 5: Data Compression

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Stream Codes. 6.1 The guessing game

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Introduction to information theory and coding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Communications Theory and Engineering

Chapter 9 Fundamental Limits in Information Theory

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Solutions to Set #2 Data Compression, Huffman code and AEP

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

Summary of Last Lectures

Lecture 22: Final Review

Chapter 2: Source coding

Digital communication system. Shannon s separation principle

Lecture 8: Shannon s Noise Models


Information & Correlation

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Intro to Information Theory

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Kolmogorov complexity ; induction, prediction and compression

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

ELEC 515 Information Theory. Distortionless Source Coding

CS4800: Algorithms & Data Jonathan Ullman

1 Introduction to information theory

CMPT 365 Multimedia Systems. Lossless Compression

Data Compression Techniques

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Data Compression Techniques

Information Theory and Statistics Lecture 2: Source coding

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

Using an innovative coding algorithm for data encryption

Data Compression. Limit of Information Compression. October, Examples of codes 1

DCSP-3: Minimal Length Coding. Jianfeng Feng

Uncertainity, Information, and Entropy

Information Theory. Week 4 Compressing streams. Iain Murray,

Context tree models for source coding

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

ITCT Lecture IV.3: Markov Processes and Sources with Memory

Dept. of Linguistics, Indiana University Fall 2015

COMM901 Source Coding and Compression. Quiz 1

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

Lec 05 Arithmetic Coding

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005

lossless, optimal compressor

Lecture 1: Shannon s Theorem

Lecture 1 : Data Compression and Entropy

Source Coding Techniques

Lecture 4 Channel Coding

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Classical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006

Lecture 4 Noisy Channel Coding

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Digital Communication Systems ECS 452

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Exercise 1. = P(y a 1)P(a 1 )

Exercises with solutions (Set B)

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

ECE 587 / STA 563: Lecture 5 Lossless Compression

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

A Mathematical Theory of Communication

ECE 587 / STA 563: Lecture 5 Lossless Compression

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Introduction to information theory and coding

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Transcription:

3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output of a source Binary Memoryless Source X 1, X 2, X 3,... X 1, X 2,... independent and identically distributed P Xi (1) = 1 P Xi (0) = 0.01 What s the optimal binary variable length code for X i? E[W ] = 1, but H(X) = 0.081 bits. Redundancy ρ 1 = E[W ] H(X) = 190 Can we do better?

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output of a source Binary Memoryless Source X 1, X 2, X 3,... X 1, X 2,... independent and identically distributed P Xi (1) = 1 P Xi (0) = 0.01 What s the optimal binary variable length code for X i? E[W ] = 1, but H(X) = 0.081 bits. Redundancy ρ 1 = E[W ] H(X) = 190 Can we do better?

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 3 / 21 Parsing a source into blocks of 2 x 1 x 2 P X1 X 2 (x 1, x 2 ) 00 801 0 01 0.0099 0 10 0.0099 0 11 0.0001 1 1 1 Codeword 0 10 110 111 E[W ] = 1.0299, H(X 1 X 2 ) = 0.1616 bits, where H(X 1 X 2 ) = x 1 x 2 P X1 X 2 (x 1, x 2 ) log P X1 X 2 (x 1, x 2 ) Redundancy per symbol: Can we do better? ρ 2 = E[W ] H(X 1X 2 ) 2 = 0.4342

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 4 / 21 Encoding a block of length N If encoding a block of length N using Huffman or Shannon-Fano coding, H(X 1... X N ) E[W ] < H(X 1... X N ) + 1 where H(X 1... X 2 ) = x 1...x 2 P(x 1,..., x 2 ) log P(x 1,..., x 2 ) Thus, ρ N = E[W ] H(X 1... X N ) N and H(X 1... X N ) + 1 H(X 1... X N ) N lim ρ N = 0, N i.e., the redundancy tends to zero = 1 N

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 5 / 21 The problem with block compression The alphabet size grows exponentially in the block length N Huffman s and Fano s algorithms become infeasible for large N, as does storing the codebook In Shannon s version of Shannon-Fano coding, the probability and cumulative probability can be computed recursively: P(x 1,..., x N ) = P(x 1,..., x N 1 )P(x N ) F(x 1,..., x N ) = F (x 1,..., x N 1 ) + F(x N )P(x 1,..., x N 1 ) Compute the cumulative probability for a specific block without computing all others! If only there wasn t the need for the alphabet to be ordered...

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 6 / 21 Reminder: Shannon s version of S-F Coding x P X (x) P(X<x) P(X < x) b log P X (x) Codeword F.30 0.0 0.00000000 2 00 D.20 0.3 0.01001101 3 010 E.20 0.5 0.10000000 3 100 C.15 0.7 0.10110011 3 101 B.10 0.85 0.11011010 4 1101 A.05 5 0.11110011 5 11110 Order the symbols in order of non-increasing probability Compute the cumulative probabilities Express the cumulative probabilities in D-ary The codeword is the fractional part of the cumulative probabilities truncated to length log P X (x)

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 7 / 21 Source intervals / code intervals x P(X<x) P(X < x) b F 0.0 0.00 D 0.3 0.010 E 0.5 0.100 C 0.7 0.101 B 0.85 0.1101 A 5 0.11110 A B C E D A B C E D F F

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 8 / 21 What if we don t order the source probabilities? F F x P(X<x) P(X < x) b A 0.0 0.00000 B 0.05 0.000 C 0.15 0.001 D 0.3 0.010 E 0.5 0.100 F 0.7 0.11 E D C B A E D C A B

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 9 / 21 New approach x [a, b] A [0.0, 0.05] B [0.05, 0.15] C [0.15, 0.3] D [0.3, 0.5] E [0.5, 0.7] F [0.7, 1.0] a = P(X < x) b = P(X < x) + P(X = x) F E D C Draw source intervals in no particular order Pick the largest interval [k2 w i, (k + 1)2 w i ] that fits in each source interval How large is w i? w i log p i +1 B A

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 9 / 21 New approach x [a, b] A [0.0, 0.05] B [0.05, 0.15] C [0.15, 0.3] D [0.3, 0.5] E [0.5, 0.7] F [0.7, 1.0] a = P(X < x) b = P(X < x) + P(X = x) F E D C Draw source intervals in no particular order Pick the largest interval [k2 w i, (k + 1)2 w i ] that fits in each source interval How large is w i? w i log p i +1 B A

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 10 / 21 Arithmetic Coding Recursive computation of the source interval: 1 0 1 0 1 0 1 At the end of the source block, generate the shortest possible codeword whose interval fits in the computed source interval E[W ] < H(X 1... X N ) + 2 0

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 11 / 21 Arithmetic Coding = Recursive Shannon-Fano Coding Claude E. Shannon Robert L. Fano Peter Elias Richard C. Pasco Jorma Rissanen H(X 1... X N ) E[W ] < H(X 1... X N ) + 2 ρ N = 2/N No need to wait until all the source block has been received to start generating code symbols Arithmetic Coding is an infinite state machine whose states need ever growing precision Finite precision implementation requires a few tricks (and loses some performance)

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 12 / 21 English Text Letter Frequency Letter Frequency Letter Frequency a 0.08167 j 0.00153 s 0.06327 b 0.01492 k 0.00772 t 0.09056 c 0.02782 l 0.04025 u 0.02758 d 0.04253 m 0.02406 v 0.00978 e 0.12702 n 0.06749 w 0.02360 f 0.02228 o 0.07507 x 0.00150 g 0.02015 p 0.01929 y 0.01974 h 0.06094 q 0.00095 z 0.00074 i 0.06966 r 0.05987 H(X) = 4.17576 bits P( and ) = 0.08167 0.06749 0.04253 = 0.0002344216 P( eee ) = 0.12702 3 = 0.00204935 P( eee ) >> P( and ). Is this right? No! Can we do better than H(X) for sources with memory?

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 12 / 21 English Text Letter Frequency Letter Frequency Letter Frequency a 0.08167 j 0.00153 s 0.06327 b 0.01492 k 0.00772 t 0.09056 c 0.02782 l 0.04025 u 0.02758 d 0.04253 m 0.02406 v 0.00978 e 0.12702 n 0.06749 w 0.02360 f 0.02228 o 0.07507 x 0.00150 g 0.02015 p 0.01929 y 0.01974 h 0.06094 q 0.00095 z 0.00074 i 0.06966 r 0.05987 H(X) = 4.17576 bits P( and ) = 0.08167 0.06749 0.04253 = 0.0002344216 P( eee ) = 0.12702 3 = 0.00204935 P( eee ) >> P( and ). Is this right? No! Can we do better than H(X) for sources with memory?

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 13 / 21 Discrete Stationary Source (DSS) properties H N (X) def = 1 N H(X 1... X N ) Entropy Rate H(X N X 1... X N 1 ) H N (X) H(X N X 1... X N 1 ) and H N (X) are non-increasing functions of N lim N H(X N X 1... X N 1 ) = lim N H N (X) def = H (X) H (X) is the entropy rate of a DSS Shannon s converse source coding theorem for a DSS E[W ] N H (X) log D

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 14 / 21 Coding for discrete stationary sources Arithmetic coding can use conditional probabilities The intervals will be different at every step depending on the source context Prediction by Partial Matching (PPM) and Context Tree Weighing (CTW) are techniques to build the context tree based source model on the fly, achieving compression rates of approx 2.2 binary symbols per ASCII character What is H (X) for English text? (assuming language is a stationary source, which is a disputed proposition) Predictor Predictor Source Source Encoder Source Decoder Destination

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 15 / 21 Markov Chain Andrey Andreyevivich Markov Stationary state random process S 1, S 2,... P(s N s 1... s N 1 ) = P(s N s N 1 ) Markov information source: states S i are mapped into source symbols X i Unifilar information source: from any state, all neighbouring states map to distinct symbols

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P X2 X 1 (1 0) = 1 P X2 X 1 (0 0) = P X2 X 1 (1 1) = 1 P X2 X 1 (0 1) = 0.8 Can we compute P X1 (1) = 1 P X1 (0)? Stationarity implies P X1 (1) = P X2 (1) and thus P X1 (1) = P X2 (1) = P X1 X 2 (01) + P X1 X 2 (11) = P X2 X 1 (1 0)P X1 (0) + P X2 X 1 (1 1)P X1 (1)

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P X2 X 1 (1 0) = 1 P X2 X 1 (0 0) = P X2 X 1 (1 1) = 1 P X2 X 1 (0 1) = 0.8 Can we compute P X1 (1) = 1 P X1 (0)? Stationarity implies P X1 (1) = P X2 (1) and thus P X1 (1) = P X2 (1) = P X1 X 2 (01) + P X1 X 2 (11) = P X2 X 1 (1 0)P X1 (0) + P X2 X 1 (1 1)P X1 (1)

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source Define the matrix T = [ PX2 X 1 (0 0) P X2 X 1 (0 1) P X2 X 1 (1 0) P X2 X 1 (1 1) and the vector P = [P X1 (0), P X1 (1)] T, then we are looking for the solution P to the equation P = TP, i.e., the eigenvector of T for the eigenvalue 1. Note that since T is a stochastic matrix (its columns sum to 1), it will always have 1 as an eigenvalue. ]

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 16 / 21 Unifilar Markov Source P = [ 0.1 0.8 ] P implies [ ] P = 0 which, together with the constraint [11]P = 1 (probabilities sum to 1) yields [ ] [ ] 0 P = 1 1 1 and finally P = Entropy rate of the source: [ PX1 (0) P X1 (1) ] = [ 0.1818 0.8182 H (X) = lim N H(X N X 1... X N 1 ) = H(X N X N 1 ) = H(X 2 X 1 ) = H(X 2 X 1 = 0)P X1 (0) + H(X 2 X 1 = 1)P X1 (1) = 0.1818h(0.1) + 0.8182h() = 0.6759 bits ]

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 1.0 0.1818 0.0

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 1.0 0.1818 0.0 0.1818 0.0182 = 0.1 0.1818 0.0

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 1.0 0.1818 0.0 0.1818 0.1818 0.0509 = (0.1818 0.0182) + 0.0182 0.0182 0.0 0.0182

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 1.0 0.1818 0.0 0.1818 0.1818 0.0509 0.0771 0.0182 0.0 0.0182 0.0509 0.1818

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 1.0 0.1818 0.0 0.1818 0.1818 0.1818 0.0509 0.0771 0.0980 0.0182 0.0 0.0182 0.509 0.0771 0.1818

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.0509 0.0771 0.0980 0.1148 0.0182 0.0 0.0182 0.0509 0.0771 0.0980

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.509 0.0771 0.0980 0.1148 0.1282 0.0182 0.0509 0.0771 0.0980 0.1148

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.0771 0.0980 0.1148 0.1282 0.1389 0.0509 0.0771 0.0980 0.1148 0.1282

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1389 0.1818... 0.0980 0.1148 0.1282 0.0949 0.0771 0.0980 0.1148 0.1282 0.1389

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 17 / 21 Encoding a unifilar Markov Source Encode source output sequence: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1389 0.1818... 0.0980 0.1148 0.1282 0.0949 0.0771 0.0980 0.1148 0.1282 0.1389

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 18 / 21 Determining the codeword Source interval [0.1389, 0.1818] in binary: [0.00100011, 0.00101110] b The probability of the source sequence is P X1...X 8 (0, 1, 1, 1, 1, 1, 1, 1) = 0.1818 0.1389 = 0.042896 log 2 P X1...X 8 (0, 1, 1, 1, 1, 1, 1, 1) = 4.543, therefore we can either truncate after 5 or 6 digits, depending if the resulting code sequence is contained in the source interval No 5 digit code sequence corresponds to a code interval contained in our source interval: Source interval: 0.1389 0.1818 Length 5 codeword intervals: 0.125 0.15625 0.1875 The 6 digit code sequence 001010 corresponds to the code interval [0.001010, 0.001011] b = [0.15625, 0.171875] which is fully contained in the source interval and therefore satisfies the prefix condition

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval 1.0 0.1818 0.0

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0 1.0 0.1818 0.1818 0.0182 0.0 0.0

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1 1.0 0.1818 0.1818 0.1818 0.0509 0.0182 0.0 0.0 0.0182

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1 1.0 0.1818 0.1818 0.1818 0.1818 0.0509 0.0771 0.0182 0.0 0.0 0.0182 0.0509

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1 1.0 0.1818 0.1818 0.1818 0.1818 0.1818 0.0509 0.0771 0.0980 0.0182 0.0 0.0 0.0182 0.509 0.0771

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.0509 0.0771 0.0980 0.1148 0.0182 0.0 0.0182 0.0509 0.0771 0.0980

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.509 0.0771 0.0980 0.1148 0.1282 0.0182 0.0509 0.0771 0.0980 0.1148

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1818 0.1818... 0.0771 0.0980 0.1148 0.1282 0.1389 0.0509 0.0771 0.0980 0.1148 0.1282

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1389 0.1818... 0.0980 0.1148 0.1282 0.0949 0.0771 0.0980 0.1148 0.1282 0.1389

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 19 / 21 Decoding a unifilar Markov Source Decode code sequence: 0,0,1,0,1,0 corresponding to interval [0.15625, 0.171875] Decoding rule: always pick sub-interval that contains the codeword interval, result: 0,1,1,1,1,1,1,1 0.1818 0.1818 0.1818 0.1389 0.1818... 0.0980 0.1148 0.1282 0.0949 0.0771 0.0980 0.1148 0.1282 0.1389

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 20 / 21 Shannon s Twin Experiment Source?? Encoder Decoder Sink

Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 21 / 21 Shannon s twin experiment Shannon 1 and Shannon 2 are hypothetical fully identical twins (who look alike, talk alike, and think exactly alike) An operator in the transmitter asks Shannon 1 to guess the next source symbol of the source based on the context The operator counts the number of guesses until Shannon 1 gets it right, and transmits this number An operator in the receiver asks Shannon 2 to guess the next source symbol based on the context. Shannon 2 will get the same answer as Shannon 1 after the same number of guesses. An upper bound on the entropy rate of English is the entropy of the number of guesses A better bound would take dependencies between numbers of guesses into account (if Shannon 1 needed many guesses for a symbol then chances are that he will need many for the next as well, whereas if he guessed right the first time, chances are that he s in the middle of a word and will guess the next symbol correctly as well)