Motivation for Arithmetic Coding

Similar documents
SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Lecture 4 : Adaptive source coding algorithms

Chapter 2: Source coding

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

3F1 Information Theory, Lecture 3

Chapter 5: Data Compression

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

3F1 Information Theory, Lecture 3

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Shannon-Fano-Elias coding

Summary of Last Lectures

1 Introduction to information theory

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Coding for Discrete Source

Communications Theory and Engineering

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Source Coding Techniques

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

U Logo Use Guidelines

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

COMM901 Source Coding and Compression. Quiz 1

Homework Set #2 Data Compression, Huffman code and AEP

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques

Information Theory and Statistics Lecture 2: Source coding

Entropy as a measure of surprise

Introduction to information theory and coding

Information and Entropy

CMPT 365 Multimedia Systems. Lossless Compression

ELEC 515 Information Theory. Distortionless Source Coding

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0


lossless, optimal compressor

(Classical) Information Theory II: Source coding

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 2014)

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

CSEP 590 Data Compression Autumn Arithmetic Coding

Exercises with solutions (Set B)

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

ECE 587 / STA 563: Lecture 5 Lossless Compression

Shannon s noisy-channel theorem

Data Compression. Limit of Information Compression. October, Examples of codes 1

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

DCSP-3: Minimal Length Coding. Jianfeng Feng

ECE 587 / STA 563: Lecture 5 Lossless Compression

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

Digital communication system. Shannon s separation principle

Non-binary Distributed Arithmetic Coding

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Chapter 9 Fundamental Limits in Information Theory

Information Theory. Week 4 Compressing streams. Iain Murray,

Generalized Kraft Inequality and Arithmetic Coding

Lec 05 Arithmetic Coding

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Compression and Coding

Stream Codes. 6.1 The guessing game

Basics of DCT, Quantization and Entropy Coding

Paper from European Trans. on Telecomm., Vol. 5, pp , July-August 1994.

Coding of memoryless sources 1/35

18.310A Final exam practice questions

Data Compression Using a Sort-Based Context Similarity Measure

Entropy and Ergodic Theory Lecture 3: The meaning of entropy in information theory

Lecture 22: Final Review

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

CSCI 2570 Introduction to Nanocomputing

Lecture 1 : Data Compression and Entropy

PROCESSING AND TRANSMISSION OF INFORMATION*

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

CMPT 365 Multimedia Systems. Final Review - 1

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Kolmogorov complexity ; induction, prediction and compression

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Bounded Expected Delay in Arithmetic Coding

Lecture 11: Polar codes construction

Information. Abstract. This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor

Introduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld

Lecture 4 Channel Coding

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

Lecture 1: Shannon s Theorem

Digital Communication Systems ECS 452

Transducers for bidirectional decoding of prefix codes

Transcription:

Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater than H(X 1 ) 2) To improve the coding efficiency, one can use block memoryless code by working with the extended alphabet X n. But computational complexity will grow exponentially as n increases Thus for small n, the Huffman coding is inefficient. On the other hand, for large n, it is unpractical due to its exponential coding complexity. Solution: Arithmetic coding is one of the algorithms that can address the above issue. It can achieve the entropy rate of a stationary source with a linear coding complexity.

Shannon-Fano-Elias Codes Let (X 1, X n ) be a random vector with joint pmf p(u 1,u 2, u n ), u i єx={x 0, x J-1 }. We partition the interval [0,1] into disjoint sub-intervals I(u 1,u 2, u n ), u 1,u 2, u n єx n such that the following properties hold: 1) The length of the interval I(u 1,u 2, u n ) is equal to p(u 1,u 2, u n ). 2) I u1 un u n 1 u n X ( ) = [0,1] 3) The intervals I(u 1,u 2, u n ) are arranged according to the natural lexicographic order on the sequence u 1,u 2, u n. I(x 0 ) I(x 1 ) I(x 2 ) I(x J-2 ) I(x J-1 ) n=1 I(x 0 x 0 ) I(x 0 x 1 ) I(x 0 x J-1 ) I(x 1 x 0 ) I(x J-1 x J-1 ) n=2

Shannon-Fano-Elias Codes (cntd) I(x 0 x 0 x 0 x 0 ) = [0, p(x 0 x 0 x 0 x 0 )] I(x 0 x 0 x 0 x 1 ) = [p(x 0 x 0 x 0 x 0 ), p(x 0 x 0 x 0 x 0 )+ p(x 0 x 0 x 0 x 1 )] : I(x J-1 x J-1 x J-1 )= [1-p(x J-1 x J-1 x J-1 ), 1] To get the codeword corresponding to u 1 u 2 u n, let I(u 1 u 2 u n ) = [a, b]. Represent the mid-point a+b/2 by its binary expansion a+ b = 0. BB 1 2 BL 2 i = B 2, B {0,1}. i=1 i i Let L= log pu ( 1 un) + 1= log( b a) + 1 The binary sequence B 1 B 2 B L is the codeword of u 1 u 2 u n. The length of the codeword assigned to u 1 u 2 u n is equal to log pu ( 1 u n ) + 1

Let a+ b 2 L Shannon-Fano-Elias Codes: Decoding a+ b = 0. BB 1 2 B 2 L L is the real number obtained by rounding off (a+b)/2 to the first L bits. We can prove a+ b 2 L is inside the interval [a,b]. a+ b a+ b 2 2 L a+ b a+ b 0.00 0B B 2 = 2 a+ b 2 L L i=l+1 i L 1 n 1 L+ 1 L+ 2 = B 2 < 2 = 2 2 i ( log p( u u ) + 1) 1 b a = pu ( 1 un ) = 2 2 n is inside [a, b]. Furthermore, a+ b a+ b L, 2 [ ab, ] 2 + L 2 L [ log p( u u ) + 1] After receiving the codeword B 1 B 2 B n, the decoder searches through all u 1 u 2 u n єx n until the unique u 1 u 2 u n is found for which I(u 1 u 2 u n ) contains a+ b = 0. BB 1 2 B 2 L, and then decodes B 1 B 2 B L as the unique u 1 u 2 u n. L

Shannon-Fano-Elias Codes: Example x p(x) I(x) L(X)= log ( ) + 1 midpoint C(x) x 0.25 0 [0, 0.25] 3 0.001 001 x 0.5 1 [0.25, 0.75] 2 0.10 10 x 0.125 2 [0.75, 0.875] 4 0.1101 1101 x 0.125 3 [0.875, 1] 4 0.1111 1111 Shannon-Fano-Elias Code is a prefix code.

Arithmetic Coding The encoding complexity of the Shannon-Fano-Elias coding algorithm mainly lies in the process of determining the interval I(u 1 u 2 u n ). Similarly, given B 1 B 2 B L, the decoding complexity of the Shannon- Fano-Elias coding algorithm mainly lies in the process of finding the unique interval I(u 1 u 2 u n ) such that the point 0.B 1 B 2 B L is in I(u 1 u 2 u n ). In arithmetic coding, both of the processes can be realized sequentially with linear complexity. The idea of arithmetic coding was originated by Elias and later made practical by Rissanen, Pasco, Moffat and Witten.

Arithmetic Coding (Continued) 1) To determine the interval I(u 1 u 2 u n ), we decompose the joint probability p(u 1 u 2 u n ) as p(u 1 u 2 u n ) = p(u 1 ) p(u 2 u 1 ) p(u 3 u 1 u 2 ) p(u n u 1 u n-1 ) we then construct a sequence of embedded intervals I( u ) I( uu 1 2) I( uu 1 2 u 1 n 2) Partition the interval [0, 1] into disjoint subintervals I(x j ), 0 j J-1 shown below 0 I(x 0 ) I(x 1 ) I(x 2 ) I(x J-2 ) I(x J-1 ) 1 The length of the interval I(x j ) is equal to p(x j ). Then I(u 1 )= I(x j ) if u 1 =x j. 3) If I(u 1 u 2 u i )=[a i,b i ], we then partition [a i,b i ] into disjoint sub-intervals I(u 1 u i x j ), 0 j J-1 according to the conditional pmf p(x j u 1 u i ), 0 j J-1, shown below. I(u 1 u i x 0 ) I(u 1 u i x 0 ) I(u 1 u i x J-1 ) a i ) b i

Arithmetic Coding (Continued) The length of the interval I(u 1 u i x j ) is equal to p(u 1 u i x j ) = p(u 1 u i ) p(x j u 1 u i ) = the length of [a i, b i ]x p(x j u 1 u i ) Then I(u 1 u i u i+1 ) = I(u 1 u i x j ) if u i+1 =x j 4) Repeat step 3) until the interval I(u 1 u n ) is determined. The last interval I(u 1 u n ) is the desired interval. 5) To get the codeword corresponding to u 1 u n, we apply the same procedure as in the Shannon-Fano-Elias coding. let I(u 1 u 2 u n ) = [a, b]. Let L= log pu (. Rounding off the midpoint (a+b)/2 to the first L 1 u n ) + 1 bits, we get a+ b = 0. BB 1 2 B 2 L L The sequence B 1 B 2 B L is the codeword corresponding to u 1 u n.

Arithmetic coding ( Decoding) The decoding process can be realized sequentially. 1) Partition [0, 1) into disjoint sub-intervals I(x j ), 0 j J-1. If 0.B 1 B 2 B L єi(x j ), set u 1 =x j. 2) Having decoded u 1 u 2 u i, we then partition I(u 1 u 2 u i ) into disjoint subintervals I(u 1 u 2 u i x j ), 0 j J-1. If 0.B 1 B 2 B L є I(u 1 u 2 u i x j ), then set u i+1 =x j. 3) Repeat step 2) until the sequence u 1 u 2 u n is decoded.

Arithmetic coding 1) In arithmetic coding, the length n of the sequence u 1 u 2 u n to be compressed is assumed to be known to both the encoder and the decoder. 2) The length of the codeword length assigned to u 1 u 2 u n is L pu1 u n = log ( ) + 1 Thus the average codeword length in bits/symbol converges to the entropy rate of a stationary source as n approaches infinity.

Arithmetic Coding (Example) Let {x i } be a discrete memoryless source with a common pmf p(0)=2/5, p(1)=3/5, and the alphabet X={0,1} Let u 1 u 2 u 5 =10110. We have I(1)=[2/5, 1] I(10)=[2/5, 16/25] I(101)=[62/125, 16/25] I(1011)=[346/625, 16/25] I(10110)=[346/625, 1838/3125] The length of I(101100) is 108/3125 108 L= log + 1= 6 3125 Midpoint = 1784/3125 = 0.100100 and the codeword = 100100

Arithmetic coding Source symbol Probability Initial Subinterval x 0 0.2 [0.0, 0.2) x 1 0.2 [0.2, 0.4) x 2 0.4 [0.4, 0.8) x 3 0.2 [0.8, 1.0] Let the message to be encoded be x 0 x 1 x 2 x 2 x 3

Encoding sequence: x0x1x2x2x3 x 0 x 1 x 2 x 2 x 3 1.0 0.2 0.08 0.072 0.0688 0.8 0.16 0.072 0.0688 0.06752 0.4 0.08 0.056 0.0624 0.06496 0.2 0.04 0.048 0.0592 0.06368 0.0 0 0.04 0.056 0.0624

The final interval [0.06752,0.0688), we can get the codeword length L and the corresponding codeword.

Adaptive Arithmetic Coding In the above description of arithmetic coding, we assume that both the encoder and decoder know in advance the joint pmf of the random vector (X 1, X 2, X n ). In practice, the pmf is often unknown, and has to be estimated online and offline. For simplicity, let x={0,1}. The initial pmf is equally likely, i. e., p(0) = p(1) = ½ After u 1 u 2 u i is processed, the conditional pmf given u 1 u 2 u i is given by p(1 u u u ) = 1 2 i p(0 u u u ) = 1 2 i number of 1 in u1u 2 ui+ 1 i+ 2 number of 0 in u 1u 2 u i + 1 i+ 2 Let u 1 u 2 u 8 = 11001010. Then according to the above 1 2 1 2 3 3 4 4 p( u1u2 u 6) =p(11001010)= 2 3 4 5 6 7 8 9

p(1 u u u ) = Adaptive Arithmetic Coding Another choice for the conditional pmf given u 1 u 2 u i is as follows 1 2 i p(0 u u u ) = 1 2 i number of 1 in u1u2 ui+ 1/ 2 i+ 1 number of 0 in u1u2 ui+ 1/ 2 i+ 1 1 3/ 2 1/ 2 3/ 2 5/ 2 5 / 2 7 / 2 7 / 2 p(11001010)= 2 2 3 4 5 6 7 8

Lempel-Ziv Algorithm Adaptive arithmetic coding presented at the end of the last section is universal because it does not require source statistics and can achieve the ultimate compression rate of any discrete memoryless source Lempel-Ziv is another universal source coding algorithm developed by Ziv and Lempel. One Lempel-Ziv algorithm is LZ77 which is known as sliding window Lempel-Ziv algorithm, which is published in 1977. One year later, they propose a variant of LZ77, the incremental parsing Lempel-Ziv algorithm, i.e., LZ78. In this course we will look at LZ78.

Lempel-Ziv parsing LZ78 adopts a incremental parsing procedure, which parses the source sequence u 1 u 2 u n into non-overlapping variable-length blocks. The first substring in the incremental parsing of u 1 u 2 u n is u 1. The second substring in the parsing is the shortest phrase of u 1 u 2 u n appeared so far in the parsing. that has not Assume that u 1, u 1 u n2,u n2 +1 u n 3, u ni-1 +1 u n i are the substrings created so far in the parsing process. The next substring, which is denoted as u ni +1 u n i+1 is the shortest phrase of u i+1 u n n that has not appeared in {u 1, u1 u n2,un2+1 u n3, uni-1+1 u is such a ni } prefix exists Otherwise u +1 u ni n = u i+1 n +1 u i n with n i+1 =n, and the incremental parsing procedure terminates.

Lempel-Ziv parsing: Example Example 1 1 0 10 11 100 111 00 1110 001 110 01 The incremental parsing procedure yields the following partition 1, 0, 10, 11, 100, 111, 00, 1110, 001, 110, 01 Example 2 1 10 11 0 00 110 1 1, 10, 11, 0, 00, 110, 1 In this example, the last substring 1 has already appeared.

Lempel-Ziv parsing The concatenation of all phrases is equal to the original source sequence. All phrases are distinct, except that the last phrase could be equal to one of the preceding ones. In Example 2, the last phrase is equal to the first one. All phrases except the last one are distinct. Λ Let denote an empty string. Think of as an initial phrase before the first phrase in the incremental parsing. Each new phrase in the parsing is the concatenation of a previous phrase with a new output letter from the source sequence. For example, the first phase 1 is the concatenation of the empty string with the new symbol 1. similarly, the phrase 110 is the concatenation of the phrases 11 with the new symbol 0. Λ

Lempel-Ziv Encoding Let X={x 0, x J-1 }. The Lempel-Ziv encoding of the sequence u 1 u 2 u n can be implemented sequentially as follows. 1. The first phrase u 1 is uniquely determined by (0, u 1 ) where the index 0 is corresponding to the initial empty phrase. Represent the pair (0,u1) by the integer 0xJ+index(u 1 ) where the index(u 1 )=j if u 1 =x j, 0 j J-1. Encode the first phrase into the binary representation of the integer 0xJ+index(u 1 ) = index(u 1 ) padded with possible zeros on the left to ensure that the total length of the codeword is logj 2. Having determined the ith phrase, we know that the ith phrase is equal to the concatenation of the mth phrase with a new symbol x j for some 0 m i-1 and 0 j J-1. Represent the ith phrase into the binary representation of the integer mxj+j padded with some possible zeros on the left to ensure that the total of the codeword is 3. Repeat step 2 until all phrases are encoded. Λ logij

Lempel Ziv Encoding: Example Partitioned phrases: 1 10 11 0 00 110 1 X ={0,1}, J=2. Phrases ( m, j ) codewords length 1 (0, 1) 1 1 10 (1, 0) 10 2 11 (1, 1) 011 3 0 (0, 0) 000 3 00 (4, 0) 1000 4 110 (3, 0) 0110 4 1 (0,1) 0001 4 So the Lempel-Ziv coding transforms from the original source sequence 1 10 11 0 00 110 1 To 1 10 011 000 1000 0110 0001

Lempel Ziv Encoding In the example, instead of compression, we get expansion. The problem is that the source sequence in the example is too short. In fact the LZ78 can achieve the entropy rate of any stationary source as the length of the source sequence goes without bound. If there are t phrases in the incremental parsing of u 1 u 2 u n, then the length of the whole Lempel-Ziv codeword for u 1 u 2 u n is t logij i= 1

Lempel Ziv Decoding The decoding process is easy and can also be done sequentially since the decoder knows in advance that the length of the codeword corresponding to the ith phrase is logij After receiving the whole codeword, the decoder parses the whole codeword into non-overlapping substring of lengths logij, 1 i t. From the ith string, the decoder finds the integer mj+j and the pair (m,j). Then the ith phrase is the concatenation of the mth phrase with the symbol x j.

Lempel Ziv Decoding: Example 1 10 011 000 1000 0110 0001 Integers 1 2 3 0 8 6 1 pairs (0,1) (1,0)(1,1) (0,0) (4,0) (3,0) (0,1) Phrases 1 10 11 0 00 110 1

Performance of Lempel-Ziv Coding Theorem 2.6.1 Let {X i } be a discrete stationary source. Let r(x 1 X n ) be the ratio between the length of the whole Lempel-Ziv codeword for X 1 X n and the length n of X 1 X n is the compression rate in bits per symbol. Then as E[r(X 1 X n ) ] H ( X ) n