Summary of Last Lectures

Similar documents
CSEP 590 Data Compression Autumn Arithmetic Coding

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

COMM901 Source Coding and Compression. Quiz 1

Motivation for Arithmetic Coding

Information and Entropy

Shannon-Fano-Elias coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Exercises with solutions (Set B)

Stream Codes. 6.1 The guessing game

Source Coding: Part I of Fundamentals of Source and Video Coding

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Lecture 3 : Algorithms for source coding. September 30, 2016

3F1 Information Theory, Lecture 3

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression

Chapter 2: Source coding

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

3F1 Information Theory, Lecture 3

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Chapter 5: Data Compression

lossless, optimal compressor

Compression and Coding

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Data Compression Techniques

ELEC 515 Information Theory. Distortionless Source Coding

Homework Set #2 Data Compression, Huffman code and AEP

CMPT 365 Multimedia Systems. Lossless Compression

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

UNIT I INFORMATION THEORY. I k log 2

Chapter 4. Regular Expressions. 4.1 Some Definitions

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Lecture 4 Noisy Channel Coding

Introduction to information theory and coding

CSCI 2570 Introduction to Nanocomputing

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

In English, there are at least three different types of entities: letters, words, sentences.

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Communications Theory and Engineering

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

CMPT 365 Multimedia Systems. Final Review - 1

Coding of memoryless sources 1/35

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Theory of Computation

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Source Coding Techniques

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

1 Introduction to information theory

Digital communication system. Shannon s separation principle

Lecture 4 Channel Coding

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Data Compression. Limit of Information Compression. October, Examples of codes 1

Lecture 1 : Data Compression and Entropy

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Coding for Discrete Source

Lec 04 Variable Length Coding (VLC) in JPEG

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

U Logo Use Guidelines

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Chapter 9 Fundamental Limits in Information Theory

Entropy as a measure of surprise

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

CSCI 340: Computational Models. Regular Expressions. Department of Computer Science

Kolmogorov complexity ; induction, prediction and compression

Lec 05 Arithmetic Coding

Can the sample being transmitted be used to refine its own PDF estimate?

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

ECE 587 / STA 563: Lecture 5 Lossless Compression

BASIC COMPRESSION TECHNIQUES

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

{a, b, c} {a, b} {a, c} {b, c} {a}

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

Solutions to Set #2 Data Compression, Huffman code and AEP

Lecture 22: Final Review

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Lecture 1: Shannon s Theorem

DCSP-3: Minimal Length Coding. Jianfeng Feng

The Binomial Theorem.

Fundamentele Informatica II

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

2018/5/3. YU Xiangyu

Lecture 4 : Adaptive source coding algorithms

ASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Information Theory and Statistics Lecture 2: Source coding

Theory of Computer Science

Information Theory and Coding Techniques

Information Theory. Week 4 Compressing streams. Iain Murray,

ECE 587 / STA 563: Lecture 5 Lossless Compression

Generalized Kraft Inequality and Arithmetic Coding

Transcription:

Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i 13 9 h 8 1 0 1 0 7 f 6 g 4 b 4 c

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 2 / 64 Last Lectures Summary of Last Lectures Unqiue Decodability / Prefix Codes There are no better uniquely decodable codes than the best prefix codes Only need to consider prefix codes (also instantaneously decodable) Variable-Length Codes Scalar codes: l H(S) Conditional codes: l H(Sn S n 1 ) with H(S n S n 1 ) H(S) Block codes: l HN (S)/N with H N (S)/N H N 1 (S)/(N 1) V2V codes: Code for variable-length symbol sequences Optimal code construction: Huffman algorithm for corresponding pmf Fundamental Lossless Source Coding Theorem For all lossless coding techniques: H N (S) l H(S) = lim (entropy rate: largest lower bound) N N Variable-Length Codes for Instationary or Unknown Sources Adapt code during encoding/decoding (forward/backward adaptation)

Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Coding: Review Shannon-Fano-Elias Coding: Special Block Code of size N Order of N-symbol sequences {s k } is known by encoder and decoder N-th order pmf p(s k ) = P(S = s k ) is known by encoder and decoder On-the-fly encoding and decoding (no codeword table) Basic Concept Unique mapping of N-symbol sequences to intervals I k of cdf F (s) Half-open intervals I k = [L k, U k ) = [L k, L k +W k ) are characterized by Interval width: W k = F (s k ) F (s k 1 ) = p(s k ) Lower interval boundary: L k = F (s k 1 ) = i<k p(s i) Codewords: Fractional bits of representive value v k inside interval I k Length of codeword: K = log 2 W k Representative value: v k = L k 2 K 2 K = z k 2 K Codeword: Binary representation of z k = L k 2 K with K bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 3 / 64

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 4 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Illustration F (s) codeword b(s k ): binary representation of z with K bits integer part: z = L 2 K v = L 2 K 2 K number of bits: K = log 2 W I(s k ) = [L, L+W ) W = p(s k ) L = i<k p(s i ) s 0 s k 1 s k s k+1 s k messages s

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 5 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Summary Determination of Codewords Given: Ordered set of symbol sequences {s k } with associated pmf {p k } Construct codeword b k = b(s k ) for any particular sequence s k by 1 Determine interval width W k and lower interval boundary L k W k = p k (1) L k = p i (2) i<k 2 Determine codeword length K k K k = log 2 W k 3 Determine representative integer z k z k = L k 2 K k (3) (4) 4 Codeword b k : Binary representation of z k with K k bits

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 6 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Decoding: Illustration F (s) read codeword b: binary representation of z with K bits representative value: v = z 2 K U k > v decoding process: Compare v with upper interval boundaries U =L+W in increasing order decoded message: s k s 0 s k 1 s k s k+1 U k 1 v. U 0 v U k = i k p(s i ) messages s

Shannon-Fano-Elias Codes / Review: Basic Concept H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 7 / 64 Shannon-Fano-Elias Decoding: Summary Decoding of a Symbol Sequence Given: Ordered set of symbol sequences {s k } with associated pmf {p k } 1 Read codeword b: Codeword b has K bits and represents the binary value of the integer z 2 Determine the representattive value v according to v = z 2 K (5) 3 Initilization: Set index k = 0 and upper interval boundary U 0 = p 0 4 Compare v with U k If v < U k Output decoded symbol sequence s k Terminate decoding Otherwise (v U k ) Set k = k + 1 and U k = U k 1 + p k Goto step 4

Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Shannon-Fano-Elias Code Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codeword s k p k W k L k K k z k b k aaa 0.512 0.512 0.000 1 0 0 aab 0.128 0.128 0.512 3 5 101 aba 0.128 0.128 0.640 3 6 110 abb 0.032 0.032 0.768 5 25 11001 baa 0.128 0.128 0.800 3 7 111 bab 0.032 0.032 0.928 5 30 11110 bba 0.032 0.032 0.960 5 31 11111 bbb 0.008 0.008 0.992 7 127 1111111 average codeword length: l = 0.733 block Huffman code: l = 0.728 W k = p k L k = i<k p i K k = log 2 W k z k = L k 2 K k b k : z k with K k bits Worse than block Huffman code for same block size (N = 3) Code is not prefix-free! Can be a problem (depends on application)! H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 8 / 64

Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 9 / 64 Why Is The Code Not Prefix-Free? Effect of Codeword Concatenation I v I v / I (i 2) 2 K (i 1) 2 K i 2 K (i + 1) 2 K (i + 2) 2 K Encoder transmits codeword b = {b 0, b 1, b 2,, b K 1 } of K bits, signaling the binary fraction v I v = (0.b 0 b 1 b 2 b K 1 ) b Decoder sees a modified binary fraction v given by v = (0.b 0 b 1 b 2 b K 1 b K b K+1 b K+2 ) b where {b K b K+1 b K+2 } are the bits of following codewords Depending on the location of v inside the interval I and the values of the following bits, v can lay outside the interval I

Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 10 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction Ensure that binary fraction v (seen by decoder) lies inside interval Worst case: All following bits are equal to 1 v = v + 2 i < L + W (6) i=k+1 Since the sum in above equation is less than 2 K, we require v < v + 2 K L + W (7) Question: How many bits K do we need for representing v according to v = L 2 K 2 K Have to choose K so that the following inequality is fulfilled v + 2 K = L 2 K 2 K + 2 K L + W (8)

Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 11 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction (continued) Since x < x + 1, the inequality L 2 K 2 K + 2 K L + W is always fulfilled if we have ( L 2 K + 1 ) 2 K + 2 K L + W L + 2 2 K L + W 2 1 K W 1 K log 2 W Unique decodability is guaranteed, if we choose K 1 log 2 W (9) K = 1 log 2 W = log 2 W + 1 (10) One additional bit per codeword (compared to non-prefix-free version)

Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Prefix-Free Shannon-Fano-Elias Code Repeated Example: Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codewords s k p k W k L k K k z k b k aaa 0.512 0.512 0.000 2 0 00 aab 0.128 0.128 0.512 4 9 1001 aba 0.128 0.128 0.640 4 11 1011 abb 0.032 0.032 0.768 6 50 110010 baa 0.128 0.128 0.800 4 13 1101 bab 0.032 0.032 0.928 6 60 111100 bba 0.032 0.032 0.960 6 62 111110 bbb 0.008 0.008 0.992 8 254 11111110 average codeword length: l = 1.067 block Huffman code: l = 0.728 W k = p k L k = i<k p i K k = 1 log 2 W k z k = L k 2 K k b k : z k with K k bits Additional bit ensures that code becomes a prefix code Worse than block Huffman code (several redundant bits) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 12 / 64

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 13 / 64 Shannon-Fano-Elias Codes / Bounds on Average Codeword Length Efficiency of Shannon-Fano-Elias Codes Average Codeword Length Average codeword length l per symbol l = E{ K(S) } N = E{ A log 2 p N (S) } N with A = { 1 : prefix-free 0 : otherwise (11) Bounds on Average Codeword Length Using inequalities x x and x < x + 1, we obtain E{ log 2 p N (S) } N H N (S) N + A N l < E{ log 2 p N (S) } N + A N l < H N(S) N + 1 + A N + 1 + A N (12) Non-prefix-free version (A = 0): Same bounds as for block Huffman coding Both versions: Close to entropy rate for N 1 (for typical sources)

Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 14 / 64 Shannon-Fano-Elias Coding: Intermediate Results Shannon-Fano-Elias Code Special block code (for given number of symbols N) Worse than block Huffman code of same size N Still close to entropy bound H N /N for N 1 No need to store codeword table! Have to store N-th order pmf (or N-th order cdf)! What is the advantage? Iterative Coding Can define a suitable order for sequences of N symbols Probability intervals are nested Iterative calculation of interval boundaries Iterative codeword construction

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 15 / 64 Shannon-Fano-Elias Codes / Iterative Coding Sorting of Symbol Sequences Lexicographical Order Define any order for single symbols s k (sorted symbol alphabet) Consider two sequences of N symbols s a = {s0, a s1, a, sn 1} a and s b = {s0 b, s1 b,, sn 1} b Lexicographical order for sequences of N symbols ( ) ( ) s a < s b n < N : k < n : sk a = sk b sn a < sn b (13) Example: Alphabetical Order a < b < < z =... aaaa... aaab.... aaaz... aaba... aabb P ( s (2) < ab ). P ( s (2) < aa ). P ( s (3) < aba ) P ( s (3) < aaz ) P ( s (3) < aab ) P ( s (3) < aaa )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 16 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Illustration Example: 3-symbol alphabet A = {a, b, c} c(z..) = x<z p(x...) I(abbc...) W (abbc) = W (abb) p(c abb) L(abbc) = L(abb) + W (abb) c(c abb) W (abb) I(abb...) I(abbb...) W (abbb) = W (abb) p(b abb) L(abbb) = L(abb) + W (abb) c(b abb) I(abba...) W (abba) = W (abb) p(a abb) L(abb) L(abba) = L(abb) + W (abb) c(a abb) Important: Probability intervals are nested!

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 17 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Interval Width Interval Refinement Consider sub-sequences s (n) = {s 0, s 1,, s n 1 } with n < N Determine interval I n+1 = [L n+1, L n+1 + W n+1 ) for s (n+1) = {s (n), s n } based on interval I n = [L n, L n + W n ) for prefix sequence s (n) Refinement of Interval Width Interval width W n+1 for sub-sequence s (n+1) = {s (n), s n } W n+1 = P (S (n+1) = s (n+1)) ) = P (S (n) = s (n), S n = s n ( = P S (n) = s (n)) ( P S n = s n S (n) = s (n)) (14) Iteration rule for interval width W n+1 = W n p(s n s n 1, s n 2,, s 0 ) (15)

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 18 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Lower Interval Boundary Refinement of Lower Interval Boundary Lower interval boundary L n+1 for sub-sequence s (n+1) = {s (n), s n } L n+1 = P (S (n+1) < s (n+1)) ( = P S (n) < s (n)) ) + P (S (n) = s (n), S n < s n ( = P S (n) < s (n)) ( + P S (n) = s (n)) ( P S n < s n S (n) = s (n)) (16) Define modified cmf c(.), which excludes current symbol c(s n s n 1,, s 0 ) = P (S n < s n S (n) = s (n)) = p(a s n 1,, s 0 ) a<s n (17) Iteration rule for lower interval boundary L n+1 = L n + W n c(s n s n 1, s n 2,, s 0 ) (18)

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 19 / 64 Shannon-Fano-Elias Codes / Iterative Coding Nested Probability Intervals: Verification Intervals are Nested Lower interval boundary ( L n+1 = L n + W n P S n < s n S (n) = s (n)) L n (19) Upper interval boundary L n+1 + W n+1 = L n + W n P (S n < s n S (n) = s (n)) + W n P (S n = s n S (n) = s (n)) = L n + W n P (S n s n S (n) = s (n)) = L n + W n W n P (S n > s n S (n) = s (n)) L n + W n (20) Interval for s (n+1) = {s (n), s n } is fully included in interval for s (n)

Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 20 / 64 Iterative Interval Refinement Iterative Algorithm for Calculation of Interval Boundaries Initialization: W 0 = 1 (21) L 0 = 0 (22) Iteration Step: W n+1 = W n p(s n s n 1,, s 0 ) (23) L n+1 = L n + W n c(s n s n 1,, s 0 ) (24) Advantage of Interval Refinement? Derived iteration approach: Conditional pmf p(s n s n 1, ) instead of joint pmf p(s 0,, s N 1 ) Need to store same amount of data But: Conditional pmfs can be well approximated using simple models IID model: p(s n s n 1,, s 0 ) = p(s n ) Markov model: p(s n s n 1,, s 0 ) = p(s n s n 1 )

Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 21 / 64 Practical Iterative Interval Refinement Simplified Iteration IID Model: W n+1 = W n p(s n ) (25) L n+1 = L n + W n c(s n ) (26) Markov Model: W n+1 = W n p(s n s n 1 ) (27) L n+1 = L n + W n c(s n s n 1 ) (28) Many other simple models possible: Condition = f (s n 1, ) Other Aspects Switching between symbol alphabets possible Suitable for complicated syntax (as for prefix codes) Adaptation of probability models Probabilities can be estimated during encoding and decoding Elegant way to deal with instationary or unkown sources

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 22 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Algorithm Iterative Shannon-Fano-Elias Encoding 1 Given is a sequence s = {s 0, s 1, s 2,, s N 1 } of N symbols 2 Initialization of probability interval W 0 = 1 and L 0 = 0 (29) 3 Determine probability interval: For each n = 0, 1,, N 1, do W n+1 = W n p( s n ) (30) L n+1 = L n + W n c( s n ) (31) 4 Determine codeword length and codeword value K = log 2 W N (for prefix code: K K + 1) (32) z = L N 2 K (33) 5 Transmit codeword b(s): Binary representation of z with K bits

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 23 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Example: IID Source B B B B B B L+W = 0.8842.. N N N N N N A A A A A A v = 0.8828.. L = 0.8819.. a p(a) c(a) A N B 1 2 1 3 1 6 0 1 2 5 6 W n+1 = W n p(s n ) L n+1 = L n + W n c(s n ) W n 1 L n 0 init B A N A N A 1 6 5 6 1 12 5 6 1 36 21 24 1 72 21 24 K = log 2 W = log 2 432 = 9 z = L 2 K = 127 144 512 = 452 b = "111000100" 1 216 127 144 1 432 127 144 ( ) v = 452 512 (z =452 with K =9 bits)

Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Algorithm Iterative Shannon-Fano-Elias Decoding 1 Given: Bitstream b = {b 0, b 1,, b K 1} of K K bits Number N of symbols to be decoded 2 Determine interval representative: v = (0.b 0 b 1 b K 1) b = z 2 K 3 Initialization of probability interval: W 0 = 1 and L 0 = 0 4 Iterative decoding: For each n = 0, 1,, N 1, do a Calculate symbol intervals: For each a A n, calculate W n+1 (a) = W n p( a ) (34) L n+1 (a) = L n + W n c( a ) (35) b c Compare v with upper interval boundaries in increasing symbol order and output symbol s n with L n+1 (s n ) v < U n+1 (s n ) = L n+1 (s n ) + W n+1 (s n ) (36) Update interval boundary and interval width W n+1 = W n+1 (s n ) and L n+1 = L n+1 (s n ) (37) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 24 / 64

Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Example: IID Source L n+1 (a) = L n +W n c(a) B N B N B N B N B N B N W n+1 (a) = W n p(a) A A A A A A (L n, W n ) 0,1 5 6, 1 6 (L n+1, W n+1 ) (A) 0, 1 2 1 (L n+1, W n+1 ) (N), 1 2 3 5 (L n+1, W n+1 ) (B), 1 6 6 5, 1 6 12 11, 1 12 18 35, 1 36 36 5 6, 1 12 5, 1 6 24 21, 1 24 36 65, 1 72 72 21, 1 24 36 21, 1 24 72 8, 1 9 108 97, 1 108 216 21, 1 24 72 21, 1 24 144 127, 1 144 216 383, 1 432 432 127, 1 144 216 127, 1 144 432 191, 1 216 648 287, 1 324 1296 symbol s n B A N A N A v = 452 512 b = "111000100" = s = "BANANA" H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 25 / 64

Arithmetic Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 26 / 64 Arithmetic Coding Iterative Shannon-Fano-Elias Coding Iterative Encoding and Decoding Iterative interval refinement Simple codeword construction Precision requirements and delay for larger N Require extremely high precision for W n, L n, and v Encoder: Complete codeword written at end of encoding Decoder: Complete codeword read at start of decoding Arithmetic Coding Fixed-precision approximation of Shannon-Fano-Elias coding Represent pmf(s) with fixed-precision integers Represent interval width with fixed-precision integers Output bits as soon as possible

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 27 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of PMF(s) Fixed-Precision Approximation of Pmf(s) Choose number of bits V for representing probability masses Represent probability masses p(a) by V -bit integers p V (a) p(a) = p V (a) 2 V (38) Resulting modified cmf c(a) can also be represented by V -bit integers c V (a) c(a) = ( ) p(b) = p V (b) 2 V = c V (a) 2 V (39) b<a b<a Requirements on Pmf Approximation Probability masses must be non-zero and pmf must be valid a : p V (a) > 0 and p V (a) 2 V (40) a

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 28 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of Interval Width Fixed-Precision Approximation of Interval Width Represent interval width W n by U-bit integer A n and counter z n U Use maximum possible precision for A n Restrict A n according to Use following initialization W n = A n 2 zn (41) 2 U 1 A n < 2 U (42) A 0 = 2 U 1, z 0 = U W 0 = 1 2 U (43) Binary representation of interval width W n = A n 2 zn z n bits {}}{ W n = 0. 00000 }{{ 0 }} 1xx {{ x } 000 z n U bits U bits

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 29 / 64 Arithmetic Coding / Rounding in Interval Refinement Rounding in Interval Refinement Conventional Refinement of Interval Width Refinement of interval width W n+1 = W n p(s n ) (44) A n+1 2 zn+1 = ( A n p V (s n ) ) 2 (zn+v ) (45) }{{} (U+V )-bit integer In general: W n p(s n ) cannot be represented using a U-bit integer What can we do? Requirement for Unique Decodability Code remains uniquely decodable if we ensure 0 < W n+1 W n p(s n ) (46) Solution: Rounding down of W n p(s n ) in each iteration so that W n+1 can represented using A n+1 2 zn+1 with 2 U 1 A n+1 < 2 U

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 30 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Product of Interval Width and Pmf Entry Binary representations of W n = A n 2 zn and p(s n ) = p V (s n ) 2 V z n U bits U bits {}}{{}}{ W n = 0. 00000 0 1xx x 000 p(s n ) = 0. } xxx {{ x } 000 V bits Interval refinement W n p(s n ) W n+1 z bits U bits V z bits {}}{{}}{{}}{ W n p(s n ) = 0. 00000 }{{ 0 } 00 } 0 1x {{ x xx x } 000 z n U bits U+V bits: A n p V (s n) z n U bits z bits U bits {}}{{}}{{}}{ W n+1 = 0. 00000 } {{ 0 00 0 } 1x }{{ x } 00 0 000 z n+1 U bits A n+1

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 31 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Arithmetic Operations for Interval Width Update z bits U bits {}}{{}}{ W n p(s n ) = 0. } 00000 {{ 0 } 00 0 1x x z n U bits z n U bits {}}{ W n+1 = 0. 00000 0 Update of interval width V z bits {}}{ xx x } {{ } U+V bits: A n p V (s n) z bits {}}{ 00 0 } {{ } z n+1 U bits U bits 000 {}}{} 1x {{ x } 00 0 000 A n+1 Determine number z of leading zeros in (U +V )-bit integer (A n p V (s n )) Update interval width according to A n+1 = ( A n p V (s n ) ) (V z) ( = bit shift to the right) z n+1 = z n + z Ensures unique decodability: 0 < W n+1 W n p(s n )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 32 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representations of Lower Interval Boundary Product of Interval Width and Modified Cmf Entry Remember: L n+1 = L n + W n c(s n ) Binary representations of W n = A n 2 zn and c(s n ) = c V (s n ) 2 V z n bits {}}{ W n = 0. 00000 }{{ 0 }} 1xx {{ x } 000 z n U bits U bits c(s n ) = 0. } xxx {{ x } 000 V bits Binary representation of product W n c(s n ) z n+v bits {}}{ W n c(s n ) = 0. 00000 }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 33 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representation of Lower Interval Boundary Update of Lower Interval Boundary z n+v bits {}}{ W n c(s n ) = 0. 00000 }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits What is the effect on lower interval boundary? z n U bits {}}{ L n = 0. } aaaaa {{ a }} 0111111 {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits 00000 }{{} trailing bits Trailing bits: Equal to 0, but maybe changed later Active bits: Directly modified by the update L n+1 = L n + W n c(s n ) Outstanding bits: May be modified by a carry from the active bits Settled bits: Not modified in any following interval update

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 34 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Lower Interval Boundary & Output Representation of Lower Interval Boundary z n U bits {}}{ L n = 0. } aaaaa {{ a }} 0111111 {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits 00000 }{{} trailing bits Active bits: (U +V )-bit integer B n Intermediate value B n + A n c V (s n ) requires (U +V +1)-bit integer Outstanding bits: Counter c n (trailing c n 1 bits are equal to 1) Settled bits: Output as soon as they become settled

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 35 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Interval Refinement Update of Probability Interval z n U A n (U) {}}{{}}{ W n = 0. 0000000000000 0 1xx x 000000000 L n W n+1 = 0. L n+1 = 0. } aaaaa {{ a } z n c n U } 011 {{ 1 } c n z n U {}}{ 0000000000000 0 = 0. aaaaa a }{{} z n c n U } xxx {{ xxxxxx } 00000 B n (U + V ) z {}}{ 0 0 xxxxxxxx xxx }{{} c n + z A n+1 (U) {}}{ 1xx x 000000000 } xxx {{ xxxxxx } 00000 B n+1 (U + V ) Interval update z = Number of trailing zeros in (U +V )-bit integer ( A n p V (s n ) ) mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = ( A n p V (s n ) ) (V z) = ( ( B n + A n c V (s n ) ) & mask ) z

Arithmetic Coding / Continuous Output of Bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 36 / 64 Arithmetic Coding: Output of Settled Bits Output of Bits L n+1 = 0. aaaaa a }{{} z n c n U xxxxxxxx xxx }{{} c n + z Investigate modified (c n + z) bits Update counter c n+1 and output new settled bits } xxx {{ xxxxxx } 00000 B n+1 (U + V ) Total Number of Bits for Arithmetic Codeword Total number of bits to output (note: W N = A N 2 zn ) K = log 2 W N = zn log 2 A N = zn U + 1 (47) Prefix-free version: K = z N U + 2 Note: z N U is the sum of settled and outstanding bits Output c N outstanding bits Output one bit of B N (prefix-free: two bits)

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 37 / 64 Arithmetic Coding / Codeword Termination Termination of Arithmetic Codeword Required Output Required output after last symbol was coded c N outstanding bits most significant bit of B N (prefix-free: two most significant bits) Note: Lower boundary must be rounded up to next multiple of 2 K Codeword Termination Set n = 1 (non-prefix free) or n = 2 (prefix-free) If the any of the last (U + V n) bits in B N is equal to 1, do Set B N = B N + (1 (U + V n)) (rounding up) If B N (1 (U + V )) (carry condition) Invert outstanding bits, output new settled bits, set c N = 1 Remove carry: B N = B N (1 (U + V )) Output outstanding bits (one 0 and (c N 1) times 1 ) Output n most significant bits of B N

Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 38 / 64 Overview of Arithmetic Encoding Process Arithmetic Encoding 1 Initialization: A 0 = 2 U 1, B 0 = 0, c 0 = 0 2 Iteration: For n = 0, 1,, N 1, do a Calculate: A = A n p V (s n) (U + V bits) B = B n + A n c V (s n) (U + V + 1 bits) b Determine number z of trailing zeros in (U +V )-bit integer A c Update: mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = A (V z) = ( B & mask ) z d Determine outstanding bits counter c n+1 (based on c n and B ) e Output new settled bits (c n + z c n+1 bits) 3 Termination: Round up B N Output c N outstanding bits + one/two most significant bit(s) of B N

Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 39 / 64 Arithmetic Coding Example Example: Preparation Coding example IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Consider arithmetic coding with V = 4 and U = 4 Symbol sequence BANANA Preparation: Quantization of pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = 8.00 8 0 N 1/3 16/3 5.33 5 8 B 1/6 16/6 2.66 3 13 Note: Quantized pmf p V (a) fulfills the requirement p V (a) 2 V

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 40 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 1 s n p V c V parameter updates & output A 0 = 15 = 1111 initialization c 0 = 0 ( ) B 0 = 0 = 0000 0000 bitstream = B 3 13 A 0 p V = 15 3 = 45 = 0010 1101 B 0 + A 0 c V = 0 + 15 13 = 195 = 0 1100 0011 z = 2 A 1 = 1011 = 11 c 1 = 0 ( ) B 1 = 0000 1100 = 12 output = 11 bitstream = 11

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 41 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 2 s n p V c V parameter updates & output A 1 = 11 = 1011 after step 1 c 1 = 0 ( ) B 1 = 12 = 0000 1100 bitstream = 11 A 8 0 A 1 p V = 11 8 = 88 = 0101 1000 B 1 + A 1 c V = 12 + 11 0 = 12 = 0 0000 1100 z = 1 A 2 = 1011 = 11 c 2 = 1 ( 0 ) B 2 = 0001 1000 = 24 output = bitstream = 11

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 42 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 3 s n p V c V parameter updates & output A 2 = 11 = 1011 after step 2 c 2 = 1 ( 0 ) B 2 = 24 = 0001 1000 bitstream = 11 N 5 8 A 2 p V = 11 5 = 55 = 0011 0111 B 2 + A 2 c V = 24 + 11 8 = 112 = 0 0111 0000 z = 2 A 3 = 1101 = 13 c 3 = 2 ( 01 ) B 3 = 1100 0000 = 192 output = 0 bitstream = 110

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 43 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 4 s n p V c V parameter updates & output A 3 = 13 = 1101 after step 3 c 3 = 2 ( 01 ) B 3 = 192 = 1100 0000 bitstream = 110 A 8 0 A 3 p V = 13 8 = 104 = 0110 1000 B 3 + A 3 c V = 192 + 13 0 = 192 = 0 1100 0000 z = 1 A 4 = 1101 = 13 c 4 = 3 ( 011 ) B 4 = 1000 0000 = 128 output = bitstream = 110

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 44 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 5 s n p V c V parameter updates & output A 4 = 13 = 1101 after step 4 c 4 = 3 ( 011 ) B 4 = 128 = 1000 0000 bitstream = 110 N 5 8 A 4 p V = 13 5 = 65 = 0100 0001 B 4 + A 4 c V = 128 + 13 8 = 232 = 0 1110 1000 z = 1 A 5 = 1000 = 8 c 5 = 4 ( 0111 ) B 5 = 1101 0000 = 208 output = bitstream = 110

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 45 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 6 s n p V c V parameter updates & output A 5 = 8 = 1000 after step 5 c 5 = 4 ( 0111 ) B 5 = 208 = 1101 0000 bitstream = 110 A 8 0 A 5 p V = 8 8 = 64 = 0100 0000 B 5 + A 5 c V = 208 + 8 0 = 208 = 0 1101 0000 z = 1 A 6 = 1000 = 8 c 6 = 5 ( 01111 ) B 6 = 1010 0000 = 160 output = bitstream = 110

Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 46 / 64 Arithmetic Coding Example Example: Codeword Termination s n p V c V parameter updates & output A 6 = 8 = 1000 after step 6 c 6 = 5 ( 01111 ) B 6 = 160 = 1010 0000 bitstream = 110 B = 1 0010 0000 (rounding up B 6 ) final rounding bitstream = 1101 000 (c 6 1 inverted bits) c = 1 ( 0 ) B = 0010 0000 termination final bitstream = 1101 0000 0 (c + 1 bits added) Bitstream b = "1101 0000 0" (for sequence s = "BANANA") Same number of bits (K = 9) as for Shannon-Fano-Elias coding

Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 47 / 64 Decoding With Finite Precision Identification of Intervals Important: Same rounding of interval width as in encoder (A n A n+1 ) Arithmetic codeword b 0 b 1 b 2 b 3 b 4 b 5 b 6 represents binary fraction v = (0.b 0 b 1 b 2 b 3 b 4 b 5 b 6 ) b Iterative decoding: Output symbol s n which fulfills inequality Observation: L n + W n c(s n ) v < L n + W n c(s n ) + W n p(s n ) Lower interval boundary L n cannot be represented with reasonable precision Idea: Subtract L n from the inequality Symbol s n is identified by W n c(s n ) v L n < W n c(s n ) + W n p(s n ) The value u n = v L n used in comparisons can be stored with (U + V ) bits, but needs to be updated after a symbol s n is decoded

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 48 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding: Binary Representations Analyse Binary Representations z n U U {}}{{}}{ W n = 0. 0000000000000 0 1xx x 00000000000000 z n U U + V {}}{{}}{ W n (c V + p V ) = 0. 0000000000000 0 xxx xxxxxx 000000000 v L n = 0. 0000000000000 0 xxx xxxxxx xxxxxxxxx W n c V = 0. 0000000000000 0 }{{} z n U xxx xxxxxx }{{} U + V 000000000 Use (U +V )-bit integer u n in comparisons (down-rounded value of v L n ) Initialization: u n = (first U +V bits from bitstream) Update u n u n+1 Subtract lower boundary: u n = u n W n c V (s n) Align with interval width: u n = u n z (leading zeros in A n p V (s n)) Fill least significant bits with next z bits from bitstream

Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 49 / 64 Overview of Arithmetic Decoding Process Arithmetic Decoding 1 Initialization: A 0 = 2 U 1, u 0 = (first U +V bits from bitstream) 2 Iteration: For n = 0, 1,, N 1, do a Identify next symbol: For k = 0, 1,, do Calculate upper boundary U(a k ) = A n (c V (a k ) + p V (a k )) If u n < U(a k ), then Output next symbol s n = a k break loop over k b Update parameters: Calculate intermediate value: A = A n p V (s n) (loop over alphabet) Determine number z of trailing zeros in (U +V )-bit integer A A n+1 = A (V z) u n+1 = (u n z) + (next z bits from bitstream)

Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 50 / 64 Arithmetic Decoding Example Decode Bitstream obtained in Encoding Example Coding example (see encoding example) IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Arithmetic coding with V = 4 and U = 4 Quantized pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = 8.00 8 0 N 1/3 16/3 5.33 5 8 B 1/6 16/6 2.66 3 13 Bitstream b = 1101 0000 0

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 51 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 1 a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) initialization A 0 = 15 = 1111 u 0 = 208 = 1101 0000 A 0 8 U(A) = 15 (0 + 8) = 120 U(A) u 0 N 8 5 U(N) = 15 (8 + 5) = 195 U(N) u 0 B 13 3 U(B) = 15 (13 + 3) = 240 U(B) > u 0 B A = 15 3 = 45 = 0010 1101 u = 208 15 13 = 13 = 0000 1101 z = 2 A 1 = 1011 = 11 u 1 = 0011 0100 = 52 bitstream = 1101 0000 0(000 0000 0 )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 52 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 2 a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) after step 1 A 1 = 11 = 1011 u 1 = 52 = 0011 0100 A 0 8 U(A) = 11 (0 + 8) = 88 U(A) > u 1 A N 8 5 U(N) = 11 (8 + 5) = 143 B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 8 = 88 = 0101 1000 u = 52 11 0 = 52 = 0011 0100 z = 1 A 2 = 1011 = 11 u 2 = 0110 1000 = 104 bitstream = 1101 0000 0(000 0000 0 )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 53 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 3 a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) after step 2 A 2 = 11 = 1011 u 2 = 104 = 0110 1000 A 0 8 U(A) = 11 (0 + 8) = 88 U(A) u 2 N 8 5 U(N) = 11 (8 + 5) = 143 U(N) > u 2 N B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 5 = 55 = 0011 0111 u = 104 11 8 = 16 = 0001 0000 z = 2 A 3 = 1101 = 13 u 3 = 0100 0000 = 64 bitstream = 1101 0000 0(000 0000 0 )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 54 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 4 a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) after step 3 A 3 = 13 = 1101 u 3 = 64 = 0100 0000 A 0 8 U(A) = 13 (0 + 8) = 104 U(A) > u 3 A N 8 5 U(N) = 13 (8 + 5) = 169 B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 8 = 104 = 0110 1000 u = 64 13 0 = 64 = 0100 0000 z = 1 A 4 = 1101 = 13 u 4 = 1000 0000 = 128 bitstream = 1101 0000 0(000 0000 0 )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 55 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 5 a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) after step 4 A 4 = 13 = 1101 u 4 = 128 = 1000 0000 A 0 8 U(A) = 13 (0 + 8) = 104 U(A) u 4 N 8 5 U(N) = 13 (8 + 5) = 169 U(N) > u 4 N B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 5 = 65 = 0100 0001 u = 128 13 8 = 24 = 0001 1000 z = 1 A 5 = 1000 = 8 u 5 = 0011 0000 = 48 bitstream = 1101 0000 0(000 0000 0 )

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 56 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 6 (last symbol) a c V p V decoding & update output bitstream = 1101 0000 0(000 0000 0 ) after step 5 A 5 = 8 = 1000 u 5 = 48 = 0011 0000 A 0 8 U(A) = 8 (0 + 8) = 64 U(A) > u 4 A N 8 5 U(N) = 8 (8 + 5) = 104 B 13 3 U(B) = 8 (13 + 3) = 128 bitstream 1101 0000 0 symbol sequence BANANA Note: Required some bits after end of the bitstream For non-prefix variant: Use bits equal to 0 For prefix-free variant: Any bit values (0 or 1) can be used

Arithmetic Coding / Efficiency H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 57 / 64 Efficiency of Arithmetic Coding Increase in Codeword Length relative to Shannon-Fano-Elias Coding Excess rate due to rounding of interval width l = log 2 W N log2 p(s) < 1 + log 2 p(s) W N (48) Upper bound for increase in codeword length per symbol relative to infinite-precision Shannon-Fano-Elias coding l < 1 N + log ( 2 1 + 2 1 U ) log 2 (1 2 V p min ) (49) (for a derivation see Wiegand, Schwarz, page 51-52) Example: Number of coded symbols N = 1000, Arithmetic precision: V = 16 and U = 12, Minimum probablity mass p min = 0.02 Increase in codeword length is less than 0.003 bit per symbol

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 58 / 64 Arithmetic Coding / Arithmetic Coding in Practice Complexity Reduction: Binary Arithmetic Coding Binary Arithmetic Coding Most popular type of arithmetic coding: JPEG 2000, H.264, H.265 Binarization of S {a 0, a 1,..., a M 1 } produces C {0, 1} Any prefix code can be used for binarization Example: Truncated unary binarization S n number of bins B C 0 C 1 C 2 C M 2 C M 1 a 0 1 1 a 1 2 0 1....... a M 2 M 2 0 0 0 0 1 a M 1 M 2 0 0 0 0 0 Entropy unchanged due to binarization S C H(S) = E{ log 2 p(s) } = E{ log 2 p(c) } = H(C)

Arithmetic Coding / Arithmetic Coding in Practice H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 59 / 64 Practical Arithmetic Coding Complexity Reduction Binary arithmetic coding Multiplication-free implementations Bypass mode: Low-complexity coding of bins with p = 0.5 Practical Design Aspects 1 Context selection Use reasonable context variables X = f (S n 1, S n 2, ) for switching probability tables p(a X ) Use context switching only when useful (certain bins) 2 Estimate probabilities during coding Choose appropriate window sizes for estimation 3 Suitably combine context selection and probability estimation

Comparison of Lossless Coding Techniques H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 60 / 64 Experimental Comparison of Lossless Coding Techniques Example: Markov Source Stationary Markov source given by conditional pmf a p(a a 0 ) p(a a 1 ) p(a a 2 ) a 0 0.90 0.15 0.05 a 1 0.05 0.80 0.05 a 2 0.05 0.05 0.60 H(S) = 1.2575 H(S) = 0.7331 Bounds for lossless coding Entropy rate H(S) for coding of infinitely many symbols Instantaneous entropy rate H inst (S, L) for coding L symbols H inst (S, L) = 1 L H(S 0, S 1,, S L 1 ) (50) Coding experiment Coding of 1 000 000 realizations of example stationary Markov source Calculate average codeword length for sequences of 1 to 1000 symbols

Comparison of Lossless Coding Techniques Experimental Results 2.5 average codeword length per symbol 2.0 1.5 1.0 0.5 0.0 scalar Huffman code (3 codewords) instantaneous entropy rate entropy rate conditional Huffman code (3 3 codewords) Huffman code for fixed-length vectors (5 symbols, 243 codewords) Huffman code for variable-length vectors (17 codewords) arithmetic coding (16 bits of precision for interval sizes and probabilities) 1 10 100 1000 number of coded symbols (logarithmic scale) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 61 / 64

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 62 / 64 Summary Part Summary Uniquely decodable codes & bounds for lossless coding Kraft inequality, prefix codes Scalar entropy, conditional entropy, block entropy Entropy rate, instantaneous entropy rate Variable-length codes for scalars and vectors Optimal code for given pmf: Huffman code Scalar, conditional codes: Inefficient for pmfs with p(a) 0.5 Block codes and V2V codes: Code tables can become extremely large Difficult adaptation to instationary sources Arithmetic coding No codeword table: Iterative construction of codeword Close to entropy bound for N 1 Well suited for exploiting statistical dependencies Well suited for adapting probabilities during coding

H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 63 / 64 Exercises Exercise 1: Implement an Arithmetic Encoder/Decoder Implement an arithmetic encoder and decoder (the one discussed in lecture): Use 64-bit integer arithmetic for all operations. Start with a 4-symbol alphabet {a, b, c, d} and a fixed pmf {6/16, 5/16, 3/16, 2/16} for verifying the implementation. Measure the average codeword length (bits per symbol) for long symbol sequences and compare it to the entropy rate. Extend the implementation to general file types. Use a fixed pmf (p k = 1/256) for the bytes of a file. For preparing the implementation, think about the following: How do we determine the newly settled bits and the new number of outstanding bits c n+1 during encoding (based on c n and B )? Suggestion: Use the framework (written in C++) on the web-site: It provides file input/output, file comparison, and implements classes for reading and writing bitstreams (bit by bit). In ac/test you ll find examples for the 4-symbol pmf ( test1mioabcd.txt ) and general text files ( Goethe.txt ).

Exercises H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 64 / 64 Exercise 2: Arithmetic Coding with Adaptive Pmfs Extend the implemented arithmetic codec by backward-adaptive pmf estimation and the usage of conditional pmfs: Use the iid model and estimate the pmf during encoding/decoding. Try to improve the performance by using a Markov model (estimation of conditional pmfs). Try to further improve the performance by using two preceding symbols as condition (2nd order Markov model). Test the different probability models (iid with fixed pmf, iid with adaptive pmf, Markov with adaptive pmfs, and 2nd order MArkov with adaptive pmfs) with the test file Goethe.txt provided in ac/test. For preparing the implementation, think about the following: How can we estimate a rounded pmf (with V bits of precision) during coding?