Summary of Last Lectures

Size: px

Start display at page:

Download "Summary of Last Lectures"

Stephany Lamb
5 years ago
Views:

1 Lossless Coding IV a k p k b k a b c d e f g h i root e a 16 d 15 i 13 9 h f 6 g 4 b 4 c

2 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 2 / 64 Last Lectures Summary of Last Lectures Unqiue Decodability / Prefix Codes There are no better uniquely decodable codes than the best prefix codes Only need to consider prefix codes (also instantaneously decodable) Variable-Length Codes Scalar codes: l H(S) Conditional codes: l H(Sn S n 1 ) with H(S n S n 1 ) H(S) Block codes: l HN (S)/N with H N (S)/N H N 1 (S)/(N 1) V2V codes: Code for variable-length symbol sequences Optimal code construction: Huffman algorithm for corresponding pmf Fundamental Lossless Source Coding Theorem For all lossless coding techniques: H N (S) l H(S) = lim (entropy rate: largest lower bound) N N Variable-Length Codes for Instationary or Unknown Sources Adapt code during encoding/decoding (forward/backward adaptation)

3 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Coding: Review Shannon-Fano-Elias Coding: Special Block Code of size N Order of N-symbol sequences {s k } is known by encoder and decoder N-th order pmf p(s k ) = P(S = s k ) is known by encoder and decoder On-the-fly encoding and decoding (no codeword table) Basic Concept Unique mapping of N-symbol sequences to intervals I k of cdf F (s) Half-open intervals I k = [L k, U k ) = [L k, L k +W k ) are characterized by Interval width: W k = F (s k ) F (s k 1 ) = p(s k ) Lower interval boundary: L k = F (s k 1 ) = i<k p(s i) Codewords: Fractional bits of representive value v k inside interval I k Length of codeword: K = log 2 W k Representative value: v k = L k 2 K 2 K = z k 2 K Codeword: Binary representation of z k = L k 2 K with K bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 3 / 64

4 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 4 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Illustration F (s) codeword b(s k ): binary representation of z with K bits integer part: z = L 2 K v = L 2 K 2 K number of bits: K = log 2 W I(s k ) = [L, L+W ) W = p(s k ) L = i<k p(s i ) s 0 s k 1 s k s k+1 s k messages s

5 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 5 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Summary Determination of Codewords Given: Ordered set of symbol sequences {s k } with associated pmf {p k } Construct codeword b k = b(s k ) for any particular sequence s k by 1 Determine interval width W k and lower interval boundary L k W k = p k (1) L k = p i (2) i<k 2 Determine codeword length K k K k = log 2 W k 3 Determine representative integer z k z k = L k 2 K k (3) (4) 4 Codeword b k : Binary representation of z k with K k bits

6 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 6 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Decoding: Illustration F (s) read codeword b: binary representation of z with K bits representative value: v = z 2 K U k > v decoding process: Compare v with upper interval boundaries U =L+W in increasing order decoded message: s k s 0 s k 1 s k s k+1 U k 1 v. U 0 v U k = i k p(s i ) messages s

7 Shannon-Fano-Elias Codes / Review: Basic Concept H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 7 / 64 Shannon-Fano-Elias Decoding: Summary Decoding of a Symbol Sequence Given: Ordered set of symbol sequences {s k } with associated pmf {p k } 1 Read codeword b: Codeword b has K bits and represents the binary value of the integer z 2 Determine the representattive value v according to v = z 2 K (5) 3 Initilization: Set index k = 0 and upper interval boundary U 0 = p 0 4 Compare v with U k If v < U k Output decoded symbol sequence s k Terminate decoding Otherwise (v U k ) Set k = k + 1 and U k = U k 1 + p k Goto step 4

8 Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Shannon-Fano-Elias Code Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codeword s k p k W k L k K k z k b k aaa aab aba abb baa bab bba bbb average codeword length: l = block Huffman code: l = W k = p k L k = i<k p i K k = log 2 W k z k = L k 2 K k b k : z k with K k bits Worse than block Huffman code for same block size (N = 3) Code is not prefix-free! Can be a problem (depends on application)! H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 8 / 64

9 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 9 / 64 Why Is The Code Not Prefix-Free? Effect of Codeword Concatenation I v I v / I (i 2) 2 K (i 1) 2 K i 2 K (i + 1) 2 K (i + 2) 2 K Encoder transmits codeword b = {b 0, b 1, b 2,, b K 1 } of K bits, signaling the binary fraction v I v = (0.b 0 b 1 b 2 b K 1 ) b Decoder sees a modified binary fraction v given by v = (0.b 0 b 1 b 2 b K 1 b K b K+1 b K+2 ) b where {b K b K+1 b K+2 } are the bits of following codewords Depending on the location of v inside the interval I and the values of the following bits, v can lay outside the interval I

10 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 10 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction Ensure that binary fraction v (seen by decoder) lies inside interval Worst case: All following bits are equal to 1 v = v + 2 i < L + W (6) i=k+1 Since the sum in above equation is less than 2 K, we require v < v + 2 K L + W (7) Question: How many bits K do we need for representing v according to v = L 2 K 2 K Have to choose K so that the following inequality is fulfilled v + 2 K = L 2 K 2 K + 2 K L + W (8)

11 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 11 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction (continued) Since x < x + 1, the inequality L 2 K 2 K + 2 K L + W is always fulfilled if we have ( L 2 K + 1 ) 2 K + 2 K L + W L K L + W 2 1 K W 1 K log 2 W Unique decodability is guaranteed, if we choose K 1 log 2 W (9) K = 1 log 2 W = log 2 W + 1 (10) One additional bit per codeword (compared to non-prefix-free version)

12 Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Prefix-Free Shannon-Fano-Elias Code Repeated Example: Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codewords s k p k W k L k K k z k b k aaa aab aba abb baa bab bba bbb average codeword length: l = block Huffman code: l = W k = p k L k = i<k p i K k = 1 log 2 W k z k = L k 2 K k b k : z k with K k bits Additional bit ensures that code becomes a prefix code Worse than block Huffman code (several redundant bits) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 12 / 64

13 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 13 / 64 Shannon-Fano-Elias Codes / Bounds on Average Codeword Length Efficiency of Shannon-Fano-Elias Codes Average Codeword Length Average codeword length l per symbol l = E{ K(S) } N = E{ A log 2 p N (S) } N with A = { 1 : prefix-free 0 : otherwise (11) Bounds on Average Codeword Length Using inequalities x x and x < x + 1, we obtain E{ log 2 p N (S) } N H N (S) N + A N l < E{ log 2 p N (S) } N + A N l < H N(S) N A N A N (12) Non-prefix-free version (A = 0): Same bounds as for block Huffman coding Both versions: Close to entropy rate for N 1 (for typical sources)

14 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 14 / 64 Shannon-Fano-Elias Coding: Intermediate Results Shannon-Fano-Elias Code Special block code (for given number of symbols N) Worse than block Huffman code of same size N Still close to entropy bound H N /N for N 1 No need to store codeword table! Have to store N-th order pmf (or N-th order cdf)! What is the advantage? Iterative Coding Can define a suitable order for sequences of N symbols Probability intervals are nested Iterative calculation of interval boundaries Iterative codeword construction

15 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 15 / 64 Shannon-Fano-Elias Codes / Iterative Coding Sorting of Symbol Sequences Lexicographical Order Define any order for single symbols s k (sorted symbol alphabet) Consider two sequences of N symbols s a = {s0, a s1, a, sn 1} a and s b = {s0 b, s1 b,, sn 1} b Lexicographical order for sequences of N symbols ( ) ( ) s a < s b n < N : k < n : sk a = sk b sn a < sn b (13) Example: Alphabetical Order a < b < < z =... aaaa... aaab.... aaaz... aaba... aabb P ( s (2) < ab ). P ( s (2) < aa ). P ( s (3) < aba ) P ( s (3) < aaz ) P ( s (3) < aab ) P ( s (3) < aaa )

16 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 16 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Illustration Example: 3-symbol alphabet A = {a, b, c} c(z..) = x<z p(x...) I(abbc...) W (abbc) = W (abb) p(c abb) L(abbc) = L(abb) + W (abb) c(c abb) W (abb) I(abb...) I(abbb...) W (abbb) = W (abb) p(b abb) L(abbb) = L(abb) + W (abb) c(b abb) I(abba...) W (abba) = W (abb) p(a abb) L(abb) L(abba) = L(abb) + W (abb) c(a abb) Important: Probability intervals are nested!

17 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 17 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Interval Width Interval Refinement Consider sub-sequences s (n) = {s 0, s 1,, s n 1 } with n < N Determine interval I n+1 = [L n+1, L n+1 + W n+1 ) for s (n+1) = {s (n), s n } based on interval I n = [L n, L n + W n ) for prefix sequence s (n) Refinement of Interval Width Interval width W n+1 for sub-sequence s (n+1) = {s (n), s n } W n+1 = P (S (n+1) = s (n+1)) ) = P (S (n) = s (n), S n = s n ( = P S (n) = s (n)) ( P S n = s n S (n) = s (n)) (14) Iteration rule for interval width W n+1 = W n p(s n s n 1, s n 2,, s 0 ) (15)

18 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 18 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Lower Interval Boundary Refinement of Lower Interval Boundary Lower interval boundary L n+1 for sub-sequence s (n+1) = {s (n), s n } L n+1 = P (S (n+1) < s (n+1)) ( = P S (n) < s (n)) ) + P (S (n) = s (n), S n < s n ( = P S (n) < s (n)) ( + P S (n) = s (n)) ( P S n < s n S (n) = s (n)) (16) Define modified cmf c(.), which excludes current symbol c(s n s n 1,, s 0 ) = P (S n < s n S (n) = s (n)) = p(a s n 1,, s 0 ) a<s n (17) Iteration rule for lower interval boundary L n+1 = L n + W n c(s n s n 1, s n 2,, s 0 ) (18)

19 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 19 / 64 Shannon-Fano-Elias Codes / Iterative Coding Nested Probability Intervals: Verification Intervals are Nested Lower interval boundary ( L n+1 = L n + W n P S n < s n S (n) = s (n)) L n (19) Upper interval boundary L n+1 + W n+1 = L n + W n P (S n < s n S (n) = s (n)) + W n P (S n = s n S (n) = s (n)) = L n + W n P (S n s n S (n) = s (n)) = L n + W n W n P (S n > s n S (n) = s (n)) L n + W n (20) Interval for s (n+1) = {s (n), s n } is fully included in interval for s (n)

20 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 20 / 64 Iterative Interval Refinement Iterative Algorithm for Calculation of Interval Boundaries Initialization: W 0 = 1 (21) L 0 = 0 (22) Iteration Step: W n+1 = W n p(s n s n 1,, s 0 ) (23) L n+1 = L n + W n c(s n s n 1,, s 0 ) (24) Advantage of Interval Refinement? Derived iteration approach: Conditional pmf p(s n s n 1, ) instead of joint pmf p(s 0,, s N 1 ) Need to store same amount of data But: Conditional pmfs can be well approximated using simple models IID model: p(s n s n 1,, s 0 ) = p(s n ) Markov model: p(s n s n 1,, s 0 ) = p(s n s n 1 )

21 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 21 / 64 Practical Iterative Interval Refinement Simplified Iteration IID Model: W n+1 = W n p(s n ) (25) L n+1 = L n + W n c(s n ) (26) Markov Model: W n+1 = W n p(s n s n 1 ) (27) L n+1 = L n + W n c(s n s n 1 ) (28) Many other simple models possible: Condition = f (s n 1, ) Other Aspects Switching between symbol alphabets possible Suitable for complicated syntax (as for prefix codes) Adaptation of probability models Probabilities can be estimated during encoding and decoding Elegant way to deal with instationary or unkown sources

22 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 22 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Algorithm Iterative Shannon-Fano-Elias Encoding 1 Given is a sequence s = {s 0, s 1, s 2,, s N 1 } of N symbols 2 Initialization of probability interval W 0 = 1 and L 0 = 0 (29) 3 Determine probability interval: For each n = 0, 1,, N 1, do W n+1 = W n p( s n ) (30) L n+1 = L n + W n c( s n ) (31) 4 Determine codeword length and codeword value K = log 2 W N (for prefix code: K K + 1) (32) z = L N 2 K (33) 5 Transmit codeword b(s): Binary representation of z with K bits

23 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 23 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Example: IID Source B B B B B B L+W = N N N N N N A A A A A A v = L = a p(a) c(a) A N B W n+1 = W n p(s n ) L n+1 = L n + W n c(s n ) W n 1 L n 0 init B A N A N A K = log 2 W = log = 9 z = L 2 K = = 452 b = " " ( ) v = (z =452 with K =9 bits)

24 Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Algorithm Iterative Shannon-Fano-Elias Decoding 1 Given: Bitstream b = {b 0, b 1,, b K 1} of K K bits Number N of symbols to be decoded 2 Determine interval representative: v = (0.b 0 b 1 b K 1) b = z 2 K 3 Initialization of probability interval: W 0 = 1 and L 0 = 0 4 Iterative decoding: For each n = 0, 1,, N 1, do a Calculate symbol intervals: For each a A n, calculate W n+1 (a) = W n p( a ) (34) L n+1 (a) = L n + W n c( a ) (35) b c Compare v with upper interval boundaries in increasing symbol order and output symbol s n with L n+1 (s n ) v < U n+1 (s n ) = L n+1 (s n ) + W n+1 (s n ) (36) Update interval boundary and interval width W n+1 = W n+1 (s n ) and L n+1 = L n+1 (s n ) (37) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 24 / 64

25 Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Example: IID Source L n+1 (a) = L n +W n c(a) B N B N B N B N B N B N W n+1 (a) = W n p(a) A A A A A A (L n, W n ) 0,1 5 6, 1 6 (L n+1, W n+1 ) (A) 0, (L n+1, W n+1 ) (N), (L n+1, W n+1 ) (B), , , , , , , , , , , , , , , , , , , , symbol s n B A N A N A v = b = " " = s = "BANANA" H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 25 / 64

26 Arithmetic Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 26 / 64 Arithmetic Coding Iterative Shannon-Fano-Elias Coding Iterative Encoding and Decoding Iterative interval refinement Simple codeword construction Precision requirements and delay for larger N Require extremely high precision for W n, L n, and v Encoder: Complete codeword written at end of encoding Decoder: Complete codeword read at start of decoding Arithmetic Coding Fixed-precision approximation of Shannon-Fano-Elias coding Represent pmf(s) with fixed-precision integers Represent interval width with fixed-precision integers Output bits as soon as possible

27 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 27 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of PMF(s) Fixed-Precision Approximation of Pmf(s) Choose number of bits V for representing probability masses Represent probability masses p(a) by V -bit integers p V (a) p(a) = p V (a) 2 V (38) Resulting modified cmf c(a) can also be represented by V -bit integers c V (a) c(a) = ( ) p(b) = p V (b) 2 V = c V (a) 2 V (39) b<a b<a Requirements on Pmf Approximation Probability masses must be non-zero and pmf must be valid a : p V (a) > 0 and p V (a) 2 V (40) a

28 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 28 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of Interval Width Fixed-Precision Approximation of Interval Width Represent interval width W n by U-bit integer A n and counter z n U Use maximum possible precision for A n Restrict A n according to Use following initialization W n = A n 2 zn (41) 2 U 1 A n < 2 U (42) A 0 = 2 U 1, z 0 = U W 0 = 1 2 U (43) Binary representation of interval width W n = A n 2 zn z n bits {}}{ W n = }{{ 0 }} 1xx {{ x } 000 z n U bits U bits

29 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 29 / 64 Arithmetic Coding / Rounding in Interval Refinement Rounding in Interval Refinement Conventional Refinement of Interval Width Refinement of interval width W n+1 = W n p(s n ) (44) A n+1 2 zn+1 = ( A n p V (s n ) ) 2 (zn+v ) (45) }{{} (U+V )-bit integer In general: W n p(s n ) cannot be represented using a U-bit integer What can we do? Requirement for Unique Decodability Code remains uniquely decodable if we ensure 0 < W n+1 W n p(s n ) (46) Solution: Rounding down of W n p(s n ) in each iteration so that W n+1 can represented using A n+1 2 zn+1 with 2 U 1 A n+1 < 2 U

30 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 30 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Product of Interval Width and Pmf Entry Binary representations of W n = A n 2 zn and p(s n ) = p V (s n ) 2 V z n U bits U bits {}}{{}}{ W n = xx x 000 p(s n ) = 0. } xxx {{ x } 000 V bits Interval refinement W n p(s n ) W n+1 z bits U bits V z bits {}}{{}}{{}}{ W n p(s n ) = }{{ 0 } 00 } 0 1x {{ x xx x } 000 z n U bits U+V bits: A n p V (s n) z n U bits z bits U bits {}}{{}}{{}}{ W n+1 = } {{ } 1x }{{ x } z n+1 U bits A n+1

31 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 31 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Arithmetic Operations for Interval Width Update z bits U bits {}}{{}}{ W n p(s n ) = 0. } {{ 0 } x x z n U bits z n U bits {}}{ W n+1 = Update of interval width V z bits {}}{ xx x } {{ } U+V bits: A n p V (s n) z bits {}}{ 00 0 } {{ } z n+1 U bits U bits 000 {}}{} 1x {{ x } A n+1 Determine number z of leading zeros in (U +V )-bit integer (A n p V (s n )) Update interval width according to A n+1 = ( A n p V (s n ) ) (V z) ( = bit shift to the right) z n+1 = z n + z Ensures unique decodability: 0 < W n+1 W n p(s n )

32 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 32 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representations of Lower Interval Boundary Product of Interval Width and Modified Cmf Entry Remember: L n+1 = L n + W n c(s n ) Binary representations of W n = A n 2 zn and c(s n ) = c V (s n ) 2 V z n bits {}}{ W n = }{{ 0 }} 1xx {{ x } 000 z n U bits U bits c(s n ) = 0. } xxx {{ x } 000 V bits Binary representation of product W n c(s n ) z n+v bits {}}{ W n c(s n ) = }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits

33 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 33 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representation of Lower Interval Boundary Update of Lower Interval Boundary z n+v bits {}}{ W n c(s n ) = }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits What is the effect on lower interval boundary? z n U bits {}}{ L n = 0. } aaaaa {{ a }} {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits }{{} trailing bits Trailing bits: Equal to 0, but maybe changed later Active bits: Directly modified by the update L n+1 = L n + W n c(s n ) Outstanding bits: May be modified by a carry from the active bits Settled bits: Not modified in any following interval update

34 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 34 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Lower Interval Boundary & Output Representation of Lower Interval Boundary z n U bits {}}{ L n = 0. } aaaaa {{ a }} {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits }{{} trailing bits Active bits: (U +V )-bit integer B n Intermediate value B n + A n c V (s n ) requires (U +V +1)-bit integer Outstanding bits: Counter c n (trailing c n 1 bits are equal to 1) Settled bits: Output as soon as they become settled

35 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 35 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Interval Refinement Update of Probability Interval z n U A n (U) {}}{{}}{ W n = xx x L n W n+1 = 0. L n+1 = 0. } aaaaa {{ a } z n c n U } 011 {{ 1 } c n z n U {}}{ = 0. aaaaa a }{{} z n c n U } xxx {{ xxxxxx } B n (U + V ) z {}}{ 0 0 xxxxxxxx xxx }{{} c n + z A n+1 (U) {}}{ 1xx x } xxx {{ xxxxxx } B n+1 (U + V ) Interval update z = Number of trailing zeros in (U +V )-bit integer ( A n p V (s n ) ) mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = ( A n p V (s n ) ) (V z) = ( ( B n + A n c V (s n ) ) & mask ) z

36 Arithmetic Coding / Continuous Output of Bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 36 / 64 Arithmetic Coding: Output of Settled Bits Output of Bits L n+1 = 0. aaaaa a }{{} z n c n U xxxxxxxx xxx }{{} c n + z Investigate modified (c n + z) bits Update counter c n+1 and output new settled bits } xxx {{ xxxxxx } B n+1 (U + V ) Total Number of Bits for Arithmetic Codeword Total number of bits to output (note: W N = A N 2 zn ) K = log 2 W N = zn log 2 A N = zn U + 1 (47) Prefix-free version: K = z N U + 2 Note: z N U is the sum of settled and outstanding bits Output c N outstanding bits Output one bit of B N (prefix-free: two bits)

37 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 37 / 64 Arithmetic Coding / Codeword Termination Termination of Arithmetic Codeword Required Output Required output after last symbol was coded c N outstanding bits most significant bit of B N (prefix-free: two most significant bits) Note: Lower boundary must be rounded up to next multiple of 2 K Codeword Termination Set n = 1 (non-prefix free) or n = 2 (prefix-free) If the any of the last (U + V n) bits in B N is equal to 1, do Set B N = B N + (1 (U + V n)) (rounding up) If B N (1 (U + V )) (carry condition) Invert outstanding bits, output new settled bits, set c N = 1 Remove carry: B N = B N (1 (U + V )) Output outstanding bits (one 0 and (c N 1) times 1 ) Output n most significant bits of B N

38 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 38 / 64 Overview of Arithmetic Encoding Process Arithmetic Encoding 1 Initialization: A 0 = 2 U 1, B 0 = 0, c 0 = 0 2 Iteration: For n = 0, 1,, N 1, do a Calculate: A = A n p V (s n) (U + V bits) B = B n + A n c V (s n) (U + V + 1 bits) b Determine number z of trailing zeros in (U +V )-bit integer A c Update: mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = A (V z) = ( B & mask ) z d Determine outstanding bits counter c n+1 (based on c n and B ) e Output new settled bits (c n + z c n+1 bits) 3 Termination: Round up B N Output c N outstanding bits + one/two most significant bit(s) of B N

39 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 39 / 64 Arithmetic Coding Example Example: Preparation Coding example IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Consider arithmetic coding with V = 4 and U = 4 Symbol sequence BANANA Preparation: Quantization of pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = N 1/3 16/ B 1/6 16/ Note: Quantized pmf p V (a) fulfills the requirement p V (a) 2 V

40 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 40 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 1 s n p V c V parameter updates & output A 0 = 15 = 1111 initialization c 0 = 0 ( ) B 0 = 0 = bitstream = B 3 13 A 0 p V = 15 3 = 45 = B 0 + A 0 c V = = 195 = z = 2 A 1 = 1011 = 11 c 1 = 0 ( ) B 1 = = 12 output = 11 bitstream = 11

41 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 41 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 2 s n p V c V parameter updates & output A 1 = 11 = 1011 after step 1 c 1 = 0 ( ) B 1 = 12 = bitstream = 11 A 8 0 A 1 p V = 11 8 = 88 = B 1 + A 1 c V = = 12 = z = 1 A 2 = 1011 = 11 c 2 = 1 ( 0 ) B 2 = = 24 output = bitstream = 11

42 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 42 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 3 s n p V c V parameter updates & output A 2 = 11 = 1011 after step 2 c 2 = 1 ( 0 ) B 2 = 24 = bitstream = 11 N 5 8 A 2 p V = 11 5 = 55 = B 2 + A 2 c V = = 112 = z = 2 A 3 = 1101 = 13 c 3 = 2 ( 01 ) B 3 = = 192 output = 0 bitstream = 110

43 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 43 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 4 s n p V c V parameter updates & output A 3 = 13 = 1101 after step 3 c 3 = 2 ( 01 ) B 3 = 192 = bitstream = 110 A 8 0 A 3 p V = 13 8 = 104 = B 3 + A 3 c V = = 192 = z = 1 A 4 = 1101 = 13 c 4 = 3 ( 011 ) B 4 = = 128 output = bitstream = 110

44 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 44 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 5 s n p V c V parameter updates & output A 4 = 13 = 1101 after step 4 c 4 = 3 ( 011 ) B 4 = 128 = bitstream = 110 N 5 8 A 4 p V = 13 5 = 65 = B 4 + A 4 c V = = 232 = z = 1 A 5 = 1000 = 8 c 5 = 4 ( 0111 ) B 5 = = 208 output = bitstream = 110

45 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 45 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 6 s n p V c V parameter updates & output A 5 = 8 = 1000 after step 5 c 5 = 4 ( 0111 ) B 5 = 208 = bitstream = 110 A 8 0 A 5 p V = 8 8 = 64 = B 5 + A 5 c V = = 208 = z = 1 A 6 = 1000 = 8 c 6 = 5 ( ) B 6 = = 160 output = bitstream = 110

46 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 46 / 64 Arithmetic Coding Example Example: Codeword Termination s n p V c V parameter updates & output A 6 = 8 = 1000 after step 6 c 6 = 5 ( ) B 6 = 160 = bitstream = 110 B = (rounding up B 6 ) final rounding bitstream = (c 6 1 inverted bits) c = 1 ( 0 ) B = termination final bitstream = (c + 1 bits added) Bitstream b = " " (for sequence s = "BANANA") Same number of bits (K = 9) as for Shannon-Fano-Elias coding

47 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 47 / 64 Decoding With Finite Precision Identification of Intervals Important: Same rounding of interval width as in encoder (A n A n+1 ) Arithmetic codeword b 0 b 1 b 2 b 3 b 4 b 5 b 6 represents binary fraction v = (0.b 0 b 1 b 2 b 3 b 4 b 5 b 6 ) b Iterative decoding: Output symbol s n which fulfills inequality Observation: L n + W n c(s n ) v < L n + W n c(s n ) + W n p(s n ) Lower interval boundary L n cannot be represented with reasonable precision Idea: Subtract L n from the inequality Symbol s n is identified by W n c(s n ) v L n < W n c(s n ) + W n p(s n ) The value u n = v L n used in comparisons can be stored with (U + V ) bits, but needs to be updated after a symbol s n is decoded

48 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 48 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding: Binary Representations Analyse Binary Representations z n U U {}}{{}}{ W n = xx x z n U U + V {}}{{}}{ W n (c V + p V ) = xxx xxxxxx v L n = xxx xxxxxx xxxxxxxxx W n c V = }{{} z n U xxx xxxxxx }{{} U + V Use (U +V )-bit integer u n in comparisons (down-rounded value of v L n ) Initialization: u n = (first U +V bits from bitstream) Update u n u n+1 Subtract lower boundary: u n = u n W n c V (s n) Align with interval width: u n = u n z (leading zeros in A n p V (s n)) Fill least significant bits with next z bits from bitstream

49 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 49 / 64 Overview of Arithmetic Decoding Process Arithmetic Decoding 1 Initialization: A 0 = 2 U 1, u 0 = (first U +V bits from bitstream) 2 Iteration: For n = 0, 1,, N 1, do a Identify next symbol: For k = 0, 1,, do Calculate upper boundary U(a k ) = A n (c V (a k ) + p V (a k )) If u n < U(a k ), then Output next symbol s n = a k break loop over k b Update parameters: Calculate intermediate value: A = A n p V (s n) (loop over alphabet) Determine number z of trailing zeros in (U +V )-bit integer A A n+1 = A (V z) u n+1 = (u n z) + (next z bits from bitstream)

50 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 50 / 64 Arithmetic Decoding Example Decode Bitstream obtained in Encoding Example Coding example (see encoding example) IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Arithmetic coding with V = 4 and U = 4 Quantized pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = N 1/3 16/ B 1/6 16/ Bitstream b =

51 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 51 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 1 a c V p V decoding & update output bitstream = ( ) initialization A 0 = 15 = 1111 u 0 = 208 = A 0 8 U(A) = 15 (0 + 8) = 120 U(A) u 0 N 8 5 U(N) = 15 (8 + 5) = 195 U(N) u 0 B 13 3 U(B) = 15 (13 + 3) = 240 U(B) > u 0 B A = 15 3 = 45 = u = = 13 = z = 2 A 1 = 1011 = 11 u 1 = = 52 bitstream = ( )

52 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 52 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 2 a c V p V decoding & update output bitstream = ( ) after step 1 A 1 = 11 = 1011 u 1 = 52 = A 0 8 U(A) = 11 (0 + 8) = 88 U(A) > u 1 A N 8 5 U(N) = 11 (8 + 5) = 143 B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 8 = 88 = u = = 52 = z = 1 A 2 = 1011 = 11 u 2 = = 104 bitstream = ( )

53 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 53 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 3 a c V p V decoding & update output bitstream = ( ) after step 2 A 2 = 11 = 1011 u 2 = 104 = A 0 8 U(A) = 11 (0 + 8) = 88 U(A) u 2 N 8 5 U(N) = 11 (8 + 5) = 143 U(N) > u 2 N B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 5 = 55 = u = = 16 = z = 2 A 3 = 1101 = 13 u 3 = = 64 bitstream = ( )

54 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 54 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 4 a c V p V decoding & update output bitstream = ( ) after step 3 A 3 = 13 = 1101 u 3 = 64 = A 0 8 U(A) = 13 (0 + 8) = 104 U(A) > u 3 A N 8 5 U(N) = 13 (8 + 5) = 169 B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 8 = 104 = u = = 64 = z = 1 A 4 = 1101 = 13 u 4 = = 128 bitstream = ( )

55 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 55 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 5 a c V p V decoding & update output bitstream = ( ) after step 4 A 4 = 13 = 1101 u 4 = 128 = A 0 8 U(A) = 13 (0 + 8) = 104 U(A) u 4 N 8 5 U(N) = 13 (8 + 5) = 169 U(N) > u 4 N B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 5 = 65 = u = = 24 = z = 1 A 5 = 1000 = 8 u 5 = = 48 bitstream = ( )

56 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 56 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 6 (last symbol) a c V p V decoding & update output bitstream = ( ) after step 5 A 5 = 8 = 1000 u 5 = 48 = A 0 8 U(A) = 8 (0 + 8) = 64 U(A) > u 4 A N 8 5 U(N) = 8 (8 + 5) = 104 B 13 3 U(B) = 8 (13 + 3) = 128 bitstream symbol sequence BANANA Note: Required some bits after end of the bitstream For non-prefix variant: Use bits equal to 0 For prefix-free variant: Any bit values (0 or 1) can be used

57 Arithmetic Coding / Efficiency H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 57 / 64 Efficiency of Arithmetic Coding Increase in Codeword Length relative to Shannon-Fano-Elias Coding Excess rate due to rounding of interval width l = log 2 W N log2 p(s) < 1 + log 2 p(s) W N (48) Upper bound for increase in codeword length per symbol relative to infinite-precision Shannon-Fano-Elias coding l < 1 N + log ( U ) log 2 (1 2 V p min ) (49) (for a derivation see Wiegand, Schwarz, page 51-52) Example: Number of coded symbols N = 1000, Arithmetic precision: V = 16 and U = 12, Minimum probablity mass p min = 0.02 Increase in codeword length is less than bit per symbol

58 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 58 / 64 Arithmetic Coding / Arithmetic Coding in Practice Complexity Reduction: Binary Arithmetic Coding Binary Arithmetic Coding Most popular type of arithmetic coding: JPEG 2000, H.264, H.265 Binarization of S {a 0, a 1,..., a M 1 } produces C {0, 1} Any prefix code can be used for binarization Example: Truncated unary binarization S n number of bins B C 0 C 1 C 2 C M 2 C M 1 a a a M 2 M a M 1 M Entropy unchanged due to binarization S C H(S) = E{ log 2 p(s) } = E{ log 2 p(c) } = H(C)

59 Arithmetic Coding / Arithmetic Coding in Practice H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 59 / 64 Practical Arithmetic Coding Complexity Reduction Binary arithmetic coding Multiplication-free implementations Bypass mode: Low-complexity coding of bins with p = 0.5 Practical Design Aspects 1 Context selection Use reasonable context variables X = f (S n 1, S n 2, ) for switching probability tables p(a X ) Use context switching only when useful (certain bins) 2 Estimate probabilities during coding Choose appropriate window sizes for estimation 3 Suitably combine context selection and probability estimation

60 Comparison of Lossless Coding Techniques H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 60 / 64 Experimental Comparison of Lossless Coding Techniques Example: Markov Source Stationary Markov source given by conditional pmf a p(a a 0 ) p(a a 1 ) p(a a 2 ) a a a H(S) = H(S) = Bounds for lossless coding Entropy rate H(S) for coding of infinitely many symbols Instantaneous entropy rate H inst (S, L) for coding L symbols H inst (S, L) = 1 L H(S 0, S 1,, S L 1 ) (50) Coding experiment Coding of realizations of example stationary Markov source Calculate average codeword length for sequences of 1 to 1000 symbols

61 Comparison of Lossless Coding Techniques Experimental Results 2.5 average codeword length per symbol scalar Huffman code (3 codewords) instantaneous entropy rate entropy rate conditional Huffman code (3 3 codewords) Huffman code for fixed-length vectors (5 symbols, 243 codewords) Huffman code for variable-length vectors (17 codewords) arithmetic coding (16 bits of precision for interval sizes and probabilities) number of coded symbols (logarithmic scale) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 61 / 64

62 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 62 / 64 Summary Part Summary Uniquely decodable codes & bounds for lossless coding Kraft inequality, prefix codes Scalar entropy, conditional entropy, block entropy Entropy rate, instantaneous entropy rate Variable-length codes for scalars and vectors Optimal code for given pmf: Huffman code Scalar, conditional codes: Inefficient for pmfs with p(a) 0.5 Block codes and V2V codes: Code tables can become extremely large Difficult adaptation to instationary sources Arithmetic coding No codeword table: Iterative construction of codeword Close to entropy bound for N 1 Well suited for exploiting statistical dependencies Well suited for adapting probabilities during coding

63 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 63 / 64 Exercises Exercise 1: Implement an Arithmetic Encoder/Decoder Implement an arithmetic encoder and decoder (the one discussed in lecture): Use 64-bit integer arithmetic for all operations. Start with a 4-symbol alphabet {a, b, c, d} and a fixed pmf {6/16, 5/16, 3/16, 2/16} for verifying the implementation. Measure the average codeword length (bits per symbol) for long symbol sequences and compare it to the entropy rate. Extend the implementation to general file types. Use a fixed pmf (p k = 1/256) for the bytes of a file. For preparing the implementation, think about the following: How do we determine the newly settled bits and the new number of outstanding bits c n+1 during encoding (based on c n and B )? Suggestion: Use the framework (written in C++) on the web-site: It provides file input/output, file comparison, and implements classes for reading and writing bitstreams (bit by bit). In ac/test you ll find examples for the 4-symbol pmf ( test1mioabcd.txt ) and general text files ( Goethe.txt ).

64 Exercises H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 64 / 64 Exercise 2: Arithmetic Coding with Adaptive Pmfs Extend the implemented arithmetic codec by backward-adaptive pmf estimation and the usage of conditional pmfs: Use the iid model and estimate the pmf during encoding/decoding. Try to improve the performance by using a Markov model (estimation of conditional pmfs). Try to further improve the performance by using two preceding symbols as condition (2nd order Markov model). Test the different probability models (iid with fixed pmf, iid with adaptive pmf, Markov with adaptive pmfs, and 2nd order MArkov with adaptive pmfs) with the test file Goethe.txt provided in ac/test. For preparing the implementation, think about the following: How can we estimate a rounded pmf (with V bits of precision) during coding?

CSEP 590 Data Compression Autumn Arithmetic Coding

CSEP 590 Data Compression Autumn 2007 Arithmetic Coding Reals in Binary Any real number x in the interval [0,1) can be represented in binary as.b 1 b 2... where b i is a bit. x 0 0 1 0 1... binary representation