Summary of Last Lectures

Size: px
Start display at page:

Download "Summary of Last Lectures"

Transcription

1 Lossless Coding IV a k p k b k a b c d e f g h i root e a 16 d 15 i 13 9 h f 6 g 4 b 4 c

2 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 2 / 64 Last Lectures Summary of Last Lectures Unqiue Decodability / Prefix Codes There are no better uniquely decodable codes than the best prefix codes Only need to consider prefix codes (also instantaneously decodable) Variable-Length Codes Scalar codes: l H(S) Conditional codes: l H(Sn S n 1 ) with H(S n S n 1 ) H(S) Block codes: l HN (S)/N with H N (S)/N H N 1 (S)/(N 1) V2V codes: Code for variable-length symbol sequences Optimal code construction: Huffman algorithm for corresponding pmf Fundamental Lossless Source Coding Theorem For all lossless coding techniques: H N (S) l H(S) = lim (entropy rate: largest lower bound) N N Variable-Length Codes for Instationary or Unknown Sources Adapt code during encoding/decoding (forward/backward adaptation)

3 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Coding: Review Shannon-Fano-Elias Coding: Special Block Code of size N Order of N-symbol sequences {s k } is known by encoder and decoder N-th order pmf p(s k ) = P(S = s k ) is known by encoder and decoder On-the-fly encoding and decoding (no codeword table) Basic Concept Unique mapping of N-symbol sequences to intervals I k of cdf F (s) Half-open intervals I k = [L k, U k ) = [L k, L k +W k ) are characterized by Interval width: W k = F (s k ) F (s k 1 ) = p(s k ) Lower interval boundary: L k = F (s k 1 ) = i<k p(s i) Codewords: Fractional bits of representive value v k inside interval I k Length of codeword: K = log 2 W k Representative value: v k = L k 2 K 2 K = z k 2 K Codeword: Binary representation of z k = L k 2 K with K bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 3 / 64

4 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 4 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Illustration F (s) codeword b(s k ): binary representation of z with K bits integer part: z = L 2 K v = L 2 K 2 K number of bits: K = log 2 W I(s k ) = [L, L+W ) W = p(s k ) L = i<k p(s i ) s 0 s k 1 s k s k+1 s k messages s

5 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 5 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Encoding: Summary Determination of Codewords Given: Ordered set of symbol sequences {s k } with associated pmf {p k } Construct codeword b k = b(s k ) for any particular sequence s k by 1 Determine interval width W k and lower interval boundary L k W k = p k (1) L k = p i (2) i<k 2 Determine codeword length K k K k = log 2 W k 3 Determine representative integer z k z k = L k 2 K k (3) (4) 4 Codeword b k : Binary representation of z k with K k bits

6 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 6 / 64 Shannon-Fano-Elias Codes / Review: Basic Concept Shannon-Fano-Elias Decoding: Illustration F (s) read codeword b: binary representation of z with K bits representative value: v = z 2 K U k > v decoding process: Compare v with upper interval boundaries U =L+W in increasing order decoded message: s k s 0 s k 1 s k s k+1 U k 1 v. U 0 v U k = i k p(s i ) messages s

7 Shannon-Fano-Elias Codes / Review: Basic Concept H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 7 / 64 Shannon-Fano-Elias Decoding: Summary Decoding of a Symbol Sequence Given: Ordered set of symbol sequences {s k } with associated pmf {p k } 1 Read codeword b: Codeword b has K bits and represents the binary value of the integer z 2 Determine the representattive value v according to v = z 2 K (5) 3 Initilization: Set index k = 0 and upper interval boundary U 0 = p 0 4 Compare v with U k If v < U k Output decoded symbol sequence s k Terminate decoding Otherwise (v U k ) Set k = k + 1 and U k = U k 1 + p k Goto step 4

8 Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Shannon-Fano-Elias Code Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codeword s k p k W k L k K k z k b k aaa aab aba abb baa bab bba bbb average codeword length: l = block Huffman code: l = W k = p k L k = i<k p i K k = log 2 W k z k = L k 2 K k b k : z k with K k bits Worse than block Huffman code for same block size (N = 3) Code is not prefix-free! Can be a problem (depends on application)! H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 8 / 64

9 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 9 / 64 Why Is The Code Not Prefix-Free? Effect of Codeword Concatenation I v I v / I (i 2) 2 K (i 1) 2 K i 2 K (i + 1) 2 K (i + 2) 2 K Encoder transmits codeword b = {b 0, b 1, b 2,, b K 1 } of K bits, signaling the binary fraction v I v = (0.b 0 b 1 b 2 b K 1 ) b Decoder sees a modified binary fraction v given by v = (0.b 0 b 1 b 2 b K 1 b K b K+1 b K+2 ) b where {b K b K+1 b K+2 } are the bits of following codewords Depending on the location of v inside the interval I and the values of the following bits, v can lay outside the interval I

10 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 10 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction Ensure that binary fraction v (seen by decoder) lies inside interval Worst case: All following bits are equal to 1 v = v + 2 i < L + W (6) i=k+1 Since the sum in above equation is less than 2 K, we require v < v + 2 K L + W (7) Question: How many bits K do we need for representing v according to v = L 2 K 2 K Have to choose K so that the following inequality is fulfilled v + 2 K = L 2 K 2 K + 2 K L + W (8)

11 Shannon-Fano-Elias Codes / Prefix-free Variant H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 11 / 64 Prefix-Free Shannon-Fano-Elias Code Prefix Code Construction (continued) Since x < x + 1, the inequality L 2 K 2 K + 2 K L + W is always fulfilled if we have ( L 2 K + 1 ) 2 K + 2 K L + W L K L + W 2 1 K W 1 K log 2 W Unique decodability is guaranteed, if we choose K 1 log 2 W (9) K = 1 log 2 W = log 2 W + 1 (10) One additional bit per codeword (compared to non-prefix-free version)

12 Shannon-Fano-Elias Codes / Prefix-free Variant Example for a Prefix-Free Shannon-Fano-Elias Code Repeated Example: Blocks of three Symbols for a Binary IID Source Binary iid source with alphabet A = {a, b} and pmf p = {0.8, 0.2} joint pmf intervals number codewords s k p k W k L k K k z k b k aaa aab aba abb baa bab bba bbb average codeword length: l = block Huffman code: l = W k = p k L k = i<k p i K k = 1 log 2 W k z k = L k 2 K k b k : z k with K k bits Additional bit ensures that code becomes a prefix code Worse than block Huffman code (several redundant bits) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 12 / 64

13 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 13 / 64 Shannon-Fano-Elias Codes / Bounds on Average Codeword Length Efficiency of Shannon-Fano-Elias Codes Average Codeword Length Average codeword length l per symbol l = E{ K(S) } N = E{ A log 2 p N (S) } N with A = { 1 : prefix-free 0 : otherwise (11) Bounds on Average Codeword Length Using inequalities x x and x < x + 1, we obtain E{ log 2 p N (S) } N H N (S) N + A N l < E{ log 2 p N (S) } N + A N l < H N(S) N A N A N (12) Non-prefix-free version (A = 0): Same bounds as for block Huffman coding Both versions: Close to entropy rate for N 1 (for typical sources)

14 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 14 / 64 Shannon-Fano-Elias Coding: Intermediate Results Shannon-Fano-Elias Code Special block code (for given number of symbols N) Worse than block Huffman code of same size N Still close to entropy bound H N /N for N 1 No need to store codeword table! Have to store N-th order pmf (or N-th order cdf)! What is the advantage? Iterative Coding Can define a suitable order for sequences of N symbols Probability intervals are nested Iterative calculation of interval boundaries Iterative codeword construction

15 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 15 / 64 Shannon-Fano-Elias Codes / Iterative Coding Sorting of Symbol Sequences Lexicographical Order Define any order for single symbols s k (sorted symbol alphabet) Consider two sequences of N symbols s a = {s0, a s1, a, sn 1} a and s b = {s0 b, s1 b,, sn 1} b Lexicographical order for sequences of N symbols ( ) ( ) s a < s b n < N : k < n : sk a = sk b sn a < sn b (13) Example: Alphabetical Order a < b < < z =... aaaa... aaab.... aaaz... aaba... aabb P ( s (2) < ab ). P ( s (2) < aa ). P ( s (3) < aba ) P ( s (3) < aaz ) P ( s (3) < aab ) P ( s (3) < aaa )

16 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 16 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Illustration Example: 3-symbol alphabet A = {a, b, c} c(z..) = x<z p(x...) I(abbc...) W (abbc) = W (abb) p(c abb) L(abbc) = L(abb) + W (abb) c(c abb) W (abb) I(abb...) I(abbb...) W (abbb) = W (abb) p(b abb) L(abbb) = L(abb) + W (abb) c(b abb) I(abba...) W (abba) = W (abb) p(a abb) L(abb) L(abba) = L(abb) + W (abb) c(a abb) Important: Probability intervals are nested!

17 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 17 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Interval Width Interval Refinement Consider sub-sequences s (n) = {s 0, s 1,, s n 1 } with n < N Determine interval I n+1 = [L n+1, L n+1 + W n+1 ) for s (n+1) = {s (n), s n } based on interval I n = [L n, L n + W n ) for prefix sequence s (n) Refinement of Interval Width Interval width W n+1 for sub-sequence s (n+1) = {s (n), s n } W n+1 = P (S (n+1) = s (n+1)) ) = P (S (n) = s (n), S n = s n ( = P S (n) = s (n)) ( P S n = s n S (n) = s (n)) (14) Iteration rule for interval width W n+1 = W n p(s n s n 1, s n 2,, s 0 ) (15)

18 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 18 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Interval Refinement: Lower Interval Boundary Refinement of Lower Interval Boundary Lower interval boundary L n+1 for sub-sequence s (n+1) = {s (n), s n } L n+1 = P (S (n+1) < s (n+1)) ( = P S (n) < s (n)) ) + P (S (n) = s (n), S n < s n ( = P S (n) < s (n)) ( + P S (n) = s (n)) ( P S n < s n S (n) = s (n)) (16) Define modified cmf c(.), which excludes current symbol c(s n s n 1,, s 0 ) = P (S n < s n S (n) = s (n)) = p(a s n 1,, s 0 ) a<s n (17) Iteration rule for lower interval boundary L n+1 = L n + W n c(s n s n 1, s n 2,, s 0 ) (18)

19 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 19 / 64 Shannon-Fano-Elias Codes / Iterative Coding Nested Probability Intervals: Verification Intervals are Nested Lower interval boundary ( L n+1 = L n + W n P S n < s n S (n) = s (n)) L n (19) Upper interval boundary L n+1 + W n+1 = L n + W n P (S n < s n S (n) = s (n)) + W n P (S n = s n S (n) = s (n)) = L n + W n P (S n s n S (n) = s (n)) = L n + W n W n P (S n > s n S (n) = s (n)) L n + W n (20) Interval for s (n+1) = {s (n), s n } is fully included in interval for s (n)

20 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 20 / 64 Iterative Interval Refinement Iterative Algorithm for Calculation of Interval Boundaries Initialization: W 0 = 1 (21) L 0 = 0 (22) Iteration Step: W n+1 = W n p(s n s n 1,, s 0 ) (23) L n+1 = L n + W n c(s n s n 1,, s 0 ) (24) Advantage of Interval Refinement? Derived iteration approach: Conditional pmf p(s n s n 1, ) instead of joint pmf p(s 0,, s N 1 ) Need to store same amount of data But: Conditional pmfs can be well approximated using simple models IID model: p(s n s n 1,, s 0 ) = p(s n ) Markov model: p(s n s n 1,, s 0 ) = p(s n s n 1 )

21 Shannon-Fano-Elias Codes / Iterative Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 21 / 64 Practical Iterative Interval Refinement Simplified Iteration IID Model: W n+1 = W n p(s n ) (25) L n+1 = L n + W n c(s n ) (26) Markov Model: W n+1 = W n p(s n s n 1 ) (27) L n+1 = L n + W n c(s n s n 1 ) (28) Many other simple models possible: Condition = f (s n 1, ) Other Aspects Switching between symbol alphabets possible Suitable for complicated syntax (as for prefix codes) Adaptation of probability models Probabilities can be estimated during encoding and decoding Elegant way to deal with instationary or unkown sources

22 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 22 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Algorithm Iterative Shannon-Fano-Elias Encoding 1 Given is a sequence s = {s 0, s 1, s 2,, s N 1 } of N symbols 2 Initialization of probability interval W 0 = 1 and L 0 = 0 (29) 3 Determine probability interval: For each n = 0, 1,, N 1, do W n+1 = W n p( s n ) (30) L n+1 = L n + W n c( s n ) (31) 4 Determine codeword length and codeword value K = log 2 W N (for prefix code: K K + 1) (32) z = L N 2 K (33) 5 Transmit codeword b(s): Binary representation of z with K bits

23 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 23 / 64 Shannon-Fano-Elias Codes / Iterative Coding Iterative Encoding Example: IID Source B B B B B B L+W = N N N N N N A A A A A A v = L = a p(a) c(a) A N B W n+1 = W n p(s n ) L n+1 = L n + W n c(s n ) W n 1 L n 0 init B A N A N A K = log 2 W = log = 9 z = L 2 K = = 452 b = " " ( ) v = (z =452 with K =9 bits)

24 Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Algorithm Iterative Shannon-Fano-Elias Decoding 1 Given: Bitstream b = {b 0, b 1,, b K 1} of K K bits Number N of symbols to be decoded 2 Determine interval representative: v = (0.b 0 b 1 b K 1) b = z 2 K 3 Initialization of probability interval: W 0 = 1 and L 0 = 0 4 Iterative decoding: For each n = 0, 1,, N 1, do a Calculate symbol intervals: For each a A n, calculate W n+1 (a) = W n p( a ) (34) L n+1 (a) = L n + W n c( a ) (35) b c Compare v with upper interval boundaries in increasing symbol order and output symbol s n with L n+1 (s n ) v < U n+1 (s n ) = L n+1 (s n ) + W n+1 (s n ) (36) Update interval boundary and interval width W n+1 = W n+1 (s n ) and L n+1 = L n+1 (s n ) (37) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 24 / 64

25 Shannon-Fano-Elias Codes / Iterative Coding Iterative Decoding Example: IID Source L n+1 (a) = L n +W n c(a) B N B N B N B N B N B N W n+1 (a) = W n p(a) A A A A A A (L n, W n ) 0,1 5 6, 1 6 (L n+1, W n+1 ) (A) 0, (L n+1, W n+1 ) (N), (L n+1, W n+1 ) (B), , , , , , , , , , , , , , , , , , , , symbol s n B A N A N A v = b = " " = s = "BANANA" H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 25 / 64

26 Arithmetic Coding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 26 / 64 Arithmetic Coding Iterative Shannon-Fano-Elias Coding Iterative Encoding and Decoding Iterative interval refinement Simple codeword construction Precision requirements and delay for larger N Require extremely high precision for W n, L n, and v Encoder: Complete codeword written at end of encoding Decoder: Complete codeword read at start of decoding Arithmetic Coding Fixed-precision approximation of Shannon-Fano-Elias coding Represent pmf(s) with fixed-precision integers Represent interval width with fixed-precision integers Output bits as soon as possible

27 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 27 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of PMF(s) Fixed-Precision Approximation of Pmf(s) Choose number of bits V for representing probability masses Represent probability masses p(a) by V -bit integers p V (a) p(a) = p V (a) 2 V (38) Resulting modified cmf c(a) can also be represented by V -bit integers c V (a) c(a) = ( ) p(b) = p V (b) 2 V = c V (a) 2 V (39) b<a b<a Requirements on Pmf Approximation Probability masses must be non-zero and pmf must be valid a : p V (a) > 0 and p V (a) 2 V (40) a

28 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 28 / 64 Arithmetic Coding / Quantization of Pmf and Interval Width Quantization of Interval Width Fixed-Precision Approximation of Interval Width Represent interval width W n by U-bit integer A n and counter z n U Use maximum possible precision for A n Restrict A n according to Use following initialization W n = A n 2 zn (41) 2 U 1 A n < 2 U (42) A 0 = 2 U 1, z 0 = U W 0 = 1 2 U (43) Binary representation of interval width W n = A n 2 zn z n bits {}}{ W n = }{{ 0 }} 1xx {{ x } 000 z n U bits U bits

29 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 29 / 64 Arithmetic Coding / Rounding in Interval Refinement Rounding in Interval Refinement Conventional Refinement of Interval Width Refinement of interval width W n+1 = W n p(s n ) (44) A n+1 2 zn+1 = ( A n p V (s n ) ) 2 (zn+v ) (45) }{{} (U+V )-bit integer In general: W n p(s n ) cannot be represented using a U-bit integer What can we do? Requirement for Unique Decodability Code remains uniquely decodable if we ensure 0 < W n+1 W n p(s n ) (46) Solution: Rounding down of W n p(s n ) in each iteration so that W n+1 can represented using A n+1 2 zn+1 with 2 U 1 A n+1 < 2 U

30 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 30 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Product of Interval Width and Pmf Entry Binary representations of W n = A n 2 zn and p(s n ) = p V (s n ) 2 V z n U bits U bits {}}{{}}{ W n = xx x 000 p(s n ) = 0. } xxx {{ x } 000 V bits Interval refinement W n p(s n ) W n+1 z bits U bits V z bits {}}{{}}{{}}{ W n p(s n ) = }{{ 0 } 00 } 0 1x {{ x xx x } 000 z n U bits U+V bits: A n p V (s n) z n U bits z bits U bits {}}{{}}{{}}{ W n+1 = } {{ } 1x }{{ x } z n+1 U bits A n+1

31 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 31 / 64 Arithmetic Coding / Rounding in Interval Refinement Refinement of Interval Width Arithmetic Operations for Interval Width Update z bits U bits {}}{{}}{ W n p(s n ) = 0. } {{ 0 } x x z n U bits z n U bits {}}{ W n+1 = Update of interval width V z bits {}}{ xx x } {{ } U+V bits: A n p V (s n) z bits {}}{ 00 0 } {{ } z n+1 U bits U bits 000 {}}{} 1x {{ x } A n+1 Determine number z of leading zeros in (U +V )-bit integer (A n p V (s n )) Update interval width according to A n+1 = ( A n p V (s n ) ) (V z) ( = bit shift to the right) z n+1 = z n + z Ensures unique decodability: 0 < W n+1 W n p(s n )

32 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 32 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representations of Lower Interval Boundary Product of Interval Width and Modified Cmf Entry Remember: L n+1 = L n + W n c(s n ) Binary representations of W n = A n 2 zn and c(s n ) = c V (s n ) 2 V z n bits {}}{ W n = }{{ 0 }} 1xx {{ x } 000 z n U bits U bits c(s n ) = 0. } xxx {{ x } 000 V bits Binary representation of product W n c(s n ) z n+v bits {}}{ W n c(s n ) = }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits

33 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 33 / 64 Arithmetic Coding / Effect on Lower Interval Boundary Effect on Binary Representation of Lower Interval Boundary Update of Lower Interval Boundary z n+v bits {}}{ W n c(s n ) = }{{ 0 }} xxx {{ x } 000 z n U bits U+V bits What is the effect on lower interval boundary? z n U bits {}}{ L n = 0. } aaaaa {{ a }} {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits }{{} trailing bits Trailing bits: Equal to 0, but maybe changed later Active bits: Directly modified by the update L n+1 = L n + W n c(s n ) Outstanding bits: May be modified by a carry from the active bits Settled bits: Not modified in any following interval update

34 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 34 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Lower Interval Boundary & Output Representation of Lower Interval Boundary z n U bits {}}{ L n = 0. } aaaaa {{ a }} {{ 1 }} xxxxx {{ x } z n c n U c n U +V settled bits outstanding bits active bits }{{} trailing bits Active bits: (U +V )-bit integer B n Intermediate value B n + A n c V (s n ) requires (U +V +1)-bit integer Outstanding bits: Counter c n (trailing c n 1 bits are equal to 1) Settled bits: Output as soon as they become settled

35 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 35 / 64 Arithmetic Coding / Interval Refinement Arithmetic Coding: Interval Refinement Update of Probability Interval z n U A n (U) {}}{{}}{ W n = xx x L n W n+1 = 0. L n+1 = 0. } aaaaa {{ a } z n c n U } 011 {{ 1 } c n z n U {}}{ = 0. aaaaa a }{{} z n c n U } xxx {{ xxxxxx } B n (U + V ) z {}}{ 0 0 xxxxxxxx xxx }{{} c n + z A n+1 (U) {}}{ 1xx x } xxx {{ xxxxxx } B n+1 (U + V ) Interval update z = Number of trailing zeros in (U +V )-bit integer ( A n p V (s n ) ) mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = ( A n p V (s n ) ) (V z) = ( ( B n + A n c V (s n ) ) & mask ) z

36 Arithmetic Coding / Continuous Output of Bits H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 36 / 64 Arithmetic Coding: Output of Settled Bits Output of Bits L n+1 = 0. aaaaa a }{{} z n c n U xxxxxxxx xxx }{{} c n + z Investigate modified (c n + z) bits Update counter c n+1 and output new settled bits } xxx {{ xxxxxx } B n+1 (U + V ) Total Number of Bits for Arithmetic Codeword Total number of bits to output (note: W N = A N 2 zn ) K = log 2 W N = zn log 2 A N = zn U + 1 (47) Prefix-free version: K = z N U + 2 Note: z N U is the sum of settled and outstanding bits Output c N outstanding bits Output one bit of B N (prefix-free: two bits)

37 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 37 / 64 Arithmetic Coding / Codeword Termination Termination of Arithmetic Codeword Required Output Required output after last symbol was coded c N outstanding bits most significant bit of B N (prefix-free: two most significant bits) Note: Lower boundary must be rounded up to next multiple of 2 K Codeword Termination Set n = 1 (non-prefix free) or n = 2 (prefix-free) If the any of the last (U + V n) bits in B N is equal to 1, do Set B N = B N + (1 (U + V n)) (rounding up) If B N (1 (U + V )) (carry condition) Invert outstanding bits, output new settled bits, set c N = 1 Remove carry: B N = B N (1 (U + V )) Output outstanding bits (one 0 and (c N 1) times 1 ) Output n most significant bits of B N

38 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 38 / 64 Overview of Arithmetic Encoding Process Arithmetic Encoding 1 Initialization: A 0 = 2 U 1, B 0 = 0, c 0 = 0 2 Iteration: For n = 0, 1,, N 1, do a Calculate: A = A n p V (s n) (U + V bits) B = B n + A n c V (s n) (U + V + 1 bits) b Determine number z of trailing zeros in (U +V )-bit integer A c Update: mask = ( 1 (U + V z) ) 1 A n+1 B n+1 = A (V z) = ( B & mask ) z d Determine outstanding bits counter c n+1 (based on c n and B ) e Output new settled bits (c n + z c n+1 bits) 3 Termination: Round up B N Output c N outstanding bits + one/two most significant bit(s) of B N

39 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 39 / 64 Arithmetic Coding Example Example: Preparation Coding example IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Consider arithmetic coding with V = 4 and U = 4 Symbol sequence BANANA Preparation: Quantization of pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = N 1/3 16/ B 1/6 16/ Note: Quantized pmf p V (a) fulfills the requirement p V (a) 2 V

40 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 40 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 1 s n p V c V parameter updates & output A 0 = 15 = 1111 initialization c 0 = 0 ( ) B 0 = 0 = bitstream = B 3 13 A 0 p V = 15 3 = 45 = B 0 + A 0 c V = = 195 = z = 2 A 1 = 1011 = 11 c 1 = 0 ( ) B 1 = = 12 output = 11 bitstream = 11

41 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 41 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 2 s n p V c V parameter updates & output A 1 = 11 = 1011 after step 1 c 1 = 0 ( ) B 1 = 12 = bitstream = 11 A 8 0 A 1 p V = 11 8 = 88 = B 1 + A 1 c V = = 12 = z = 1 A 2 = 1011 = 11 c 2 = 1 ( 0 ) B 2 = = 24 output = bitstream = 11

42 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 42 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 3 s n p V c V parameter updates & output A 2 = 11 = 1011 after step 2 c 2 = 1 ( 0 ) B 2 = 24 = bitstream = 11 N 5 8 A 2 p V = 11 5 = 55 = B 2 + A 2 c V = = 112 = z = 2 A 3 = 1101 = 13 c 3 = 2 ( 01 ) B 3 = = 192 output = 0 bitstream = 110

43 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 43 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 4 s n p V c V parameter updates & output A 3 = 13 = 1101 after step 3 c 3 = 2 ( 01 ) B 3 = 192 = bitstream = 110 A 8 0 A 3 p V = 13 8 = 104 = B 3 + A 3 c V = = 192 = z = 1 A 4 = 1101 = 13 c 4 = 3 ( 011 ) B 4 = = 128 output = bitstream = 110

44 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 44 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 5 s n p V c V parameter updates & output A 4 = 13 = 1101 after step 4 c 4 = 3 ( 011 ) B 4 = 128 = bitstream = 110 N 5 8 A 4 p V = 13 5 = 65 = B 4 + A 4 c V = = 232 = z = 1 A 5 = 1000 = 8 c 5 = 4 ( 0111 ) B 5 = = 208 output = bitstream = 110

45 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 45 / 64 Arithmetic Coding / Summary of Arithmetic Encoding Arithmetic Coding Example Example: Step 6 s n p V c V parameter updates & output A 5 = 8 = 1000 after step 5 c 5 = 4 ( 0111 ) B 5 = 208 = bitstream = 110 A 8 0 A 5 p V = 8 8 = 64 = B 5 + A 5 c V = = 208 = z = 1 A 6 = 1000 = 8 c 6 = 5 ( ) B 6 = = 160 output = bitstream = 110

46 Arithmetic Coding / Summary of Arithmetic Encoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 46 / 64 Arithmetic Coding Example Example: Codeword Termination s n p V c V parameter updates & output A 6 = 8 = 1000 after step 6 c 6 = 5 ( ) B 6 = 160 = bitstream = 110 B = (rounding up B 6 ) final rounding bitstream = (c 6 1 inverted bits) c = 1 ( 0 ) B = termination final bitstream = (c + 1 bits added) Bitstream b = " " (for sequence s = "BANANA") Same number of bits (K = 9) as for Shannon-Fano-Elias coding

47 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 47 / 64 Decoding With Finite Precision Identification of Intervals Important: Same rounding of interval width as in encoder (A n A n+1 ) Arithmetic codeword b 0 b 1 b 2 b 3 b 4 b 5 b 6 represents binary fraction v = (0.b 0 b 1 b 2 b 3 b 4 b 5 b 6 ) b Iterative decoding: Output symbol s n which fulfills inequality Observation: L n + W n c(s n ) v < L n + W n c(s n ) + W n p(s n ) Lower interval boundary L n cannot be represented with reasonable precision Idea: Subtract L n from the inequality Symbol s n is identified by W n c(s n ) v L n < W n c(s n ) + W n p(s n ) The value u n = v L n used in comparisons can be stored with (U + V ) bits, but needs to be updated after a symbol s n is decoded

48 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 48 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding: Binary Representations Analyse Binary Representations z n U U {}}{{}}{ W n = xx x z n U U + V {}}{{}}{ W n (c V + p V ) = xxx xxxxxx v L n = xxx xxxxxx xxxxxxxxx W n c V = }{{} z n U xxx xxxxxx }{{} U + V Use (U +V )-bit integer u n in comparisons (down-rounded value of v L n ) Initialization: u n = (first U +V bits from bitstream) Update u n u n+1 Subtract lower boundary: u n = u n W n c V (s n) Align with interval width: u n = u n z (leading zeros in A n p V (s n)) Fill least significant bits with next z bits from bitstream

49 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 49 / 64 Overview of Arithmetic Decoding Process Arithmetic Decoding 1 Initialization: A 0 = 2 U 1, u 0 = (first U +V bits from bitstream) 2 Iteration: For n = 0, 1,, N 1, do a Identify next symbol: For k = 0, 1,, do Calculate upper boundary U(a k ) = A n (c V (a k ) + p V (a k )) If u n < U(a k ), then Output next symbol s n = a k break loop over k b Update parameters: Calculate intermediate value: A = A n p V (s n) (loop over alphabet) Determine number z of trailing zeros in (U +V )-bit integer A A n+1 = A (V z) u n+1 = (u n z) + (next z bits from bitstream)

50 Arithmetic Coding / Arithmetic Decoding H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 50 / 64 Arithmetic Decoding Example Decode Bitstream obtained in Encoding Example Coding example (see encoding example) IID source with symbol alphabet {A, N, B} Pmf is given by { 1/2, 1/3, 1/6 } Arithmetic coding with V = 4 and U = 4 Quantized pmf (and cmf) with V = 4 bits a p(a) p(a) 2 4 p V (a) c V (a) A 1/2 16/2 = N 1/3 16/ B 1/6 16/ Bitstream b =

51 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 51 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 1 a c V p V decoding & update output bitstream = ( ) initialization A 0 = 15 = 1111 u 0 = 208 = A 0 8 U(A) = 15 (0 + 8) = 120 U(A) u 0 N 8 5 U(N) = 15 (8 + 5) = 195 U(N) u 0 B 13 3 U(B) = 15 (13 + 3) = 240 U(B) > u 0 B A = 15 3 = 45 = u = = 13 = z = 2 A 1 = 1011 = 11 u 1 = = 52 bitstream = ( )

52 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 52 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 2 a c V p V decoding & update output bitstream = ( ) after step 1 A 1 = 11 = 1011 u 1 = 52 = A 0 8 U(A) = 11 (0 + 8) = 88 U(A) > u 1 A N 8 5 U(N) = 11 (8 + 5) = 143 B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 8 = 88 = u = = 52 = z = 1 A 2 = 1011 = 11 u 2 = = 104 bitstream = ( )

53 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 53 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 3 a c V p V decoding & update output bitstream = ( ) after step 2 A 2 = 11 = 1011 u 2 = 104 = A 0 8 U(A) = 11 (0 + 8) = 88 U(A) u 2 N 8 5 U(N) = 11 (8 + 5) = 143 U(N) > u 2 N B 13 3 U(B) = 11 (13 + 3) = 176 A = 11 5 = 55 = u = = 16 = z = 2 A 3 = 1101 = 13 u 3 = = 64 bitstream = ( )

54 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 54 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 4 a c V p V decoding & update output bitstream = ( ) after step 3 A 3 = 13 = 1101 u 3 = 64 = A 0 8 U(A) = 13 (0 + 8) = 104 U(A) > u 3 A N 8 5 U(N) = 13 (8 + 5) = 169 B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 8 = 104 = u = = 64 = z = 1 A 4 = 1101 = 13 u 4 = = 128 bitstream = ( )

55 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 55 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 5 a c V p V decoding & update output bitstream = ( ) after step 4 A 4 = 13 = 1101 u 4 = 128 = A 0 8 U(A) = 13 (0 + 8) = 104 U(A) u 4 N 8 5 U(N) = 13 (8 + 5) = 169 U(N) > u 4 N B 13 3 U(B) = 13 (13 + 3) = 208 A = 13 5 = 65 = u = = 24 = z = 1 A 5 = 1000 = 8 u 5 = = 48 bitstream = ( )

56 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 56 / 64 Arithmetic Coding / Arithmetic Decoding Arithmetic Decoding Example Example: Step 6 (last symbol) a c V p V decoding & update output bitstream = ( ) after step 5 A 5 = 8 = 1000 u 5 = 48 = A 0 8 U(A) = 8 (0 + 8) = 64 U(A) > u 4 A N 8 5 U(N) = 8 (8 + 5) = 104 B 13 3 U(B) = 8 (13 + 3) = 128 bitstream symbol sequence BANANA Note: Required some bits after end of the bitstream For non-prefix variant: Use bits equal to 0 For prefix-free variant: Any bit values (0 or 1) can be used

57 Arithmetic Coding / Efficiency H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 57 / 64 Efficiency of Arithmetic Coding Increase in Codeword Length relative to Shannon-Fano-Elias Coding Excess rate due to rounding of interval width l = log 2 W N log2 p(s) < 1 + log 2 p(s) W N (48) Upper bound for increase in codeword length per symbol relative to infinite-precision Shannon-Fano-Elias coding l < 1 N + log ( U ) log 2 (1 2 V p min ) (49) (for a derivation see Wiegand, Schwarz, page 51-52) Example: Number of coded symbols N = 1000, Arithmetic precision: V = 16 and U = 12, Minimum probablity mass p min = 0.02 Increase in codeword length is less than bit per symbol

58 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 58 / 64 Arithmetic Coding / Arithmetic Coding in Practice Complexity Reduction: Binary Arithmetic Coding Binary Arithmetic Coding Most popular type of arithmetic coding: JPEG 2000, H.264, H.265 Binarization of S {a 0, a 1,..., a M 1 } produces C {0, 1} Any prefix code can be used for binarization Example: Truncated unary binarization S n number of bins B C 0 C 1 C 2 C M 2 C M 1 a a a M 2 M a M 1 M Entropy unchanged due to binarization S C H(S) = E{ log 2 p(s) } = E{ log 2 p(c) } = H(C)

59 Arithmetic Coding / Arithmetic Coding in Practice H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 59 / 64 Practical Arithmetic Coding Complexity Reduction Binary arithmetic coding Multiplication-free implementations Bypass mode: Low-complexity coding of bins with p = 0.5 Practical Design Aspects 1 Context selection Use reasonable context variables X = f (S n 1, S n 2, ) for switching probability tables p(a X ) Use context switching only when useful (certain bins) 2 Estimate probabilities during coding Choose appropriate window sizes for estimation 3 Suitably combine context selection and probability estimation

60 Comparison of Lossless Coding Techniques H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 60 / 64 Experimental Comparison of Lossless Coding Techniques Example: Markov Source Stationary Markov source given by conditional pmf a p(a a 0 ) p(a a 1 ) p(a a 2 ) a a a H(S) = H(S) = Bounds for lossless coding Entropy rate H(S) for coding of infinitely many symbols Instantaneous entropy rate H inst (S, L) for coding L symbols H inst (S, L) = 1 L H(S 0, S 1,, S L 1 ) (50) Coding experiment Coding of realizations of example stationary Markov source Calculate average codeword length for sequences of 1 to 1000 symbols

61 Comparison of Lossless Coding Techniques Experimental Results 2.5 average codeword length per symbol scalar Huffman code (3 codewords) instantaneous entropy rate entropy rate conditional Huffman code (3 3 codewords) Huffman code for fixed-length vectors (5 symbols, 243 codewords) Huffman code for variable-length vectors (17 codewords) arithmetic coding (16 bits of precision for interval sizes and probabilities) number of coded symbols (logarithmic scale) H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 61 / 64

62 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 62 / 64 Summary Part Summary Uniquely decodable codes & bounds for lossless coding Kraft inequality, prefix codes Scalar entropy, conditional entropy, block entropy Entropy rate, instantaneous entropy rate Variable-length codes for scalars and vectors Optimal code for given pmf: Huffman code Scalar, conditional codes: Inefficient for pmfs with p(a) 0.5 Block codes and V2V codes: Code tables can become extremely large Difficult adaptation to instationary sources Arithmetic coding No codeword table: Iterative construction of codeword Close to entropy bound for N 1 Well suited for exploiting statistical dependencies Well suited for adapting probabilities during coding

63 H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 63 / 64 Exercises Exercise 1: Implement an Arithmetic Encoder/Decoder Implement an arithmetic encoder and decoder (the one discussed in lecture): Use 64-bit integer arithmetic for all operations. Start with a 4-symbol alphabet {a, b, c, d} and a fixed pmf {6/16, 5/16, 3/16, 2/16} for verifying the implementation. Measure the average codeword length (bits per symbol) for long symbol sequences and compare it to the entropy rate. Extend the implementation to general file types. Use a fixed pmf (p k = 1/256) for the bytes of a file. For preparing the implementation, think about the following: How do we determine the newly settled bits and the new number of outstanding bits c n+1 during encoding (based on c n and B )? Suggestion: Use the framework (written in C++) on the web-site: It provides file input/output, file comparison, and implements classes for reading and writing bitstreams (bit by bit). In ac/test you ll find examples for the 4-symbol pmf ( test1mioabcd.txt ) and general text files ( Goethe.txt ).

64 Exercises H. Schwarz (FU Berlin) Image and Video Coding: Lossless Coding IV 64 / 64 Exercise 2: Arithmetic Coding with Adaptive Pmfs Extend the implemented arithmetic codec by backward-adaptive pmf estimation and the usage of conditional pmfs: Use the iid model and estimate the pmf during encoding/decoding. Try to improve the performance by using a Markov model (estimation of conditional pmfs). Try to further improve the performance by using two preceding symbols as condition (2nd order Markov model). Test the different probability models (iid with fixed pmf, iid with adaptive pmf, Markov with adaptive pmfs, and 2nd order MArkov with adaptive pmfs) with the test file Goethe.txt provided in ac/test. For preparing the implementation, think about the following: How can we estimate a rounded pmf (with V bits of precision) during coding?

CSEP 590 Data Compression Autumn Arithmetic Coding

CSEP 590 Data Compression Autumn Arithmetic Coding CSEP 590 Data Compression Autumn 2007 Arithmetic Coding Reals in Binary Any real number x in the interval [0,1) can be represented in binary as.b 1 b 2... where b i is a bit. x 0 0 1 0 1... binary representation

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Stream Codes. 6.1 The guessing game

Stream Codes. 6.1 The guessing game About Chapter 6 Before reading Chapter 6, you should have read the previous chapter and worked on most of the exercises in it. We ll also make use of some Bayesian modelling ideas that arrived in the vicinity

More information

Source Coding: Part I of Fundamentals of Source and Video Coding

Source Coding: Part I of Fundamentals of Source and Video Coding Foundations and Trends R in sample Vol. 1, No 1 (2011) 1 217 c 2011 Thomas Wiegand and Heiko Schwarz DOI: xxxxxx Source Coding: Part I of Fundamentals of Source and Video Coding Thomas Wiegand 1 and Heiko

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression CSEP 52 Applied Algorithms Spring 25 Statistical Lossless Data Compression Outline for Tonight Basic Concepts in Data Compression Entropy Prefix codes Huffman Coding Arithmetic Coding Run Length Coding

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

Compression and Coding

Compression and Coding Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)

More information

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

CMPT 365 Multimedia Systems. Lossless Compression

CMPT 365 Multimedia Systems. Lossless Compression CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

repetition, part ii Ole-Johan Skrede INF Digital Image Processing repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Chapter 4. Regular Expressions. 4.1 Some Definitions

Chapter 4. Regular Expressions. 4.1 Some Definitions Chapter 4 Regular Expressions 4.1 Some Definitions Definition: If S and T are sets of strings of letters (whether they are finite or infinite sets), we define the product set of strings of letters to be

More information

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2 582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix

More information

In English, there are at least three different types of entities: letters, words, sentences.

In English, there are at least three different types of entities: letters, words, sentences. Chapter 2 Languages 2.1 Introduction In English, there are at least three different types of entities: letters, words, sentences. letters are from a finite alphabet { a, b, c,..., z } words are made up

More information

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compiler Design Spring 2011 Lexical Analysis Sample Exercises and Solutions Prof. Pedro C. Diniz USC / Information Sciences Institute 4676 Admiralty Way, Suite 1001 Marina del Rey, California 90292 pedro@isi.edu

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 1 Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 Aline Roumy aline.roumy@inria.fr May 2011 2 Motivation for Video Compression Digital video studio standard ITU-R Rec. 601 Y luminance

More information

CMPT 365 Multimedia Systems. Final Review - 1

CMPT 365 Multimedia Systems. Final Review - 1 CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Theory of Computation

Theory of Computation Theory of Computation Lecture #2 Sarmad Abbasi Virtual University Sarmad Abbasi (Virtual University) Theory of Computation 1 / 1 Lecture 2: Overview Recall some basic definitions from Automata Theory.

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015 Outline Codes and Cryptography 1 Information Sources and Optimal Codes 2 Building Optimal Codes: Huffman Codes MAMME, Fall 2015 3 Shannon Entropy and Mutual Information PART III Sources Information source:

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Digital communication system. Shannon s separation principle

Digital communication system. Shannon s separation principle Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation

More information

Lecture 4 Channel Coding

Lecture 4 Channel Coding Capacity and the Weak Converse Lecture 4 Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 15, 2014 1 / 16 I-Hsiang Wang NIT Lecture 4 Capacity

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

More information

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms) Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr

More information

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

Lec 04 Variable Length Coding (VLC) in JPEG

Lec 04 Variable Length Coding (VLC) in JPEG ECE 5578 Multimedia Communication Lec 04 Variable Length Coding (VLC) in JPEG Zhu Li Dept of CSEE, UMKC Z. Li Multimedia Communciation, 2018 p.1 Outline Lecture 03 ReCap VLC JPEG Image Coding Framework

More information

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet. EE376A - Information Theory Midterm, Tuesday February 10th Instructions: You have two hours, 7PM - 9PM The exam has 3 questions, totaling 100 points. Please start answering each question on a new page

More information

U Logo Use Guidelines

U Logo Use Guidelines COMP2610/6261 - Information Theory Lecture 15: Arithmetic Coding U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature

More information

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction to Information Theory By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction [B.P. Lathi] Almost in all the means of communication, none produces error-free communication.

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB) Image Compression Qiaoyong Zhong CAS-MPG Partner Institute for Computational Biology (PICB) November 19, 2012 1 / 53 Image Compression The art and science of reducing the amount of data required to represent

More information

CSCI 340: Computational Models. Regular Expressions. Department of Computer Science

CSCI 340: Computational Models. Regular Expressions. Department of Computer Science CSCI 340: Computational Models Regular Expressions Chapter 4 Department of Computer Science Yet Another New Method for Defining Languages Given the Language: L 1 = {x n for n = 1 2 3...} We could easily

More information

Kolmogorov complexity ; induction, prediction and compression

Kolmogorov complexity ; induction, prediction and compression Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds

More information

Lec 05 Arithmetic Coding

Lec 05 Arithmetic Coding ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic

More information

Can the sample being transmitted be used to refine its own PDF estimate?

Can the sample being transmitted be used to refine its own PDF estimate? Can the sample being transmitted be used to refine its own PDF estimate? Dinei A. Florêncio and Patrice Simard Microsoft Research One Microsoft Way, Redmond, WA 98052 {dinei, patrice}@microsoft.com Abstract

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1) COSE212: Programming Languages Lecture 1 Inductive Definitions (1) Hakjoo Oh 2017 Fall Hakjoo Oh COSE212 2017 Fall, Lecture 1 September 4, 2017 1 / 9 Inductive Definitions Inductive definition (induction)

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

BASIC COMPRESSION TECHNIQUES

BASIC COMPRESSION TECHNIQUES BASIC COMPRESSION TECHNIQUES N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lectures # 05 Questions / Problems / Announcements? 2 Matlab demo of DFT Low-pass windowed-sinc

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

{a, b, c} {a, b} {a, c} {b, c} {a}

{a, b, c} {a, b} {a, c} {b, c} {a} Section 4.3 Order Relations A binary relation is an partial order if it transitive and antisymmetric. If R is a partial order over the set S, we also say, S is a partially ordered set or S is a poset.

More information

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Peter Bro Miltersen August 29, 2005 Version 2.0 1 The paradox of data compression Definition 1 Let Σ be an alphabet and let

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

DCSP-3: Minimal Length Coding. Jianfeng Feng

DCSP-3: Minimal Length Coding. Jianfeng Feng DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than

More information

The Binomial Theorem.

The Binomial Theorem. The Binomial Theorem RajeshRathod42@gmail.com The Problem Evaluate (A+B) N as a polynomial in powers of A and B Where N is a positive integer A and B are numbers Example: (A+B) 5 = A 5 +5A 4 B+10A 3 B

More information

Fundamentele Informatica II

Fundamentele Informatica II Fundamentele Informatica II Answer to selected exercises 5 John C Martin: Introduction to Languages and the Theory of Computation M.M. Bonsangue (and J. Kleijn) Fall 2011 5.1.a (q 0, ab, Z 0 ) (q 1, b,

More information

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1) COSE212: Programming Languages Lecture 1 Inductive Definitions (1) Hakjoo Oh 2018 Fall Hakjoo Oh COSE212 2018 Fall, Lecture 1 September 5, 2018 1 / 10 Inductive Definitions Inductive definition (induction)

More information

2018/5/3. YU Xiangyu

2018/5/3. YU Xiangyu 2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

ASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER

ASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER ASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER Huffman coding Arithmetic coding fast, but operates on integer number of bits: approximates probabilities with powers of ½, getting inferior

More information

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding Ch 0 Introduction 0.1 Overview of Information Theory and Coding Overview The information theory was founded by Shannon in 1948. This theory is for transmission (communication system) or recording (storage

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Theory of Computer Science

Theory of Computer Science Theory of Computer Science C1. Formal Languages and Grammars Malte Helmert University of Basel March 14, 2016 Introduction Example: Propositional Formulas from the logic part: Definition (Syntax of Propositional

More information

Information Theory and Coding Techniques

Information Theory and Coding Techniques Information Theory and Coding Techniques Lecture 1.2: Introduction and Course Outlines Information Theory 1 Information Theory and Coding Techniques Prof. Ja-Ling Wu Department of Computer Science and

More information

Information Theory. Week 4 Compressing streams. Iain Murray,

Information Theory. Week 4 Compressing streams. Iain Murray, Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Generalized Kraft Inequality and Arithmetic Coding

Generalized Kraft Inequality and Arithmetic Coding J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic

More information