Universal Loseless Compression: Context Tree Weighting(CTW)

Size: px
Start display at page:

Download "Universal Loseless Compression: Context Tree Weighting(CTW)"

Transcription

1 Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014

2 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding

3 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding In the case where the actual model is unknown, we can still compress the sequence with universal compression. e.g. Lempel Ziv, and CTW

4 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding In the case where the actual model is unknown, we can still compress the sequence with universal compression. e.g. Lempel Ziv, and CTW Aren t the LZ good enough?

5 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding In the case where the actual model is unknown, we can still compress the sequence with universal compression. e.g. Lempel Ziv, and CTW Aren t the LZ good enough? Yes. Universal w.r.t class of finite-state coders, and converging to entropy rate for stationary ergodic distribution

6 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding In the case where the actual model is unknown, we can still compress the sequence with universal compression. e.g. Lempel Ziv, and CTW Aren t the LZ good enough? Yes. Universal w.r.t class of finite-state coders, and converging to entropy rate for stationary ergodic distribution No. convergence is slow

7 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic) model of data is known, and aims at compressing the data optimally w.r.t. the model. e.g. Huffman coding and Arithmetic coding In the case where the actual model is unknown, we can still compress the sequence with universal compression. e.g. Lempel Ziv, and CTW Aren t the LZ good enough? Yes. Universal w.r.t class of finite-state coders, and converging to entropy rate for stationary ergodic distribution No. convergence is slow We explore CTW.

8 Universal Compression and Universal Estimation A good compressor implies a good estimate of the source distribution. For a given sequence x n, P(x n ) be true distribution and P L(x n ) be estimated distribution.

9 Universal Compression and Universal Estimation A good compressor implies a good estimate of the source distribution. For a given sequence x n, P(x n ) be true distribution and P L(x n ) be estimated distribution. We know that P L(x n ) = 2 L(xn ) k n achieves the minimum expected code length where k n = x n 2 L(xn) 1 by Kraft Inequality

10 Universal Compression and Universal Estimation A good compressor implies a good estimate of the source distribution. For a given sequence x n, P(x n ) be true distribution and P L(x n ) be estimated distribution. We know that P L(x n ) = 2 L(xn ) k n achieves the minimum expected code length where k n = x n 2 L(xn) 1 by Kraft Inequality Then the expected redundancy is Red n(p L, P) = E[L(x n ) log 1 P(x n ) ] (1) = log k n + D(P P L) (2)

11 Universal Compression and Universal Estimation A good compressor implies a good estimate of the source distribution. For a given sequence x n, P(x n ) be true distribution and P L(x n ) be estimated distribution. We know that P L(x n ) = 2 L(xn ) k n achieves the minimum expected code length where k n = x n 2 L(xn) 1 by Kraft Inequality Then the expected redundancy is Red n(p L, P) = E[L(x n ) log 1 P(x n ) ] (1) = log k n + D(P P L) (2) A good compressor has a small redundacny. Therefore, a good compressor implies a good estimate of the true distribution.

12 Two Entropy Coding for known parameters Huffman code an optimal prefix code Arithmetic Code For L(x n 1 ) = log( P(x n ) ) + 1,.c = Q(x n ) 2 L(xn ) 2 L(xn ) Q c L3 0.c 3 Q 3 Q 2 Q 1 Q 0

13 Two Entropy Coding for known parameters Huffman code an optimal prefix code Arithmetic Code For L(x n 1 ) = log( P(x n ) ) + 1,.c = Q(x n ) 2 L(xn ) 2 L(xn ) Q c L3 0.c 3 Q 3 Q 2 Q 1 What if parameters are unknown? Q 0

14 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ.

15 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor?

16 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor? Yes! with acceptable pointwise redundacny for all x n

17 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor? Yes! with acceptable pointwise redundacny for all x n Note that min C C L C (x n ) = min θ [0,1] log 1 P θ = nĥ(xn ) = nh 2 (ˆθ)

18 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor? Yes! with acceptable pointwise redundacny for all x n Note that min C C L C (x n ) = min θ [0,1] log 1 P θ = nĥ(xn ) = nh 2 (ˆθ) Two part code: Use log(n + 1) bits to encode n 1 and then tune code to θ = n1 n.

19 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor? Yes! with acceptable pointwise redundacny for all x n Note that min C C L C (x n ) = min θ [0,1] log 1 P θ = nĥ(xn ) = nh 2 (ˆθ) Two part code: Use log(n + 1) bits to encode n 1 and then tune code to θ = n1 Then n. Red n(l, x n ) = log(n + 1) + 2 n 0

20 Bernoulli Model: Two part code Let C = {P θ, θ [0, 1]}, X n iid Bern(θ). But we don t know θ. Can we we still design the compressor? Yes! with acceptable pointwise redundacny for all x n Note that min C C L C (x n ) = min θ [0,1] log 1 P θ = nĥ(xn ) = nh 2 (ˆθ) Two part code: Use log(n + 1) bits to encode n 1 and then tune code to θ = n1 n. Then Red n(l, x n log(n + 1) + 2 ) = 0 n Natural but dependent on n(not sequential)

21 Bernoulli Model: Mixture and Plug in code Better code:

22 Bernoulli Model: Mixture and Plug in code Better code:

23 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1

24 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3)

25 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where

26 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2

27 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2 Use bias estimator of the conditional probability(plug-in approach)

28 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2 Use bias estimator of the conditional probability(plug-in approach) dθ Mixture code Mix over Dirichlet s density : Γ( 1 )2 = dw(θ) 2 θ(1 θ)

29 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2 Use bias estimator of the conditional probability(plug-in approach) dθ Mixture code Mix over Dirichlet s density : Γ( 1 )2 = dw(θ) 2 θ(1 θ)

30 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2 Use bias estimator of the conditional probability(plug-in approach) dθ Mixture code Mix over Dirichlet s density : Γ( 1 )2 = dw(θ) 2 θ(1 θ) Red n (L, x n ) log(n + 1) 2n + O( 1 n ) 0 (4) Note that P L(x n ) = 1 0 P θ(x n )dw(θ) = n 1 t=0 pl(xt t+1 x t )

31 Bernoulli Model: Mixture and Plug in code Better code: Enumerate all ( ) n n 1 sequences with n1 (n 1 is away from 0, n), L(x n ) = log ( ) n n 1 Red n (L, x n ) log(n + 1) + 2 2n + O( 1 n ) 0 (3) Uniform assignment over types is equivalent to uniform mixture, P L(x n ) = 1 0 θn 0 θn 1 1dθ = ( ) n 1 1 = n 1 n 1 n+1 t=0 pl(xt t+1 x t ) where where p L(0 x t ) = n 0(x t )+1 is Rule of succession t+2 Use bias estimator of the conditional probability(plug-in approach) dθ Mixture code Mix over Dirichlet s density : Γ( 1 )2 = dw(θ) 2 θ(1 θ) Red n (L, x n ) log(n + 1) 2n + O( 1 n ) 0 (4) Note that P L(x n ) = 1 P 0 θ(x n )dw(θ) = n 1 t=0 pl(xt t+1 x t ) where p L(0 x t ) = n 0(x t )+ 2 1 is KT(Krichevski-Trofimov) estimator t+1 only depending on n 0, n 1

32 Tree Source with Known Model If the next symbol depends on past symbols, then Tree model is a good choice. Let a binary tree have leaves S. Then θ R S. For a fixed x n, define the number of 0 s after s as n 0 (s). Similarly define n 1 (s)

33 Tree Source with Known Model If the next symbol depends on past symbols, then Tree model is a good choice. Let a binary tree have leaves S. Then θ R S. For a fixed x n, define the number of 0 s after s as n 0 (s). Similarly define n 1 (s) For a x n = , sequence n = 10, n 0 (10) = 2, n 1 (10) = 1

34 Tree Source with Known Model If the next symbol depends on past symbols, then Tree model is a good choice. Let a binary tree have leaves S. Then θ R S. For a fixed x n, define the number of 0 s after s as n 0 (s). Similarly define n 1 (s) For a x n = , sequence n = 10, n 0 (10) = 2, n 1 (10) = 1 For each s, calculate KT estimator for P L,s (n 0 (s), n 1 (s)) and use P L (x n ) = s S P L,s(n 0 (s), n 1 (s)) for Arithmetic coding

35 Tree Source with Known Model If the next symbol depends on past symbols, then Tree model is a good choice. Let a binary tree have leaves S. Then θ R S. For a fixed x n, define the number of 0 s after s as n 0 (s). Similarly define n 1 (s) For a x n = , sequence n = 10, n 0 (10) = 2, n 1 (10) = 1 For each s, calculate KT estimator for P L,s (n 0 (s), n 1 (s)) and use P L (x n ) = s S P L,s(n 0 (s), n 1 (s)) for Arithmetic coding Then redundancy is Red n (L, x n ) = L(x n 1 ) log P(x n ) (5) 1 < log( P L (x n ) ) + 2 log 1 P(x n ) (6) 1 = log s S P L,s(n 0 (s), n 1 (s)) log 1 P(x n ) + 2 (7)

36 Tree Source with Known Model Continued. s S Red n (L, x n ) = log( θ s n0(s) n 1(s) θs s S P L,s(n 0 (s), n 1 (s)) ) + 2 (8) ( ) 1 2 log(n 0(s) + n 1 (s)) (9) s S ( 1 S 2 log s S (n ) 0(s) + n 1 (s)) (10) S ( 1 = S 2 log n ) S (11)

37 Tree Source with Known Model Continued. s S Red n (L, x n ) = log( θ s n0(s) n 1(s) θs s S P L,s(n 0 (s), n 1 (s)) ) + 2 (8) ( ) 1 2 log(n 0(s) + n 1 (s)) (9) s S ( 1 S 2 log s S (n ) 0(s) + n 1 (s)) (10) S ( 1 = S 2 log n ) S (11) O(log(n)) term is the cost for unknown paramters θ s, s S

38 Tree Source with Known Model Continued. s S Red n (L, x n ) = log( θ s n0(s) n 1(s) θs s S P L,s(n 0 (s), n 1 (s)) ) + 2 (8) ( ) 1 2 log(n 0(s) + n 1 (s)) (9) s S ( 1 S 2 log s S (n ) 0(s) + n 1 (s)) (10) S ( 1 = S 2 log n ) S (11) O(log(n)) term is the cost for unknown paramters θ s, s S This upperbound through KT estimator meets the lowerbound of Minimax redundancy, i.e., S 2 log n + o(1)

39 Tree Source with Unknown Model What if we do not know the tree model either?

40 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth,

41 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T

42 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T

43 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T But it should be optimized over all trees and is not sequentially implementable

44 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T But it should be optimized over all trees and is not sequentially implementable CTW(Context Tree Weighting) defines a weighted coding distribution which takes into account all the possible tree sources that could lead to the sequence that we observe

45 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T But it should be optimized over all trees and is not sequentially implementable CTW(Context Tree Weighting) defines a weighted coding distribution which takes into account all the possible tree sources that could lead to the sequence that we observe

46 Tree Source with Unknown Model What if we do not know the tree model either? With assumption that tree has at most D depth, Two part code: Use n T bits for representing tree and then use KT estimator for that tree, L T (x n ) + n T But it should be optimized over all trees and is not sequentially implementable CTW(Context Tree Weighting) defines a weighted coding distribution which takes into account all the possible tree sources that could lead to the sequence that we observe Question: The leaves S from this context tree might be different from actual tree model. But still good?

47 Tree Source with Unknown Model Calculate n 0 (s), n 1 (s) for all s {s s {0, 1}, s D} s = λ Ex) x n = n 0 (0) = 3, n 1 (0) = 3 s = 0 s = s = 0 s = 1 n 0 (10) = 2, n 1 (10) = 1 (x n = ) s = 00s = 01s = 10s = 11

48 Tree Source with Unknown Model Calculate n 0 (s), n 1 (s) for all s {s s {0, 1}, s D} s = λ Ex) x n = n 0 (0) = 3, n 1 (0) = 3 s = 0 s = s = 0 s = 1 n 0 (10) = 2, n 1 (10) = 1 (x n = ) s = 00s = 01s = 10s = 11 Calculate corresponding KT estimatorp e (n 0 (s), n 1 (s)) for each s.

49 Tree Source with Unknown Model Calculate n 0 (s), n 1 (s) for all s {s s {0, 1}, s D} s = λ Ex) x n = n 0 (0) = 3, n 1 (0) = 3 s = 0 s = s = 0 s = 1 n 0 (10) = 2, n 1 (10) = 1 (x n = ) s = 00s = 01s = 10s = 11 Calculate corresponding KT estimatorp e (n 0 (s), n 1 (s)) for each s. Assign weighting probability P s w = { Pe (n 0 (s), n 1 (s)) if l(s) = D P e(n 0(s),n 1(s))+P 0s w P 1s w 2 if l(s) D (12) And take P L (x n ) = P λ w(x n ) and use it for Arithmetic coding.

50 Tree Source with Unknown Model Calculate n 0 (s), n 1 (s) for all s {s s {0, 1}, s D} s = λ Ex) x n = n 0 (0) = 3, n 1 (0) = 3 s = 0 s = s = 0 s = 1 n 0 (10) = 2, n 1 (10) = 1 (x n = ) s = 00s = 01s = 10s = 11 Calculate corresponding KT estimatorp e (n 0 (s), n 1 (s)) for each s. Assign weighting probability P s w = { Pe (n 0 (s), n 1 (s)) if l(s) = D P e(n 0(s),n 1(s))+P 0s w P 1s w 2 if l(s) D (12) And take P L (x n ) = P λ w(x n ) and use it for Arithmetic coding. Corollary: Suppose x n is drawn from P 1 or P 2. The one can achieve a redundacny of 1 bit by using the weighted distribution P w = P1+P2 2

51 Tree Source with Unknown Model

52 Tree Source with Unknown Model Then P λ w(x n ) = S all T D 2 ΓD(S ) P e (n 0 (s), n 1 (s)) s S (13) S P e (n 0 (s), n 1 (s)) (14) s S S P known L (x n ) (15)

53 Tree Source with Unknown Model Then P λ w(x n ) = The redundancy is S all T D 2 ΓD(S ) P e (n 0 (s), n 1 (s)) s S (13) S P e (n 0 (s), n 1 (s)) (14) s S S P known L (x n ) (15) Red n (x n ) = Red known n (x n ) + 2 S 1 (16) = S 2 log n + S S 1 (17) S

54 Summary Two part code vs. Mixture code(plug-in code)

55 Summary Two part code vs. Mixture code(plug-in code) Look at KT estimator which is is optimal in minimax sense. And the mixture is sequentially implementable by KT estimator.

56 Summary Two part code vs. Mixture code(plug-in code) Look at KT estimator which is is optimal in minimax sense. And the mixture is sequentially implementable by KT estimator. Even though we do not know the tree model(only depth is given), can design good compressor using CTW.

57 Reference Willems; Shtarkov; Tjalkens (1995), The Context-Tree Weighting Method: Basic Properties 41, IEEE Transactions on Information Theory Begleiter; El-Yaniv; Yona (2004), On Prediction Using Variable Order Markov Models 22, Journal of Artificial Intelligence Research: Journal of Artificial Intelligence Research, pp Some tutorial and lecture note

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

10-704: Information Processing and Learning Fall Lecture 9: Sept 28 10-704: Information Processing and Learning Fall 2016 Lecturer: Siheng Chen Lecture 9: Sept 28 Note: These notes are based on scribed notes from Spring15 offering of this course. LaTeX template courtesy

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

IN this paper, we study the problem of universal lossless compression

IN this paper, we study the problem of universal lossless compression 4008 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 9, SEPTEMBER 2006 An Algorithm for Universal Lossless Compression With Side Information Haixiao Cai, Member, IEEE, Sanjeev R. Kulkarni, Fellow,

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Lecture 5: Asymptotic Equipartition Property

Lecture 5: Asymptotic Equipartition Property Lecture 5: Asymptotic Equipartition Property Law of large number for product of random variables AEP and consequences Dr. Yao Xie, ECE587, Information Theory, Duke University Stock market Initial investment

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

An O(N) Semi-Predictive Universal Encoder via the BWT

An O(N) Semi-Predictive Universal Encoder via the BWT An O(N) Semi-Predictive Universal Encoder via the BWT Dror Baron and Yoram Bresler Abstract We provide an O(N) algorithm for a non-sequential semi-predictive encoder whose pointwise redundancy with respect

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Lecture 22: Final Review

Lecture 22: Final Review Lecture 22: Final Review Nuts and bolts Fundamental questions and limits Tools Practical algorithms Future topics Dr Yao Xie, ECE587, Information Theory, Duke University Basics Dr Yao Xie, ECE587, Information

More information

Image and Multidimensional Signal Processing

Image and Multidimensional Signal Processing Image and Multidimensional Signal Processing Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ Image Compression 2 Image Compression Goal: Reduce amount

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

arxiv:cs/ v1 [cs.it] 21 Nov 2006

arxiv:cs/ v1 [cs.it] 21 Nov 2006 On the space complexity of one-pass compression Travis Gagie Department of Computer Science University of Toronto travis@cs.toronto.edu arxiv:cs/0611099v1 [cs.it] 21 Nov 2006 STUDENT PAPER Abstract. We

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition

Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition Journal of Machine Learning Research 7 2006) 379 4 Submitted 8/05; Published 2/06 Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition Ron Begleiter Ran El-Yaniv

More information

much more on minimax (order bounds) cf. lecture by Iain Johnstone

much more on minimax (order bounds) cf. lecture by Iain Johnstone much more on minimax (order bounds) cf. lecture by Iain Johnstone http://www-stat.stanford.edu/~imj/wald/wald1web.pdf today s lecture parametric estimation, Fisher information, Cramer-Rao lower bound:

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 2014)

Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 2014) Sample solutions to Homework 4, Information-Theoretic Modeling (Fall 204) Jussi Määttä October 2, 204 Question [First, note that we use the symbol! as an end-of-message symbol. When we see it, we know

More information

CS 229r Information Theory in Computer Science Feb 12, Lecture 5

CS 229r Information Theory in Computer Science Feb 12, Lecture 5 CS 229r Information Theory in Computer Science Feb 12, 2019 Lecture 5 Instructor: Madhu Sudan Scribe: Pranay Tankala 1 Overview A universal compression algorithm is a single compression algorithm applicable

More information

Information. Abstract. This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor

Information. Abstract. This paper presents conditional versions of Lempel-Ziv (LZ) algorithm for settings where compressor and decompressor Optimal Universal Lossless Compression with Side Information Yeohee Im, Student Member, IEEE and Sergio Verdú, Fellow, IEEE arxiv:707.0542v cs.it 8 Jul 207 Abstract This paper presents conditional versions

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Information Theory. Week 4 Compressing streams. Iain Murray,

Information Theory. Week 4 Compressing streams. Iain Murray, Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

On Prediction Using Variable Order Markov Models

On Prediction Using Variable Order Markov Models Journal of Artificial Intelligence Research 22 (2004) 385-421 Submitted 05/04; published 12/04 On Prediction Using Variable Order Markov Models Ron Begleiter Ran El-Yaniv Department of Computer Science

More information

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity Eckehard Olbrich MPI MiS Leipzig Potsdam WS 2007/08 Olbrich (Leipzig) 26.10.2007 1 / 18 Overview 1 Summary

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004 Types for Parametric Probability Distributions A = finite alphabet,

More information

Digital communication system. Shannon s separation principle

Digital communication system. Shannon s separation principle Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation

More information

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2 582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix

More information

Bounded Expected Delay in Arithmetic Coding

Bounded Expected Delay in Arithmetic Coding Bounded Expected Delay in Arithmetic Coding Ofer Shayevitz, Ram Zamir, and Meir Feder Tel Aviv University, Dept. of EE-Systems Tel Aviv 69978, Israel Email: {ofersha, zamir, meir }@eng.tau.ac.il arxiv:cs/0604106v1

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

Information & Correlation

Information & Correlation Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 7: Information Theory Cosma Shalizi 3 February 2009 Entropy and Information Measuring randomness and dependence in bits The connection to statistics Long-run

More information

Applications of Information Theory in Science and in Engineering

Applications of Information Theory in Science and in Engineering Applications of Information Theory in Science and in Engineering Some Uses of Shannon Entropy (Mostly) Outside of Engineering and Computer Science Mário A. T. Figueiredo 1 Communications: The Big Picture

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences

Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences 2003 Conference on Information Sciences and Systems, The Johns Hopkins University, March 2 4, 2003 Two-Part Codes with Low Worst-Case Redundancies for Distributed Compression of Bernoulli Sequences Dror

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

Non-binary Distributed Arithmetic Coding

Non-binary Distributed Arithmetic Coding Non-binary Distributed Arithmetic Coding by Ziyang Wang Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the Masc degree in Electrical

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1) EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the

More information

ee378a spring 2013 April 1st intro lecture statistical signal processing/ inference, estimation, and information processing Monday, April 1, 13

ee378a spring 2013 April 1st intro lecture statistical signal processing/ inference, estimation, and information processing Monday, April 1, 13 ee378a statistical signal processing/ inference, estimation, and information processing spring 2013 April 1st intro lecture 1 what is statistical signal processing? anything & everything inference, estimation,

More information

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

More information

Information Theory and Statistics, Part I

Information Theory and Statistics, Part I Information Theory and Statistics, Part I Information Theory 2013 Lecture 6 George Mathai May 16, 2013 Outline This lecture will cover Method of Types. Law of Large Numbers. Universal Source Coding. Large

More information

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018

EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Please submit the solutions on Gradescope. Some definitions that may be useful: EE376A: Homework #2 Solutions Due by 11:59pm Thursday, February 1st, 2018 Definition 1: A sequence of random variables X

More information

An MCMC Approach to Lossy Compression of Continuous Sources

An MCMC Approach to Lossy Compression of Continuous Sources An MCMC Approach to Lossy Compression of Continuous Sources Dror Baron Electrical Engineering Department Technion Israel Institute of Technology Haifa, Israel Email: drorb@ee.technion.ac.il Tsachy Weissman

More information

A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm 1 Nikhil Krishnan, Student Member, IEEE, and Dror Baron, Senior Member, IEEE arxiv:submit/11714 [cs.it] 1 Mar 015 Abstract Computing

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

A One-to-One Code and Its Anti-Redundancy

A One-to-One Code and Its Anti-Redundancy A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH. Outline of the Talk. Prefix Codes

More information

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques

More information

Multimedia Information Systems

Multimedia Information Systems Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 3 & 4: Color, Video, and Fundamentals of Data Compression 1 Color Science Light is an electromagnetic wave. Its color is characterized

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006 MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter

More information

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016 Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).

More information

On Probability Estimation by Exponential Smoothing

On Probability Estimation by Exponential Smoothing On Probability Estimation by Exponential Smoothing Christopher Mattern Technische Universität Ilmenau Ilmenau, Germany christopher.mattern@tu-ilmenau.de Abstract. Probability estimation is essential for

More information

On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004*

On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004* On universal types Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL-2004-153 September 6, 2004* E-mail: gadiel.seroussi@hp.com method of types, type classes, Lempel-Ziv coding,

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

On the lower limits of entropy estimation

On the lower limits of entropy estimation On the lower limits of entropy estimation Abraham J. Wyner and Dean Foster Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, PA e-mail: ajw@wharton.upenn.edu foster@wharton.upenn.edu

More information

arxiv: v2 [cs.ds] 28 Jan 2009

arxiv: v2 [cs.ds] 28 Jan 2009 Minimax Trees in Linear Time Pawe l Gawrychowski 1 and Travis Gagie 2, arxiv:0812.2868v2 [cs.ds] 28 Jan 2009 1 Institute of Computer Science University of Wroclaw, Poland gawry1@gmail.com 2 Research Group

More information

ELEMENTS O F INFORMATION THEORY

ELEMENTS O F INFORMATION THEORY ELEMENTS O F INFORMATION THEORY THOMAS M. COVER JOY A. THOMAS Preface to the Second Edition Preface to the First Edition Acknowledgments for the Second Edition Acknowledgments for the First Edition x

More information

Block 2: Introduction to Information Theory

Block 2: Introduction to Information Theory Block 2: Introduction to Information Theory Francisco J. Escribano April 26, 2015 Francisco J. Escribano Block 2: Introduction to Information Theory April 26, 2015 1 / 51 Table of contents 1 Motivation

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Piecewise Constant Prediction

Piecewise Constant Prediction Piecewise Constant Prediction Erik Ordentlich Information heory Research Hewlett-Packard Laboratories Palo Alto, CA 94304 Email: erik.ordentlich@hp.com Marcelo J. Weinberger Information heory Research

More information

Asymptotic Minimax Regret for Data Compression, Gambling, and Prediction

Asymptotic Minimax Regret for Data Compression, Gambling, and Prediction IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 2, MARCH 2000 431 Asymptotic Minimax Regret for Data Compression, Gambling, Prediction Qun Xie Andrew R. Barron, Member, IEEE Abstract For problems

More information

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding Ch 0 Introduction 0.1 Overview of Information Theory and Coding Overview The information theory was founded by Shannon in 1948. This theory is for transmission (communication system) or recording (storage

More information

Algorithmic probability, Part 1 of n. A presentation to the Maths Study Group at London South Bank University 09/09/2015

Algorithmic probability, Part 1 of n. A presentation to the Maths Study Group at London South Bank University 09/09/2015 Algorithmic probability, Part 1 of n A presentation to the Maths Study Group at London South Bank University 09/09/2015 Motivation Effective clustering the partitioning of a collection of objects such

More information