On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004

Size: px
Start display at page:

Download "On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004"

Transcription

1 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004

2 Types for Parametric Probability Distributions A = finite alphabet, sequence or string x n = x 1 x 2... x n A n ( n 0 ), x k j = x jx j+1... x k sub-string Parametric family P = { P Θ } of distributions on A n, P Θ : A n [0, 1], Θ D R K Type class of x n with respect to P : T P x n = { yn A n : P Θ (y n ) = P Θ (x n ) Θ D } we say that y n and x n are of the same type 1

3 Types for Parametric Probability Distributions (cont.) Type class of x n with respect to P : T P x n = { yn A n : P Θ (y n ) = P Θ (x n ) Θ D } Example: A = {0, 1}, P = {Bernoulli(θ) 0 θ 1} (1-parameter family): x n and y n are of the same type they have the same count of 1 s Example: k -th order binary Markov: 2 k parameters P (x i = 1 x i 1 i k )... same k+1 st order joint empirical distributions (with appropriate initial state assumptions) The method of types is classical in information theory [Csiszár & Körner 81, Csiszár 98,...] 2

4 The Lempel-Ziv incremental parsing Parse x n into phrases p 0 = λ, the null string x n = p 0 p 1 p 2... p c 1 t x p i, i > 0, is the shortest substring of x n, starting at the position following p i 1, such that p i p j, j < i (all p i s are different) every phrase p i, except p 0 = λ, is an extension of another phrase t x is the remaining tail of x n where parsing is truncated due to the end of the sequence (must be equal to one of the p i s) LZ78 universal data compression algorithm [Ziv & Lempel, 1978] has code length L LZ = c log c + O(c) Example: x 8 = λ, 1, 0, 1 0, 1 1, 0 0 c = 6, t x = λ λ T 10 1 tree representation: nodes phrases 11 3

5 Universal type classes T x n = {p 0, p 1,..., p c 1 } set of phrases of x n Universal (LZ) type class of x n : U x n = { y n A n : T y n = T x n } all strings of length n that produce the same set of phrases as x n same phrases, generally in different order phrase permutation must respect prefix relation Example: x 8 = , 0, 10, 11, 00 0, 00, 1, 11, = y 8 4

6 Why are universal type classes interesting? N(u k, v m ) = number of (possibly overlapping) occurrences of u k in v m. ˆP (k) x n (u k ) = N(u k, x n )/(n k + 1), u k A k Empirical joint distribution of order k n Theorem. Let x n be an arbitrary sequence of length n, and k a fixed positive integer. If y n U x n, then, for all u k A k, ˆP (k) x n (u k ) ˆP (k) y n (u k ) 0 as n. (empirical distributions of any order k converge in the variational sense) Asymptotics: an infinite sequence of sequences x n, not necessarily prefix-related Proof arguments: any u k is either contained in a phrase, or spans a phrase boundary ( O(kc) places), or contained in the tail ( t x places) well known LZ properties: c n/(log n o(log n)), p i 2n o( n) Convergence: O(1/ log n) 5

7 Finite memory probability assignments A k th order (finite-memory) probability assignment on A n is defined by a set of conditional distributions Q k (u k+1 u k ), u k+1 A k+1 a distribution Q k (u k ) on the initial state so that Q k (x n ) = Q k (x k ) n i=k+1 Q k(x i x i 1 i k ) In particular, Q k could be defined by the k th order approximation of an ergodic measure Corollary. Let x n and y n be sequences of the same universal type. Then, for any nonnegative integer k and any k -th order probability assignment Q k s.t. Q k (x n ), Q k (y n ) 0, we have 1 n log Q k(x n ) Q k (y n ) 0 as n 6

8 Conventional vs. Universal types Sequences of the same... conventional type T P x n have identical empirical distributions of a given structure universal type U x n have converging empirical distributions of any finite order are assigned identical are assigned probabilities probabilities whose normalized logs converge, by assignments from for finite-memory assignments a given parametric class of any finite order degree of indistinguishability vs. universality 7

9 Other things we like to do with type classes Ask how many sequences in the type class of x n (conventional: 2 nĥ(xn )+o(n) ) Ask how many types for given seq. length n (conventional: n K ) Draw a sequence at random with uniform probability from the class Enumerate the type class Use the class for (enumerative) coding and universal simulation 8

10 The tree is the type (almost) Assumptions For simplicity, A = {0, 1} (everything carries to A > 2 ) We ignore effects of tails (nuisance, does not affect main results) When in doubt, log is to base 2 Parsing tree c = number of nodes n = i p i = path length of T l = number of leaves h = height = length of longest path T λ c = 6, n = 8, l = 3, h = 2 1 h All sequences in U x n yield the same tree: the type is T (or (T, n) ) a combinatorial characterization, not explicitly based on counts 9

11 Size of the universal type class Phrases coming from T 0 and T 1 do not have prefix relation constraints, and can be interleaved arbitrarily Let U T = type class associated with a tree T. Then U T = U T0 U T1 c 1 c 0 note: c 1 = c 0 + c 1 cases where a sub-tree is missing are easily accounted for (deficient trees) Unroll recursion, telescope Let c pi = size of sub-tree rooted at p i. Then U T = (c 1)! i>0 c pi 10

12 Size of the universal type class example λ x 8 = , c = 6 c 0 = 2, c 00 = 1, c 1 = 3, c 10 = c 11 = 1 U T = 5! 2 3 = 20 0, 00, 1, 10, 11 0, 00, 1, 11, 10 0, 1, 00, 10, 11 0, 1, 00, 11, 10 0, 1, 10, 00, 11 0, 1, 11, 00, 10 0, 1, 10, 11, 00 0, 1, 11, 10, 00 1, 0, 00, 10, 11 1, 0, 00, 11, 10 1, 0, 10, 00, 11 1, 0, 11, 00, 10 1, 0, 10, 11, 00 1, 0, 11, 10, 00 1, 10, 0, 00, 11 1, 11, 0, 11, 10 1, 10, 0, 11, 00 1, 11, 0, 10, 00 1, 10, 11, 0, 00 1, 10, 11, 0, 00 11

13 Asymptotic size of a universal type class Theorem. The size of U x n 1 n ( satisfies log U x n L LZ (x n ) ) 0 as n Analogous to conventional types: 1 n ( log T P x n Ĥ(xn ) ) 0 as n with L LZ (x n ) c log c in lieu of the empirical entropy Ĥ(xn ) (entropy of ML distribution) 12

14 Asymptotic size of a universal type class (zoom-in) Theorem. The size of U T satisfies (1 β o(1)) c log c log U T (1 η o(1)) c log c, where 0 β, η 1, and β is bounded away from 0 iff η is bounded away from 0 iff log n log c is bounded away from 1, log n log c 2. Notes: we have 1. log n log c. 2 (up to lower order terms) log n log c bounded away from 1 c log c = o(n) 1 n L LZ 0 more detail: β = ( 1 l c ) ( log n log c 1 ), η = (h+1) log(h+1) c log c 13

15 Asymptotic size of the universal type: proof outline U T = (c 1)! i>0 c p i = (c 1)! D need to estimate D = i>0 c p i (hard?) let s estimate the sum instead S = i>0 c pi = n easy!!! if p i is a leaf, then c pi = 1. Rewrite sum and product: S = c pi = c pi >1 c pi = n l ; D = i>0 D = product of positive integers whose sum is known c pi >1 c pi D ( ) c l 1 n l the rest follows c l 1 14

16 Examples Skinny full tree Fat tree c = 2k + 1, n = k(k + 1) = c2 1, 4 log c log n 1 2, log U T = log = 1 2 ( ) 2 k k! c log c + O(c) n = c = 2 m+1 2 m i2 i = (m 1)2 m i=1 log c log n 1, = c log c + O(c) log U T = c log c + O(c) 0 (1 β) 1 2 attained with deficient trees, e.g.: totally skinny log U T = 0 15

17 Coding with the universal type Enumerative coding [Cover 1973]: encode x n by describing its type + index of the sequence within the type For universal types Description of the parsing tree: 2c + o(c) bits Index within the type: log U x n In some cases, total description length can be shorter than LZ78 code length, e.g. for skinny full tree description length = 1 2 c log c + O(c) possible only when 1 n L LZ 0 16

18 How many universal types? Number of types for sequences of length n = number of LZ78 parsing dictionaries = number of binary trees of given path length n Known (hard?) open combinatorial problem b c,n = number of trees with c nodes and path length = n B(w, z) = c,n b c,nw n z c generating function satisfies zb(w, wz) 2 = B(w, z) 1 (we need B(w, 1) ) Airy phenomenon, Brownian motion, analysis of quicksort,... 17

19 How many universal types? Using information-theoretic/combinatorial arguments Theorem. Let τ n = log ( number of types for sequences of length n). Then, τ n = 2n log n (1 + o(1)) Sloane s OEIS #A Number of binary trees for given path length G. Seroussi, On the number of t-ary trees with a given path length, 18

20 How many universal types? Using information-theoretic/combinatorial arguments Theorem. Let t = A and τ n = log t ( number of types for sequences of length n). Then, τ n = t H 2(t 1 ) n log n (1 + o(1)) Sloane s OEIS #A Number of binary trees for given path length G. Seroussi, On the number of t-ary trees with a given path length, 19

21 Bounds Upper bound. τ n 2n log n (1 + o(1)) a binary tree with c nodes can be encoded using 2c bits a tree with path length n has at most c = 2n log n O(log log n) n log n O(log log n) nodes bits are sufficient to (losslessly) encode all trees of path length n Lower bound. τ n 2n log n (1 o(1)) proof by building a large class of trees of the same path length 2 m 1 nodes at depth m m 1 l 1 different sub-trees of size m 2 (l 1)! permutations (l 1)! different trees of same path length 20

22 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 2 5, Prob(1) = 3 5 Choose 1 P = 3 5 y = 1 21

23 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 1 P = reached a used node, keep going 22

24 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At 1 : choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 0 P = y =

25 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 2 3, Prob(1) = 1 3 Choose 0 P = y =

26 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 0 (rest is deterministic) P = = 1 20 = 1 U T y =

27 Drawing a random sequence from the universal type Proposition. The procedure outlined (i) produces a sequence y n U x n with uniform probability (ii) requires an average of log U x n + O(1) random bits to do so (i.e., it is entropy-efficient ) (ii) is accomplished by letting an arithmetic decoder, fed with a Bernoulli( 1 2 ) sequence and the sequence of bifurcation probabilities, generate the random choices called for by the procedure [Han-Hoshi 97] 26

28 Universal simulation of individual sequences Given an individual sequence x n, we wish 1. to produce a sequence y n that is statistically similar to x n 2. to let y n have as much uncertainty (entropy) as possible given the previous requirement Proposed solution: choose y n randomly and uniformly from U x n 1. similarity = convergence in variation of empirical distributions of any order 2. asymptotically optimal in the following sense: any simulation scheme satisfying (1) must have H(y n ) log U x n + > 0 and almost all x n (doubly-universal analogue of [Merhav-Weinberger 04]) 27

29 Example: Simulation of a binary texture Simulation of a binary texture ( out of , scanned with Peano scan) 28

30 Example: Simulation of a binary texture Simulation of a binary texture ( out of , scanned with Peano scan) 29

31 Additional results and research questions Explicit enumeration of the universal type. Given x n, efficiently implement the map {1, 2, 3,..., U x n } U x n provides an alternative way to draw uniformly from the type class trade-off: fixed number of random bits with approximate uniform probability vs. variable number of random bits with exact uniform probability allows implementation of enumerative code Ongoing research: study universal types for other universal compression algorithms (e.g., Rissanen s Context) no problems in principle, but combinatorial analysis seems more challenging than LZ 30

On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004*

On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004* On universal types Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL-2004-153 September 6, 2004* E-mail: gadiel.seroussi@hp.com method of types, type classes, Lempel-Ziv coding,

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Lecture 4 : Adaptive source coding algorithms

Lecture 4 : Adaptive source coding algorithms Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Deinterleaving Markov Processes via Penalized ML

Deinterleaving Markov Processes via Penalized ML Deinterleaving Markov Processes via Penalized ML Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA Email: gseroussi@ieee.org Wojciech Szpankowski Department of Computer Science Purdue University

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3 3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

CMPT 365 Multimedia Systems. Lossless Compression

CMPT 365 Multimedia Systems. Lossless Compression CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding

More information

Discrete Universal Filtering Through Incremental Parsing

Discrete Universal Filtering Through Incremental Parsing Discrete Universal Filtering Through Incremental Parsing Erik Ordentlich, Tsachy Weissman, Marcelo J. Weinberger, Anelia Somekh Baruch, Neri Merhav Abstract In the discrete filtering problem, a data sequence

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

Introduction to information theory and coding

Introduction to information theory and coding Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data

More information

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA gseroussi@ieee.org Wojciech Szpankowski Purdue University, West Lafayette,

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

Data Compression Using a Sort-Based Context Similarity Measure

Data Compression Using a Sort-Based Context Similarity Measure Data Compression Using a Sort-Based Context Similarity easure HIDETOSHI YOKOO Department of Computer Science, Gunma University, Kiryu, Gunma 76, Japan Email: yokoo@cs.gunma-u.ac.jp Every symbol in the

More information

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Notes on Logarithmic Lower Bounds in the Cell Probe Model Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

Two notes on subshifts

Two notes on subshifts Two notes on subshifts Joseph S. Miller Special Session on Logic and Dynamical Systems Joint Mathematics Meetings, Washington, DC January 6, 2009 First Note Every Π 0 1 Medvedev degree contains a Π0 1

More information

An O(N) Semi-Predictive Universal Encoder via the BWT

An O(N) Semi-Predictive Universal Encoder via the BWT An O(N) Semi-Predictive Universal Encoder via the BWT Dror Baron and Yoram Bresler Abstract We provide an O(N) algorithm for a non-sequential semi-predictive encoder whose pointwise redundancy with respect

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

On the lower limits of entropy estimation

On the lower limits of entropy estimation On the lower limits of entropy estimation Abraham J. Wyner and Dean Foster Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, PA e-mail: ajw@wharton.upenn.edu foster@wharton.upenn.edu

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

More information

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1) 3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx

More information

ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source

More information

Piecewise Constant Prediction

Piecewise Constant Prediction Piecewise Constant Prediction Erik Ordentlich Information heory Research Hewlett-Packard Laboratories Palo Alto, CA 94304 Email: erik.ordentlich@hp.com Marcelo J. Weinberger Information heory Research

More information

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood

Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi, Fellow, IEEE, Wojciech Szpankowski, Fellow, IEEE, and Marcelo J. Weinberger Fellow, IEEE Abstract We study the problem

More information

CMPT 365 Multimedia Systems. Final Review - 1

CMPT 365 Multimedia Systems. Final Review - 1 CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression

More information

Piecewise Constant Prediction

Piecewise Constant Prediction Piecewise Constant Prediction Erik Ordentlich, Marcelo J. Weinberger, Yihong Wu HP Laboratories HPL-01-114 Keyword(s): prediction; data compression; i strategies; universal schemes Abstract: Mini prediction

More information

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Tao Jiang, Ming Li, Brendan Lucier September 26, 2005 Abstract In this paper we study the Kolmogorov Complexity of a

More information

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39 Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur

More information

A Universal Scheme for Learning

A Universal Scheme for Learning A Universal Scheme for Learning Vivek F. Farias, Ciamac C. Moallemi, Benjamin Van Roy, and Tsachy Weissman Electrical Engineering, Stanford University, Stanford, CA 94305 USA Emails: {vivekf,ciamac,bvr,tsachy}@stanford.edu

More information

Optimal prefix codes for pairs of geometrically-distributed random variables

Optimal prefix codes for pairs of geometrically-distributed random variables Author manuscript, published in IEEE International Symposium on Information Theory (ISIT'06), United States (2006) Optimal prefix codes for pairs of geometrically-distributed random variables Frédériue

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

2 Plain Kolmogorov Complexity

2 Plain Kolmogorov Complexity 2 Plain Kolmogorov Complexity In this section, we introduce plain Kolmogorov Complexity, prove the invariance theorem - that is, the complexity of a string does not depend crucially on the particular model

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Context tree models for source coding

Context tree models for source coding Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

More information

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital

More information

Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

IN this paper, we study the problem of universal lossless compression

IN this paper, we study the problem of universal lossless compression 4008 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 9, SEPTEMBER 2006 An Algorithm for Universal Lossless Compression With Side Information Haixiao Cai, Member, IEEE, Sanjeev R. Kulkarni, Fellow,

More information

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings Expected Number of Distinct Subsequences in Randomly Generated Binary Strings arxiv:704.0866v [math.co] 4 Mar 08 Yonah Biers-Ariel, Anant Godbole, Elizabeth Kelley March 6, 08 Abstract When considering

More information

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2 Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

More information

18.310A Final exam practice questions

18.310A Final exam practice questions 18.310A Final exam practice questions This is a collection of practice questions, gathered randomly from previous exams and quizzes. They may not be representative of what will be on the final. In particular,

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

THIS work is motivated by the goal of finding the capacity

THIS work is motivated by the goal of finding the capacity IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 8, AUGUST 2007 2693 Improved Lower Bounds for the Capacity of i.i.d. Deletion Duplication Channels Eleni Drinea, Member, IEEE, Michael Mitzenmacher,

More information

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2 582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix

More information

Entropy estimation of symbol sequences

Entropy estimation of symbol sequences Entropy estimation of symbol sequences Thomas Schürmann and Peter Grassberger Department of Theoretical Physics, University of Wuppertal, D-42097 Wuppertal, Germany We discuss algorithms for estimating

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,

More information

Universal Loseless Compression: Context Tree Weighting(CTW)

Universal Loseless Compression: Context Tree Weighting(CTW) Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)

More information

Quantifying Computational Security Subject to Source Constraints, Guesswork and Inscrutability

Quantifying Computational Security Subject to Source Constraints, Guesswork and Inscrutability Quantifying Computational Security Subject to Source Constraints, Guesswork and Inscrutability Ahmad Beirami, Robert Calderbank, Ken Duffy, Muriel Médard Department of Electrical and Computer Engineering,

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

A Faster Grammar-Based Self-Index

A Faster Grammar-Based Self-Index A Faster Grammar-Based Self-Index Travis Gagie 1 Pawe l Gawrychowski 2 Juha Kärkkäinen 3 Yakov Nekrich 4 Simon Puglisi 5 Aalto University Max-Planck-Institute für Informatik University of Helsinki University

More information

The Moments of the Profile in Random Binary Digital Trees

The Moments of the Profile in Random Binary Digital Trees Journal of mathematics and computer science 6(2013)176-190 The Moments of the Profile in Random Binary Digital Trees Ramin Kazemi and Saeid Delavar Department of Statistics, Imam Khomeini International

More information

Lower and Upper Bound Theory

Lower and Upper Bound Theory Lower and Upper Bound Theory Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY 1 Lower and Upper Bound Theory How fast can we sort? Most of the sorting algorithms are

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

Indexing LZ77: The Next Step in Self-Indexing. Gonzalo Navarro Department of Computer Science, University of Chile

Indexing LZ77: The Next Step in Self-Indexing. Gonzalo Navarro Department of Computer Science, University of Chile Indexing LZ77: The Next Step in Self-Indexing Gonzalo Navarro Department of Computer Science, University of Chile gnavarro@dcc.uchile.cl Part I: Why Jumping off the Cliff The Past Century Self-Indexing:

More information

The Method of Types and Its Application to Information Hiding

The Method of Types and Its Application to Information Hiding The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,

More information

Intro to Information Theory

Intro to Information Theory Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

Non-binary Distributed Arithmetic Coding

Non-binary Distributed Arithmetic Coding Non-binary Distributed Arithmetic Coding by Ziyang Wang Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the Masc degree in Electrical

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

arxiv:cs/ v1 [cs.it] 15 Sep 2005

arxiv:cs/ v1 [cs.it] 15 Sep 2005 On Hats and other Covers (Extended Summary) arxiv:cs/0509045v1 [cs.it] 15 Sep 2005 Hendrik W. Lenstra Gadiel Seroussi Abstract. We study a game puzzle that has enjoyed recent popularity among mathematicians,

More information

Midterm 2 for CS 170

Midterm 2 for CS 170 UC Berkeley CS 170 Midterm 2 Lecturer: Gene Myers November 9 Midterm 2 for CS 170 Print your name:, (last) (first) Sign your name: Write your section number (e.g. 101): Write your sid: One page of notes

More information

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org

More information

Computable Information Content and Boolean Networks Dynamics

Computable Information Content and Boolean Networks Dynamics Computable Information Content and Boolean Networks Dynamics C. Bonanno 1 G. Menconi 2 V. Benci 1 P. Cerrai 3 1 Dipartimento di Matematica Applicata, Università di Pisa, Italy 2 Dipartimento di Matematica,

More information

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H. Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without

More information

Consistency of Feature Markov Processes

Consistency of Feature Markov Processes Consistency of Feature Markov Processes Peter Sunehag and Marcus Hutter ALT 2010 Introduction Feature Markov (Decision) Processes is a history based approach to Sequence Prediction and Reinforcement Learning

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

On Scalable Source Coding for Multiple Decoders with Side Information

On Scalable Source Coding for Multiple Decoders with Side Information On Scalable Source Coding for Multiple Decoders with Side Information Chao Tian School of Computer and Communication Sciences Laboratory for Information and Communication Systems (LICOS), EPFL, Lausanne,

More information

Digital communication system. Shannon s separation principle

Digital communication system. Shannon s separation principle Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation

More information

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms) Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 12.1 Introduction Today we re going to do a couple more examples of dynamic programming. While

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

Source Coding Techniques

Source Coding Techniques Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.

More information

The Poisson Channel with Side Information

The Poisson Channel with Side Information The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH

More information

The complexity of recursive constraint satisfaction problems.

The complexity of recursive constraint satisfaction problems. The complexity of recursive constraint satisfaction problems. Victor W. Marek Department of Computer Science University of Kentucky Lexington, KY 40506, USA marek@cs.uky.edu Jeffrey B. Remmel Department

More information

(Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding (Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

More information

Smaller and Faster Lempel-Ziv Indices

Smaller and Faster Lempel-Ziv Indices Smaller and Faster Lempel-Ziv Indices Diego Arroyuelo and Gonzalo Navarro Dept. of Computer Science, Universidad de Chile, Chile. {darroyue,gnavarro}@dcc.uchile.cl Abstract. Given a text T[1..u] over an

More information

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016 Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).

More information

Asymptotic Optimal Lossless Compression via the CSE Technique

Asymptotic Optimal Lossless Compression via the CSE Technique 2011 First International Conference on Data Compression, Communications and Processing Asymptotic Optimal Lossless Compression via the CSE Technique Hidetoshi Yokoo Department of Computer Science Gunma

More information

Generalized Kraft Inequality and Arithmetic Coding

Generalized Kraft Inequality and Arithmetic Coding J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic

More information

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

Lecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 4 1 Circuit Complexity Circuits are directed, acyclic graphs where nodes are called gates and edges are called

More information

Multiple Choice Tries and Distributed Hash Tables

Multiple Choice Tries and Distributed Hash Tables Multiple Choice Tries and Distributed Hash Tables Luc Devroye and Gabor Lugosi and Gahyun Park and W. Szpankowski January 3, 2007 McGill University, Montreal, Canada U. Pompeu Fabra, Barcelona, Spain U.

More information