On Universal Types. Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA. University of Minnesota, September 14, 2004
|
|
- Muriel Tate
- 5 years ago
- Views:
Transcription
1 On Universal Types Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, California, USA University of Minnesota, September 14, 2004
2 Types for Parametric Probability Distributions A = finite alphabet, sequence or string x n = x 1 x 2... x n A n ( n 0 ), x k j = x jx j+1... x k sub-string Parametric family P = { P Θ } of distributions on A n, P Θ : A n [0, 1], Θ D R K Type class of x n with respect to P : T P x n = { yn A n : P Θ (y n ) = P Θ (x n ) Θ D } we say that y n and x n are of the same type 1
3 Types for Parametric Probability Distributions (cont.) Type class of x n with respect to P : T P x n = { yn A n : P Θ (y n ) = P Θ (x n ) Θ D } Example: A = {0, 1}, P = {Bernoulli(θ) 0 θ 1} (1-parameter family): x n and y n are of the same type they have the same count of 1 s Example: k -th order binary Markov: 2 k parameters P (x i = 1 x i 1 i k )... same k+1 st order joint empirical distributions (with appropriate initial state assumptions) The method of types is classical in information theory [Csiszár & Körner 81, Csiszár 98,...] 2
4 The Lempel-Ziv incremental parsing Parse x n into phrases p 0 = λ, the null string x n = p 0 p 1 p 2... p c 1 t x p i, i > 0, is the shortest substring of x n, starting at the position following p i 1, such that p i p j, j < i (all p i s are different) every phrase p i, except p 0 = λ, is an extension of another phrase t x is the remaining tail of x n where parsing is truncated due to the end of the sequence (must be equal to one of the p i s) LZ78 universal data compression algorithm [Ziv & Lempel, 1978] has code length L LZ = c log c + O(c) Example: x 8 = λ, 1, 0, 1 0, 1 1, 0 0 c = 6, t x = λ λ T 10 1 tree representation: nodes phrases 11 3
5 Universal type classes T x n = {p 0, p 1,..., p c 1 } set of phrases of x n Universal (LZ) type class of x n : U x n = { y n A n : T y n = T x n } all strings of length n that produce the same set of phrases as x n same phrases, generally in different order phrase permutation must respect prefix relation Example: x 8 = , 0, 10, 11, 00 0, 00, 1, 11, = y 8 4
6 Why are universal type classes interesting? N(u k, v m ) = number of (possibly overlapping) occurrences of u k in v m. ˆP (k) x n (u k ) = N(u k, x n )/(n k + 1), u k A k Empirical joint distribution of order k n Theorem. Let x n be an arbitrary sequence of length n, and k a fixed positive integer. If y n U x n, then, for all u k A k, ˆP (k) x n (u k ) ˆP (k) y n (u k ) 0 as n. (empirical distributions of any order k converge in the variational sense) Asymptotics: an infinite sequence of sequences x n, not necessarily prefix-related Proof arguments: any u k is either contained in a phrase, or spans a phrase boundary ( O(kc) places), or contained in the tail ( t x places) well known LZ properties: c n/(log n o(log n)), p i 2n o( n) Convergence: O(1/ log n) 5
7 Finite memory probability assignments A k th order (finite-memory) probability assignment on A n is defined by a set of conditional distributions Q k (u k+1 u k ), u k+1 A k+1 a distribution Q k (u k ) on the initial state so that Q k (x n ) = Q k (x k ) n i=k+1 Q k(x i x i 1 i k ) In particular, Q k could be defined by the k th order approximation of an ergodic measure Corollary. Let x n and y n be sequences of the same universal type. Then, for any nonnegative integer k and any k -th order probability assignment Q k s.t. Q k (x n ), Q k (y n ) 0, we have 1 n log Q k(x n ) Q k (y n ) 0 as n 6
8 Conventional vs. Universal types Sequences of the same... conventional type T P x n have identical empirical distributions of a given structure universal type U x n have converging empirical distributions of any finite order are assigned identical are assigned probabilities probabilities whose normalized logs converge, by assignments from for finite-memory assignments a given parametric class of any finite order degree of indistinguishability vs. universality 7
9 Other things we like to do with type classes Ask how many sequences in the type class of x n (conventional: 2 nĥ(xn )+o(n) ) Ask how many types for given seq. length n (conventional: n K ) Draw a sequence at random with uniform probability from the class Enumerate the type class Use the class for (enumerative) coding and universal simulation 8
10 The tree is the type (almost) Assumptions For simplicity, A = {0, 1} (everything carries to A > 2 ) We ignore effects of tails (nuisance, does not affect main results) When in doubt, log is to base 2 Parsing tree c = number of nodes n = i p i = path length of T l = number of leaves h = height = length of longest path T λ c = 6, n = 8, l = 3, h = 2 1 h All sequences in U x n yield the same tree: the type is T (or (T, n) ) a combinatorial characterization, not explicitly based on counts 9
11 Size of the universal type class Phrases coming from T 0 and T 1 do not have prefix relation constraints, and can be interleaved arbitrarily Let U T = type class associated with a tree T. Then U T = U T0 U T1 c 1 c 0 note: c 1 = c 0 + c 1 cases where a sub-tree is missing are easily accounted for (deficient trees) Unroll recursion, telescope Let c pi = size of sub-tree rooted at p i. Then U T = (c 1)! i>0 c pi 10
12 Size of the universal type class example λ x 8 = , c = 6 c 0 = 2, c 00 = 1, c 1 = 3, c 10 = c 11 = 1 U T = 5! 2 3 = 20 0, 00, 1, 10, 11 0, 00, 1, 11, 10 0, 1, 00, 10, 11 0, 1, 00, 11, 10 0, 1, 10, 00, 11 0, 1, 11, 00, 10 0, 1, 10, 11, 00 0, 1, 11, 10, 00 1, 0, 00, 10, 11 1, 0, 00, 11, 10 1, 0, 10, 00, 11 1, 0, 11, 00, 10 1, 0, 10, 11, 00 1, 0, 11, 10, 00 1, 10, 0, 00, 11 1, 11, 0, 11, 10 1, 10, 0, 11, 00 1, 11, 0, 10, 00 1, 10, 11, 0, 00 1, 10, 11, 0, 00 11
13 Asymptotic size of a universal type class Theorem. The size of U x n 1 n ( satisfies log U x n L LZ (x n ) ) 0 as n Analogous to conventional types: 1 n ( log T P x n Ĥ(xn ) ) 0 as n with L LZ (x n ) c log c in lieu of the empirical entropy Ĥ(xn ) (entropy of ML distribution) 12
14 Asymptotic size of a universal type class (zoom-in) Theorem. The size of U T satisfies (1 β o(1)) c log c log U T (1 η o(1)) c log c, where 0 β, η 1, and β is bounded away from 0 iff η is bounded away from 0 iff log n log c is bounded away from 1, log n log c 2. Notes: we have 1. log n log c. 2 (up to lower order terms) log n log c bounded away from 1 c log c = o(n) 1 n L LZ 0 more detail: β = ( 1 l c ) ( log n log c 1 ), η = (h+1) log(h+1) c log c 13
15 Asymptotic size of the universal type: proof outline U T = (c 1)! i>0 c p i = (c 1)! D need to estimate D = i>0 c p i (hard?) let s estimate the sum instead S = i>0 c pi = n easy!!! if p i is a leaf, then c pi = 1. Rewrite sum and product: S = c pi = c pi >1 c pi = n l ; D = i>0 D = product of positive integers whose sum is known c pi >1 c pi D ( ) c l 1 n l the rest follows c l 1 14
16 Examples Skinny full tree Fat tree c = 2k + 1, n = k(k + 1) = c2 1, 4 log c log n 1 2, log U T = log = 1 2 ( ) 2 k k! c log c + O(c) n = c = 2 m+1 2 m i2 i = (m 1)2 m i=1 log c log n 1, = c log c + O(c) log U T = c log c + O(c) 0 (1 β) 1 2 attained with deficient trees, e.g.: totally skinny log U T = 0 15
17 Coding with the universal type Enumerative coding [Cover 1973]: encode x n by describing its type + index of the sequence within the type For universal types Description of the parsing tree: 2c + o(c) bits Index within the type: log U x n In some cases, total description length can be shorter than LZ78 code length, e.g. for skinny full tree description length = 1 2 c log c + O(c) possible only when 1 n L LZ 0 16
18 How many universal types? Number of types for sequences of length n = number of LZ78 parsing dictionaries = number of binary trees of given path length n Known (hard?) open combinatorial problem b c,n = number of trees with c nodes and path length = n B(w, z) = c,n b c,nw n z c generating function satisfies zb(w, wz) 2 = B(w, z) 1 (we need B(w, 1) ) Airy phenomenon, Brownian motion, analysis of quicksort,... 17
19 How many universal types? Using information-theoretic/combinatorial arguments Theorem. Let τ n = log ( number of types for sequences of length n). Then, τ n = 2n log n (1 + o(1)) Sloane s OEIS #A Number of binary trees for given path length G. Seroussi, On the number of t-ary trees with a given path length, 18
20 How many universal types? Using information-theoretic/combinatorial arguments Theorem. Let t = A and τ n = log t ( number of types for sequences of length n). Then, τ n = t H 2(t 1 ) n log n (1 + o(1)) Sloane s OEIS #A Number of binary trees for given path length G. Seroussi, On the number of t-ary trees with a given path length, 19
21 Bounds Upper bound. τ n 2n log n (1 + o(1)) a binary tree with c nodes can be encoded using 2c bits a tree with path length n has at most c = 2n log n O(log log n) n log n O(log log n) nodes bits are sufficient to (losslessly) encode all trees of path length n Lower bound. τ n 2n log n (1 o(1)) proof by building a large class of trees of the same path length 2 m 1 nodes at depth m m 1 l 1 different sub-trees of size m 2 (l 1)! permutations (l 1)! different trees of same path length 20
22 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 2 5, Prob(1) = 3 5 Choose 1 P = 3 5 y = 1 21
23 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 1 P = reached a used node, keep going 22
24 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At 1 : choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 0 P = y =
25 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 2 3, Prob(1) = 1 3 Choose 0 P = y =
26 Drawing a random sequence from the universal type Mark phrases as unused or used (initially all unused except the root λ ) Pick next phrase by starting from the root and following branches to an unused phrase When at a bifurcation, choose branch randomly with probabilities proportional to number of unused nodes in sub-tree Example: 7λ At root: choose branch with Prob(0) = 1 2, Prob(1) = 1 2 Choose 0 (rest is deterministic) P = = 1 20 = 1 U T y =
27 Drawing a random sequence from the universal type Proposition. The procedure outlined (i) produces a sequence y n U x n with uniform probability (ii) requires an average of log U x n + O(1) random bits to do so (i.e., it is entropy-efficient ) (ii) is accomplished by letting an arithmetic decoder, fed with a Bernoulli( 1 2 ) sequence and the sequence of bifurcation probabilities, generate the random choices called for by the procedure [Han-Hoshi 97] 26
28 Universal simulation of individual sequences Given an individual sequence x n, we wish 1. to produce a sequence y n that is statistically similar to x n 2. to let y n have as much uncertainty (entropy) as possible given the previous requirement Proposed solution: choose y n randomly and uniformly from U x n 1. similarity = convergence in variation of empirical distributions of any order 2. asymptotically optimal in the following sense: any simulation scheme satisfying (1) must have H(y n ) log U x n + > 0 and almost all x n (doubly-universal analogue of [Merhav-Weinberger 04]) 27
29 Example: Simulation of a binary texture Simulation of a binary texture ( out of , scanned with Peano scan) 28
30 Example: Simulation of a binary texture Simulation of a binary texture ( out of , scanned with Peano scan) 29
31 Additional results and research questions Explicit enumeration of the universal type. Given x n, efficiently implement the map {1, 2, 3,..., U x n } U x n provides an alternative way to draw uniformly from the type class trade-off: fixed number of random bits with approximate uniform probability vs. variable number of random bits with exact uniform probability allows implementation of enumerative code Ongoing research: study universal types for other universal compression algorithms (e.g., Rissanen s Context) no problems in principle, but combinatorial analysis seems more challenging than LZ 30
On universal types. Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL September 6, 2004*
On universal types Gadiel Seroussi Information Theory Research HP Laboratories Palo Alto HPL-2004-153 September 6, 2004* E-mail: gadiel.seroussi@hp.com method of types, type classes, Lempel-Ziv coding,
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationMotivation for Arithmetic Coding
Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater
More informationDeinterleaving Markov Processes via Penalized ML
Deinterleaving Markov Processes via Penalized ML Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA Email: gseroussi@ieee.org Wojciech Szpankowski Department of Computer Science Purdue University
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More informationBasic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.
Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More informationCMPT 365 Multimedia Systems. Lossless Compression
CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding
More informationDiscrete Universal Filtering Through Incremental Parsing
Discrete Universal Filtering Through Incremental Parsing Erik Ordentlich, Tsachy Weissman, Marcelo J. Weinberger, Anelia Somekh Baruch, Neri Merhav Abstract In the discrete filtering problem, a data sequence
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data
More informationDeinterleaving Finite Memory Processes via Penalized Maximum Likelihood
Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi Hewlett-Packard Laboratories Palo Alto, CA, USA gseroussi@ieee.org Wojciech Szpankowski Purdue University, West Lafayette,
More informationChapter 9 Fundamental Limits in Information Theory
Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For
More informationData Compression Using a Sort-Based Context Similarity Measure
Data Compression Using a Sort-Based Context Similarity easure HIDETOSHI YOKOO Department of Computer Science, Gunma University, Kiryu, Gunma 76, Japan Email: yokoo@cs.gunma-u.ac.jp Every symbol in the
More informationNotes on Logarithmic Lower Bounds in the Cell Probe Model
Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)
More informationMARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for
MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1
More informationLecture 3 : Algorithms for source coding. September 30, 2016
Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationTwo notes on subshifts
Two notes on subshifts Joseph S. Miller Special Session on Logic and Dynamical Systems Joint Mathematics Meetings, Washington, DC January 6, 2009 First Note Every Π 0 1 Medvedev degree contains a Π0 1
More informationAn O(N) Semi-Predictive Universal Encoder via the BWT
An O(N) Semi-Predictive Universal Encoder via the BWT Dror Baron and Yoram Bresler Abstract We provide an O(N) algorithm for a non-sequential semi-predictive encoder whose pointwise redundancy with respect
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More informationOn the lower limits of entropy estimation
On the lower limits of entropy estimation Abraham J. Wyner and Dean Foster Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, PA e-mail: ajw@wharton.upenn.edu foster@wharton.upenn.edu
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More informationLecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)
3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationPiecewise Constant Prediction
Piecewise Constant Prediction Erik Ordentlich Information heory Research Hewlett-Packard Laboratories Palo Alto, CA 94304 Email: erik.ordentlich@hp.com Marcelo J. Weinberger Information heory Research
More informationDeinterleaving Finite Memory Processes via Penalized Maximum Likelihood
Deinterleaving Finite Memory Processes via Penalized Maximum Likelihood Gadiel Seroussi, Fellow, IEEE, Wojciech Szpankowski, Fellow, IEEE, and Marcelo J. Weinberger Fellow, IEEE Abstract We study the problem
More informationCMPT 365 Multimedia Systems. Final Review - 1
CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression
More informationPiecewise Constant Prediction
Piecewise Constant Prediction Erik Ordentlich, Marcelo J. Weinberger, Yihong Wu HP Laboratories HPL-01-114 Keyword(s): prediction; data compression; i strategies; universal schemes Abstract: Mini prediction
More informationAverage Case Analysis of QuickSort and Insertion Tree Height using Incompressibility
Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility Tao Jiang, Ming Li, Brendan Lucier September 26, 2005 Abstract In this paper we study the Kolmogorov Complexity of a
More informationEntropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39
Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur
More informationA Universal Scheme for Learning
A Universal Scheme for Learning Vivek F. Farias, Ciamac C. Moallemi, Benjamin Van Roy, and Tsachy Weissman Electrical Engineering, Stanford University, Stanford, CA 94305 USA Emails: {vivekf,ciamac,bvr,tsachy}@stanford.edu
More informationOptimal prefix codes for pairs of geometrically-distributed random variables
Author manuscript, published in IEEE International Symposium on Information Theory (ISIT'06), United States (2006) Optimal prefix codes for pairs of geometrically-distributed random variables Frédériue
More informationCSE 421 Greedy: Huffman Codes
CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More information2 Plain Kolmogorov Complexity
2 Plain Kolmogorov Complexity In this section, we introduce plain Kolmogorov Complexity, prove the invariance theorem - that is, the complexity of a string does not depend crucially on the particular model
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationContext tree models for source coding
Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding
More informationEntropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea
Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital
More informationInformation Theory, Statistics, and Decision Trees
Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More informationIN this paper, we study the problem of universal lossless compression
4008 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 9, SEPTEMBER 2006 An Algorithm for Universal Lossless Compression With Side Information Haixiao Cai, Member, IEEE, Sanjeev R. Kulkarni, Fellow,
More informationExpected Number of Distinct Subsequences in Randomly Generated Binary Strings
Expected Number of Distinct Subsequences in Randomly Generated Binary Strings arxiv:704.0866v [math.co] 4 Mar 08 Yonah Biers-Ariel, Anant Godbole, Elizabeth Kelley March 6, 08 Abstract When considering
More informationText Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2
Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable
More information18.310A Final exam practice questions
18.310A Final exam practice questions This is a collection of practice questions, gathered randomly from previous exams and quizzes. They may not be representative of what will be on the final. In particular,
More informationComplexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler
Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard
More informationOutline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.
Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität
More informationTHIS work is motivated by the goal of finding the capacity
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 8, AUGUST 2007 2693 Improved Lower Bounds for the Capacity of i.i.d. Deletion Duplication Channels Eleni Drinea, Member, IEEE, Michael Mitzenmacher,
More informationData Compression Techniques (Spring 2012) Model Solutions for Exercise 2
582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix
More informationEntropy estimation of symbol sequences
Entropy estimation of symbol sequences Thomas Schürmann and Peter Grassberger Department of Theoretical Physics, University of Wuppertal, D-42097 Wuppertal, Germany We discuss algorithms for estimating
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationEECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have
EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,
More informationUniversal Loseless Compression: Context Tree Weighting(CTW)
Universal Loseless Compression: Context Tree Weighting(CTW) Dept. Electrical Engineering, Stanford University Dec 9, 2014 Universal Coding with Model Classes Traditional Shannon theory assume a (probabilistic)
More informationQuantifying Computational Security Subject to Source Constraints, Guesswork and Inscrutability
Quantifying Computational Security Subject to Source Constraints, Guesswork and Inscrutability Ahmad Beirami, Robert Calderbank, Ken Duffy, Muriel Médard Department of Electrical and Computer Engineering,
More informationData Compression. Limit of Information Compression. October, Examples of codes 1
Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality
More informationA Faster Grammar-Based Self-Index
A Faster Grammar-Based Self-Index Travis Gagie 1 Pawe l Gawrychowski 2 Juha Kärkkäinen 3 Yakov Nekrich 4 Simon Puglisi 5 Aalto University Max-Planck-Institute für Informatik University of Helsinki University
More informationThe Moments of the Profile in Random Binary Digital Trees
Journal of mathematics and computer science 6(2013)176-190 The Moments of the Profile in Random Binary Digital Trees Ramin Kazemi and Saeid Delavar Department of Statistics, Imam Khomeini International
More informationLower and Upper Bound Theory
Lower and Upper Bound Theory Adnan YAZICI Dept. of Computer Engineering Middle East Technical Univ. Ankara - TURKEY 1 Lower and Upper Bound Theory How fast can we sort? Most of the sorting algorithms are
More information4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information
4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk
More informationIndexing LZ77: The Next Step in Self-Indexing. Gonzalo Navarro Department of Computer Science, University of Chile
Indexing LZ77: The Next Step in Self-Indexing Gonzalo Navarro Department of Computer Science, University of Chile gnavarro@dcc.uchile.cl Part I: Why Jumping off the Cliff The Past Century Self-Indexing:
More informationThe Method of Types and Its Application to Information Hiding
The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationCoding for Discrete Source
EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively
More informationNon-binary Distributed Arithmetic Coding
Non-binary Distributed Arithmetic Coding by Ziyang Wang Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the Masc degree in Electrical
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationlossless, optimal compressor
6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.
More informationarxiv:cs/ v1 [cs.it] 15 Sep 2005
On Hats and other Covers (Extended Summary) arxiv:cs/0509045v1 [cs.it] 15 Sep 2005 Hendrik W. Lenstra Gadiel Seroussi Abstract. We study a game puzzle that has enjoyed recent popularity among mathematicians,
More informationMidterm 2 for CS 170
UC Berkeley CS 170 Midterm 2 Lecturer: Gene Myers November 9 Midterm 2 for CS 170 Print your name:, (last) (first) Sign your name: Write your section number (e.g. 101): Write your sid: One page of notes
More informationTight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes
Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes Weihua Hu Dept. of Mathematical Eng. Email: weihua96@gmail.com Hirosuke Yamamoto Dept. of Complexity Sci. and Eng. Email: Hirosuke@ieee.org
More informationComputable Information Content and Boolean Networks Dynamics
Computable Information Content and Boolean Networks Dynamics C. Bonanno 1 G. Menconi 2 V. Benci 1 P. Cerrai 3 1 Dipartimento di Matematica Applicata, Università di Pisa, Italy 2 Dipartimento di Matematica,
More information1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.
Problem sheet Ex. Verify that the function H(p,..., p n ) = k p k log p k satisfies all 8 axioms on H. Ex. (Not to be handed in). looking at the notes). List as many of the 8 axioms as you can, (without
More informationConsistency of Feature Markov Processes
Consistency of Feature Markov Processes Peter Sunehag and Marcus Hutter ALT 2010 Introduction Feature Markov (Decision) Processes is a history based approach to Sequence Prediction and Reinforcement Learning
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationLec 03 Entropy and Coding II Hoffman and Golomb Coding
CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap
More informationOn Scalable Source Coding for Multiple Decoders with Side Information
On Scalable Source Coding for Multiple Decoders with Side Information Chao Tian School of Computer and Communication Sciences Laboratory for Information and Communication Systems (LICOS), EPFL, Lausanne,
More informationDigital communication system. Shannon s separation principle
Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17
601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 12.1 Introduction Today we re going to do a couple more examples of dynamic programming. While
More informationCoding on Countably Infinite Alphabets
Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe
More informationSource Coding Techniques
Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.
More informationThe Poisson Channel with Side Information
The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH
More informationThe complexity of recursive constraint satisfaction problems.
The complexity of recursive constraint satisfaction problems. Victor W. Marek Department of Computer Science University of Kentucky Lexington, KY 40506, USA marek@cs.uky.edu Jeffrey B. Remmel Department
More information(Classical) Information Theory II: Source coding
(Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable
More informationSmaller and Faster Lempel-Ziv Indices
Smaller and Faster Lempel-Ziv Indices Diego Arroyuelo and Gonzalo Navarro Dept. of Computer Science, Universidad de Chile, Chile. {darroyue,gnavarro}@dcc.uchile.cl Abstract. Given a text T[1..u] over an
More informationMidterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016
Midterm Exam Time: 09:10 12:10 11/23, 2016 Name: Student ID: Policy: (Read before You Start to Work) The exam is closed book. However, you are allowed to bring TWO A4-size cheat sheet (single-sheet, two-sided).
More informationAsymptotic Optimal Lossless Compression via the CSE Technique
2011 First International Conference on Data Compression, Communications and Processing Asymptotic Optimal Lossless Compression via the CSE Technique Hidetoshi Yokoo Department of Computer Science Gunma
More informationGeneralized Kraft Inequality and Arithmetic Coding
J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic
More informationLecture 4. 1 Circuit Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz
Notes on Complexity Theory: Fall 2005 Last updated: September, 2005 Jonathan Katz Lecture 4 1 Circuit Complexity Circuits are directed, acyclic graphs where nodes are called gates and edges are called
More informationMultiple Choice Tries and Distributed Hash Tables
Multiple Choice Tries and Distributed Hash Tables Luc Devroye and Gabor Lugosi and Gahyun Park and W. Szpankowski January 3, 2007 McGill University, Montreal, Canada U. Pompeu Fabra, Barcelona, Spain U.
More information