A One-to-One Code and Its Anti-Redundancy

A One-to-One Code and Its Anti-Redundancy W. Szpankowski Department of Computer Science, Purdue University July 4, 2005 This research is supported by NSF, NSA and NIH.

Outline of the Talk. Prefix Codes 2. Redundancy 3. Our One-to-One Code 4. Asymptotic Results for Anti-redundancy 5. Sketch of Proof Sums involving the floor function Saddle point method Sums and sequences modulo

Some Definitions A block code C n : A n {0, } is an injective mapping from the set A n of all sequences x n = x...x n of length n over the alphabet A to the set {0, } of binary sequences. For a given source P, the pointwise redundancy and the average redundancy are defined as respectively R n (C n,p; x n ) = L(C n)+lgp (x n ) R n (C n,p) = E X n[r n (C n,p; X n )] where L(C n,x n ) is the code length, H n (P )= P x n P (xn ) the source entropy, and E denotes the expectation, = E[L(C n,x n )] H n(p )

Prefix Codes Usually, we deal with prefix codes which are defined as those in which there is no codeword being a prefix of another codeword. Prefix codes do satisfy Kraft s inequality: P x n 2 L(xn ). Shannon Lower Bound: For any prefix code E[L(C n,x n )] H n(p ). Indeed, let K = P x n 2 L(xn ) Kraft. E[L(C n,x n )] H n(p )= = X x n An P (x n )L(xn )+ X x n An P (x n )logp (xn ) = X P (x n n )log P (x ) x n An 2 L(xn ) /K log K 0 since the first term is a divergence and cannot be negative (or log x x for 0 <x ).

Redundancy for Prefix Codes Throughout this talk we assume that the source P is given and is binary memoryless with probability p for transmitting a 0. Thatis,P (x n )=pk ( p) n k where k is the number of 0 s. Let ««p α =log 2, β =log 2. p p and x = x x be the fractional part of x. Redundancy of the Shannon-Fano Code: R SF n = 8 < : 2 + o() α irrational 2 M ` Mnβ 2 + O(ρ n ) α = N M, gcd(n, M) = Redundancy of the Huffman Code: R H n = 8 >< >: 3 2 log 2 3 2 M + o() 0.057304 α irrational ` βmn 2 M( 2 /M ) 2 nβm /M + O(ρ n ) α = N M where N, M are integers such that gcd(n, M) =and ρ<.

Oscillations 0.8 0.7 0.8 0.6 0.6 0.5 0.4 0.4 0.2 0.3 0 0 20 30 40 50 60 70 0 0 20 30 40 50 60 70 (a) (b) Figure : Shannon Fano code redundancy 0.08 0.08 0.07 0.06 0.06 0.04 0.05 0.04 0.02 0.03 0 0 20 30 (a) 40 50 60 70 0 0 20 30 (b) 40 50 60 70 Figure 2: Huffman s code redundancy versus block size n for: (a) irrational α = log 2 ( p)/p with p = /π; (b) rational α = log 2 ( p)/p with p =/9.

One-to-One Codes One-to-One codes are not prefix codes. In one-to-one codes a distinct codeword is assigned to each source symbol and unique decodability is not required. Such codes are usually one shot codes and there is one designated an end of message channel symbol. Wyner in 972 proved that L H(X), which was further improved by Alon and Orlitsky who showed L H(X) log(h(x) +) log e. Can we establish more precise bounds? Where are the oscillations observed in prefix codes?

Block One-to-One Codes We consider a block one-to-one code for x n = x...x n A n generated by a memoryless source with p being the probability of generating a 0 and q = p. We write P (x n )=pk q n k,where k is the number of 0s. Throughout we assume p q. We now list all 2 n probabilities in a nonincreasing order and assign code lengths as follows q n p q 0 q n p q n... q n p q log 2 () log 2 (2)... log 2 (2 n )

Average Code Length There are `n k equal probabilities p k q n k. Define A k = + + +, A =0. 0 Starting from the position A k the next `n k probabilities P (x n ) are the same. The average code length is L n = p k q n k A k X j=a k + log 2 (j) = X ( n k) p k q n k log 2 (A k + i). i= Our goal is to estimate L n asymptotically for large n.

An Ugly Sum To evaluate the inner part of the sum for L n we apply the following identity (cf. Knuth Ex..2.4-42) NX a j = Na n j= N X (a j+ a j ) j= for any sequence a j.then L n = + p k q n k log 2 A k p k q n k 2 log 2 A k p k q n k +A k `n k log 2 + «n A k + log 2 A k log 2 A k ) 2 p k q n k A k 2 log 2 A k 4 2 log 2 A k where x = x x is the fractional part of x. `n k

Main Result Theorem. Consider a binary memoryless source and the one-to-one block code described above. Then for p< 2 L n = nh(p) 2 log 2 n 2ln2 +log 2 + p 2p log 2 + F (n) +o() 2 3p p + 5 4p 2p 2ln2 + G(n) p ( 2p) pqπ «where H(p) = p log 2 p ( p)log 2 ( p), andg(n) = F (n) = 0 p p if log 2 p is irrational. If log 2 p = N/M for some integers M, N such that gcd(n, M) =, then G(n) and F (n) are oscillating functions of complicated nature. For example, F (n) is equal to M 2π Z e x2 /2 * M «!+ 2p p nβ log 2πpqn x2! dx p 2ln2 2 where β = log 2 ( p). For p = 2, then L n = nh(/2) +2 n (n 2) for every n.

Oscillations 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0 0 20 30 40 50 60 70 0 0 20 30 40 50 60 70 (a) (b) Figure 3: The constant part of the average anti-redundancy versus n for: (a) irrational α =log 2 ( p)/p with p =/π; (b) rational α =log 2 ( p)/p with p =/9. Anti-redundancy R n = L n nh(p) for our one-to-one code is R n = 2 log n + O() where the O() terms contains oscillations.

Sketch of Proof. We only deal with the sum S n = = p k q n k log 2 A k p k q n k log 2 A k p k q n k log 2 A k = a n + b n where a n = b n = p k q n k log 2 A k, p k q n k log 2 A k.

Asymptotics of A n 2. We need the saddle point approximation of A n. Lemma. For large n and p</2 A np = p p 2 nh(p) +O(n /2 ). 2p 2πnp( p) More precisely, for an ε>0 and k = np +Θ(n /2+ε ) we have for some δ>0. Proof. Notice that A k = p «p k p 2p 2πnp( p) p ( p) n! (k np)2 exp +O(n δ ) 2p( p)n A n (z) = A k z k = ( + z)n 2 n z n+. z and apply the saddle point method to the Cauchy formula.

Binomial Distribution Approximation 3. Using Stirling s approximation we find a good approximation for the binomial distribution. Lemma 2. Let p n (k) = `n k pk q n k where q = p be the binomial distribution. Then for k pn n /2+ε we have p n (k) = p 2πp( p)n exp uniformly as n. Furthermore for large n. X k np > npn /2+ε p n (k) < 2n ε e n! (k pn)2 + O(n δ ) 2p( p)n 2ε /2

Asymptotics of a n 4. From above lemmas we find log A k =loga np + α(k np) (k np)2 2pqn ln 2 + O(n δ ). and then a n =loga np 2ln2 + O(n δ ) which is the desired result.

Returning to b n 5. Recall we need asymptotics of b n = p k q n k log 2 A k. From previous lemmas we conclude that log A k = αk + nβ log 2 ω n (k np)2 2pqn ln 2 + O(n δ ) for some ω>0. Thus we need asymptotics of the following sum p k q n k * αk + nβ log 2 ω n + (k np)2. 2pqn ln 2

Final Lemma 6. To complete we need the following lemma. Lemma 3. Let 0 <p< be a fixed real number and f :[0, ] R be a Riemann integrable function. (i) If α is irrational, then lim n p k ( p) n k f = D E kα + y (k np) 2 /(2pqn ln 2) Z 0 f(t) dt, where the convergence is uniform for all shifts y R.

Continue... (ii) Suppose that α = N M is a rational number with integers N, M such that gcd(n, M) =. Then uniformly for all y R Z = p k ( p) n k f 0 f(t) dt + H M (y) D E kα + y (k np) 2 /(2pqn ln 2) where Z H M (y) := M 2π Z «f(t) dt 0 e x2 /2 dx *!+ M y x2 2ln2 is a periodic function with period M.