# Generalized Kraft Inequality and Arithmetic Coding

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 J. J. Rissanen Generalized Kraft Inequality and Arithmetic Coding Abstract: Algorithms for encoding and decoding finite strings over a finite alphabet are described. The coding operations are arithmetic involving rational numbers li as parameters such that Zi2"i 5 2". This coding technique requires no blocking, and the per-symbol length of the encoded string approaches the associated entropy within E. The coding speed is comparable to that of conventional coding methods. Introduction The optimal conventional instantaneous Huffman code [ for an independent information source with symbol probabilities (p,,..., p,) may be viewed as a solution to the integer programming problem: Find m natural numbers li as lengths of binary code words such that Cipil, is minimized under the constraining Kraft inequality I. Then the minimized sum B,p,l,approximates the Shannon- Boltzmann entropy function H (p,,..., p,) = -zip, log pi from above with an error no more than one. If a better approximation is required, blocking is needed; e.g., a kth extension of the alphabet must be encoded, which reduces the least upper bound of the error to / k [2]. We describe another coding technique in which m positive rational numbers I,,...,, are selected such that a generalized Kraft inequality holds: 2"i 5 2-, E > 0. (For E = 0, the rationality requirement would have to be relaxed.) The length of the code of a sequence s with ni occurrences of the ith symbol is given by the sum L (s) = i nili, which when minimized over the I, subject to the preceding inequality and divided by n = Bini is never larger than 98 where E is determined by the difference li - log(n/n,). This means that, if the strings are generated by an independent information source, the mean of n"l(s) approaches the entropy function from above within an error E + O( /n). The coding operations are arithmetic, and they resemble the concatenation operations in conventional coding in that the code of the string sak, where uk is a new symbol, is obtained by replacing only a few (always less than some fixed number) of the left-most bits of the code representing s by a new longer string. As a result, the coding operations can be accomplished with a speed comparable to that of conventional coding. The primary advantage of the resulting "arithmetic coding" is that there is no blocking needed even for the binary alphabet. In addition, for small alphabet sizes the size of tables to be stored and searched is smaller than the code word tables in conventional coding methods. For a binary alphabet a special choice of parameters I, and I, leads to a particularly simple coding technique that is closely related to one due to Fano [ 3. The coding method described here is reminiscent of the enumerative coding techniques of Schalkwijk [4] and Cover [5], and perhaps also of that due to Elias, as sketched in [2]. All of these are inferior, however, in one or more crucial respects, especially speed. Binary alphabet The coding algorithm is derived and studied for any finite alphabet, including the binary case. But because of the special nature of a binary alphabet, which admits certain simplifications, we study it separately. The importance of applications of the binary alphabet in data storage also warrants separate study. J. J. RISSANEN IBM.I. RES. DEVELOP.

2 Let I, and I, be two positive rational numbers such that I, 5 I, and when written in binary notation they have q binary digits in their fractional part. Further, for x, a rational number, 0 5 x <, with q binary digits in its fractional part, let e(x) be a rational-valued approximation to 2s such that ( x) = 2s+% '3 I > -6,0, and that e(x) has r > 0 binary digits in its fractional part. Clearly, the minimum size for r depends on E and q. For a choice of these, see Theorem 2. Let s denote a binary string and A an empty string. Write the concatenation of s and k, k = 0 or, as sk. Define the rational-valued function L by the recursion L(sk) = L(s) + Ik+,, L(A) =r. Write the numbers L(s) as L(s) = y (s) + x(s), (3) where y(s) is the integer part LL(s)d and x(s) is the fraction with fractional bits. The encoding function C transforms binary strings into nonnegative integers by the recursive formula () In addition, in the former case make C(s') = C(s) and in the latter case make C(s') = C(s) - 2"'"'e[x(s)], L(s') = L(s) - I,, to complete the recursion. As a practical matter, the test in (7) is done by truncat- ing C(s) to its r + left-most bits c(s) and writing - C(s) = 2"'"'p(s), where 5 'p(s) < 2 and a(s) = (C(s)l + I. Then the order inequality in (7) is equivalent to the lexicographic order inequality with priority on the first component. We next give a Kraft inequality type condition for the numbers, and I,, which guarantees a successful decoding. then D(C(s), L(s)) = s for all binary strings s. Because the code increases only when a has been appended to the string, C takes a binary string s,, described by the list (k,;.., k,), where ki denotes the position of its ith I from left, to the sum Proof We have to prove that s = s' I C( s) 2'") e[x(s)]. By (4) the implication from left to right is clear. To prove the converse is equivalent to proving that for all strings = 2'/'si"e[x(si)] Y(s,) = L(ki- i)i, + d2-i + r x(si) = (ki - i)i, + ii, + r - y(s,). (6) The code for s consists of the pair (C(s), L (s) ), where L (s) may be replaced by the length n = / SI of s and m, the number of 's in s. Decoding function D recovers s from right to left recursively as follows: If ~ ( s < ) 2Y'"'e[x(s)l, Then generate a 0, i.e., s = s'0; (7 Else, generate a, i.e., s = s'. and, because by ( ) I, > E and I, > (sj). (4) We prove ( 2) by induction. For the induction base C(A) = 0 < 2''0'e[x(~)], because L(0) = I, + r > 0. Assume then that (2) holds for all s with length Is/ 5 n. Case - st = SO By (4),(2), and (l4), in turn, and ( 2) holds for all strings s with length n + ending at MAY 976 ARITHMETIC CODING

3 Case 2 - s' = s l By (4), (2), and (3), in turn, C(s') = C(s) (SI) (SO) + (sl) 5 (2+i3 + 2' Again by ( 3) twice ~ ( \$3 ~ ~ ( ~ ) X 2-+"3~(s0), and we have < (,-(I+, + 2-b+' By ( ), finally, C(s') (S'O), which completes the induction and the proof of the theorem. Theorem 2 If s is a binary string of length n with m l's, then log C(s) 5 rn, + (n - m)l, + r + + ~ /3. (5) With E 5 el, E, 5 E + 2-' n, = log n" + ' n I, = log + E,, inequality ( )' holds and and n + E + 2-'+ O(i), where H(p) = p log p-' + ( - p ) log ( - p)" Proof By (l), (5), and (3), m q S ) 2r+r/3 2 2(ki-i)ll+i2 i= I The sum is clearly at its maximum when kt = n - m + i, and hence Numerical considerations, example, and special case We next study the addition in (9). From (4), (3), and (2, In cases of interest, where I, < and I, > 2, this gives It then follows that never more than r- L2d + 2 left-most bits of C (s) in (9) need be replaced in order to get C (s ). This means that the recursion (9) is reminiscent of conventional coding methods such as Huffman coding in which a new code word is appended to the string of the previous code words. Here, there is a small overlapping between the manipulated strings. What then are the practical choices for q, E, and r? Suppose that p = (m/n) < \$, which are the values that provide worthwhile savings in data storage; [H(f) E 0.8. Then E should be smaller than, say, p E I,. This makes q not smaller than log p-' - E, -. The function 2" is such that ( ) in general can be satisfied approximately with Y = q + 2. The function e(x) may be tabulated for small or moderate values of q, or it may be calculated from the standard formula of the type With a table lookup for e(x) the coding operations involve only two arithmetic operations each per bit with the version (7)-(0) and approximately 6(m/n) operations per bit when the formulas (5), (6) are used. In the latter case the decoding test (0) is not done for every bit but rather the ki in si c* (k,,..., kt) are solved as: ki is the maximum for which In fact, ki=60r6+, By ( ) and from the fact that, 2,, 2-l~ 5 2-f- or 2'2 > 2. Therefore, / (22 - ) < I, and the claim (5) follows. The rest follows by a direct calculation. Remark If the symbols 0 and are generated by an independent or memoryless information source, then, because E(m/ n) = p, inequality ( 6) holds also for the 200 means of both sides. where 6 = R-'[ a( si) + i(, - /,) - r] can be calculated recursively with two arithmetic operations. If log e(x) is also approximated by a table, then ki can be calculated without the above ambiguity with three arithmetic operations from Thus the coding operations can be done at a rate faster or comparable to that of Huffman coding. If the function e(x) and its logarithm are tabulated, the sizes of the tables are of the order of p-', which gives the error E E p as the "penalty" term in ( 6). J. J. RISSANEN IBM J. RES. DEVELOP.

4 With Huffman coding the same penalty term is k" where k is the block size [2]. Therefore, in order to guarantee the same penalty term, we must put k E ec E - p, which gives 2k 2"Ii for the number of code words. In typical cases this is much larger than the number of entries p-' in the table of e(x) with the present coding method. Example We now illustrate our coding technique by a simple example. Let I, = 0.0 I and l2 =. I, both numbers being in binary form. Then 2-' < < Hence, E = 0.28 will do, and we get from ( ) r = 4. The function e is defined by the table Take s above, X e(x) ( I, 3, 20, 22,42). Then from (6) and the table By adding all these we I) 2) = 2 3) = 28 4) = LZ2 = 23 X LOOOO. f(s)= ll Because of the complications involved in blocking, Huffman coding is rarely used for a binary alphabet. Instead, binary strings are often coded by various other techniques, such as run-length coding in which runs of consecutive zeros are encoded as their number. We describe a special case of arithmetic coding that is particularly fast and simple. This method turns out to be related to one due to Fano [3], and the resulting compression is much better than in ordinary run-length coding. Suppose that we wish to store m binary integers, z,=(z,,z2,~~~,z,),0~zl~z,~'~~iz,i2", each written with w bits. Let n = [log ml, and let n < w. The list Z, may be encoded as a binary string s, whose ith isin position ki = zi + i, as in (5). Then p is approximately given by 2"-", and we may put = 2"-m l,=w-n+l. Then inequality ( ) can be shown to hold for some E whose size we do not need. By setting e(x) = I + x and r = w - n we get from (6) y(s,) = qi + i(r + ) + r, x(si) = = (2' + ri)2qi+i'r+', where ri consists of the w - n right-most bits of zi and qi of the rest; i.e., zi = qi2w-" + Ti. The encoding of Z, or, equivalently, of s, is now straightforward by (5). It can be seen to be a binary string where the m strings ri and qi appear as follows lr, lr20~~~0r0~~~0, where the first run of zeros counted from the right is of length ql, the next of length q, - q,, and so on; the last or the left-most run is of length q, - q,-l. From this the ri and the qi can be recovered in the evident manner. The length of the code is seen to be not greater than m( w - n + 2), which is an excellent approximation to the optimum, 22LH( 2'-,), whenever w - n > 3. Finite alphabet Let ( ul, a2;.., a,) be an ordered alphabet, and let be a positive number, which we will show determines the accuracy within which the optimum per symbol length, the entropy, is achieved. Let I,,..., I, be positive rational numbers with q binary digits in their fractional parts such that 5 2-i 5 2". i= Further, let pk be a rational number such that and define k P, = pi, Po = 0. i=l Finally, for x, a q-bit fraction, let e(x) be an approximation to 2" such that e(x) = 2s+sx, 0 5 6z 5 5 2' Write pi, P,, and e(x) with r fractional bits. Because of (2 ) and (22), P, m2-r, so that r P log m. The encoding function C is defined as follows: (24) C(sa,) = C(s) + Pk-l C(h) = 0, 20 MAY 976 ARITHMETIC (:ODlNG

5 L(su,) = L(s) +, = y(sa,) + x(sa,), L(X) = = 2Y(sa,) X e[x(su,)l, (25) where y (sa,) = LL(sa,) and x(sa,) is the fractional part of L (sa,). Let u (t) be the largest number in {,..., m} such that Pu(l)-l 5 t for t 5. (26) Further, for the decoding we need an upper bound for a truncation of C(s). We pick this as follows: - C(s) = 2"[p(s) + 2-y+'], (7) where a(s) = IC(s)l + and p(s), 5 p(s) < 2, is defined by the y left-most bits of C(s). Put = r5.6 - logel. (28) The decoding function D recovers s from (C (s), L ( s) ) as follows: s t(s) = s'aullcs,,' - = C(s') = C(s) - Pu[t(s),-l L(s') = L(s) - (29 Theorem 3 If for E > 0, m i=l 2-'i 5 2-~, then for every s, D(C(s), L(s)) = s. Proof We prove first that (30) C(s) (3) for every s. This inequality holds for s = A by (25). Arguing by induction, suppose that it holds for all strings s of length no more than n. By (25) and (3 ), C(sa,) + Pk-l (32) By (24) and (25) 5 (33) which with (32) leads to C(sa,) 5 (2-'k+t/2 + This with (22) and (23) gives C(sa,) 5 and because by (22) A! P, 5 2' E 2-4 i= 202 the desired inequality, C(sa,) follows with (30). The induction and the proof of (3) is complete. When we apply the decoding algorithm to the pair (C(sa,),L(su,)),wefindthatu[t(su,)]=kifandonlyif P,-] 5 t(su,) < P,. (34) If we putc(s) = C(s) +ti(\$), where 0 < 6(s) 5 2-y+' C(s) 5 > (35) the inequalities (34) become The former inequality holds by (25) and the fact that 6 (sa,) 3 0. The second inequality holds with the first equation of (25), and (23) translates into C(s) + S(sa,) < (36) By (3), (33) and (35, C(s) + S(sa,) 5 2-"+t2( + - * -la!+c/2 ( + 2--y+) 2l"'k x = (2(e/2)-ck + 2(f/2)-~+l-t~ (sa,). By (22), E, - E/ 2? E/ 6. Thus by a well-known exponential inequality we have The coefficient of (sa,) is then smaller than one for the given value (28) for y, and (36) holds. The proof is complete. From (3 I) and (25) we immediately obtain -logc(s) n n. 2r + Inequality (30) holds if we put in which case where H(Pl,..., Pm) = E Pi log pi-,. By (33 ) and the fact that Pk-, > 2-+"2 for k >, It therefore follows that in the first equation of (25) no more than 2r + -, +, left-most bits of C(s) need by changed in order to get C(sa,). J. J. RISSANEN IBM J. RES. DEVELOP.

6 The crux of the decoding process is to calculate u(t) by (26 j for the r-bit fractions t. This can either be tabulated or calculated from a suitable approximation to k p = k i=l 2- i+ci. Further, the products fk-l X e[x(sa,j] appearing in (25 j can be tabulated, which is practicable for small or moderate sized alphabets. Then only two arithmetic operations per symbol are needed in the coding operations. The general case does not reduce to the binary case when rn = 2, because the factor f, = p, approximated as 2-Ii, can be absorbed by +(sl), which is cheaper. Finally, if for large alphabets the symbols are ordered such that the numbers li form an increasing sequence, the function k + P, is concave. Then both are easy to approximate by an analytic function, and the overhead storage consists mainly of the table of symbols. The coding operations then require about the time for the table lookups k c-2 ak. In addition, the li may be chosen, except for small values for i, in a universal way; e.g. as li E log i + (log and still have a near-entropy compression. (For universal coding, see [ 6 ). This means that there is no need to collect elaborate statistics about the symbol probabilities. We leave the details to another paper. References. D. A. Huffman, A Method for the Construction of Minimum-Redundancy Codes, Proc. IRE 40, 098 (952). 2. N. Abramson, Information Theory and Coding, McGraw- Hill Book Co., Inc., New York, R. M. Fano, On the Number of Bits Required to Implement an Associative Memory, Memorandum 6, Computer Structures Group, Project MAC, Massachusetts, J. P. M. Schalkwijk, An Algorithm for Source Coding, IEEE Truns. Information Theory IT-8, 395 (972). 5. T. M. Cover, Enumerative Source Encoding, IEEE Trans. lnfornzation Theory IT-9, 73 (973). 6. P. Elias, Universal Codeword Sets and Representations of the Integers, IEEE Trans. Infiwmution Theory IT-2, 94 (March 975). Received May 6, 975; revised October 5, 975 The author is located ut the IBM Research Laborutory, Monterey and Cottle Roads, Sun Jose, California MAY 976 ARITHMETIC CODING

### Data Compression Techniques

Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy

### Chapter 2: Source coding

Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

### ECE 587 / STA 563: Lecture 5 Lossless Compression

ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source

### Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is

### Fibonacci Coding for Lossless Data Compression A Review

RESEARCH ARTICLE OPEN ACCESS Fibonacci Coding for Lossless Data Compression A Review Ezhilarasu P Associate Professor Department of Computer Science and Engineering Hindusthan College of Engineering and

### 1 Introduction to information theory

1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

### Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

### An introduction to basic information theory. Hampus Wessman

An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on

### Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

### The Optimal Fix-Free Code for Anti-Uniform Sources

Entropy 2015, 17, 1379-1386; doi:10.3390/e17031379 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article The Optimal Fix-Free Code for Anti-Uniform Sources Ali Zaghian 1, Adel Aghajan

### On the Cost of Worst-Case Coding Length Constraints

On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed

### Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

### Data Compression Techniques

Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques

### Coding for Discrete Source

EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

### An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

### Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw

### Lec 03 Entropy and Coding II Hoffman and Golomb Coding

CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

### Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

### COS597D: Information Theory in Computer Science October 19, Lecture 10

COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept

### Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005

Course notes for Data Compression - 1 The Statistical Coding Method Fall 2005 Peter Bro Miltersen August 29, 2005 Version 2.0 1 The paradox of data compression Definition 1 Let Σ be an alphabet and let

### Introduction to algebraic codings Lecture Notes for MTH 416 Fall Ulrich Meierfrankenfeld

Introduction to algebraic codings Lecture Notes for MTH 416 Fall 2014 Ulrich Meierfrankenfeld December 9, 2014 2 Preface These are the Lecture Notes for the class MTH 416 in Fall 2014 at Michigan State

### COMMUNICATION SCIENCES AND ENGINEERING

COMMUNICATION SCIENCES AND ENGINEERING X. PROCESSING AND TRANSMISSION OF INFORMATION Academic and Research Staff Prof. Peter Elias Prof. Robert G. Gallager Vincent Chan Samuel J. Dolinar, Jr. Richard

### Lecture 6: Kraft-McMillan Inequality and Huffman Coding

EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

### Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

### ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

### Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C

Image Compression Background Reduce the amount of data to represent a digital image Storage and transmission Consider the live streaming of a movie at standard definition video A color frame is 720 480

### Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Cung Nguyen and Robert G. Redinbo Department of Electrical and Computer Engineering University of California, Davis, CA email: cunguyen,

### Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Introduction to Information Theory By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar Introduction [B.P. Lathi] Almost in all the means of communication, none produces error-free communication.

### THIS paper is aimed at designing efficient decoding algorithms

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999 2333 Sort-and-Match Algorithm for Soft-Decision Decoding Ilya Dumer, Member, IEEE Abstract Let a q-ary linear (n; k)-code C be used

### Kolmogorov complexity ; induction, prediction and compression

Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds

### Digital Communication Systems ECS 452. Asst. Prof. Dr. Prapun Suksompong 5.2 Binary Convolutional Codes

Digital Communication Systems ECS 452 Asst. Prof. Dr. Prapun Suksompong prapun@siit.tu.ac.th 5.2 Binary Convolutional Codes 35 Binary Convolutional Codes Introduced by Elias in 1955 There, it is referred

### Error Detection and Correction: Hamming Code; Reed-Muller Code

Error Detection and Correction: Hamming Code; Reed-Muller Code Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Hamming Code: Motivation

### k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

### /633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17 12.1 Introduction Today we re going to do a couple more examples of dynamic programming. While

### 4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the

### Variable-to-Variable Codes with Small Redundancy Rates

Variable-to-Variable Codes with Small Redundancy Rates M. Drmota W. Szpankowski September 25, 2004 This research is supported by NSF, NSA and NIH. Institut f. Diskrete Mathematik und Geometrie, TU Wien,

### (Classical) Information Theory II: Source coding

(Classical) Information Theory II: Source coding Sibasish Ghosh The Institute of Mathematical Sciences CIT Campus, Taramani, Chennai 600 113, India. p. 1 Abstract The information content of a random variable

### repetition, part ii Ole-Johan Skrede INF Digital Image Processing

repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and

### Reserved-Length Prefix Coding

Reserved-Length Prefix Coding Michael B. Baer vlnks Mountain View, CA 94041-2803, USA Email:.calbear@ 1eee.org Abstract Huffman coding finds an optimal prefix code for a given probability mass function.

### Context tree models for source coding

Context tree models for source coding Toward Non-parametric Information Theory Licence de droits d usage Outline Lossless Source Coding = density estimation with log-loss Source Coding and Universal Coding

### Fast Progressive Wavelet Coding

PRESENTED AT THE IEEE DCC 99 CONFERENCE SNOWBIRD, UTAH, MARCH/APRIL 1999 Fast Progressive Wavelet Coding Henrique S. Malvar Microsoft Research One Microsoft Way, Redmond, WA 98052 E-mail: malvar@microsoft.com

### A Combinatorial Bound on the List Size

1 A Combinatorial Bound on the List Size Yuval Cassuto and Jehoshua Bruck California Institute of Technology Electrical Engineering Department MC 136-93 Pasadena, CA 9115, U.S.A. E-mail: {ycassuto,bruck}@paradise.caltech.edu

### Local Asymptotics and the Minimum Description Length

Local Asymptotics and the Minimum Description Length Dean P. Foster and Robert A. Stine Department of Statistics The Wharton School of the University of Pennsylvania Philadelphia, PA 19104-6302 March 27,

### ,

Kolmogorov Complexity Carleton College, CS 254, Fall 2013, Prof. Joshua R. Davis based on Sipser, Introduction to the Theory of Computation 1. Introduction Kolmogorov complexity is a theory of lossless

### J. E. Cunningham, U. F. Gronemann, T. S. Huang, J. W. Pan, O. J. Tretiak, W. F. Schreiber

X. PROCESSNG AND TRANSMSSON OF NFORMATON D. N. Arden E. Arthurs J. B. Dennis M. Eden P. Elias R. M. Fano R. G. Gallager E. M. Hofstetter D. A. Huffman W. F. Schreiber C. E. Shannon J. M. Wozencraft S.

### LONGEST SUCCESS RUNS AND FIBONACCI-TYPE POLYNOMIALS

LONGEST SUCCESS RUNS AND FIBONACCI-TYPE POLYNOMIALS ANDREAS N. PHILIPPOU & FROSSO S. MAKRI University of Patras, Patras, Greece (Submitted January 1984; Revised April 1984) 1. INTRODUCTION AND SUMMARY

### :s ej2mttlm-(iii+j2mlnm )(J21nm/m-lnm/m)

BALLS, BINS, AND RANDOM GRAPHS We use the Chernoff bound for the Poisson distribution (Theorem 5.4) to bound this probability, writing the bound as Pr(X 2: x) :s ex-ill-x In(x/m). For x = m + J2m In m,

### Entropies & Information Theory

Entropies & Information Theory LECTURE I Nilanjana Datta University of Cambridge,U.K. See lecture notes on: http://www.qi.damtp.cam.ac.uk/node/223 Quantum Information Theory Born out of Classical Information

### The memory centre IMUJ PREPRINT 2012/03. P. Spurek

The memory centre IMUJ PREPRINT 202/03 P. Spurek Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland J. Tabor Faculty of Mathematics and Computer

### Information Theory. Week 4 Compressing streams. Iain Murray,

Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 4 Compressing streams Iain Murray, 2014 School of Informatics, University of Edinburgh Jensen s inequality For convex functions: E[f(x)]

### Introduction to Convolutional Codes, Part 1

Introduction to Convolutional Codes, Part 1 Frans M.J. Willems, Eindhoven University of Technology September 29, 2009 Elias, Father of Coding Theory Textbook Encoder Encoder Properties Systematic Codes

### Sources: The Data Compression Book, 2 nd Ed., Mark Nelson and Jean-Loup Gailly.

Lossless ompression Multimedia Systems (Module 2 Lesson 2) Summary: daptive oding daptive Huffman oding Sibling Property Update lgorithm rithmetic oding oding and ecoding Issues: OF problem, Zero frequency

### CISC 876: Kolmogorov Complexity

March 27, 2007 Outline 1 Introduction 2 Definition Incompressibility and Randomness 3 Prefix Complexity Resource-Bounded K-Complexity 4 Incompressibility Method Gödel s Incompleteness Theorem 5 Outline

### Quantum-inspired Huffman Coding

Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,

### PSK bit mappings with good minimax error probability

PSK bit mappings with good minimax error probability Erik Agrell Department of Signals and Systems Chalmers University of Technology 4196 Göteborg, Sweden Email: agrell@chalmers.se Erik G. Ström Department

### A version of for which ZFC can not predict a single bit Robert M. Solovay May 16, Introduction In [2], Chaitin introd

CDMTCS Research Report Series A Version of for which ZFC can not Predict a Single Bit Robert M. Solovay University of California at Berkeley CDMTCS-104 May 1999 Centre for Discrete Mathematics and Theoretical

### CS 6820 Fall 2014 Lectures, October 3-20, 2014

Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

### Design of Optimal Quantizers for Distributed Source Coding

Design of Optimal Quantizers for Distributed Source Coding David Rebollo-Monedero, Rui Zhang and Bernd Girod Information Systems Laboratory, Electrical Eng. Dept. Stanford University, Stanford, CA 94305

### Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

October 5, 017 Greedy Chapters 5 of Dasgupta et al. 1 Activity selection Fractional knapsack Huffman encoding Later: Outline Dijkstra (single source shortest path) Prim and Kruskal (minimum spanning tree)

### 4F5: Advanced Communications and Coding

4F5: Advanced Communications and Coding Coding Handout 4: Reed Solomon Codes Jossy Sayir Signal Processing and Communications Lab Department of Engineering University of Cambridge jossy.sayir@eng.cam.ac.uk

### Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Introduction to Coding Theory CMU: Spring 010 Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes April 010 Lecturer: Venkatesan Guruswami Scribe: Venkat Guruswami & Ali Kemal Sinop DRAFT

### Stream Codes. 6.1 The guessing game

About Chapter 6 Before reading Chapter 6, you should have read the previous chapter and worked on most of the exercises in it. We ll also make use of some Bayesian modelling ideas that arrived in the vicinity

### Progressive Wavelet Coding of Images

Progressive Wavelet Coding of Images Henrique Malvar May 1999 Technical Report MSR-TR-99-26 Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052 1999 IEEE. Published in the IEEE

### Journal of Cryptology International Association for Cryptologic Research

J. Cryptology (1991) 3:149-155 Journal of Cryptology 9 1991 International Association for Cryptologic Research On the Chor-Rivest Knapsack Cryptosystem 1 H. W. Lenstra, Jr. Department of Mathematics, University

### The simple ideal cipher system

The simple ideal cipher system Boris Ryabko February 19, 2001 1 Prof. and Head of Department of appl. math and cybernetics Siberian State University of Telecommunication and Computer Science Head of Laboratory

### Optimum Binary-Constrained Homophonic Coding

Optimum Binary-Constrained Homophonic Coding Valdemar C. da Rocha Jr. and Cecilio Pimentel Communications Research Group - CODEC Department of Electronics and Systems, P.O. Box 7800 Federal University

### Relating Entropy Theory to Test Data Compression

Relating Entropy Theory to Test Data Compression Kedarnath J. Balakrishnan and Nur A. Touba Computer Engineering Research Center University of Texas, Austin, TX 7872 Email: {kjbala, touba}@ece.utexas.edu

### Multimedia Communications. Scalar Quantization

Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing

### A L A BA M A L A W R E V IE W

A L A BA M A L A W R E V IE W Volume 52 Fall 2000 Number 1 B E F O R E D I S A B I L I T Y C I V I L R I G HT S : C I V I L W A R P E N S I O N S A N D TH E P O L I T I C S O F D I S A B I L I T Y I N

### Information Theory, Statistics, and Decision Trees

Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010

### Lecture #8: We now take up the concept of dynamic programming again using examples.

Lecture #8: 0.0.1 Dynamic Programming:(Chapter 15) We now take up the concept of dynamic programming again using examples. Example 1 (Chapter 15.2) Matrix Chain Multiplication INPUT: An Ordered set of

### Block 2: Introduction to Information Theory

Block 2: Introduction to Information Theory Francisco J. Escribano April 26, 2015 Francisco J. Escribano Block 2: Introduction to Information Theory April 26, 2015 1 / 51 Table of contents 1 Motivation

### Indeterminate-length quantum coding

Indeterminate-length quantum coding arxiv:quant-ph/0011014v1 2 Nov 2000 Benjamin Schumacher (1) and Michael D. Westmoreland (2) February 1, 2008 (1) Department of Physics, Kenyon College, Gambier, OH 43022

### A Divide-and-Conquer Algorithm for Functions of Triangular Matrices

A Divide-and-Conquer Algorithm for Functions of Triangular Matrices Ç. K. Koç Electrical & Computer Engineering Oregon State University Corvallis, Oregon 97331 Technical Report, June 1996 Abstract We propose

### Lecture 7: Dynamic Programming I: Optimal BSTs

5-750: Graduate Algorithms February, 06 Lecture 7: Dynamic Programming I: Optimal BSTs Lecturer: David Witmer Scribes: Ellango Jothimurugesan, Ziqiang Feng Overview The basic idea of dynamic programming

### Sec\$on Summary. Sequences. Recurrence Relations. Summations. Ex: Geometric Progression, Arithmetic Progression. Ex: Fibonacci Sequence

Section 2.4 Sec\$on Summary Sequences Ex: Geometric Progression, Arithmetic Progression Recurrence Relations Ex: Fibonacci Sequence Summations 2 Introduc\$on Sequences are ordered lists of elements. 1, 2,

### Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur

### The Poisson Channel with Side Information

The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH

### On the Exponent of the All Pairs Shortest Path Problem

On the Exponent of the All Pairs Shortest Path Problem Noga Alon Department of Mathematics Sackler Faculty of Exact Sciences Tel Aviv University Zvi Galil Department of Computer Science Sackler Faculty

### Computational complexity and some Graph Theory

Graph Theory Lars Hellström January 22, 2014 Contents of todays lecture An important concern when choosing the algorithm to use for something (after basic requirements such as correctness and stability)

### Weighted Activity Selection

Weighted Activity Selection Problem This problem is a generalization of the activity selection problem that we solvd with a greedy algorithm. Given a set of activities A = {[l, r ], [l, r ],..., [l n,

### Shannon's Theory of Communication

Shannon's Theory of Communication An operational introduction 5 September 2014, Introduction to Information Systems Giovanni Sileno g.sileno@uva.nl Leibniz Center for Law University of Amsterdam Fundamental

### Turbo Codes for Deep-Space Communications

TDA Progress Report 42-120 February 15, 1995 Turbo Codes for Deep-Space Communications D. Divsalar and F. Pollara Communications Systems Research Section Turbo codes were recently proposed by Berrou, Glavieux,

### Exhibit 2-9/30/15 Invoice Filing Page 1841 of Page 3660 Docket No

xhibit 2-9/3/15 Invie Filing Pge 1841 f Pge 366 Dket. 44498 F u v 7? u ' 1 L ffi s xs L. s 91 S'.e q ; t w W yn S. s t = p '1 F? 5! 4 ` p V -', {} f6 3 j v > ; gl. li -. " F LL tfi = g us J 3 y 4 @" V)

### 54 D. S. HOODA AND U. S. BHAKER Belis and Guiasu [2] observed that a source is not completely specied by the probability distribution P over the sourc

SOOCHOW JOURNAL OF MATHEMATICS Volume 23, No. 1, pp. 53-62, January 1997 A GENERALIZED `USEFUL' INFORMATION MEASURE AND CODING THEOREMS BY D. S. HOODA AND U. S. BHAKER Abstract. In the present communication

### F U N C T I O N A L P E A R L S Inverting the Burrows-Wheeler Transform

Under consideration for publication in J. Functional Programming 1 F U N C T I O N A L P E A R L S Inverting the Burrows-Wheeler Transform RICHARD BIRD and SHIN-CHENG MU Programming Research Group, Oxford

### A New Task Model and Utilization Bound for Uniform Multiprocessors

A New Task Model and Utilization Bound for Uniform Multiprocessors Shelby Funk Department of Computer Science, The University of Georgia Email: shelby@cs.uga.edu Abstract This paper introduces a new model

### Variable-to-Fixed Length Homophonic Coding with a Modified Shannon-Fano-Elias Code

Variable-to-Fixed Length Homophonic Coding with a Modified Shannon-Fano-Elias Code Junya Honda Hirosuke Yamamoto Department of Complexity Science and Engineering The University of Tokyo, Kashiwa-shi Chiba

### 3 Self-Delimiting Kolmogorov complexity

3 Self-Delimiting Kolmogorov complexity 3. Prefix codes A set is called a prefix-free set (or a prefix set) if no string in the set is the proper prefix of another string in it. A prefix set cannot therefore

### The Real Number System

MATH 337 The Real Number System Sets of Numbers Dr. Neal, WKU A set S is a well-defined collection of objects, with well-defined meaning that there is a specific description from which we can tell precisely

### Successive Cancellation Decoding of Single Parity-Check Product Codes

Successive Cancellation Decoding of Single Parity-Check Product Codes Mustafa Cemil Coşkun, Gianluigi Liva, Alexandre Graell i Amat and Michael Lentmaier Institute of Communications and Navigation, German

### MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK SATELLITE COMMUNICATION DEPT./SEM.:ECE/VIII UNIT V PART-A 1. What is binary symmetric channel (AUC DEC 2006) 2. Define information rate? (AUC DEC 2007)

### CSE 140 Lecture 11 Standard Combinational Modules. CK Cheng and Diba Mirza CSE Dept. UC San Diego

CSE 4 Lecture Standard Combinational Modules CK Cheng and Diba Mirza CSE Dept. UC San Diego Part III - Standard Combinational Modules (Harris: 2.8, 5) Signal Transport Decoder: Decode address Encoder:

### Limits and Continuity

Chapter Limits and Continuity. Limits of Sequences.. The Concept of Limit and Its Properties A sequence { } is an ordered infinite list x,x,...,,... The n-th term of the sequence is, and n is the index

### Exercise 1. = P(y a 1)P(a 1 )

Chapter 7 Channel Capacity Exercise 1 A source produces independent, equally probable symbols from an alphabet {a 1, a 2 } at a rate of one symbol every 3 seconds. These symbols are transmitted over a

### CHAPTER log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C * 9-4.* (Errata: Delete 1 after problem number) 9-5.

CHPTER 9 2008 Pearson Education, Inc. 9-. log 2 64 = 6 lines/mux or decoder 9-2.* C = C 8 V = C 8 C 7 Z = F 7 + F 6 + F 5 + F 4 + F 3 + F 2 + F + F 0 N = F 7 9-3.* = S + S = S + S S S S0 C in C 0 dder