On the Cost of Worst-Case Coding Length Constraints

Similar documents
Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Data Compression. Limit of Information Compression. October, Examples of codes 1

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Information Theory and Statistics Lecture 2: Source coding

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Tight Bounds on Minimum Maximum Pointwise Redundancy

Lecture 4 : Adaptive source coding algorithms

Chapter 5: Data Compression

On the redundancy of optimum fixed-to-variable length codes

EE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)

Chapter 2: Source coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Using an innovative coding algorithm for data encryption

An O(N) Semi-Predictive Universal Encoder via the BWT

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Quantum-inspired Huffman Coding

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Lecture 3 : Algorithms for source coding. September 30, 2016

Motivation for Arithmetic Coding

APC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

COMM901 Source Coding and Compression. Quiz 1

Progressive Wavelet Coding of Images

An Approximation Algorithm for Constructing Error Detecting Prefix Codes

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Coding of memoryless sources 1/35

Lecture 1 : Data Compression and Entropy

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

1 Introduction to information theory

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Reserved-Length Prefix Coding

ECE 587 / STA 563: Lecture 5 Lossless Compression

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Variable-to-Variable Codes with Small Redundancy Rates

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

On the minimum neighborhood of independent sets in the n-cube

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

Asymptotically Optimal Tree-based Group Key Management Schemes

Intro to Information Theory

Asymptotic redundancy and prolixity

Entropy as a measure of surprise

ECE 587 / STA 563: Lecture 5 Lossless Compression

Reserved-Length Prefix Coding

Chapter 5. Data Compression

Redundancy-Related Bounds for Generalized Huffman Codes

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

D-ary Bounded-Length Huffman Coding

ECE533 Digital Image Processing. Embedded Zerotree Wavelet Image Codec

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 2

Homework Set #2 Data Compression, Huffman code and AEP

EE-597 Notes Quantization

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

lossless, optimal compressor

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

2018/5/3. YU Xiangyu

Digital communication system. Shannon s separation principle

Variable-Rate Universal Slepian-Wolf Coding with Feedback

UNIT I INFORMATION THEORY. I k log 2

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Chapter 9 Fundamental Limits in Information Theory

Fractal Dimension and Vector Quantization

Infinite anti-uniform sources

Optimal prefix codes for pairs of geometrically-distributed random variables

CSE 421 Greedy: Huffman Codes

Approximation Metrics for Discrete and Continuous Systems

Non-binary Distributed Arithmetic Coding

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

On Probability Estimation by Exponential Smoothing

Generalized Kraft Inequality and Arithmetic Coding

The Optimal Fix-Free Code for Anti-Uniform Sources

3F1 Information Theory, Lecture 3

3F1 Information Theory, Lecture 3

Optimal computation of symmetric Boolean functions in Tree networks

CS Data Structures and Algorithm Analysis

Compression and Coding

Logarithmic quantisation of wavelet coefficients for improved texture classification performance

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Lecture 1: September 25, A quick reminder about random variables and convexity

Optimum Binary-Constrained Homophonic Coding

An algorithm for computing minimal bidirectional linear recurrence relations

Data Compression Using a Sort-Based Context Similarity Measure

Shannon-Fano-Elias coding

Information and Entropy

Design of Optimal Quantizers for Distributed Source Coding

IEEE copyright notice

Solutions to Set #2 Data Compression, Huffman code and AEP

Efficient Alphabet Partitioning Algorithms for Low-complexity Entropy Coding

An Entropy Bound for Random Number Generation

Transcription:

On the Cost of Worst-Case Coding Length Constraints Dror Baron and Andrew C. Singer Abstract We investigate the redundancy that arises from adding a worst-case length-constraint to uniquely decodable fixed to variable codes over achievable Huffman codes. This is in contrast to the traditional metric of the redundancy over the entropy. We show that the cost for adding constraints on the worstcase coding length is small, and that the resulting bound is related to the Fibonacci numbers. Keywords Data compression, Fibonacci numbers, Huffman coding, redundancy, source coding, uniquely decodable. I. Introduction A fundamental tradeoff in lossless source coding is that some inputs can be compressed only if others are expanded. A reasonable objective is to compress well on average, while expanding little in the worst-case. The tradeoff between the expected coding length and the worst-case coding expansion has received research attention. In [] an algorithm for finding a code meeting these constraints is proposed, and in [] the redundancy of the expected coding length over the entropy is bounded. In this paper, we investigate the redundancy of the expected coding length of constrained codes over that of achievable Huffman codes []. We bound this redundancy by a term that decays exponentially in the worst-case coding expansion, and note that this term is related to the Fibonacci numbers. The problem is stated in Section II, the main results are given in Section III, and a discussion is provided in Section IV. II. Problem formulation Consider a discrete alphabet X and length-n input sequences x, i.e., x X N. We define a source code C as a mapping C : X N X where X is the set of finite-length sequences over X. Following [4], let C(x) be the codeword corresponding to x, and l(x) This work was supported in part by NSF grants No. MIP-97-076 and NSF CDA 96-496. The authors are with the Coordinated Science Laboratory and the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign. They can be reached at dbaron@uiuc.edu and acsinger@uiuc.edu. c 00 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

the length of C(x). The expected coding length L(C) for a random variable X with a probability mass function (PMF) p(x) is defined as L(C) x X N p(x)l(x). () A uniquely decodable code is a source code C with a dual mapping C : X X N, such that C(C(x)) = x, x X N. We say that the code C expands x when l(x) > N, and we define the worst-case coding expansion as W(C) max x X N {l(x) N} () (it is non-negative because no source code compresses all input sequences). Given the PMF p(x), x X N, and an (integer) constraint on the worst-case coding expansion, the constrained expected coding length is defined as L min {L(C)}. () C: W(C) When the constraint is relaxed, and is increased, the constrained expected coding length will go down until at some stage it is equal to the unconstrained expected coding length. This is the expected coding length of the Huffman code, and is denoted by L. In the following section we bound L L. III. Results Any Huffman code can be viewed as a tree, where codewords correspond to leaves, and their prefixes correspond to internal nodes. The nodes, be they leaves or internal nodes, make up a Huffman tree. A leaf corresponds to some input x and its probability p(x); an internal node corresponds to a set of descendant leaves, and the total probability of those leaves. Lemma bounds the probabilities corresponding to internal nodes, and is related to Theorem 7 in [5], which bounds the probabilities corresponding to leaves. Both proofs are by induction, the only difference is due to initial conditions. Lemma : Any depth-k internal node in a Huffman tree corresponds to a set of codewords with total probability p k satisfying p k f k, (4) where f n+ f n + ( X )f n, with initial conditions f 0, f X. Proof: In the Huffman tree, let p k, p k,..., p 0 = be the probabilities corresponding to internal nodes on the path from our depth-k node, denoted by α, to the root. Let q i l, i {,..., X } be the corresponding probability of node i merged with p l into p l. We prove p l f k l p k by induction on l. First, p k = p k = f 0 p k. Second, the lemma requires α to be an internal node, so at least one of its descendant nodes corresponds

to at least a probability of X p k. But α and its parent are internal nodes, so when the parent is created in the Huffman algorithm, its descendants are nodes corresponding to the minimal probabilities among all the nodes at that stage, so q i k X p k. Therefore, p k = p k + X i= q i k (5) ( ) pk p k + ( X ) X = f p k, (6) where (5) is the merging of corresponding probabilities. The inductive step is p l = p l + X i= q i l (7) p l + ( X )p l+ (8) (f k l + ( X )f k l )p k (9) = f k l+ p k (0) where (7) is similar to (5), the reasoning in the second step leads to (8), (9) is by induction, and (0) uses the definition of f. The lemma (4) follows because p 0 =. For X =, f n are essentially scaled Fibonacci numbers, i.e., f 0 =, f =, f = 5, f = 8,. In the general case, the recursion for f n can be solved using methods for difference equations [6] leading to f n = + ( + ) ( X 4 X ( )( X 4 X + ) n 4 X 4 X ) n, n 0. () Theorem : Given a random variable X with a PMF p(x) over X N, then for > 0 L L. () f Proof: We begin with C H, a Huffman code that achieves L. We create a new code C (not necessarily a Huffman code), with length function l(x) and expected coding length L( C), by modifying C H in two steps. Step : prune X = {x : l(x) N + } from the Huffman tree of C H. If X is empty we are done, else nodes were pruned off a depth-n + internal node, so there exists a depth- internal node, including the one in the path from the depth-n + internal node to the root. Step : take any depth- internal node in the tree; denote it by β. Let X be the descendant leaves of β. Replace β with a new node γ with two descendants. The first descendant is β, and the

second descendant is a depth-n full tree with up to X N leaves that can accommodate all of X, since X < X N. The full tree for X starts at depth and goes up to depth N +, so l(x) N +, x X. For X, l(x) < N +, x X, so adding an additional symbol gives l(x) N +, x X. Therefore W( C). The structure β originally resided at depth in the tree, so by Lemma x X p(x) f. Therefore, C satisfies L( C) = x X N p(x) l(x) p(x) l(x) + p(x) l(x) + p(x) l(x) x X x X x X X p(x)l(x) + p(x)(l(x) + ) + x X x X = x p(x)l(x) + x X p(x) x X X p(x)l(x) () L(C H ) + f, (4) where () arises because codewords in X became shorter. The result is obtained by noting that L(C H ) = L. IV. Discussion We begin with several technical remarks on Theorem. First, the theorem does not apply for = 0, because there is no depth- internal node that can be split. In fact, for = 0 there is no expansion, nor is there any compression, thus L 0 = N. Second, the theorem upper bounds L L, but we cannot give a lower bound, because for a uniform PMF we have L = N, hence L L = 0. Third, we can get a stronger bound on the expected coding length for =. In this case, there always exists some depth- node (not necessarily internal) with x X p(x), so X L L X. (5) Although Lemma bounds the probabilities corresponding to depth-k internal nodes, there could be nodes at that depth that correspond to even smaller probabilities. The constructive method used in the proof of the theorem can be used to derive codes that satisfy constraints on the worst-case coding expansion, but these are not necessarily optimal codes. However, the theorem is useful because it bounds the cost of the constraint by a term that decays exponentially in the expansion. A tighter bound in the main theorem could be obtained by finding a stronger version of Lemma for the depth-k node that corresponds to the smallest probabilities. 4

Acknowledgments The authors wish to thank the two anonymous reviewers and Marcelo Weinberger for their insightful comments. References [] A. Moffat, A. Turpin, and J. Katajainen, Space-efficient construction of optimal prefix codes, Proc. Data Compression Conference, Snowbird, UT, pp. 9-0, March 995. [] R. M. Capocelli and A. De Santis, On the Redundancy of Optimal Codes with Limited Word Length, IEEE Trans. Information Theory, vol. IT-8, no., pp. 49-445, March 99. [] D. A. Huffman, A Method for the Construction of Minimum Redundancy Codes, Proc. IRE, vol. 40, no. 9, pp. 098-0, September 95. [4] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York; John Wiley and Sons, 99. [5] R. M. Capocelli and A. De Santis, A Note on D-ary Huffman Codes, IEEE Trans. Information Theory, vol. IT-7, no., pp. 74-79, January 99. [6] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ; Prentice Hall, 989. 5