Introduction to Information Theory, Data Compression,

Similar documents
Introduction to information theory and data compression

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

EGR 544 Communication Theory

Chapter 7 Channel Capacity and Coding

Compression in the Real World :Algorithms in the Real World. Compression in the Real World. Compression Outline

Problem Set 9 Solutions

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

Calculation of time complexity (3%)

Entropy Coding. A complete entropy codec, which is an encoder/decoder. pair, consists of the process of encoding or

Chapter 7 Channel Capacity and Coding

Lecture 3: Shannon s Theorem

Chapter 8 SCALAR QUANTIZATION

} Often, when learning, we deal with uncertainty:

VQ widely used in coding speech, image, and video

Bit Juggling. Representing Information. representations. - Some other bits. - Representing information using bits - Number. Chapter

Learning Theory: Lecture Notes

Lecture 10: May 6, 2013

Error Probability for M Signals

Speeding up Computation of Scalar Multiplication in Elliptic Curve Cryptosystem

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

NUMERICAL DIFFERENTIATION

Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003

Comparative Studies of Law of Conservation of Energy. and Law Clusters of Conservation of Generalized Energy

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition

Application of Nonbinary LDPC Codes for Communication over Fading Channels Using Higher Order Modulations

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

The internal structure of natural numbers and one method for the definition of large prime numbers

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Pulse Coded Modulation

The Minimum Universal Cost Flow in an Infeasible Flow Network

HMMT February 2016 February 20, 2016

Problem Set 6: Trees Spring 2018

Lecture Notes on Linear Regression

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Math Review. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija

Finding Dense Subgraphs in G(n, 1/2)

Introduction to Algorithms

ENTROPIC QUESTIONING

The Study of Teaching-learning-based Optimization Algorithm

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Generalized Linear Methods

A Note on Bound for Jensen-Shannon Divergence by Jeffreys

An application of generalized Tsalli s-havrda-charvat entropy in coding theory through a generalization of Kraft inequality

Operating conditions of a mine fan under conditions of variable resistance

Exercises of Chapter 2

Chapter 3 Describing Data Using Numerical Measures

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Foundations of Arithmetic

Finding Primitive Roots Pseudo-Deterministically

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.

Quantum and Classical Information Theory with Disentropy

Min Cut, Fast Cut, Polynomial Identities

Edge Isoperimetric Inequalities

Lossless Compression Performance of a Simple Counter- Based Entropy Coder

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Nodal analysis of finite square resistive grids and the teaching effectiveness of students projects

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

CHAPTER-5 INFORMATION MEASURE OF FUZZY MATRIX AND FUZZY BINARY RELATION

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

Estimating Delays. Gate Delay Model. Gate Delay. Effort Delay. Computing Logical Effort. Logical Effort

COMPLEX NUMBERS AND QUADRATIC EQUATIONS

CHAPTER 17 Amortized Analysis

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Lecture 10 Support Vector Machines II

Kernel Methods and SVMs Extension

Lecture 5 Decoding Binary BCH Codes

A Simple Inventory System

18.1 Introduction and Recap

Feature Selection: Part 1

CHAPTER IV RESEARCH FINDING AND ANALYSIS

Channel Encoder. Channel. Figure 7.1: Communication system

Communication Complexity 16:198: February Lecture 4. x ij y ij

New modular multiplication and division algorithms based on continued fraction expansion

A 2D Bounded Linear Program (H,c) 2D Linear Programming

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

The Order Relation and Trace Inequalities for. Hermitian Operators

The Synchronous 8th-Order Differential Attack on 12 Rounds of the Block Cipher HyRAL

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Estimation: Part 2. Chapter GREG estimation

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture # 15 Scribe: Jieming Mao April 1, 2013

Lecture 4: Universal Hash Functions/Streaming Cont d

Economics 101. Lecture 4 - Equilibrium and Efficiency

Lecture 4: November 17, Part 1 Single Buffer Management

Inductance Calculation for Conductors of Arbitrary Shape

Design and Analysis of Algorithms

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Assortment Optimization under MNL

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Errors for Linear Systems

CS 770G - Parallel Algorithms in Scientific Computing

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

1 The Mistake Bound Model

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Transcription:

Introducton to Informaton Theory, Data Compresson, Codng Mehd Ibm Brahm, Laura Mnkova Aprl 5, 208 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on the 3th of March 208 for a Data Structures and Algorthms class (COMP 252) at McGll Unversty. The subject was Informaton Theory, Data Compresson, and Codng. Data Compresson: The effcent encodng of nformaton. In many compresson methods, nput symbols are mapped to codewords (bt sequences). The set of codewords s called a code. If all codewords are of equal length, then we have a fxed-length code. Otherwse, we have a varable-length code. The most mportant codes are prefx codes,.e., codes n whch no codeword s the prefx of another codeword. If codewords are mapped to bnary trees (a 0 correspondng to a left edge, and a to a rght edge), then one can assocate each symbol n a prefx code wth a unque leaf. It s noteworthy that the compressed (a coded) sequence can be decoded to yeld the nput by repeatedly gong down the tree untl leaves are reached. Claude E. Shannon (96-200) s a hghly recognzed Amercan mathematcan and computer scentst. He studed electrcal engneerng and mathematcs at the Unversty of Mchgan before gong on to complete a masters and postdoctorate degree at MIT. The computer scence and engneerng communty ncreasngly began to notce hs brllant mnd after the publcaton of hs master s thess "A Symbolc Analyss of Relay and Swtchng Crcuts", wrtten n 936. Hs most notable and well known publcaton "A Mathematcal Thoery of Communcaton", was publshed a few years later, n 948. Although he worked n a feld n whch no Nobel Prze exsted, he was granted numerous prestgous przes throughout hs career. He passed away at the age of 84 after a long fght wth alzhemer dsease. Informaton Theory Informaton Theory s the study of nformaton and how t can Charles E. Leserson, Thomas H. Cormen, and Ronald L. Rvest. Introducton be processed and communcated. Not long after begnnng work to Algorthms. Cambrdge, MA, 2009 at the Bell Laboratores, Claude E. Shannon publshed hs paper "A Mathematcal Theory of Communcaton" 2, n 948, n the Bell 2 Claude E. Shannon. A mathematcal Systems Techncal Journal. 3 Ths paper quckly ganed wde-spread theory of communcaton. The Bell System Techncal Journal, 948 recognton as beng the ground work for what s now known as 3 Inrene Woo Adel Magra, modern day nformaton theory. Emma Goune. Informaton The man premse of the paper was an nvestgaton nto solvng communcaton problems, dscussng them both n a theorectcal and real lfe sense. The greatest dfference between the two s that n real lfe, often tmes there s nose that can nterfere wth the mode of transmsson of nformaton, whch he called the channel. For the purpose of ths course, we consder a communcaton system n whch no nose s present. theory. http://luc.devroye. org/magra-goune-woo--shannon+ InformatonTheory-LectureNotes-McGllUnversty-20 pdf, March 207. Accessed on 208-03- 20

ntroducton to nformaton theory, data compresson, codng 2 Fgure : Noseless communcaton system Shannon s greatest concern was the "how" and not the "what" of nformaton transmsson. He dd note, however, that n the case of data compresson how well you compress (and how easly) depends on the nput you are consderng. That beng sad, he pays no attenton to the actual meanng of the nput, statng "...these semantc aspects of communcaton are rrelevant to the engneerng problem." 4 4 Shannon [948] The compresson rato, C, s defned by C = number of symbols n output number of symbols n nput In order to determne the expected length of the output sequence, Shannon consdered every possble nput. He assumed that every nput sequence that may have to be compressed has a gven probablty p, where the p s sum to one. If the th nput, was gven some encodng of length l bts, then the expected length of the output bt sequence s Σ p l. A bnary tree proved to be very useful n representng the encodng of nformaton. The nternal nodes of ths tree would have no value, however each leaf would represent a possble nput. Every left edge represents by a 0, and every rght edge a. Thngs to note:. Input n a communcaton system s not lmted to words, characters etc. It can be anythng! 2. Output s always bnary. Example: Entropy (Symbol E) In nformaton theory, entropy s a quantty that measures the amount of nformaton n a random varable. Thus entropy provdes a theoretcal (sometmes nachevable) lmt for the effcency of any possble encodng. 5 Correspondng Translaton Table: The bnary entropy s defned as follows E = Σ p log 2 p 0, where p s are the probabltes of the nput sequences. 5 George Markowsky. Informaton theory. https://www.brtannca.com/ scence/nformaton-theory, June 207. Accessed on 208-03-

ntroducton to nformaton theory, data compresson, codng 3 Shannon faced three problems:. Fnd a bnary tree that mnmzes Σ p l (solved by hs student, Davd Huffman). 2. Prove E mn Σ p l, where "mn" refers to the mnmum over all bnary trees. (Thus, the expected length of the output, regardless of the comparson method, s at least E.) 3. Prove Σ p l E+, for some bnary tree. (Ths reassures us, snce we can come close to the lowerbound, E.) We wll frst prove (2) E mnσ p l. Proof. Recall Kraft s nequalty, whch s vald for all bnary trees: 2 l By Taylor s seres expanson, log e x x. Now observe that: p l = p log 2 2 l () = p log 2 (2 l p ) (2) p = p log 2 + p p log 2 (p 2 l ) (3) ( ) = E (log 2 e) p log e p 2 l (4) ( ) E (log 2 e) p p 2 l (5) = E. (6) We have shown that p l E. We must now exhbt a compresson method wth E + p l. Proof. We take l = log 2 ( p ) so we have, 2 l p 2 log 2 p. So, Kraft s nequalty holds, By orderng the lengths l from small to large, and assgnng the l s to leaves n a bnary tree from left to rght, one can fnd a code wth the gven l s. Ths code s called the Shannon-Fano code.

ntroducton to nformaton theory, data compresson, codng 4 Now, p l p ( + log 2 ) = + E. p Huffman Tree A Huffman tree s a bnary tree that mnmzes Σ p l where p s the weght of leaf and l s the dstance from leaf to the root. It has the followng propertes:. Two nputs wth smallest p value are furthest from the root. 2. Every nternal node has 2 chldren. 3. Two nputs wth smallest p value can safely be made sblngs. It s mportant to note that Huffman trees are not unque! The Hu-Tucker algorthm s a greedy algorthm desgned to output the Huffman tree gven a set of nputs and ther p s. It has tme complexty O(n log n). Setup: Let PQ be a bnary heap holdng pars (, p ) wth the smallest key p near the root. Assumng that there are n leaves, we can reserve n nterval nodes n an array of total sze 2n. Let us use left[] and rght[] to denote the chldren of node. Node s the root. HuffmanTree MAKENULL(PQ) 2 for = n to 2n do 3 left[] = rght[] = nl; 4 INSERT((, p ), PQ); for = n down to do 6 (a, p a ) = DELETEMIN(PQ); 7 (b, p b ) = DELETEMIN(PQ); 8 left[]= a; 9 rght[]= b; 0 INSERT((, (p a + p b )), PQ);

ntroducton to nformaton theory, data compresson, codng 5 Example: How to construct a Huffman tree Examples We wll now show dfferent methods of codng and see how they compare wth Shannon s lower bound. Suppose our nput s x, x 2,..., x n where x are unformly random elements of {,2,3}. There are, therefore, 3 n equally lkely nput sequences of length n. Note that E= log 2 3 n = n log 2 3.57n. ) (Fxed wdth length). We use two bts per nput symbol usng the fxed wdth code: 0, 2 0, 3. So the length of the output s 2n whch s not optmal. There s room for a smaller expected output length.

ntroducton to nformaton theory, data compresson, codng 6 2) (Huffman code). Consder the Huffman code where symbols are coded symbol by symbol usng a Huffman tree prefx code: 0, 2 0, 3. The expected output length s 3 5 n, snce p l = 3 () + 3 (2) + 3 (2) = 5 3. Thus, the expected output length s 5 3 n, whch s consderably larger than E.57n. 3) Let s now make groups of fxed length d. Each group of d s an nput symbol coded by a Huffman code. The expected output length n number of bts wll be n d tmes the expected length of the Huffman tree code for one group, whch we know s + log 2 3 d. So the overall expected length s n d log 2 3 d n d ( + d log 2 3) = n(log 2 3 + d ). Fnally, by choosng d large enough, we can get arbtrarly close to E. We cannot take d too large though, because computng the Huffman code would requre too much space as the Huffman tree has 3 d leaves.

ntroducton to nformaton theory, data compresson, codng 7 References [] Charles E. Leserson, Thomas H. Cormen, and Ronald L. Rvest. Introducton to Algorthms. Cambrdge, MA, 2009. [2] Claude E. Shannon. A mathematcal theory of communcaton. The Bell System Techncal Journal, 948. [3] Inrene Woo Adel Magra, Emma Goune. Informaton theory. http://luc.devroye.org/magra-goune-woo--shannon+ InformatonTheory-LectureNotes-McGllUnversty-207.pdf, March 207. Accessed on 208-03-20. [4] George Markowsky. Informaton theory. https://www. brtannca.com/scence/nformaton-theory, June 207. Accessed on 208-03-.