Information and Entropy

Similar documents
Digital communication system. Shannon s separation principle

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

UNIT I INFORMATION THEORY. I k log 2

Predictive Coding. Prediction Prediction in Images

Predictive Coding. Prediction

Compression and Coding

Source Coding: Part I of Fundamentals of Source and Video Coding

Chapter 2: Source coding

Image and Multidimensional Signal Processing

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Basic Principles of Video Coding

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

COMM901 Source Coding and Compression. Quiz 1

CSCI 2570 Introduction to Nanocomputing

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Coding for Discrete Source

Summary of Last Lectures

1 Introduction to information theory

Entropy as a measure of surprise

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

3F1 Information Theory, Lecture 3

Transform Coding. Transform Coding Principle

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

2018/5/3. YU Xiangyu

3F1 Information Theory, Lecture 3

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Information Theory and Statistics Lecture 2: Source coding

Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

Chapter 9 Fundamental Limits in Information Theory

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Exercises with solutions (Set B)

Motivation for Arithmetic Coding

ELEMENT OF INFORMATION THEORY

CMPT 365 Multimedia Systems. Lossless Compression

Chapter 5: Data Compression

lossless, optimal compressor

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Image Processing Lectures 25 & 26

Uncertainity, Information, and Entropy

Multimedia Networking ECE 599

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

On Common Information and the Encoding of Sources that are Not Successively Refinable

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

Distributed Arithmetic Coding

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C

Lecture 1: Shannon s Theorem

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Compression and Coding. Theory and Applications Part 1: Fundamentals

at Some sort of quantization is necessary to represent continuous signals in digital form

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 4 Noisy Channel Coding

Intro to Information Theory

Information Sources. Professor A. Manikas. Imperial College London. EE303 - Communication Systems An Overview of Fundamentals

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

Module 5 EMBEDDED WAVELET CODING. Version 2 ECE IIT, Kharagpur

Lecture 4 Channel Coding


Quantization. Introduction. Roadmap. Optimal Quantizer Uniform Quantizer Non Uniform Quantizer Rate Distorsion Theory. Source coding.

EE-597 Notes Quantization

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

Information Theory, Statistics, and Decision Trees

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

6.02 Fall 2012 Lecture #1

Tight Upper Bounds on the Redundancy of Optimal Binary AIFV Codes

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

ELEC 515 Information Theory. Distortionless Source Coding

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image

BASICS OF COMPRESSION THEORY

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression

3F1 Information Theory Course

Shannon s A Mathematical Theory of Communication

Lecture Notes on Digital Transmission Source and Channel Coding. José Manuel Bioucas Dias

Shannon-Fano-Elias coding

Reliable Computation over Multiple-Access Channels

Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

DCSP-3: Minimal Length Coding. Jianfeng Feng

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Lecture 7 Predictive Coding & Quantization

Introduction to information theory and coding

Transcription:

Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication Information and Entropy 1 Shannon's Separation Principle Assumptions: Single source and user Unlimited complexity and delay Information Source Generates information we want to transmit or store Source Coding Channel Coding Reduces number of bits to store or transmit relevant information Increases number of bits or changes them to protect against channel errors Thomas Wiegand: Digital Image Communication Information and Entropy 2

Practical Systems Many applications are not uni-directional point-to-point transmissions: Feedback Networks In any practical system, we cannot effort unlimited complexity neither unlimited delay: There will always be a small error rate unless we tolerate sub-optimality It might work better to consider source and channel coding jointly Consider effect of transmission errors on source decoding result Thomas Wiegand: Digital Image Communication Information and Entropy 3 Source Coding Principles The source coder shall represent the video signal by the minimum number of (binary symbols without exceeding an acceptable level of distortion. Two principles are utilized: 1. Properties of the information source that are known a priori result in redundant information that need not be transmitted ( redundancy reduction. 2. The human observer does not perceive certain deviations of the received signal from the original ( irrelevancy reduction. Lossless coding: completely reversible, exploit 1. principle only Lossy coding: not reversible, exploit 1. and 2. principle Thomas Wiegand: Digital Image Communication Information and Entropy 4

Entropy of a Memoryless Source Let a memoryless source be characterized by an ensemble U 0 with: Alphabet { a 0,a 1,a 2,...,a K-1 } Probabilities { P(a 0, P(a 1, P(a 2,..., P(a K-1 } Shannon: information conveyed by message a k : I(a k =-log(p(a k Entropy of the source is the average information contents: K-1 H(U 0 =E{I(a k } = - P(a k *log(p(a k k=0 For log = log 2 the unit is bits/symbol Thomas Wiegand: Digital Image Communication Information and Entropy 5 Properties of entropy: Entropy and Bit-Rate H(U 0 0 max { H(U 0 }=logk with P(a j =P(a k for all j, k The entropy H(U 0 is a lower bound for the average word length l av of a decodable variable length code Conversely, the average word length l av can approach H(U 0, if sufficiently large blocks of symbols are encoded jointly. Redundancy of a code: ρ = l av - H(U 0 0 Thomas Wiegand: Digital Image Communication Information and Entropy 6

Encoding with Variable Word Lengths A code without redundancy, i.e. l av = H(U 0 is achieved, if all individual code word lengths l cw (a k =-log(p(a k For binary code words, all probabilities would have to be binary fractions: P(a k =2 -l cw(a k Thomas Wiegand: Digital Image Communication Information and Entropy 7 Redundant Codes: Example a i P(a i redundant code optimum code a 1 0.500 00 0 a 2 0.250 01 10 a 3 0.125 10 110 a 4 0.125 11 111 H(U 0 =1.75 bits l av =2bits ρ =0.25bits l av =1.75 bits ρ =0bits Thomas Wiegand: Digital Image Communication Information and Entropy 8

Variable Length Codes Unique decodability: Where does each code word start or end Insert start symbol: 01.0.010.1. wasteful Construct prefix-free code Kraft Inequality: test for uniquely decodable codes K-1 - l cw (a k Uniquely decodable code exists if ς = 2 1 k=0 Application: a i P(a i -log 2 (P(a i Code A Code B a 1 0.5 1 0 0 a 2 0.2 2.32 01 10 a 3 0.2 2.32 10 110 a 4 0.1 3.32 111 111 ς =1.125 ς =1 Not uniquely decodable Uniquely decodable Thomas Wiegand: Digital Image Communication Information and Entropy 9 root node branch Prefix-Free Codes Prefix-free codes are instantaneously and uniquely decodable Prefix-free codes can be represented by trees 0 0 terminal node 0 10 1 0 110 interior 1 node 1 111 Terminal nodes may be assigned code words Interior nodes cannot be assigned code words For binary trees: N terminal nodes: N-1 interior nodes Code 0, 01, 11 is not a prefix-free code and uniquely decodable but: non-instantaneous Thomas Wiegand: Digital Image Communication Information and Entropy 10

Huffman Code Design algorithm for variable length codes proposed by D. A. Huffman (1952 always finds a code with minimum redundancy. Obtain code tree as follows: 1 Pick the two symbols with lowest probabilities and merge them into a new auxiliary symbol. 2 Calculate the probability of the auxiliary symbol. 3 If more than one symbol remains, repeat steps 1 and 2 for the new auxiliary alphabet. 4 Convert the code tree into a prefix code. Thomas Wiegand: Digital Image Communication Information and Entropy 11 Huffman Code: Example p(7=0,29 p(6=0,28 p(5=0,16 p(4=0,14 p(3=0,07 p(2=0,03 p(1=0,02 p(0=0,01 "1" "1" p=0,57 "1" "1" "1" p=0,43 "1" p=0,27 p=0,13 "1" p=0,06 p=0,03 1 Pick the two symbols with lowest probabilities and merge them into a new auxiliary symbol. 2 Calculate the probability of the auxiliary symbol. 3 If more than one symbol remains, repeat steps 1 and 2 for the new auxiliary alphabet. 4 Convert the code tree into a prefix code. "11" "10" "01" "001" "0001" "00001" "000001" "000000" Thomas Wiegand: Digital Image Communication Information and Entropy 12

Joint Sources Joint sources generate N symbols simultaneously. A coding gain can be achieved by encoding those symbols jointly. The lower bound for the average code word length is the joint entropy: HU 1, U 2,...,U N = -... Pu 1,u 2,...,u N log Pu 1,u 2,...,u N u 1 u 2 u N It generally holds that HU 1, U 2,...,U N HU 1 + HU 2 +... + HU N with equality, if U 1, U 2,...,U N are statistically independent. Thomas Wiegand: Digital Image Communication Information and Entropy 13 Markov Process Neighboring samples of the video signal are not statistically independent: Source with memory P(u T P(u T I u T-1, u T-2,..., u T-N A source with memory can be modeled by a Markov random process. Conditional p robabilities of the source symbols u T of a Markov source of order N: P(u T Z T =P(u T I u T-1, u T-2,...,u T-N state of the Markov source at time T Thomas Wiegand: Digital Image Communication Information and Entropy 14

Entropy of Source with Memory Markov source of order N: conditional entropy H(U T I Z T =H(U T I U T-1, U T-2,..., U T-N =-E{log(p(u T I u T-1, u T-2,..., u T-N } =-Σ ut... Σ ut-n p(u T,u T-1, u T-2,..., u T-N log(p(u T I u T-1, u T-2,..., u T-N H(U T H(U T I Z T (equality for memoryless source Average code word length can approach H(U T Z T e.g.witha switched Huffman code Number of states for an 8-bit video signal: N =1 N =2 N =3 256 states 65536 states 16777216 states Thomas Wiegand: Digital Image Communication Information and Entropy 15 Second Order Statistics for Luminance Signal Relative frequency of occurence Amplitude of adjacent pixel Amplitude of current pixel Histogram of two horizontally adjacent pels (picture: female head-and-shoulder view Thomas Wiegand: Digital Image Communication Information and Entropy 16

Arithmetic Coding Universal entropy coding algorithm for strings Representation of a string by a subinterval of the unit interval [0,1 Width of the subinterval is approximately equal to the probability of the string p(s 1 0 0.1111 0.111 0.1101 0.11 0.1011 0.101 0.1001 0.1 0.0111 0.011 0.0101 0.01 0.0011 0.001 0.0001 0 p(s Interval of width p(s is guaranteed to contain one number that can be represented by b binary digits, with - log(p(s + 1 b - log(p(s + 2 Each interval can be represented by a number which needs 1 to 2 bits more than the ideal code word length Thomas Wiegand: Digital Image Communication Information and Entropy 17 Arithmetic Coding: Probability Intervals Random experiment: pmf p( = (0.01 b and p( = (0.11 b Symbol 1 1 0.0111 0.010011 0.01000011 0.01000011 0 0.01 0.01 0.01 0.01 Multiplications and additions with (potentially very long word length Universal coding: probabilities can be changed on the fly: e.g., use p( I, p( I, p( I, p( I 0.010000011 Thomas Wiegand: Digital Image Communication Information and Entropy 18

Arithmetic Encoding and Decoding Encoding: w, s, s, s 010000 Decoding: 010 w, s q= s" q= w" b= b="1" I N=1 I N=2 I N=3 I N=4 I M=1 I M=2 I M=3 I M=4 I M=5 I M=6 Probabilty intervals Code intervals from: Ohm Thomas Wiegand: Digital Image Communication Information and Entropy 19 Adaptive Entropy Coding For non-adaptive coding methods: pdf of source must be known a priori (inherent assumption: stationary source Image and video signals are not stationary: sub-optimal performance Solution: adaptive entropy coding Two basic approaches to adaptation: 1. Forward Adaptation Gather statistics for a large enough block of source symbols Transmit adaptation signal to decoder as side information Drawback: increased bit-rate 2. Backward Adaptation Gather statistics simultaneously at coder and decoder Drawback: error resilience Combine the two approaches and circumvent drawbacks (Packet based transmission systems Thomas Wiegand: Digital Image Communication Information and Entropy 20

Forward vs. Backward Adaptive Systems Forward Adaptation Compuation of adaptation signal Source symbols Delay Encoding Channel Decoding Reconstructed symbols Backward Adaptation Source symbols Delay Compuation of adaptation signal Encoding Channel Compuation of adaptation signal Decoding Delay Reconstructed symbols from: Ohm Thomas Wiegand: Digital Image Communication Information and Entropy 21 Summary Shannon s information theory vs. practical systems Source coding principles: redundancy & irrelevancy reduction Lossless vs. lossy coding Redundancy reduction exploits the properties of the signal source. Entropy is the lower bound for the average code word length. Huffman code is optimum entropy code. Huffman coding: needs code table. Arithmetic coding is a universal method for encoding strings of symbols. Arithmetic coding does not need a code table. Adaptive entropy coding: gains for sources that are not stationary Thomas Wiegand: Digital Image Communication Information and Entropy 22