UNIT I INFORMATION THEORY. I k log 2

Similar documents
CSCI 2570 Introduction to Nanocomputing

Chapter 9 Fundamental Limits in Information Theory

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Compression and Coding

Information and Entropy

Basic information theory

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

3F1 Information Theory, Lecture 3

Entropy as a measure of surprise

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

2018/5/3. YU Xiangyu


Digital communication system. Shannon s separation principle

Lecture 1: Shannon s Theorem

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

3F1 Information Theory, Lecture 3

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

1 Introduction to information theory

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Digital Image Processing Lectures 25 & 26

Roll No. :... Invigilator's Signature :.. CS/B.TECH(ECE)/SEM-7/EC-703/ CODING & INFORMATION THEORY. Time Allotted : 3 Hours Full Marks : 70

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

CMPT 365 Multimedia Systems. Lossless Compression

Chapter 2: Source coding

Introduction to Information Theory. By Prof. S.J. Soni Asst. Professor, CE Department, SPCE, Visnagar

ELEMENT OF INFORMATION THEORY

An introduction to basic information theory. Hampus Wessman

Lecture 4 Channel Coding

Information Theory - Entropy. Figure 3

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

A Mathematical Theory of Communication

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Coding for Discrete Source

ECE Advanced Communication Theory, Spring 2009 Homework #1 (INCOMPLETE)

ELEC 515 Information Theory. Distortionless Source Coding

Fibonacci Coding for Lossless Data Compression A Review

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Source Coding Techniques

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Lecture 1 : Data Compression and Entropy

ELECTRONICS & COMMUNICATIONS DIGITAL COMMUNICATIONS

Lecture 10 : Basic Compression Algorithms

Lecture 4 : Adaptive source coding algorithms

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Source Coding: Part I of Fundamentals of Source and Video Coding

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Information Theory and Coding Techniques

Uncertainity, Information, and Entropy

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Entropies & Information Theory

Image and Multidimensional Signal Processing

Information & Correlation

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

CSE 421 Greedy: Huffman Codes

Information Theory, Statistics, and Decision Trees

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

6.02 Fall 2012 Lecture #1

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

Data Compression Techniques

Coding of memoryless sources 1/35

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011

Chapter 5: Data Compression

COMM901 Source Coding and Compression. Quiz 1

Lecture 11: Polar codes construction

Entropy and Certainty in Lossless Data Compression

Lecture 8: Shannon s Noise Models

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Quantum-inspired Huffman Coding

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

Shannon's Theory of Communication

Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source

4 An Introduction to Channel Coding and Decoding over BSC

(Classical) Information Theory III: Noisy channel coding

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

1. Basics of Information

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

Block 2: Introduction to Information Theory

Shannon-Fano-Elias coding

Lecture 4 Noisy Channel Coding

BASICS OF COMPRESSION THEORY

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Summary of Last Lectures

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Image Data Compression

Basic Principles of Video Coding

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

Lecture 3 : Algorithms for source coding. September 30, 2016

Transcription:

UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper A Mathematical Theory of Communication (1948). The ideas from information theory have been applied to many areas such as wireless communication, video compression, bioinformatics, and others. Objective To introduce to the students the concept of amount of information, entropy, channel capacity, error-detection and error-correction codes, block coding, convolutional coding, and Viterbi decoding algorithm. to calculate the capacity of a communication channel, with and without noise; coding schemes, including error correcting codes; how discrete channels and measures of information generalize to their continuous forms; the Fourier perspective; and extensions to wavelets, complexity, compression, and efficient coding of audio-visual information. Information theory Information theory is a branch of mathematics that overlaps into communications engineering, biology, medical science, sociology, and psychology. The theory is devoted to the discovery and exploration of mathematical laws that govern the behavior of data as it is transferred, stored, or retrieved. Information theory deals with measurement and transmission of information through a channel. Whenever data is transmitted, stored, or retrieved, there are a number of variables such as bandwidth, noise, data transfer rate, storage capacity, number of channels, propagation delay, signal-to-noise ratio, accuracy (or error rate), intelligibility, and reliability. In audio systems, additional variables include fidelity and dynamic range. In video systems, image resolution, contrast, color depth, color realism, distortion, and the number of frames per second are significant variables. Information Suppose the allowable messages (or symbols) are m 1,m 2,.. and each have probability of occurrence p 1,p 2, The transmitter selects message k with probability pk. (The complete set of symbols {m1,m2,.. } is called the alphabet.) If the receiver correctly identifies the message then an amount of information I k given by I k log 2 has been conveyed. I k is dimensionless, but is measured in bits. The definition of information satisfies a number of useful criteria: It is intuitive: the occurrence of a highly probable event carries little information (I k = 0 for p k = 1). It is positive: information may not decrease upon receiving a message (I k 0 for 0 p k 1). We gain more information when a less probable message is received (I k > I l for p k < p l ). Information is additive if the messages are independent:

I k,l = log 2 =log 2 + log 2 = I k +I l. Entropy In information theory, entropy is a measure of the uncertainty associated with a random variable. Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable. Suppose we have M different independent messages (as before), and that a long sequence of L messages is generated. In the L message sequence, we expect p 1 L occurrences of m 1, p 2 L of m 2, etc. The total information in the sequence is I total =p 1 Llog 2 + p 2 Llog 2 +. so the average information per message interval will be H = = p 1 log 2 + p 2 log 2 +. = This average information is referred to as the entropy.

Information rate If the source of the messages generates messages at the rate r per second, then the information rate is defined to be R r H = average number of bits of information per second. Classification of codes A binary code encodes each character as a binary string or codeword. There are two ways to encode the file, they are a fixed-length code: each codeword has the same length. ASCII, the most widely used code for representing text in computer systems, is a fixed length code a variable-length code: codewords may have different lengths. Morse code, Shannon fano code, huffmann code A prefix code: no codeword is a prefix of any other codeword. Source Coding Theorem (Shannon's first theorem) The theorem can be stated as follows: Given a discrete memoryless source of entropy H(S), the average code-word length L for any distortionless source coding is bounded as L H(S) This theorem provides the mathematical tool for assessing data compaction, i.e. lossless data compression, of data generated by a discrete memoryless source. The entropy of a source is a function of the probabilities of the source symbols that constitute the alphabet of the source.

Entropy of Discrete Memoryless Source Assume that the source output is modeled as a discrete random variable, S, which takes on symbols from a fixed finite alphabet S={s 0, s 1,.,s k-1 } With probabilities P (S = s k ) = p k, k=0,1,2,,k-1 with =1 Define the amount of information gain after observing the event S = s k as the logarithmic function I(S)=log 2 bits the entropy of the source is defined as the mean of I(s k ) over source alphabet S given by H(S)=E[I(s k )] bits The entropy is a measure of the average information content per source symbol. The source coding theorem is also known as the "noiseless coding theorem" in the sense that it establishes the condition for error - free encoding to be possible. Channel Coding Theorem (Shannon's 2nd theorem) The channel coding theorem for a discrete memoryless channel is stated in two parts as follows: (a)let a discrete memoryless source with an alphabet S have entropy H(S) and produce symbols once every T s seconds. Let a discrete memoryless channel have capacity C and be used once every T c seconds. Then if There exists a coding scheme for which the source output can be transmitted over the channel and be reconstructed with an arbitrarily small probability of error. Conversely, if It is not possible to transmit information over the channel and reconstruct with an arbitrarily small probability of error. The theorem specifies the channel capacity C as a fundamental limit on the rate at which the transmission of reliable error - free message can take place over a discrete memory less channel. Theorem: Kraft inequality

Shannon-Fano coding A variable-length coding based on the frequency of occurrence of each character. Divide the characters into two sets with the frequency of each set as close to half as possible, and assign the sets either 0 or 1 coding. Repeatedly divide the sets until each character has a unique coding. Shannon-Fano is a minimal prefix code. Huffman is optimal for character coding (one character-one code word) and simple to program. Arithmetic coding is better still, since it can allocate fractional bits, but is more complicated and has patents. Example Shannon-Fano Coding To create a code tree according to Shannon and Fano an ordered table is required providing the frequency of any symbol. Each part of the table will be divided into two segments. The algorithm has to ensure that either the upper and the lower part of the segment have nearly the same sum of frequencies. This procedure will be repeated until only single symbols are left. Symbol Frequency Code Code Total Length Length ------------------------------------------ A 24 2 00 48 B 12 2 01 24 C 10 2 10 20 D 8 3 110 24 E 8 3 111 24 ------------------------------------------ total: 62 symbols SF coded: 140 Bit linear (3 Bit/Symbol): 186 Bit

The original data can be coded with an average length of 2.26 bit. Linear coding of 5 symbols would require 3 bit per symbol. But, before generating a Shannon-Fano code tree the table must be known or it must be derived from preceding data. Step-by-Step Construction Freq- 1. Step 2. Step 3. Step Symbol quency Sum Kode Sum Kode Sum Kode ----------------------------------------------- A 24 24 0 24 00 ---------- B 12 36 0 12 01 -------------------------- C 10 26 1 10 10 ---------------------- D 8 16 1 16 16 110 ----------- E 8 8 1 8 8 111 ----------------------------------------------- Code trees according to the steps mentioned above: 1.

2. 3. Huffman coding Definition: A minimal variable-length character coding based on the frequency of each character. First, each character becomes a one-node binary tree, with the character as the only node. The character's frequency is the tree's frequency. Two trees with the least frequencies are joined as the subtrees of a new root that is assigned the sum of their frequencies. Repeat until all characters are in one tree. One code bit represents each level. Thus more frequent characters are near the root and are coded with few bits, and rare characters are far from the root and are coded with many bits. The worst case for Huffman coding (or, equivalently, the longest Huffman coding for a set of characters) is when the distribution of frequencies follows the Fibonacci numbers. Joining trees by frequency is the same as merging sequences by length in optimal merge. Since a node with only one child is not optimal, any Huffman coding corresponds to a full binary tree. Huffman coding is one of many lossless compression algorithms. This algorithm produces a prefix code.

Example: "abracadabra" Symbol Frequency a 5 b 2 r 2 c 1 d 1 According to the outlined coding scheme the symbols "d" and "c" will be coupled together in a first step. The new interior node will get the frequency 2. Step 1 Symbol Frequency Symbol Frequency a 5 a 5 b 2 b 2 r 2 r 2 c 1 -----------> 1 2 d 1

Code tree after the 1st step: Step 2 Symbol Frequency Symbol Frequency a 5 a 5 b 2 b 2 r 2 -----------> 2 4 1 2 Code tree after the 2nd step: Step3 Symbol Frequency Symbol Frequency a 5 a 5 2 4 -----------> 3 6 b 2 Code tree after the 3rd step:

Step 4 Symbol Frequency Symbol Frequency 3 6 -----------> 4 11 a 5 Code tree after the 4th step: Code Table If only one single node is remaining within the table, it forms the root of the Huffman tree. The paths from the root node to the leaf nodes define the code word used for the corresponding symbol: Symbol Frequency Code Word a 5 0 b 2 10 r 2 111 c 1 1101 d 1 1100 Complete Huffman Tree:

Applications of Huffman Codes Lossless Image Compression Text Compression Lossless Audio Compression Block Huffman Codes (or Extended Huffman Codes) If the source alphabet is rather large,p max is likely to be comparatively small. On the other hand, if the source alphabet contains only a few symbols, the chances are that p max is quite large compared to the other probabilities. The average code length is upper bounded by H(X)+p max +0.086. This seems to be the assurance that the code is never very bad.

Discrete Memoryless Channel A communication Channel may be defined as the path or medium through which the symbols flow to the receiver end. A Discrete Memoryless Channel (DMC) is a statistical model with an input X and an output Y as shown in figure.during each unit of time,( the signaling interval ), the channel accepts an input symbol from X, and in response an output symbol from Y. The channel is said to be "discrete" when the alphabets of X and y are both finite.also,it is said to be "memoryless" and not on any of the previous inputs. Discrete Memoryless Channels Binary Symmetric Channel Binary Erasure Channel Asymmetric Channel