Module 1. Introduction to Digital Communications and Information Theory. Version 2 ECE IIT, Kharagpur

Similar documents
Noisy channel communication

Entropies & Information Theory

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

Dept. of Linguistics, Indiana University Fall 2015

Revision of Lecture 4

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Chapter 9 Fundamental Limits in Information Theory

1 Introduction to information theory

Coding for Discrete Source

CSCI 2570 Introduction to Nanocomputing

Information Sources. Professor A. Manikas. Imperial College London. EE303 - Communication Systems An Overview of Fundamentals

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

Exercise 1. = P(y a 1)P(a 1 )

ITCT Lecture IV.3: Markov Processes and Sources with Memory

Chapter I: Fundamental Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Shannon s noisy-channel theorem

02 Background Minimum background on probability. Random process

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Exam 1. Problem 1: True or false

Discrete Probability Refresher

Information Theory - Entropy. Figure 3

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

Review of Basic Probability

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

ELEC546 Review of Information Theory

CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 3 (May 2, 2018) SOLUTIONS

Introduction to Probability and Stocastic Processes - Part I

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

Principles of Communications

Shannon's Theory of Communication

Bivariate distributions

5 Mutual Information and Channel Capacity

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

Revision of Lecture 5


Lecture 2: August 31

Lecture 14 February 28

EGR 544 Communication Theory

Bounds on Mutual Information for Simple Codes Using Information Combining

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Massachusetts Institute of Technology

Computing and Communications 2. Information Theory -Entropy

One Lesson of Information Theory

Lecture 4 Noisy Channel Coding

Introduction to Information Theory. B. Škorić, Physical Aspects of Digital Security, Chapter 2

Classical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006

Information measures in simple coding problems

Information Theory and Statistics Lecture 2: Source coding

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Probability Theory. Fourier Series and Fourier Transform are widely used techniques to model deterministic signals.

Digital communication system. Shannon s separation principle

Recent Results on Input-Constrained Erasure Channels

Lecture 8: Shannon s Noise Models

(Classical) Information Theory III: Noisy channel coding

Week No. 06: Numbering Systems

Probability, Entropy, and Inference / More About Inference

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Basic information theory

Communications Theory and Engineering

Block 2: Introduction to Information Theory

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Introduction to Information Theory

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2

Gambling and Information Theory

Entropy in Classical and Quantum Information Theory

Naïve Bayes classification

Noisy-Channel Coding

Chapter 2 Source Models and Entropy. Any information-generating process can be viewed as. computer program in executed form: binary 0

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

ELEMENT OF INFORMATION THEORY

UNIT I INFORMATION THEORY. I k log 2

3F1 Information Theory, Lecture 1

to mere bit flips) may affect the transmission.

Communication Theory and Engineering

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011

Decision-Point Signal to Noise Ratio (SNR)

On the Throughput, Capacity and Stability Regions of Random Multiple Access over Standard Multi-Packet Reception Channels

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

If the objects are replaced there are n choices each time yielding n r ways. n C r and in the textbook by g(n, r).

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Towards a Theory of Information Flow in the Finitary Process Soup

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Information and Entropy

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

A Simple Introduction to Information, Channel Capacity and Entropy

Classification & Information Theory Lecture #8

Topics. Probability Theory. Perfect Secrecy. Information Theory

Massachusetts Institute of Technology. Solution to Problem 1: The Registrar s Worst Nightmare

CS 630 Basic Probability and Information Theory. Tim Campbell

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

M378K In-Class Assignment #1

XOR - XNOR Gates. The graphic symbol and truth table of XOR gate is shown in the figure.

Transcription:

Module ntroduction to Digital Communications and nformation Theory

Lesson 3 nformation Theoretic Approach to Digital Communications

After reading this lesson, you will learn about Scope of nformation Theory Measures of self and mutual information Entropy and average mutual information The classical theory of information plays an important role in defining the principles as well as design of digital communication systems. Broadly, nformation Theory provides us answer to questions such as: a What is information and how to quantify and measure it? b What ind of processing of information may be possible to facilitate transmission of information efficiently through a transmission medium? c What if at all, are the fundamental limits for such processing of information and how to design efficient schemes to facilitate transmission (or storage of information? d How should devices be designed to approach these limits? More specifically, nowledge of information theory helps us in efficient design of digital transmission schemes and also in appreciating scope and limits of the extent of improvement that may be possible. n this Lesson, we introduce the concepts of information the way they are used in Digital Communications. t is customary and convenient to define information through the basic concepts of a statistical experiment. Let us consider a statistical experiment, having a number of outcomes, rather than a single outcome. Examples of several such experiments with multiple outcomes can be drawn from the theory of Digital Communications and a few are mentioned below. Examples of random experiments with multiple outcomes. A signal source with finite and discrete number of possible output letters may be viewed as a statistical experiment wherein an outcome of interest may be sequences of source letters.. Another example, frequently referred to, is that of a discrete input-discrete output communication channel where the input and output letters (or sequences may be the outcomes of interest. We discuss more about such a channel later. Joint Experiment Let us consider a two-outcome experiment. Let, x and y denote the two outcomes where, x denotes a selection from the set of alternatives X = { a, a,... a K }.3.

K is the number of elements in the set X. Similarly, let y ' denote a selection from the set of alternatives Y = { b, b,... b J }, with J being the number of elements in set Y. Now, the set of pairs a, b K and j J, forms a joint sample space. { j} Let, the corresponding joint probability of Px ( = a & y= b j over the joint sample space be denoted as, PXY ( a, bj where, K and j J.3. We are now in a position to define a joint ensemble XY that consists of the joint sample space and the probability measures of its elements{ a, b j}. An event over the joint ensemble is defined as a subset of elements in the joint sample space. Example: n XY joint ensemble, the event that x= a corresponds to the subset of pairs {( a, b ;( a, b ;...( a b }. So, we can write an expression of P ( a as:, J J PX( a= PXY ( a, bj, or in short, Px ( = Pxy (,.3.3 j= Conditional Probability The conditional probability that the outcome y is b j given that the outcome x is a is defined as or in short, ( PXY ( a, bj P ( b j a = YX PX( a, assuming that, Pa ( 0.3.4 > p( xy, P y x = p( x Two events x = a and y = b j are said to be statistically independent if, PXY ( a, bj = PX ( a. PY ( bj So in this case, PX( a. PY( bj P ( b j a = PY( b YX j P ( a =.3.5 X Two ensembles X and Y will be statistically independent if and only if the above condition holds for each element in the joint sample space, i.e., for all j-s and -s. So, two ensembles X and Y are statistically independent if all the pairs (a, b j are statistically independent in the joint sample space. y X

The above concept of joint ensemble can be easily extended to define more than two (many outcomes. We are now in a position to introduce the concept of information. For the sae if generality, we first introduce the definition of mutual information. Mutual nformation X = { a, a,..., ak} Let, and Y = { b, b,..., bj } in a joint ensemble with joint probability assignments PXY ( a, bj. Now, the mutual information between the events x=a and y=b j is defined as, P ( a b XY j ( a; bj log, bits PX( a a posteriori probability of 'x' = log, bits a priori probability of 'x' ratio of conditional probability of a given by b j i.e, Mutual nformation = log probability of a.3.6 This is the information provided about the event x = a by the occurrence of the event y = b j. Similarly it is easy to note that, P ( ba YX j ( b; a log Pb ( YX ; j =.3.7 Further, it is interesting to note that, Y j P ( bj a YX PXY ( a, bj Y; X( bj; a = log = log. PY( bj PX( a PY( bj P ( a b XY j = log = ( a; bj PX( a ( a ; b = ( b ; a.3.8 X ; Y j Y; X j i.e, the information obtained about a given b j is same as the information that may be obtained about bj given that x = a. Hence this parameter is nown as mutual information. Note that mutual information can be negative.

Self-nformation Consider once again the expression of mutual information over a joint ensemble: P ( a b XY j ( a; bj = log bits,.3.9 PX( a Now, if P ( a bj = i.e., if the occurrence of the event y=b XY j certainly specifies the outcome of x = a we see that, ( a; b = j X( a = log PX( a bit.3.0 This is defined as the self-information of the event x = a. t is always non-negative. So, we say that self-information is a special case of mutual information over a joint ensemble. Entropy (average self-information The entropy of an ensemble is the average value of all self information over the ensemble and is given by K H( X = PX( alog P ( a = X = p( xlog p( x, bit.3. x To summarize, self-information of an event x = a is X( a = log P ( a bit while the entropy of the ensemble X is X K K H( X = P ( a log bit. ( X = PX a Average Mutual nformation The concept of averaging information over an ensemble is also applicable over a joint ensemble and the average information, thus obtained, is nown as Average Mutual nformation: K J P ( a b XY j ( X; Y PXY ( a ; bj log bit.3. P ( a = j= X

t is straightforward to verify that, ( X; Y P ( a, b log ( a b K J XY j XY j = j= px( a = P P ( b a = = (Y;X.3.3 ( K J YX j PXY ( a, bj log = j= py bj The unit of Average Mutual nformation is bit. Average mutual information is not dependent on specific joint events and is a property of the joint ensemble. Let us consider an example elaborately to appreciate the significance of self and mutual information. Example of a Binary Symmetric Channel (BSC Let us consider a channel input alphabet X = {a, a } and a channel output alphabet Y = { b, b }. Further, let P(b a = P(b a = -ε and P(b a = P(b a = ε. This is an example of a binary channel as both the input and output alphabets have two elements each. Further, the channel is symmetric and unbiased in its behavior to the two possible input letters a and a. Note that the output alphabet Y need not necessarily be the same as the input alphabet in general. P(b a a -ε ε ε a b -ε Fig.P..3. Representation of a binary symmetric channel; ε indicates the probability of an error during transmission. Usually, if a is transmitted through the channel, b will be received at the output provided the channel has not caused any error. So, ε in our description represents the probability of the channel causing an error on an average to the transmitted letters. Let us assume that the probabilities of occurrence of a and a are the same, i.e. P X (a = P X (a = 0.5. A source presenting finite varieties of letters with equal probability is nown as a discrete memory less source (DMS. Such a source is unbiased to any letter or symbol. b Now, observe that, PXY ( a, b = PXY ( a, b = PXY ( a, b = PXY ( a, b = And Note that the output letters are also equally probable, i.e. Py ( j xi px ( i = for j=

So, P a b XY P a XY b ( = ( = and P ( a b = P ( a b = XY XY Therefore, possible mutual information are: ( a; b = log ( = XY ; ( a; b and ( a; b = log = XY ; ( a; b Next, we determine an expression for the average mutual information of the joint ensemble XY: ( X; Y = ( log ( + log bit Some observations (a Case#: Let, 0. n this case, ( a; b = ( a ; b = log ( bit =.0 bit (approx. ( a; b = ( a ; b = log bit a large negative quantity This is an almost noiseless channel. When a is transmitted we receive b at the output almost certainly. (b Case#: Let us put = 0.5. n this case, ( a; b = ( a ; b = 0 bit t represents a very noisy channel where no information is received. The input and output are statistically independent, i.e. the received symbols do not help in assessing which symbols were transmitted. (c Case#3 : Fortunately, for most practical scenario, 0< << 0.5 and usually, ( a; b = ( a; b = log ( bit =.0 bit (approx. So, reception of the letter y=b implies that it is highly liely that a was transmitted. Typically, in a practical digital transmission system, 0-3 or less. The following figure, Fig.3. shows the variation of (X;Y vs.. (X;Y.0 0.5 ε Fig..3. Variation of average mutual information vs. ε for a binary symmetric channel

Problems Q.3. Why the theory of information is relevant for understanding the principles of digital communication systems? Q.3. Consider a random sequence of 6 binary digits where the probability of occurrence is 0.5. How much information is contained in this sequence? Q.3.3 A discrete memory less source has an alphabet {,,,9}. f the probability of generating digit is inversely proportional to its value. Determine the entropy of source. Q.3.4 Describe a situation drawn from your experience where concept of mutual information may be useful.