I - Information theory basics

Similar documents
ECE 534 Information Theory - Midterm 2

Improved Capacity Bounds for the Binary Energy Harvesting Channel

Homework Set #3 Rates definitions, Channel Coding, Source-Channel coding

Formal Modeling in Cognitive Science Lecture 29: Noisy Channel Model and Applications;

Information Theory - Entropy. Figure 3

The decision-feedback equalizer optimization for Gaussian noise

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Revision of Lecture 5

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

On Code Design for Simultaneous Energy and Information Transfer

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Lecture 21: Quantum Communication

Principles of Communications

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Convolutional Codes. Lecture 13. Figure 93: Encoder for rate 1/2 constraint length 3 convolutional code.

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

Revision of Lecture 4

Convex Optimization methods for Computing Channel Capacity

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

Universal Finite Memory Coding of Binary Sequences

Coding for Discrete Source

V-0. Review of Probability

EE/Stats 376A: Information theory Winter Lecture 5 Jan 24. Lecturer: David Tse Scribe: Michael X, Nima H, Geng Z, Anton J, Vivek B.

Lecture 4. Capacity of Fading Channels

Communication Theory II

Anytime communication over the Gilbert-Eliot channel with noiseless feedback

On the capacity of the general trapdoor channel with feedback

4 An Introduction to Channel Coding and Decoding over BSC

An Introduction to Information Theory: Notes

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Lecture 2. Capacity of the Gaussian channel

Chapter I: Fundamental Information Theory

School of Computer and Communication Sciences. Information Theory and Coding Notes on Random Coding December 12, 2003.

7. Two Random Variables


UNIT I INFORMATION THEORY. I k log 2

Chapter 9 Fundamental Limits in Information Theory

16.36 Communication Systems Engineering

One Lesson of Information Theory

Interactive Hypothesis Testing Against Independence

LDPC codes for the Cascaded BSC-BAWGN channel

Design, fabrication and testing of high performance fiber optic depolarizer

ECE Information theory Final (Fall 2008)

HetNets: what tools for analysis?

Lecture Thermodynamics 9. Entropy form of the 1 st law. Let us start with the differential form of the 1 st law: du = d Q + d W

Introduction to Wireless & Mobile Systems. Chapter 4. Channel Coding and Error Control Cengage Learning Engineering. All Rights Reserved.

Block 2: Introduction to Information Theory

A PROBABILISTIC POWER ESTIMATION METHOD FOR COMBINATIONAL CIRCUITS UNDER REAL GATE DELAY MODEL

Analysis of Multi-Hop Emergency Message Propagation in Vehicular Ad Hoc Networks

3F1 Information Theory, Lecture 1

Coding Along Hermite Polynomials for Gaussian Noise Channels

Discrete Memoryless Channels with Memoryless Output Sequences

Cooperative Communication with Feedback via Stochastic Approximation

Information Theory, Statistics, and Decision Trees

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Noisy channel communication

Improved Identification of Nonlinear Dynamic Systems using Artificial Immune System

Chapter 2: Source coding

Chapter 10. Supplemental Text Material

Basic information theory

Lecture 22: Final Review

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Planar Transformations and Displacements

Ch 0 Introduction. 0.1 Overview of Information Theory and Coding

Entropies & Information Theory

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Introduction to Information Theory. Part 4

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Massachusetts Institute of Technology

Decoding Linear Block Codes Using a Priority-First Search: Performance Analysis and Suboptimal Version

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

On the Secrecy Capacity of Fading Channels

(Classical) Information Theory III: Noisy channel coding

1 Introduction to information theory

Distributed K-means over Compressed Binary Data

Binary Transmissions over Additive Gaussian Noise: A Closed-Form Expression for the Channel Capacity 1

Asymptotic Distortion Performance of Source-Channel Diversity Schemes over Relay Channels

Channel Coding 1. Sportturm (SpT), Room: C3165

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

s v 0 q 0 v 1 q 1 v 2 (q 2) v 3 q 3 v 4

Lecture 18: Gaussian Channel

CSCI 2570 Introduction to Nanocomputing

SENS'2006 Second Scientific Conference with International Participation SPACE, ECOLOGY, NANOTECHNOLOGY, SAFETY June 2006, Varna, Bulgaria

ALTERNATIVE SOLUTION TO THE QUARTIC EQUATION by Farid A. Chouery 1, P.E. 2006, All rights reserved

ELEC546 Review of Information Theory

Computation of Total Capacity for Discrete Memoryless Multiple-Access Channels

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Amin, Osama; Abediseid, Walid; Alouini, Mohamed-Slim. Institute of Electrical and Electronics Engineers (IEEE)

Introduction to Information Theory

Elliptic Curves and Cryptography

Resonances in high-contrast gratings with complex unit cell topology

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Upper Bounds on the Capacity of Binary Intermittent Communication

ELECTRONICS & COMMUNICATIONS DIGITAL COMMUNICATIONS

DSP IC, Solutions. The pseudo-power entering into the adaptor is: 2 b 2 2 ) (a 2. Simple, but long and tedious simplification, yields p = 0.

Chapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

An introduction to basic information theory. Hampus Wessman

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Transcription:

I - Information theor basics Introduction To communicate, that is, to carr information between two oints, we can emlo analog or digital transmission techniques. In digital communications the message is constituted b sequences of bits. Digital transmissions offer the following advantages:. Robustness to noise and interference that cannot attained with analog sstems (through the use of source and channel coding and a convenient transmission rate in bits/second). 2. Integration of several information sources (analog and digital) in a common format. 3. Securit of the information along the ath between the source and the destination (through the use of encrted messages and sread sectrum techniques). 4. Efficient storage of large amounts of data in otical or magnetic media. 5. Fleibilit in the transmission of information through the communication network b formatting data in ackets (data+origin and destination addresses+acket number). Fig. deicts a oint-to-oint digital communication sstem. The transmission channel is the hsical transmission medium used to connect the source of information (transmitter) to the user (receiver). Different tes of channels can be defined, deending on the art of the sstem we are analzing. Between the modulator outut and the demodulator inut we have a continuous channel modeled, for instance, according to Fig. 2. In this case, the channel is comletel characterized b the noise robabilit densit function w(t). A common channel in Telecommunications is the AWGN (additive white Gaussian noise) where w(t) is additive Gaussian noise with ower sectral densit G w (f) = N 0 /2. If, alternativel, we consider in Fig. the channel encoder outut and the decoder inut, we have a discrete channel that accets smbols i of an inut alhabet X rovided b the channel encoder and ro- FDNunes, IST 203.

2 SOURCE ENCODING 2 duces smbols j belonging to an outut alhabet Y. When X and Y contain the same smbols, j is an estimate of the transmitted smbol i. discrete source source encoder channel encoder i modulator (t) transmission channel user source decoder channel decoder j demodulator (t) Figure : Functional block diagram of a oint-to-oint digital communication sstem (t) + (t) w(t) Figure 2: Diagram of the AWGN channel 2 Source encoding Consider that the message consists of a sequence of smbols selected from a finite set, named source alhabet. In general, we can associate a given robabilit with the occurrence of each smbol of the alhabet. Besides, the successive emitted smbols ma be statisticall indeendent or ehibit some te of deendenc between them. In the former case we sa that the source is memorless. The amount of information carried b a given smbol of the alhabet deends on the uncertaint of its occurrence. For instance, given the sentences: a dog bit a man and a man bit a dog, the amount of information is larger in the latter sentence because the robabilit of occurrence of the latter event is smaller (an event with robabilit one corresonds to a null amount of information). Consider a finite alhabet X formed b M smbols { i } M i= and define a message as a sequence of indeendent smbols (n), n = 0,,..., with n denoting time. A robabilit of occurrence i = rob ( i ) eists associated with each smbol. The amount of information corresonding to that smbol is I( i ) = log 2 (bits) i

2 SOURCE ENCODING 3 In order to characterize the alhabet we define the average content of information (or entro) of X M M H(X) = i I( i ) = i log 2 i which is eressed in bits/smbol. i= eamle: A source alhabet consists of four smbols with robabilities = /2, 2 = /4, 3 = 4 = /8. The source entro is given b H(X) = 2 log 2 2 + 4 log 2 4 + 2 8 log 2 8 =.75 bits/smbol i= A roblem that arises is the encoding of each source smbol through a binar code word (using binar smbols 0 and ). Since the wa of encoding each smbol is not unique, this leads to the question of otimizing the encoding rocess in the sense of minimizing the average number of bits (binar smbols) used to transmit the message. A classical source encoding eamle is the Morse code where the letters A..Z, the numbers 0..9 and some unctuation marks are encoded in binar words constituted b dashes and dots. Let L be the average length of the code words, given b M L = i l i, i= where l i is the length (in bits) of the code word associated with the smbol i. It can be roven that the average length of the code words resents a minimum value such that L H(X) in order to allow the discrete memorless source, X, to be encoded and uniquel decoded (without ambiguit), that is, in such a wa that, to each finite sequence of bits, there is at most a corresonding message. A sufficient condition that allows the code to be uniquel decodable and instantaneous (each word is immediatel decoded after its occurrence) is that no code word is the refi of another longer code word (refi code). For instance, the following code

2 SOURCE ENCODING 4 smbols code words 0 2 0 3 0 4 00 is ambiguous or not uniquel decodable because the sequence of bits 00 can reresent either the smbols 3 + or the smbol 4. But the net code is decodable without ambiguit (refi code) smbols code words 0 2 0 3 0 4 0 The Huffman rocedure can be used to build uniquel decodable codes: ste. Order the M smbols according to decreasing values of their robabilities. ste 2. Grou that last two smbols, M e M, into an equivalent smbol with robabilit M + M. ste 3. Reeat stes e 2 until one smbol is left. ste 4. Using the tree generated b the revious stes, associate the binar smbols 0 and to each air of branches originated from a given intermediate node. The code word of each message smbol is written (from left to right) as a binar sequence read from the root of the tree (thus, from right to left). eamle: Determine a Huffman code for the following source smbols robabilities, i 0.4 2 0.25 3 0.20 4 0. 5 0.05 Solution:

2 SOURCE ENCODING 5 A ossible solution is smbols code words 2 0 3 000 4 000 5 00 corresonding to the Huffman tree i i 2 3 4 5 0.4 0.25 0.2 0. 0.05 0 0.4 0.25 0.2 0.5 0 0.4 0.35 0.25 0 0.6 0.4 0.0 where Figure 3: Eamle of a Huffman tree The efficienc of the resulting code is defined as bits/smbol and η = H L, H = 0.4 log 2 (/0.4) + 0.25 log 2 (/0.25) + 0.2 log 2 (/0.2) + 0. log 2 (/0.) + 0.05 log 2 (/0.05) = 2.04 L = 0.4 + 0.25 2 + 0.20 3 + 0. 4 + 0.04 4 = 2. bits/word, ielding η = H/L = 97.2%. The Huffman algorithm, roosed in 952, requires a robabilistic source model. This data comression technique was later surassed b the Lemel-Ziv algorithm (invented in 978) which is adative and does not require knowledge of the source distribution model. The Lemel-Ziv algorithm is nowadas the most oular data

3 GAUSSIAN CHANNEL CAPACITY 6 comression technique; when alied to english tets is allows a comaction of about 55% whereas the Huffman algorithm allows about 43% comaction. Note that the urose of source encoding is to reduce the source code redundanc and not to rotect against channel errors. This task is assigned to the channel encoding, to be discussed later in this course. 3 Gaussian channel caacit The otimal digital sstem is the one that minimizes the bit error robabilit when certain constraints are imosed to the transmitted energ and channel bandwidth. An issue is the ossibilit of transmitting data without bit errors through a nois channel.this roblem was solved b Claude Shannon in 948, which has shown that, for an AWGN channel, it is ossible to transmit data with a bit error robabilit as small as desired (virtuall tending to zero) rovided that the transmission rate (in bits/second) is smaller than the channel caacit ( C = B log 2 + P ), bits/s N 0 B where B is the channel bandwidth in Hz, P is the average ower of the received signal in watts and P /N 0 B is the recetion signal-to-noise ratio. O channel caacit theorem establishes the theoretical limit that the actual communication sstems can achieve although it does not secif which are the modulation and encoding/decoding techniques to be used to attain that limit. Eamle: Which is the caacit of the AWGN channel, with bandwidth B = 0 KHz when the signal-to-noise ratio is: a) 0 db; b) 20 db. Solution: a) C = 0 4 log 2 2 = 0 Kbits/s b) C = 0 4 log 2 0 = 66.6 Kbits/s Let E b be the average bit energ and R b the transmission rate in bits/second. The Shannon theorem ma be re-written as ( C B = log 2 + E br b N 0 B ).

3 GAUSSIAN CHANNEL CAPACITY 7 or, But R b C; thus ( R b B log 2 + E br b N 0 B E b 2Rb/B N 0 R b /B. This inequalit gives us the minimum value of the bit signal-to-noise ratio for transmissions with arbitraril small robabilites. If now we allow the channel bandwidth to increase to infinit, the asmtotic value of the caacit is ), C = lim C = lim B ln ( ) + P N 0 B B B ln 2 But lim n ( + /n) n =, leading to = ( ln 2 lim ln + P ) B B N 0 B C = P/N 0 ln 2 where P = E b /T b = r b E b (r b = /T b is the transmission rate in bits/s). But r b < C, so E b N 0 > ln 2.6 db This value is the absolute minimum for communications with virtuall null robabilities, being named Shannon limit. Eamle: Determine the minimum bit signal-to-noise ratio to transmit with an arbitraril small error robabilit at the rate of kbit/second when the channel bandwidth is a) B = khz, b) B = 00 Hz. Solution: a) R b /B =, E b /N 0 (0 db) b) R b /B = 0, E b /N 0 02.3 (20. db)

4 DISCRETE MEMORYLESS CHANNEL 8 4 Discrete memorless channel A discrete channel is characterized b an inut alhabet X = { i }, i =,..., M, an outut alhabet Y = { j }, j =,..., N, and a set of conditional robabilities ij, where ij = P ( j i ) reresents the robabilit of receiving the smbol j when smbol i was transmitted (see Fig. 4). It is assumed that the channel does not have memor, that is n P ((),..., (n) (),..., (n)) = P ((i) (i)) i= where (i) and (i) are resectivel the channel inut and outut smbols that occur at the discrete time i with i =,..., n. 2 2 N 2 M M M2 MN N Figure 4: Model of the discrete memorless channel In general we have N ij =, i =,..., M j= that is, the sum of all the transition robabilities using the same inut smbol is equal to one. It is usual to organize the transition robabilities in the so-called channel matri N P = 2 2N..... M MN

4 DISCRETE MEMORYLESS CHANNEL 9 For M = N we define the average error robabilit as P (e) = = N N N N P ( i, j ) = P ( i ) P ( j i ) i= i= j= j i N N P ( i ) ij i= j= j i N P ( i )( ii ) i= j= j i whereas the robabilit of receiving correctl the transmitted smbol is N P (c) P (e) = P ( i ) ii i= noiseless channel. We have M = N and the transition robabilities are ij = {, j = i 0, j i Thus, P (e) = 0. useless channel. We have M = N and the outut smbols are indeendent of the inut smbols ij = P ( j i ) = P ( j ), i, j The noiseless channel and the useless channel are the etreme cases of the ossible channel behavior. The outut smbol of the noiseless channel defines uniquel the inut smbol. In the useless channel the received smbol does not give an useful information about the transmitted smbol. smmetric channel. In this channel each row of P contains the same set of values {r j }, j =,..., N and each column contains the same set of values {q i }, i =,..., M. Eamles: /2 /3 /6 P = /6 /2 /3, P = /3 /6 /2 [ /3 /3 /6 ] /6 /6 /6 /3 /3 Using the channel inut and outut alhabets, resectivel X and Y and the channel matri P, we can define the following five entroies. (i) inut entro H(X)

4 DISCRETE MEMORYLESS CHANNEL 0 H(X) = M i= ( ) P ( i ) log 2 P ( i ) bit/smbol which measures the average amount of information of each smbol of X. (ii) outut entro H(Y ) H(Y ) = N i= ( ) P ( j ) log 2 P ( j ) bit/smbol which measures the average amount of information of each smbol of Y. (iii) joint entro H(X, Y ) ( ) M N H(X, Y ) = P ( i, j ) log 2 i= j= P ( i, j ) bit/(air of smbols) which measures the average information content of a air of outut and inut channel smbols. (iv) conditional entro H(Y X) ( ) M N H(Y X) = P ( i, j ) log 2 i= j= P ( j i ) bit/smbol which measures the average amount of information required to secif the outut (received) smbol when the inut (transmitted) smbol is known. (v) conditional entro H(X Y ) ( ) M N H(X Y ) = P ( i, j ) log 2 i= j= P ( i j ) bit/smbol which measures the average amount of information required to secif the inut smbol when the outut smbol is known. This conditional entro reresents the average amount of information that is lost in the channel (or equivocation). It can also be conceived as the uncertaint about the channel inut after the observation of the channel outut. Note that for a noiseless channel there is no loss of the information in the channel and we have H(X Y ) = 0, whereas in the useless channel we have H(X Y ) = H(X). In this case, the uncertaint about the transmitted smbol remains unaltered b the observation (recetion) of the outut smbol (all the information was lost in the channel). Using the revious entro definitions and the fact that H(X Y ) H(X) and H(Y X) H(Y ), we obtain H(X, Y ) = H(Y, X) = H(X) + H(Y X) = H(Y ) + H(X Y ) ()

5 CAPACITY OF THE DISCRETE MEMORYLESS CHANNEL 5 Caacit of the discrete memorless channel We define the flow of information (or mutual information) between X and Y through the channel as or using () I(X; Y ) H(X) H(X Y ) bit/smbol (2) I(X; Y ) = H(Y ) H(Y X) = H(X) + H(Y ) H(X, Y ) (3) H(X) H(Y) H(X Y) I(X;Y) H(Y X) Figure 5: Relation between the conditional entroies and the mutual information We have I(X; Y ) = H(X) + H(Y ) H(X, Y ) = E = E [ [ ( )] log 2 P (X) + E ( )] P (X, Y ) log 2 P (X)P (Y ) But P ( i, j ) = P ( j i )P ( i ) leading to I(X; Y ) = From (2) and (3) we get also M N i= j= [ ( )] log 2 P (Y ) = M N i= j= ( )] E [log 2 P (X, Y ) ( ) P (i, j ) P ( i, i ) log 2 P ( i )P ( j ) ( ) P (j i ) P ( i, i ) log 2 P ( j ) I(X; Y ) = I(Y ; X) The mutual information I(X; Y ) quantifies the reduction of uncertaint relative to a X given the knowledge of Y (see Fig. 5).

6 CAPACITY OF THE BINARY SYMMETRIC CHANNEL 2 The caacit C of a discrete memorless channel is defined as the maimum of mutual information I(X; Y ) that can be transmitted through the channel C ma P () I(X; Y ) bit/transmission Maimization is carried out relative to the robabilities P ( i ) of the inut smbols. 6 Caacit of the binar smmetric channel Consider the binar smmetric channel (BSC) of Fig. 6 and let q P ( ) and r P ( ). The entro H(X) of source X is (see Fig. 7) - - 2 2 Figure 6: Binar smmetric channel (BSC) H(q) = q log 2 q + ( q) log 2 q and the entro of source Y is H(Y ) = H(r) with Besides ( ) 2 2 H(Y X) = P ( i, j ) log 2 i= j= P ( j i ) bit/smbol P (, ) = P ( )P ( ) = ( )q P (, 2 ) = P ( 2 )P ( ) = q P ( 2, ) = P ( 2 )P ( 2 ) = ( q) P ( 2, 2 ) = P ( 2 2 )P ( 2 ) = ( )( q)

6 CAPACITY OF THE BINARY SYMMETRIC CHANNEL 3 0.9 0.8 0.7 0.6 H(q) 0.5 0.4 0.3 0.2 0. 0 0 0.2 0.4 0.6 0.8 q Figure 7: Entro of the binar source X resulting in H(Y X) = ( )q log 2 + q log 2 + ( q) log 2 + ( )( q) log 2 = H() Thus, the mutual information of the BSC is given b and the caacit of the BSC is I(X; Y ) = H(Y ) H(Y X) = H(r) H() or taking into account Fig. 7 C = ma{h(r)} H() r bit/transmission C = H() The lot of the BSC caacit versus the transition robabilit is shown in Fig. 8. The situation that leads to the maimum, that is, H(r) =, corresonds to

6 CAPACITY OF THE BINARY SYMMETRIC CHANNEL 4 BSC caacit 0.9 0.8 0.7 C [bits/transmission] 0.6 0.5 0.4 0.3 0.2 0. 0 0 0.2 0.4 0.6 0.8 Figure 8: Caacit of the BSC versus the transition robabilit r log 2 r + ( r) log 2 r = which, b insection of Fig. 7, gives r = ( ) = /2. In other words, the maimum of information transmission from the channel inut to the outut, for an value of, occurs when the robabilities of and 2 are equal. The channel caacit is maimum when = 0 or =, since in both cases the channel is noiseless (see Fig. 9). (=0) 2 2 2 2 (=) Figure 9: Noiseless channels that maimize the caacit C For = /2, the channel caacit is zero because the outut smbols are indeendent from the inut smbols as no information can flow through the channel. We have then I(X; Y ) = H(Y ) H(Y X) = H(r) H() = H ( 2) H ( 2) = 0

6 CAPACITY OF THE BINARY SYMMETRIC CHANNEL 5 Bibliograh S. Benedetto, E. Biglieri - Princiles of Digital Transmission with Wireless Alications, Kluwer, 999. Simon Hakin - Communication Sstems, 4.th edition, Wile, 200. C. E. Shannon - A mathematical theor of communication, Bell Sst. Tech. J., vol. 27,. 379-423, 623-656, Jul-Oct. 948. S. Verdú - Fift ears of Shannon theor, IEEE Trans. Info. Theor, vol. 44,. 2057-2078, Oct. 998.