Lecture 14 February 28

Similar documents
Lecture 18: Gaussian Channel

Lecture 8: Channel Capacity, Continuous Random Variables

Capacity of a channel Shannon s second theorem. Information Theory 1/33

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

LECTURE 13. Last time: Lecture outline

Shannon s noisy-channel theorem

Lecture 4 Noisy Channel Coding

Lecture 5 Channel Coding over Continuous Channels

Lecture 3: Channel Capacity

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Noisy channel communication

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Lecture 2. Capacity of the Gaussian channel

Lecture 6 Channel Coding over Continuous Channels

Lecture 22: Final Review

Revision of Lecture 5

Gaussian channel. Information theory 2013, lecture 6. Jens Sjölund. 8 May Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

Chapter 9. Gaussian Channel

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

Appendix B Information theory from first principles

X 1 : X Table 1: Y = X X 2

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

National University of Singapore Department of Electrical & Computer Engineering. Examination for

ELEC546 Review of Information Theory

Chapter 9 Fundamental Limits in Information Theory

Lecture 11: Polar codes construction

Information Theory Primer:

CSCI 2570 Introduction to Nanocomputing

One Lesson of Information Theory

ECE 4400:693 - Information Theory

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

ECE Information theory Final (Fall 2008)

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Lecture 8: Shannon s Noise Models

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Introduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.

ECE Information theory Final

Dept. of Linguistics, Indiana University Fall 2015

EE 4TM4: Digital Communications II Scalar Gaussian Channel

6.02 Fall 2011 Lecture #9

Information Theory for Wireless Communications. Lecture 10 Discrete Memoryless Multiple Access Channel (DM-MAC): The Converse Theorem

LECTURE 3. Last time:

Information Theory - Entropy. Figure 3

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Communication Theory II

Entropies & Information Theory

Principles of Communications

MMSE estimation and lattice encoding/decoding for linear Gaussian channels. Todd P. Coleman /22/02

COMPSCI 650 Applied Information Theory Jan 21, Lecture 2

x log x, which is strictly convex, and use Jensen s Inequality:

Homework Set #2 Data Compression, Huffman code and AEP

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

On the Duality between Multiple-Access Codes and Computation Codes

Lecture 2: August 31

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Classical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Lecture 10: Broadcast Channel and Superposition Coding

Lecture 11: Continuous-valued signals and differential entropy

Lecture B04 : Linear codes and singleton bound

(Classical) Information Theory III: Noisy channel coding

The Method of Types and Its Application to Information Hiding

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Lecture 1: Introduction, Entropy and ML estimation

Ch. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited

Interactive Decoding of a Broadcast Message

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Steiner s formula and large deviations theory

Block 2: Introduction to Information Theory

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

Machine Learning Srihari. Information Theory. Sargur N. Srihari

Communications Theory and Engineering

Lecture 5: Asymptotic Equipartition Property

Lecture 11: Quantum Information III - Source Coding

16.36 Communication Systems Engineering

Recent Results on Input-Constrained Erasure Channels

1 Introduction to information theory

Lecture 4: Proof of Shannon s theorem and an explicit code

Computing and Communications 2. Information Theory -Entropy

Note that the new channel is noisier than the original two : and H(A I +A2-2A1A2) > H(A2) (why?). min(c,, C2 ) = min(1 - H(a t ), 1 - H(A 2 )).

Information Theory and Statistics Lecture 2: Source coding

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Recitation 2: Probability

Energy State Amplification in an Energy Harvesting Communication System

Lecture 11: Information theory THURSDAY, FEBRUARY 21, 2019

EE 4TM4: Digital Communications II. Channel Capacity

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Quantum Information Theory and Cryptography

Transcription:

EE/Stats 376A: Information Theory Winter 07 Lecture 4 February 8 Lecturer: David Tse Scribe: Sagnik M, Vivek B 4 Outline Gaussian channel and capacity Information measures for continuous random variables 4 Recap So far, we have focused only on communication channels with a discrete alphabet For instance, the binary erasure channel (BEC) is a good model for links and routes in a network where packets of data are transferred correctly or lost entirely The binary symmetric channel (BSC) on the other hand is quite an oversimplification of errors in a physical channel and therefore, not a very realistic model 43 Gaussian Channel In reality, a communication channel is ana, motivating the study of continuous alphabet communication channels The physical layer of communication channels has additive noise due to a variety of reasons Now, by the central limit theorem, the cumulative effect of a large number of small random effects is approximately normal, so modelling the channel noise as a Gaussian random variable is quite natural and valid in a large number of situations Mathematically, the Gaussian channel is a continuous alphabet time-discrete channel where the output Y i is the sum of the input X i and independent noise Z i, where Z i N (0, σ ) Y i = X i + Z i 43 Power Constraint If we analyze the Gaussian channel like a discrete alphabet channel, the natural approach would be to try to find its capacity However, if the input is unconstrained, we can choose an infinite subset of inputs arbitrarily far apart, so that they are distinguishable at the output with arbitrarily small probability of error Such a scheme would then have infinite capacity Here we will impose an average power constraint on the input X Any codeword (X,, X n ) transmitted over the channel must satisfy E n i= X i 4- iid np (4)

43 Capacity Theorem In a Gaussian channel with average power constrained by P and noise distributed as N (0, σ ), the channel capacity C is given by ( + P σ ) Theorem is a cornerstone result for information theory, and we will devote this section to analyze the sphere packing structure of a Gaussian channel and provide an intuitive explanation of the theorem above In the next lecture we will prove the theorem rigorously The sphere packing argument of the Gaussian channel is characterized by the noise and the output spheres In particular, the ratio of the volume of the output sphere to the volume of a average noise sphere gives us an upper bound on the codebook size, nr Noise Spheres: Consider any input codeword X of length n and the received vector Y, where Y i = X i + Z i, Z i N (0, σ ) Then, the radius of the noise sphere is the distance between input vector X and received vector Y, equal to n i= (X i Y i ) = n i= Z i Now, EZ i = Var(Z i ) + EZ i = σ and since Z i s are iid by the Weak Law of Large Numbers, n i= Z i nσ Thus the average radius of a noise sphere is approximately nσ Output Sphere: Because of the power constraint P, the input sphere has radius at most np The noise expands the input sphere into the output sphere by nσ Since the output vectors have energy no greater than np +nσ, they lie in a sphere of radius n(p + σ ) The volume of an n-dimensional sphere is given C n R n, where R is its radius and C n is some constant Therefore, we have Volume of Output Sphere = C n (n(p + σ )) n Volume of Noise Sphere = C n (nσ ) n Therefore, the number of non-intersecting noise spheres in the output sphere is at most C n (n(p + σ )) n = ( + Pσ ) n C n (nσ ) n For decoding with a low probability of error, nr ( + Pσ ) n = R ( + P ) σ Similar to the case of discrete alphabet, we expect the expression ( + P σ ) to be the solution of a mutual information maximization problem with power constraint (4): max f X I(X; Y ) st EX P, (4) where I(X; Y ) is some notion of mutual information between continuous random variables X and Y A rigorous proof of the result necessitates the introduction of the corresponding 4-

information measures (mutual information, KL-divergence, entropy etc) for continuous random variables, which will be further used to solve the optimization problem (4) in the next lecture Definition The quantity P σ is the signal to noise ratio 44 Information Measures for Continuous RVs 44 Mutual Information For discrete random variables X, Y p, the mutual information I(X; Y ) equals E p(x,y ) p(x)p(y ) The probability mass function for discrete RVs is akin to the density function for continuous ones This motivates the following definition Definition Mutual information for continuous random variables X, Y f is given by I(X; Y ) E f(x, Y ) X Y Digital To Ana Physical Channel Ana To Digital Figure 4: A typical flow-chart for a communication channel Now, we will show that Definition is sensible by proving that it s an approximation to the discretized form of X, Y Since X and Y are the limit of arbitrarily small discretizations, we will have show that the definition is consistent with our previous definitions For > 0, define X = i if i X < (i + ), Y = i if i Y < (i + ) Now, for small, we have p(x ) f(x), p(y ) and p(x, Y ) f(x, Y ), since f is the probability density function Then, I(X, Y ) = E p(x, Y ) p(x )p(y ) f(x, Y ) E (f(x) )( ) = E f(x, Y ) = I(X; Y ) = lim 0 I(X, Y ) = I(X; Y ) From the above equations we see that can be arbitrarily small, I(X, Y ) approximates I(X; Y ) to an arbitrary precision Therefore,the definition for mutual information in continuous random variables is consistent with that for discrete ones 4-3

44 Differential Entropy The definition of mutual information immediately gives us I(X; Y ) = E f(x, Y ) = E = E E Akin to the discrete case, we would like to define the entropy of Y to be E and the conditional entropy of Y given X to be E However, the quantities and are not dimensionless Furthermore, the entropy of a continuous random variable in its purest sense doesn t exist, since the number of random bits necessary to specify a real number to an arbitrary precision is infinite It would however be very convenient to have a measure for continuous random variables, similar to entropy This prompts the following definitions Definition 3 For a continuous random variable Y f, the differential entropy is defined as h(y ) E Definition 4 For continuous random variables X, Y f, the differential conditional entropy of Y conditioned on X is defined as h(y X) E Now, of course there are similarities and dissimilarities between entropy and differential entropy We explore some of these in detail: H(X) 0 for any discrete random variable X However, h(x) need not be nonnegative for every continuous random variable This is because a probability mass function is always at most but a density function can be arbitrarily large H(X) is label-invariant However, h(x) need not be label invariant The simplest change of labels, say Y = ax, for a scalar a proves this Indeed the density function f Y is given by f Y (y) = f ( y ) a X a and so h(y ) = E = E f Y (Y ) a = a +E f X (Y/a) 443 Differential Entropies of common distributions = h(x)+ a f X (X) We finish these lecture notes by computing the differential entropies of random variables with some common distributions Uniform Distribution Suppose X Unif(0, a) Then, f(x) = for every x 0, a, a and = a Then, f(x) h(x) = E = E = a f(x) a 4-4

Normal Distribution Suppose X N (0, ) Then, f(x) = π e x / for every x, and = f(x) π + x e The differential entropy of X is X h(x) = E = E f(x) π + e = h(x) = π + e (Var(X) + EX ) = πe 4-5