EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Similar documents
Lecture 3: Channel Capacity

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Dispersion of the Gilbert-Elliott Channel

Lecture 4 Noisy Channel Coding

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Second-Order Asymptotics in Information Theory

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Capacity of a channel Shannon s second theorem. Information Theory 1/33

LECTURE 10. Last time: Lecture outline

Lecture 15: Conditional and Joint Typicaility

Solutions to Homework Set #2 Broadcast channel, degraded message set, Csiszar Sum Equality

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Shannon s noisy-channel theorem

ECE Information theory Final (Fall 2008)

Lecture 4 Channel Coding

Lecture 10: Broadcast Channel and Superposition Coding

EE 4TM4: Digital Communications II. Channel Capacity

Channels with cost constraints: strong converse and dispersion

Exercise 1. = P(y a 1)P(a 1 )

On Third-Order Asymptotics for DMCs

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

ELEC546 Review of Information Theory

Secret Key Agreement: General Capacity and Second-Order Asymptotics. Masahito Hayashi Himanshu Tyagi Shun Watanabe

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities

Shannon s Noisy-Channel Coding Theorem

(Classical) Information Theory III: Noisy channel coding

Variable Rate Channel Capacity. Jie Ren 2013/4/26

A New Metaconverse and Outer Region for Finite-Blocklength MACs

Shannon s Noisy-Channel Coding Theorem

Strong Converse Theorems for Classes of Multimessage Multicast Networks: A Rényi Divergence Approach

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Second-Order Asymptotics for the Gaussian MAC with Degraded Message Sets

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Lecture 11: Quantum Information III - Source Coding

A new converse in rate-distortion theory

Feedback Capacity of the Gaussian Interference Channel to Within Bits: the Symmetric Case

Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities

On Multiple User Channels with State Information at the Transmitters

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Lecture 2. Capacity of the Gaussian channel

Fundamental Limits in Asynchronous Communication

The Method of Types and Its Application to Information Hiding

LECTURE 13. Last time: Lecture outline

Variable Length Codes for Degraded Broadcast Channels

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Noisy-Channel Coding

X 1 : X Table 1: Y = X X 2

Upper Bounds on the Capacity of Binary Intermittent Communication

Lecture 22: Final Review

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Lecture 5 Channel Coding over Continuous Channels

Solutions to Homework Set #3 Channel and Source coding

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Frans M.J. Willems. Authentication Based on Secret-Key Generation. Frans M.J. Willems. (joint work w. Tanya Ignatenko)

The Communication Complexity of Correlation. Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Classical communication over classical channels using non-classical correlation. Will Matthews University of Waterloo

Cut-Set Bound and Dependence Balance Bound

The Gallager Converse

Channel Dispersion and Moderate Deviations Limits for Memoryless Channels

A Formula for the Capacity of the General Gel fand-pinsker Channel

Simple Channel Coding Bounds

An Extended Fano s Inequality for the Finite Blocklength Coding

Interactive Communication for Data Exchange

Arimoto Channel Coding Converse and Rényi Divergence

Shannon s A Mathematical Theory of Communication

Secret Key Agreement: General Capacity and Second-Order Asymptotics

Secret Key Agreement Using Asymmetry in Channel State Knowledge

Soft Covering with High Probability

UCSD ECE 255C Handout #12 Prof. Young-Han Kim Tuesday, February 28, Solutions to Take-Home Midterm (Prepared by Pinar Sen)

Lecture 8: Shannon s Noise Models

On the Duality between Multiple-Access Codes and Computation Codes

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information

Simultaneous Nonunique Decoding Is Rate-Optimal

Bounds on Achievable Rates for General Multi-terminal Networks with Practical Constraints

List of Figures. Acknowledgements. Abstract 1. 1 Introduction 2. 2 Preliminaries Superposition Coding Block Markov Encoding...

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Appendix B Information theory from first principles

Entropies & Information Theory

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Interference Channels with Source Cooperation

Principles of Communications

A General Formula for Compound Channel Capacity

The PPM Poisson Channel: Finite-Length Bounds and Code Design

On the Rate-Limited Gelfand-Pinsker Problem

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

On the Distribution of Mutual Information

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

The Capacity Region for Multi-source Multi-sink Network Coding

On Network Interference Management

Transcription:

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 1. Cascade of Binary Symmetric Channels The conditional probability distribution py x for each of the BSCs may be expressed by the transition probability matrix A, given by: 1 p p A = p 1 p The transition matrix for the cascade is given by A n = A n ; it is possible to exploit the singular value decomposition for A to be able to easily compute A n : A = T 1 1 0 1 1 T, where T = 0 1 2p 1 1 Hence, A n = T 1 1 0 0 1 2p n T = [ 1 2 1 + 1 2pn 1 2 1 1 ] 2pn 1 2 1 1 2pn 1 2 1 + 1 2pn Hence, the probability of error of the cascade is 1 2 1 1 2pn and it is equivalent to a single BSC with this probability of error. Now say that E is a random variable that indicates whether an error occurs in the cascade channel. E can assume values in 0, 1} and its distribution is PrE = 1 = 1 2 1 1 2pn and the capacity is C = IX 0 ; X n = 1 HE achieved using a uniform distribution. Now for n, PrE = 1 1 2 and so the probability distribution of the error becomes uniform; we can now compute the limit: 2. Channel with two independent looks at Y : a We have lim IX 0; X n = lim 1 HE = 1 1 = 0 n n IX; Y 1 Y 2 = HY 1 Y 2 HY 1 Y 2 X a = HY 1 Y 2 HY 1 X HY 2 X = HY 1 + HY 2 IY 1 ; Y 2 HY 1 X HY 2 X = IX; Y 1 + IX; Y 2 IY 1 ; Y 2 b = 2IX; Y 1 IY 1 ; Y 2 where equality a is due to Y 1 X Y 2 and equality b is because Y 1 and Y 2 are conditionally identically distributed given X. 1

b The capacity of the single look channel X Y 1 is The capacity of the channel X Y 1, Y 2 is 3. Tall Fat People. C 1 = max IX; Y 1 C 2 = max IX; Y 1 Y 2 = max 2IX; Y 1 IY 1 ; Y 2 max 2IX; Y 1 = 2C 1 a The average height of the individuals in the population is 5 feet. So 1 n i h i, where n is the population size and h i is the height of the i-th person. If more than 1/3 of the population is at least 15 feet tall, then the average will be greater than 1/3 15 = 5 feet since each person is at least 0 feet tall. Thus no more than 1/3 of the population is 15 feet tall. b By the same reasoning as in part a, at most 1/2 of the population is 10 feet tall and at most 1/3 of the population weighs 300 lbs. Therefore, at most 1/3 are both 10 feet tall and weigh 300 lbs. 4. Noise Alphabets. a Maximum capacity is 2 bits. Z = 10, 20, 30} and PX = 1/4, 1/4, 1/4, 1/4. b Minimum capacity is 1 bit. Z = 0, 1, 2} and PX = 1/2, 0, 0, 1/2. 5. Joint Typicality a Consider Pr X n, Ỹ n, Z n A n = x n,y n,z n A n x n,y n,z n A n px n py n pz n 2 nhx 2 nhy 2 nhz = A n 2 nhx 2 nhy 2 nhz 2 nhxy Z+ 2 nhx 2 nhy 2 nhz nhx+hy +HZ HXY Z 4 2 2

We may reverse all the inequalities to obtain Pr X n, Ỹ n, Z n A n = x n,y n,z n A n x n,y n,z n A n px n py n pz n 2 nhx+ 2 nhy + 2 nhz+ = A n 2 nhx+ 2 nhy + 2 nhz+ 1 2 nhxy Z 2 nhx+ 2 nhy + 2 nhz+ 1 nhx+hy +HZ HXY Z+4 2 where the inequality for the size of A n holds for all n large enough depending on. 6. Information Spectrum Analysis: In class, we saw how to do typical set decoding and proved that for all rates R smaller than capacity C = max PX IX; Y, there exists a sequence of 2 nr, n-codes with vanishing error probabilities. Here, we consider a refined version of this analysis, leading to better bounds on the error probability in decoding. We can also derive a general formula for channel capacity. Let X and Y be the input and output alphabets of a channel. These alphabets need not be discrete. Let P Y X be a channel from X to Y. a Suppose we use the channel once. Show that there exist a code with M codewords with average error probability ε satisfying ε Pr log P Y XY X log M + γ + 2 γ. P Y Y for any choice of γ > 0 and any input distribution where P Y y = x P Y Xy x x. Hint: Generate codewords independently according to. Instead of using typical set decoding, decode that ˆm 1,..., M} is the transmitted message if it is the unique one satisfying log P Y Xy x ˆm P Y y If there is no unique ˆm satisfying the above condition, declare an error. The analysis to arrive at the one-shot finite blocklength bound above is very similar to typical set decoding. A stronger version of this bound for maximum error was shown by Feinstein [Fei54]. As provided in the hint, we generate M codewords xm independently from. To send message m, transmit codeword xm. Decode using the rule given above. Assume m = 1. We make an error if and only if one or more of the following events occurs: E 1 := log P } Y XY X1 < log M + γ P Y Y E 2 := m 1 : log P } Y XY X m P Y Y The probability of error can be bounded as PrE PrE 1 + PrE 2 Now note that X1, Y P Y X and so PrE 1 gives the first term in the bound we have to 3

show. We simply have to show that PrE 2 2 nγ. For this consider PrE 2 = Pr m 1 : log P Y XY X m P Y Y a b = c d M m=2 M m=2 x,y M m=2 x,y Pr log P Y XY X m P Y Y xp Y y1 log P } Y Xy x P Y y xp Y X y xm2 γ 1 log P } Y Xy x P Y y M xp Y X y xm2 γ m=2 x,y e 2 γ where a is due to the union bound, b due to the fact that for m 1, the codeword X m and channel output Y are independent, c due to the fact that we re only summing over all x, y such that log P Y X y x P Y y, d we drop the indicator and e use the fact that x,y xp Y X y x = 1 and there are M 1 terms in the outer sum. b Based on part a, prove the channel coding theorem for finite X, Y and memoryless channels. Hint: Set n above to be the n-fold product distribution corresponding to a capacity-achieving input distribution arg max PX IX; Y. Set γ above to be nγ for some γ > 0. Set log M = nc 2γ. Apply the law of large numbers to the first term to see that there exists a sequence of n, 2 nc 2γ -codes with vanishing average error probabilities. Going to the n-fold n channel uses setting, we have that there exists a code with blocklength n, M n codewords and average error probability ε n satisfying ε n Pr log P Y n X ny n X n P Y ny n log M n + γ + 2 γ. Choose n to be the n-fold product distribution corresponding to a capacity-achieving input distribution arg max PX IX; Y. Since channel is a DMC, we have Pr log P Y n X ny n X n n P Y ny n log M n + γ = Pr log P Y XY i X i P Y Y i n = Pr = Pr 1 n log P Y XY i X i P Y Y i n log P Y XY i X i P Y Y i log M n + nγ nc 2γ + nγ C γ Since [ E log P ] Y XY i X i = IX; Y = C P Y Y i 4

for all i, we have that the probability above tends to zero by the weak law of large numbers. Clearly, the second term in the bound 2 γ = 2 nγ also tends to zero because γ > 0. So we have demonstrated a sequence of codes for which ε n 0 and the code rate is C 2γ which is arbitrarily close to C. c Again consider the setup in b. Let V := Var log P Y XY X P Y Y evaluated at the say unique capacity-achieving input distribution. Based on part a, show using the central limit theorem that there exists a sequence of codes indexed by blocklength n, with sizes M n satisfying log M n = nc + nv Φ 1 ε + o n such that the average error probability is no larger than ε + o1. This exercise demonstrates a cool refinement to the channel coding theorem we have seen. For more information about this class of results information theory problems with non-vanishing errors, you may refer to my monograph [Tan14]. This result was first shown by Strassen [Str62]. See also Hayashi [Hay09] and Polyanskiy-Poor-Verdú [PPV10]. Now we again use the bound ε n Pr log P Y n X ny n X n P Y ny n log M n + nγ + 2 nγ. with γ = log n n so the final term is 1/n. Plug the value of M n into the probability in the bound above. We have Pr log P Y n X ny n X n P Y ny n log M n + nγ n = Pr log P Y XY i X i nc + nv Φ 1 ε + log n P Y Y i 1 n = Pr log P Y XY i X i log n C Φ 1 ε + O nv P Y Y i nv Now note that for all i [n], [ 1 E log P Y XY i X i V P Y Y i [ 1 Var log P Y XY i X i C V P Y Y i ] C = 0 ] = 1 so the random variable in the probability converges to a standard Gaussian by the central limit theorem. Consequently, Pr log P Y n X ny n X n P Y ny n log M n + nγ ε and we are done. d Now consider a general channel P Y n X n : X n Y n } n 1 which is simply a sequence of stochastic maps from X n Y n. Show that the capacity of this channel is bounded from below as follows: 1 C sup sup a R : lim Pr X n n log P Y n XnY n X n } P Y ny n a = 0 5

In fact, this bound is tight. That is, there is a matching upper bound. What is cool is that the lower bound above is a generalization of the notion of convergence in probability. See Verdú-Han s beautiful paper on general formulas [VH94]. Follows directly from the definition and the bound in part a. 7. List Decoding for Channel Coding a Fano s inequality for list decoding: Define the error random variable 1 W / LY E = 0 W LY Now consider HW, E LY = HW E, LY + HE LY = HE W, LY + HW LY Let P e := PrW / LY. Now clearly, HE W, LY = 0, and HE LY HE = H b P e. Now, we examine the term HW E, LY. We have HW E, LY = PrE = 0HW E = 0, LY + PrE = 1HW E = 1, LY 1 P e log l + P e log W l since if we know that E = 0, the number of values that W can take on is no more than l and if E = 1, the number of values that W can take on is no more than W l. Putting everything together and upper bounding H b P e by 1, we have b We have so the minimum a is capacity C. c We have HW LY 1 log l P e. log W l l IX n ; Y n = HY n HY n X n n = HY n HY i X i nr = HW n HY i HY i X i n IX i ; Y i nc = HW LY n + IW ; LY n Now from Fano s inequality for list decoding, HW LY n P e log W 2nL 2 nl + 1 + log2 nl = nl + n n 6

where n 0 as n. Furthermore, IW ; LY n IX n ; Y n nc where the first inequality follows from data processing; cf. W X n Y n LY n forms a Markov chain. So we have nr nl + n n + nc which upon dividing by n and taking lim sup on both sides yields 8. [Capacity Calculation for Symmetric Channels] The capacity is log 2 m h1/4. Consider R L + C =: R +. IX; Y = HY HY X = HY h1/4 Note that HY is maximized at the value log 2 m and this is achievable using the uniform input distribution px = 1/m for all x 0, 1,..., m 1}. References [Fei54] A Feinstein. A new basic theorem of information theory. IEEE Transactions on Information Theory, 44:2 22, 1954. [Hay09] M. Hayashi. Information spectrum approach to second-order coding rate in channel coding. IEEE Transactions on Information Theory, 5511:4947 4966, 2009. [PPV10] Y. Polyanskiy, H. V. Poor, and S. Verdú. Channel coding rate in the finite blocklength regime. IEEE Transactions on Information Theory, 565:2307 2359, 2010. [Str62] V. Strassen. Asymptotische Abschätzungen in Shannons Informationstheorie. In Trans. Third Prague Conf. Inf. Theory, pages 689 723, Prague, 1962. http://www.math.cornell.edu/ pmlut/strassen.pdf. [Tan14] [VH94] V. Y. F. Tan. Asymptotic estimates in information theory with non-vanishing error probabilities. Foundations and Trends in Communications and Information Theory, 111 2:1 184, 2014. S. Verdú and T. S. Han. A general formula for channel capacity. IEEE Transactions on Information Theory, 404:1147 1157, 1994. 7