LECTURE 10. Last time: Lecture outline

Similar documents
LECTURE 13. Last time: Lecture outline

Lecture 3: Channel Capacity

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Lecture 4 Noisy Channel Coding

(Classical) Information Theory III: Noisy channel coding

Shannon s noisy-channel theorem

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Lecture 15: Conditional and Joint Typicaility

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

National University of Singapore Department of Electrical & Computer Engineering. Examination for

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Shannon s Noisy-Channel Coding Theorem

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Network coding for multicast relation to compression and generalization of Slepian-Wolf

EE 4TM4: Digital Communications II. Channel Capacity

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Lecture 4 Channel Coding

Lecture 10: Broadcast Channel and Superposition Coding

Frans M.J. Willems. Authentication Based on Secret-Key Generation. Frans M.J. Willems. (joint work w. Tanya Ignatenko)

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Tightened Upper Bounds on the ML Decoding Error Probability of Binary Linear Block Codes and Applications

Lecture 1: The Multiple Access Channel. Copyright G. Caire 12

Lecture 11: Quantum Information III - Source Coding

Lecture 22: Final Review

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

School of Computer and Communication Sciences. Information Theory and Coding Notes on Random Coding December 12, 2003.

ELEC546 Review of Information Theory

Error Exponent Region for Gaussian Broadcast Channels

for some error exponent E( R) as a function R,

Lecture 7 September 24

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Arimoto Channel Coding Converse and Rényi Divergence

Homework Set #2 Data Compression, Huffman code and AEP

The Method of Types and Its Application to Information Hiding

Advanced Topics in Information Theory

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Lecture 5: Asymptotic Equipartition Property

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

Bounds on the Maximum Likelihood Decoding Error Probability of Low Density Parity Check Codes

EECS 750. Hypothesis Testing with Communication Constraints

Lecture 11: Polar codes construction

Reliable Computation over Multiple-Access Channels

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Tightened Upper Bounds on the ML Decoding Error Probability of Binary Linear Block Codes and Applications

The Capacity Region for Multi-source Multi-sink Network Coding

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities

Introduction to Convolutional Codes, Part 1

Covert Communication with Channel-State Information at the Transmitter

Diversity-Multiplexing Tradeoff in MIMO Channels with Partial CSIT. ECE 559 Presentation Hoa Pham Dec 3, 2007

Shannon s Noisy-Channel Coding Theorem

Solutions to Homework Set #2 Broadcast channel, degraded message set, Csiszar Sum Equality

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

ECE 534 Information Theory - Midterm 2

ECE Information theory Final (Fall 2008)

Capacity of AWGN channels

Lecture 5 Channel Coding over Continuous Channels

Solutions to Set #2 Data Compression, Huffman code and AEP

Intermittent Communication

Relay Networks With Delays

Lecture 4: Proof of Shannon s theorem and an explicit code

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Lecture 2: August 31

A Comparison of Superposition Coding Schemes

On the Duality between Multiple-Access Codes and Computation Codes

Simultaneous Nonunique Decoding Is Rate-Optimal

Lecture 12. Block Diagram

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

Classical codes for quantum broadcast channels. arxiv: Ivan Savov and Mark M. Wilde School of Computer Science, McGill University

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

EE5585 Data Compression May 2, Lecture 27

Approaching Blokh-Zyablov Error Exponent with Linear-Time Encodable/Decodable Codes

Achievable Rates for Probabilistic Shaping

X 1 : X Table 1: Y = X X 2

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Exponential Error Bounds for Block Concatenated Codes with Tail Biting Trellis Inner Codes

Lecture 22: Error exponents in hypothesis testing, GLRT

The Gallager Converse

Distributed Source Coding Using LDPC Codes

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

(Classical) Information Theory II: Source coding

A Summary of Multiple Access Channels

An Improved Sphere-Packing Bound for Finite-Length Codes over Symmetric Memoryless Channels

ECE 4400:693 - Information Theory

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

SOURCE CODING WITH SIDE INFORMATION AT THE DECODER (WYNER-ZIV CODING) FEB 13, 2003

On Function Computation with Privacy and Secrecy Constraints

Cut-Set Bound and Dependence Balance Bound

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

1 Background on Information Theory

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

Transcription:

LECTURE 10 Joint AEP Coding Theorem Last time: Error Exponents Lecture outline Strong Coding Theorem Reading: Gallager, Chapter 5.

Review Joint AEP A ( ɛ n) (X) A ( ɛ n) (Y ) vs. A ( ɛ n) (X, Y ) 2 nh(x) 2 nh(y ) >> 2 nh(x,y ) Passing x n through the channel to obtain y n, (x n, y n ) are jointly typical with high probability. For another independently chosen xn, ( x n, y n ) are jointly typical with probability 2 ni(x;y ). Coding Theorem Random coding Joint typicality decoding Converse proved by using Fano s inequality. A Possible Confusion: i.i.d. input distribution vs. transmitting independent symbols.

Remaining Topics Can we get rid of the random coding? Instead, we will get a closer look of random coding. Joint typicality decoding vs. Likelihood decoding. Maximum Example: Binary source sequence passing through BSC. Finite codeword length n. Our plan for the next two lectures Maximum likelihood decoding. Upper bound of the error probability. Random coding error exponent. Binary symmetric channel as an example.

Maximum Likelihood Decoding Notations Message W uniformly distributed in {1, 2,..., M}. M = 2 nr. Encoder transmit codeword x n (m) if the incoming message is W = m. Receiver receives y n, and find the most likely transmitted codeword. Ŵ = arg max P Y n X n(yn x n (m)) m Define Y m as the set of y n s that decodes to m. The probability of error conditioned on W = m is transmitted: P e,m = y n m P Y n X n(yn x n (m)) Y c

Pairwise Error Probability Consider the case M = 2. P e,1 = (y cp n x n (1)) Y y n 1 We really hate the Yc 1, since we have to figure out the decision region. Can we sum over the entire set Y n without dealing with the discontinuity? c Consider any y n Y 1, by definition P (y n x n (2)) P (y n x n (1)), so we can bound [ P (y n x n P (y n x n (2)) ] s P e,1 (1)) P (y n x n (1)) = y n Y c 1 P (y n x n (1)) 1 s P (y n x(2)) s y n Y c 1 y n P (y n x n (1)) 1 s P (y n x(2)) s for any s (0, 1).

Random Codewords Choose the codeword x n (1) and x n (2) i.i.d. from the distribution P X (or equivalently P X n) P e,1 and = P X n(x n (1)) P (y n x n (1)) y n x n (1) P (error W = 1, x n (1), y n ) P (error W = 1, x n (1), y n ) [ P X n(x n P (y n x n (2)) ] s (2)) P (y n x n (1)) x n (2) In general, this is a good bound for both the fixed and the random codewords. Random coding allows for generalization to many codewords. The upper bound allows us to compute the error probability without dealing with specific decision regions.

Example: Binary source/bsc Let x n (1) and x n (2) be the all 0 and the all 1 words. Use DMC, we have for m = 1, 2, and any s (0, 1), y 1 y 2 n n P e,m... P (y i x 1,i ) 1 s P (y i x 2,i ) s = i=1 y i y n i=1 P (y i x 1,i ) 1 s P (y i x 2,i ) s

Example: Binary source/bsc Plug in the specific choices of codewords: P (yi x 1,i ) 1 s P (y i x 2,i ) s = y i ɛ 1 s (1 ɛ) s + ɛ s (1 ɛ) 1 s Optimize to get s = 1/2, and P e,m [2 ɛ(1 ɛ)] N Alternative Approach Condition on the all 0 word is transmitted, error occurs when there are more than n/2 1 s, P e,1 2 nd(1 2 ɛ) = 2 n[log 2+1 2 log ɛ+1 2 log(1 ɛ)] The upper bound is quite tight! Similar development can be done with random codewords. We are now one step away from the case with many codewords.

BSC with Many Codewords Can we generalize this to many codeword by using the union bound? Assume there are 2 nr codewords, union bound of P e,m = m =m P (m m ) 2 nr 2 nd(1 2 ɛ) The error probability goes to 0 if ( ) 1 R D < 0 2 ɛ This says the probability of error decays exponentially with n.

The Error Exponent Rewrite P e,m,union = 2 n(d(1/2 ɛ) R) The error exponent is E u (R) = D ( 1 2 ɛ) R. As long as R is small enough such that E u (R) > 0, the error probability decays with n exponentially. Question Does this give the capacity? Too bad!, the maximum data rate is not the capacity. ( 1 ) 1 H(ɛ) D 2 ɛ = log 2 + ɛ log ɛ + (1 ɛ) log(1 ɛ) 1 1 [log 2 + log ɛ + log(1 ɛ)] ( ) 2 2 1 = ɛ log 1 ɛ 2 ɛ > 0

A Better Way than the Union Bound Lemma For any ρ (0, 1], Proof P ( ) [ ] ρ Am m m P (A m ) ( ) { m P (A m ) P Am 1 m Idea: use ρ to compensate serious overlap with a cost. Now define event A = {m m W = m, x n (m), y n m } and we have just computed P (A m ) x n (m ) P (y n x n (m )) s P X n(x n (m )) P (y n x n (m)) s

Upper Bound for the Error Probability = P (error W = m, x n (m), y n ) P A m m =m x n (m ) P (y n x n (m )) s (M 1) ρ P X n(x n (m )) P (y n x n (m)) s ρ n Average over x n (m), y, P e,m P n X n(x n (m))p (y x n (m)) 1 sρ y n x n (m) x n (m ) (M 1) ρ P X n(x n (m ))P (y n x n (m )) s for any s, ρ (0, 1]. ρ take s = 1/(1 + ρ).

Upper Bound Theorem P e,m (M 1) ρ y n x n PX n(x n )P (y n x n ) 1 1+ρ (1+ρ) Corollary Apply DMC P e,m (M 1) ρ n i=1 y i x i (M 1) ρ y 1 PX (x i )P Y X (y i x i ) 1+ρ 1+ρ ) 1+ρ ( n 1 = PX (x)p 1+ρ Y X (y x) x

Random Coding Error Exponent For a fixed input distribution P X, define ) 1 1+ρ E 0 (ρ) = log ( PX (x)p Y X (y x) 1+ρ y x Then the average probability of error P e,m 2 n(e 0 (ρ) ρr) As long as the random coding error exponent E r (R) = max[e 0 (ρ) ρr] ρ [0,1] is positive, the error probability can be driven to 0 as n.

The Behavior of the Error Exponent Facts: E 0 (ρ) 0 with equality only at ρ = 0. E 0 (ρ) ρ 0. E 0 (ρ) ρ I(X; Y ), with equality at ρ = 0. E 0 (ρ) is concave in ρ. Consider E r (R) = max [E 0 (ρ) ρ(r)] ρ [0,1] Ignore the constraint, the maximum occurs at R = E 0 (ρ)/ ρ ρ. The maximizing ρ lies in [0, 1] if E 0 (ρ) E 0 (ρ) R = I(X; Y ) ρ ρ=1 ρ ρ=0 For any R < I(X; Y ), we get positive error exponent, and the error probability can be driven to 0 as n.

Summary We have proved the coding theorem in another way. For R < C, the error probability decays exponentially with n. Remaining Questions Is this a good bound? We have chosen the random codes and computed the average performance. Is there any specific code that can do better than this? The two pieces of the error exponent curve is mysterious.