Second-Order Asymptotics in Information Theory

Similar documents
On Third-Order Asymptotics for DMCs

Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Lecture 4 Noisy Channel Coding

Second-Order Asymptotics for the Gaussian MAC with Degraded Message Sets

Lecture 4 Channel Coding

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

Dispersion of the Gilbert-Elliott Channel

Channel Dispersion and Moderate Deviations Limits for Memoryless Channels

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Lecture 5 Channel Coding over Continuous Channels

Covert Communication with Channel-State Information at the Transmitter

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Secret Key Agreement: General Capacity and Second-Order Asymptotics. Masahito Hayashi Himanshu Tyagi Shun Watanabe

ELEC546 Review of Information Theory

A Hierarchy of Information Quantities for Finite Block Length Analysis of Quantum Tasks

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information

Strong Converse Theorems for Classes of Multimessage Multicast Networks: A Rényi Divergence Approach

for some error exponent E( R) as a function R,

A new converse in rate-distortion theory

Quantum Sphere-Packing Bounds and Moderate Deviation Analysis for Classical-Quantum Channels

ECE Information theory Final (Fall 2008)

Channels with cost constraints: strong converse and dispersion

Lecture 22: Final Review

ECE Information theory Final

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

A Tight Upper Bound for the Third-Order Asymptotics for Most Discrete Memoryless Channels

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Refined Bounds on the Empirical Distribution of Good Channel Codes via Concentration Inequalities

Necessary and Sufficient Conditions for High-Dimensional Salient Feature Subset Recovery

Soft Covering with High Probability

Upper Bounds on the Capacity of Binary Intermittent Communication

Appendix B Information theory from first principles

Shannon s noisy-channel theorem

Shannon s Noisy-Channel Coding Theorem

Chapter 9 Fundamental Limits in Information Theory

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Arimoto Channel Coding Converse and Rényi Divergence

Lecture 3: Channel Capacity

X 1 : X Table 1: Y = X X 2

EE 4TM4: Digital Communications II. Channel Capacity

Memory in Classical Information Theory: A Brief History

The Method of Types and Its Application to Information Hiding

A New Metaconverse and Outer Region for Finite-Blocklength MACs

Quantum Achievability Proof via Collision Relative Entropy

Entropies & Information Theory

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

lossless, optimal compressor

Lecture 7. Union bound for reducing M-ary to binary hypothesis testing

Performance-based Security for Encoding of Information Signals. FA ( ) Paul Cuff (Princeton University)

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

LECTURE 13. Last time: Lecture outline

Generalized Writing on Dirty Paper

The PPM Poisson Channel: Finite-Length Bounds and Code Design

Lecture 7 Introduction to Statistical Decision Theory

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Information measures in simple coding problems

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Discrete Memoryless Channels with Memoryless Output Sequences

Lecture 2: August 31

Recent Results on Input-Constrained Erasure Channels

Lecture 11: Polar codes construction

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Exercise 1. = P(y a 1)P(a 1 )

Lecture 8: Shannon s Noise Models

An Extended Fano s Inequality for the Finite Blocklength Coding

Lecture 11: Quantum Information III - Source Coding

Reliable Computation over Multiple-Access Channels

Communications Theory and Engineering

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

Variable Rate Channel Capacity. Jie Ren 2013/4/26

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

The Poisson Channel with Side Information

Variable Length Codes for Degraded Broadcast Channels

A Formula for the Capacity of the General Gel fand-pinsker Channel

(Classical) Information Theory III: Noisy channel coding

On Composite Quantum Hypothesis Testing

ECE 4400:693 - Information Theory

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

Secret Key Agreement: General Capacity and Second-Order Asymptotics

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

A One-to-One Code and Its Anti-Redundancy

Block 2: Introduction to Information Theory

Lecture 2. Capacity of the Gaussian channel

Source-Channel Coding Theorems for the Multiple-Access Relay Channel

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

On Scalable Coding in the Presence of Decoder Side Information

Homework Set #2 Data Compression, Huffman code and AEP

Coding on Countably Infinite Alphabets

Lecture 10: Broadcast Channel and Superposition Coding

Lecture 15: Conditional and Joint Typicaility

A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels

On the Capacity of the Binary-Symmetric Parallel-Relay Network

Information Theory and Hypothesis Testing

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Unequal Error Protection Querying Policies for the Noisy 20 Questions Problem

Source and Channel Coding for Correlated Sources Over Multiuser Channels

Transcription:

Second-Order Asymptotics in Information Theory Vincent Y. F. Tan (vtan@nus.edu.sg) Dept. of ECE and Dept. of Mathematics National University of Singapore (NUS) National Taiwan University November 2015 Vincent Tan (NUS) Second-Order Asymptotics NTU 1 / 109

Outline 1 Motivation, Background and History Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 2 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 3 / 109

Transmission of Information Shannon s Figure 1 INFORMATION SOURCE TRANSMITTER RECEIVER DESTINATION SIGNAL RECEIVED SIGNAL MESSAGE MESSAGE NOISE SOURCE Shannon abstracted away information meaning, semantics treat all data equally Shannon s bits as a universal Figure currency 1 crucial abstraction for modern communication and computing systems Also relaxed computation and delay constraints to discover a fundamental limit: capacity, providing a goal-post to work toward Information theory Finding fundamental limits for reliable information transmission Saturday, June 11, 2011 Vincent Tan (NUS) Second-Order Asymptotics NTU 4 / 109

Transmission of Information Shannon s Figure 1 INFORMATION SOURCE TRANSMITTER RECEIVER DESTINATION SIGNAL RECEIVED SIGNAL MESSAGE MESSAGE NOISE SOURCE Shannon abstracted away information meaning, semantics treat all data equally Shannon s bits as a universal Figure currency 1 crucial abstraction for modern communication and computing systems Also relaxed computation and delay constraints to discover a fundamental limit: capacity, providing a goal-post to work toward Information theory Finding fundamental limits for reliable information transmission Saturday, June 11, 2011 Channel coding: Concerned with the maximum rate of communication in bits/channel use Vincent Tan (NUS) Second-Order Asymptotics NTU 4 / 109

Channel Coding (One-Shot) M X Y M f W ϕ A code is an triple C = {M, f, ϕ} where M is the message set The average error probability p err (C) is [ ] p err (C) := Pr M M where M is uniform on M Vincent Tan (NUS) Second-Order Asymptotics NTU 5 / 109

Channel Coding (One-Shot) M X Y M f W ϕ A code is an triple C = {M, f, ϕ} where M is the message set The average error probability p err (C) is [ ] p err (C) := Pr M M where M is uniform on M A non-asymptotic fundamental limit can be defined as M (W, ε) := sup { m N C s.t. m = M, p err (C) ε } Central problem in information theory to characterize M (W, ε). Vincent Tan (NUS) Second-Order Asymptotics NTU 5 / 109

Channel Coding (n-shot) M X n Y f W n n M ϕ Consider n independent uses of a discrete memoryless channel (DMC) W n Vincent Tan (NUS) Second-Order Asymptotics NTU 6 / 109

Channel Coding (n-shot) M X n Y f W n n M ϕ Consider n independent uses of a discrete memoryless channel (DMC) W n For vectors x n = (x 1,..., x n ) X n and y n := (y 1,..., y n ) Y n, the channel law is n W n (y n x n ) = W(y i x i ) i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 6 / 109

Channel Coding (n-shot) M X n Y f W n n M ϕ Consider n independent uses of a discrete memoryless channel (DMC) W n For vectors x n = (x 1,..., x n ) X n and y n := (y 1,..., y n ) Y n, the channel law is n W n (y n x n ) = W(y i x i ) i=1 Non-asymptotic fundamental limit for n uses of W M (W n, ε) Vincent Tan (NUS) Second-Order Asymptotics NTU 6 / 109

Background: Shannon s Channel Coding Theorem Shannon s (1948) noisy channel coding theorem and Wolfowitz s (1959) strong converse state that Vincent Tan (NUS) Second-Order Asymptotics NTU 7 / 109

Background: Shannon s Channel Coding Theorem Shannon s (1948) noisy channel coding theorem and Wolfowitz s (1959) strong converse state that Theorem (Shannon (1948), Wolfowitz (1959)) where is the capacity of the DMC. 1 lim n n log M (W n, ε) = C, ε (0, 1) C = max P I(P, W) Vincent Tan (NUS) Second-Order Asymptotics NTU 7 / 109

Background: Shannon s Channel Coding Theorem 1 lim n n log M (W n, ε) = C bits/channel use Channel coding theorem for DMCs is independent of ε (0, 1) Vincent Tan (NUS) Second-Order Asymptotics NTU 8 / 109

Background: Shannon s Channel Coding Theorem 1 lim n n log M (W n, ε) = C bits/channel use Channel coding theorem for DMCs is independent of ε (0, 1) 1 lim p err(c n ) n 0 C R Vincent Tan (NUS) Second-Order Asymptotics NTU 8 / 109

Background: Shannon s Channel Coding Theorem 1 lim n n log M (W n, ε) = C bits/channel use Channel coding theorem for DMCs is independent of ε (0, 1) 1 lim p err(c n ) n 0 C R Vincent Tan (NUS) Second-Order Asymptotics NTU 8 / 109

Background: Shannon s Channel Coding Theorem 1 lim n n log M (W n, ε) = C bits/channel use Channel coding theorem for DMCs is independent of ε (0, 1) 1 lim p err(c n ) n 0 C R Phase transition at capacity Vincent Tan (NUS) Second-Order Asymptotics NTU 8 / 109

Background: Second-Order Coding Rates What happens at capacity? Vincent Tan (NUS) Second-Order Asymptotics NTU 9 / 109

Background: Second-Order Coding Rates What happens at capacity? More precisely, what happens when log M n nc + L n for some L R? Vincent Tan (NUS) Second-Order Asymptotics NTU 9 / 109

Background: Second-Order Coding Rates What happens at capacity? More precisely, what happens when log M n nc + L n for some L R? Here L is known as the second-order coding rate of the code Vincent Tan (NUS) Second-Order Asymptotics NTU 9 / 109

Background: Second-Order Coding Rates What happens at capacity? More precisely, what happens when log M n nc + L n for some L R? Here L is known as the second-order coding rate of the code Note that L can be negative (cf. Hayashi (2008), Hayashi (2009)) Vincent Tan (NUS) Second-Order Asymptotics NTU 9 / 109

Background: Second-Order Coding Rates Assume rate of the code satisfies 1 n log M n = C + L ( 1 ) + o n n Vincent Tan (NUS) Second-Order Asymptotics NTU 10 / 109

Background: Second-Order Coding Rates Assume rate of the code satisfies 1 n log M n = C + L ( 1 ) + o n n 1 lim n perr(cn) 0.5 0 L Vincent Tan (NUS) Second-Order Asymptotics NTU 10 / 109

Background: Second-Order Coding Rates Assume rate of the code satisfies 1 n log M n = C + L ( 1 ) + o n n 1 lim n perr(cn) 0.5 p err (C n ) = Φ( L V ) + o(1) 0 L Vincent Tan (NUS) Second-Order Asymptotics NTU 10 / 109

Background: Second-Order Coding Rates Assume rate of the code satisfies 1 n log M n = C + L ( 1 ) + o n n 1 lim n perr(cn) 0.5 p err (C n ) = Φ( L V ) + o(1) 0 L For an error probability ε, the optimum second-order coding rate is L (ε) := VΦ 1 (ε) Vincent Tan (NUS) Second-Order Asymptotics NTU 10 / 109

Error Exponents vs Normal Approximation For error exponent analysis, we fix rate R < C and study ε (W n, 2 nr ) = min{ε : (2 nr, ε)-code for W n } Most of the time, ε (W n, 2 nr ) exp( ne(r)) Vincent Tan (NUS) Second-Order Asymptotics NTU 11 / 109

Error Exponents vs Normal Approximation For error exponent analysis, we fix rate R < C and study ε (W n, 2 nr ) = min{ε : (2 nr, ε)-code for W n } Most of the time, ε (W n, 2 nr ) exp( ne(r)) For normal approximation or second-order analysis, we fix the error probability ε (0, 1) and seek R (W n, ε) = log M (W n, ε) n C + V n Φ 1 (ε) Vincent Tan (NUS) Second-Order Asymptotics NTU 11 / 109

Error Exponents vs Normal Approximation For error exponent analysis, we fix rate R < C and study ε (W n, 2 nr ) = min{ε : (2 nr, ε)-code for W n } Most of the time, ε (W n, 2 nr ) exp( ne(r)) For normal approximation or second-order analysis, we fix the error probability ε (0, 1) and seek R (W n, ε) = log M (W n, ε) n C + V n Φ 1 (ε) Some form of duality in the analyses... Vincent Tan (NUS) Second-Order Asymptotics NTU 11 / 109

Agenda : Part I Agenda for today s tutorial: Point-to-point communication 1 Most results in point-to-point information theory can be derived by understanding the fundamental limits of binary hypothesis testing (Strassen (1962)) Vincent Tan (NUS) Second-Order Asymptotics NTU 12 / 109

Agenda : Part I Agenda for today s tutorial: Point-to-point communication 1 Most results in point-to-point information theory can be derived by understanding the fundamental limits of binary hypothesis testing (Strassen (1962)) 2 Lossless source coding is an easy corollary of binary hypothesis testing (Strassen (1962)) Vincent Tan (NUS) Second-Order Asymptotics NTU 12 / 109

Agenda : Part I Agenda for today s tutorial: Point-to-point communication 1 Most results in point-to-point information theory can be derived by understanding the fundamental limits of binary hypothesis testing (Strassen (1962)) 2 Lossless source coding is an easy corollary of binary hypothesis testing (Strassen (1962)) 3 Prove the channel coding dispersion for DMCs 1 Strassen (1962) 2 Hayashi (2009) 3 Polyanskiy-Poor-Verdú (2010) 4 Tomamichel-Tan (2013) Vincent Tan (NUS) Second-Order Asymptotics NTU 12 / 109

Agenda : Part II An extension to a multiterminal (network) setting the Slepian-Wolf problem (Tan-Kosut (2014), Nomura-Han (2015)); Vincent Tan (NUS) Second-Order Asymptotics NTU 13 / 109

Agenda : Part II An extension to a multiterminal (network) setting the Slepian-Wolf problem (Tan-Kosut (2014), Nomura-Han (2015)); I will be talking about a subset of the material from my monograph. V. Y. F. Tan Asymptotic expansions in information theory with non-vanishing error probabilities Now Publishers Foundations and Trends in Communications and Information Theory Vincent Tan (NUS) Second-Order Asymptotics NTU 13 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 14 / 109

Setup of the Binary Hypothesis Testing Problem In binary hypothesis testing, we are concerned with the problem H 0 : Z P H 1 : Z Q where P, Q P(Z) are distributions on the same space Z. Vincent Tan (NUS) Second-Order Asymptotics NTU 15 / 109

Setup of the Binary Hypothesis Testing Problem In binary hypothesis testing, we are concerned with the problem H 0 : Z P H 1 : Z Q where P, Q P(Z) are distributions on the same space Z. Alphabet Z may be assumed to be finite Vincent Tan (NUS) Second-Order Asymptotics NTU 15 / 109

Setup of the Binary Hypothesis Testing Problem In binary hypothesis testing, we are concerned with the problem H 0 : Z P H 1 : Z Q where P, Q P(Z) are distributions on the same space Z. Alphabet Z may be assumed to be finite Design a test δ : Z {0, 1} that outputs 0 if Z P and 1 otherwise Vincent Tan (NUS) Second-Order Asymptotics NTU 15 / 109

Various Error Probabilities For a given test δ : Z {0, 1}, we may define Vincent Tan (NUS) Second-Order Asymptotics NTU 16 / 109

Various Error Probabilities For a given test δ : Z {0, 1}, we may define Probability of false alarm P FA (δ) := z δ(z)p(z) = E P [δ(z)] Vincent Tan (NUS) Second-Order Asymptotics NTU 16 / 109

Various Error Probabilities For a given test δ : Z {0, 1}, we may define Probability of false alarm P FA (δ) := z δ(z)p(z) = E P [δ(z)] Probability of missed detection P MD (δ) := z (1 δ(z))q(z) = E Q [1 δ(z)] Vincent Tan (NUS) Second-Order Asymptotics NTU 16 / 109

Various Error Probabilities For a given test δ : Z {0, 1}, we may define Probability of false alarm P FA (δ) := z δ(z)p(z) = E P [δ(z)] Probability of missed detection P MD (δ) := z (1 δ(z))q(z) = E Q [1 δ(z)] Holy grail: Design a test δ such that P FA (δ) 0 while P D (δ) = 1 P MD (δ) 1 but impossible most of the time Vincent Tan (NUS) Second-Order Asymptotics NTU 16 / 109

A Measure of the Performance of a Test δ For given distributions P and Q and an ε (0, 1), we may define β 1 ε (P, Q) := This is the same as β 1 ε (P, Q) := inf {P MD(δ) : P FA (δ) ε} δ:z {0,1} inf A Z:P(A) ε Q(Ac ) where A represents the acceptance region for H 1, i.e., δ(z) = 1 iff z A. Vincent Tan (NUS) Second-Order Asymptotics NTU 17 / 109

A Measure of the Performance of a Test δ For given distributions P and Q and an ε (0, 1), we may define β 1 ε (P, Q) := This is the same as β 1 ε (P, Q) := inf {P MD(δ) : P FA (δ) ε} δ:z {0,1} inf A Z:P(A) ε Q(Ac ) where A represents the acceptance region for H 1, i.e., δ(z) = 1 iff z A. The larger the tolerance ε, the smaller β 1 ε Vincent Tan (NUS) Second-Order Asymptotics NTU 17 / 109

ε-hypothesis Testing Divergence The ε-hypothesis testing divergence is D ε h(p Q) := log β 1 ε(p, Q) 1 ε Measure of distinguishability of P from Q Vincent Tan (NUS) Second-Order Asymptotics NTU 18 / 109

ε-hypothesis Testing Divergence The ε-hypothesis testing divergence is D ε h(p Q) := log β 1 ε(p, Q) 1 ε Measure of distinguishability of P from Q Similar to divergence, we have non-negativity and data processing inequality D ε h(p Q) 0. D ε h(p Q) D ε h(pw QW) where PW(z ) = z P(z)W(z z) for any channel W : Z Z Vincent Tan (NUS) Second-Order Asymptotics NTU 18 / 109

ε-information Spectrum Divergence While D ε h (P Q) is very useful and fundamental, it is hard to compute (optimization over functions δ : Z {0, 1}). Vincent Tan (NUS) Second-Order Asymptotics NTU 19 / 109

ε-information Spectrum Divergence While D ε h (P Q) is very useful and fundamental, it is hard to compute (optimization over functions δ : Z {0, 1}). Define another related quantity: The ε-information spectrum divergence { { D ε s(p Q) := sup R : P z Z : log P(z) } } Q(z) R ε Information Spectrum Methods in Information Theory by T. S. Han (2003) Vincent Tan (NUS) Second-Order Asymptotics NTU 19 / 109

ε-information Spectrum Divergence { { D ε s(p Q) := sup R : P z Z : log P(z) } } Q(z) R ε ε 1 ε Density of log P(Z) Q(Z) when Z P R D ε s (P Q) is the largest point R for which the probability mass to the left is no larger than ε. Vincent Tan (NUS) Second-Order Asymptotics NTU 20 / 109

ε-information Spectrum Divergence { { D ε s(p Q) := sup R : P z Z : log P(z) } } Q(z) R ε The ε-information Spectrum Divergence is easy to estimate Vincent Tan (NUS) Second-Order Asymptotics NTU 21 / 109

ε-information Spectrum Divergence { { D ε s(p Q) := sup R : P z Z : log P(z) } } Q(z) R ε The ε-information Spectrum Divergence is easy to estimate If P n and Q n are product distributions, i.e., n P n (z n ) = P(z i ), Q n (z n ) = i=1 n Q(z i ), then the probability { P n z n Z n : log Pn (z n } ( n ) ) Q n (z n ) R = Pr log P(Z i) Q(Z i ) R i=1 i=1 and log P(Z i) Q(Z i ) (where Z i P) are iid random variables. Vincent Tan (NUS) Second-Order Asymptotics NTU 21 / 109

ε-information Spectrum Divergence { { D ε s(p Q) := sup R : P z Z : log P(z) } } Q(z) R ε The ε-information Spectrum Divergence is easy to estimate If P n and Q n are product distributions, i.e., n P n (z n ) = P(z i ), Q n (z n ) = i=1 n Q(z i ), then the probability { P n z n Z n : log Pn (z n } ( n ) ) Q n (z n ) R = Pr log P(Z i) Q(Z i ) R i=1 i=1 and log P(Z i) Q(Z i ) (where Z i P) are iid random variables. Probability is that of the tail of a sum of iid rvs easy! Vincent Tan (NUS) Second-Order Asymptotics NTU 21 / 109

Relation Between Divergences Lemma For every ε (0, 1) and η (0, 1 ε), we have D ε 1 s(p Q) log 1 ε Dε h(p Q) D ε+η s (P Q) + log 1 ε η Vincent Tan (NUS) Second-Order Asymptotics NTU 22 / 109

Relation Between Divergences Lemma For every ε (0, 1) and η (0, 1 ε), we have D ε 1 s(p Q) log 1 ε Dε h(p Q) D ε+η s (P Q) + log 1 ε η Proof: We only prove the lower bound: Let δ be the likelihood ratio test { δ(z) := 1 log P(z) } Q(z) γ where γ := D ε s(p Q) ξ. Vincent Tan (NUS) Second-Order Asymptotics NTU 22 / 109

Proof of Relation Between Divergences By definition of ε-information spectrum divergence, { E P [δ(z)] = P z Z : log P(z) } Q(z) }{{} γ ε =D ε s (P Q) ξ Vincent Tan (NUS) Second-Order Asymptotics NTU 23 / 109

Proof of Relation Between Divergences By definition of ε-information spectrum divergence, { E P [δ(z)] = P z Z : log P(z) } Q(z) }{{} γ ε =D ε s (P Q) ξ Next, we estimate E Q [1 δ(z)] = z z { Q(z)1 log P(z) } Q(z) > γ P(z) exp( γ)1 { log P(z) } Q(z) > γ exp( γ) Vincent Tan (NUS) Second-Order Asymptotics NTU 23 / 109

Proof of Relation Between Divergences By definition of ε-information spectrum divergence, { E P [δ(z)] = P z Z : log P(z) } Q(z) }{{} γ ε =D ε s (P Q) ξ Next, we estimate E Q [1 δ(z)] = z z { Q(z)1 log P(z) } Q(z) > γ P(z) exp( γ)1 { log P(z) } Q(z) > γ exp( γ) Thus, one has D ε 1 h(p Q) γ log 1 ε = 1 Dε s(p Q) ξ log 1 ε Proof is completed by taking ξ 0. Vincent Tan (NUS) Second-Order Asymptotics NTU 23 / 109

Basic Definitions Define the product distributions n n P (n) (z n ) = P i (z i ), Q (n) (z n ) = Q i (z i ). i=1 i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 24 / 109

Basic Definitions Define the product distributions n n P (n) (z n ) = P i (z i ), Q (n) (z n ) = Q i (z i ). The relative entropy is i=1 i=1 D(P Q) = z P(z) log P(z) Q(z), Vincent Tan (NUS) Second-Order Asymptotics NTU 24 / 109

Basic Definitions Define the product distributions n n P (n) (z n ) = P i (z i ), Q (n) (z n ) = Q i (z i ). i=1 i=1 The relative entropy is D(P Q) = z P(z) log P(z) Q(z), The relative entropy variance is V(P Q) := z [ P(z) log P(z) ] 2 Q(z) D(P Q). Vincent Tan (NUS) Second-Order Asymptotics NTU 24 / 109

Basic Asymptotic Expansions Lemma (Asymptotic Expansion for D ε s) Assume that V(P i Q i ) V for all i and for some V > 0. Then, D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) where D n := 1 n n D(P i Q i ), i=1 V n := 1 n n V(P i Q i ). i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 25 / 109

Basic Asymptotic Expansions Lemma (Asymptotic Expansion for D ε s) Assume that V(P i Q i ) V for all i and for some V > 0. Then, D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) where D n := 1 n n D(P i Q i ), i=1 V n := 1 n n V(P i Q i ). i=1 Corollary (Asymptotic Expansion for D ε s for Identical Distributions) If P i = P and Q i = Q for all i {1, 2,..., n} and P Q, then D ε s(p (n) Q (n) ) = nd + nvφ 1 (ε) + O(1) where D = D(P Q) and V = V(P Q). Vincent Tan (NUS) Second-Order Asymptotics NTU 25 / 109

Asymptotic Expansion for D ε s(p n Q n ) Moral: n (1 n n i=1 ) log P(Z i) Q(Z i ) D(P Q) d N (0, V(P Q)). Vincent Tan (NUS) Second-Order Asymptotics NTU 26 / 109

Asymptotic Expansion for D ε s(p n Q n ) Moral: n (1 n n i=1 ) log P(Z i) Q(Z i ) D(P Q) d N (0, V(P Q)). V(P Q) 1/2 D(P Q) Vincent Tan (NUS) Second-Order Asymptotics NTU 26 / 109

Berry-Esseen Theorem Theorem Let X 1,..., X n be independent random variables with Define E[X i ] = 0, E[X 2 i ] = σ 2 i, E[ X i 3 ] = T i σ 2 := 1 n Then for every n 1, ( sup Pr 1 σ n a R n σi 2, i=1 T := 1 n n i=1 ) n X i < a Φ(a) i=1 where Φ(a) = a 1 2π e t2 /2 dt. T i 6T σ 3 n Vincent Tan (NUS) Second-Order Asymptotics NTU 27 / 109

Berry-Esseen Theorem vs Central Limit Theorem Recall that CLT implies that ( ) 1 n Pr σ X i < a Φ(a) = Pr(Z < a) n for every a R. i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 28 / 109

Berry-Esseen Theorem vs Central Limit Theorem Recall that CLT implies that ( ) 1 n Pr σ X i < a Φ(a) = Pr(Z < a) n for every a R. i=1 Thus, the Berry-Esseen theorem quantifies the rate of convergence of the distribution function of to the standard Gaussian Z. 1 σ n n i=1 X i Vincent Tan (NUS) Second-Order Asymptotics NTU 28 / 109

Basic Asymptotic Expansions: Proof Now we show D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) Vincent Tan (NUS) Second-Order Asymptotics NTU 29 / 109

Basic Asymptotic Expansions: Proof Now we show D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) We may write ( Pr log P(n) (Z n ) Q (n) (Z n ) R ) = Pr ( n i=1 ) log P i(z i ) Q i (Z i ) R Vincent Tan (NUS) Second-Order Asymptotics NTU 29 / 109

Basic Asymptotic Expansions: Proof Now we show D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) We may write ( Pr log P(n) (Z n ) Q (n) (Z n ) R ) = Pr By the Berry-Esseen theorem, ( n ) Pr log P i(z i ) Q i (Z i ) R i=1 ( n i=1 ) log P i(z i ) Q i (Z i ) R ( ) R ndn = Φ ± c nvn n Vincent Tan (NUS) Second-Order Asymptotics NTU 29 / 109

Basic Asymptotic Expansions: Proof Now we show D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) We may write ( Pr log P(n) (Z n ) Q (n) (Z n ) R ) = Pr By the Berry-Esseen theorem, ( n ) Pr log P i(z i ) Q i (Z i ) R i=1 ( n i=1 ) log P i(z i ) Q i (Z i ) R ( ) R ndn = Φ ± c nvn n Constant c > 0 finite because V(P i Q i ) V > 0 for all i. Vincent Tan (NUS) Second-Order Asymptotics NTU 29 / 109

Basic Asymptotic Expansions: Proof Now we show D ε s(p (n) Q (n) ) = nd n + nv n Φ 1 (ε) + O(1) We may write ( Pr log P(n) (Z n ) Q (n) (Z n ) R ) = Pr By the Berry-Esseen theorem, ( n ) Pr log P i(z i ) Q i (Z i ) R i=1 ( n i=1 ) log P i(z i ) Q i (Z i ) R ( ) R ndn = Φ ± c nvn n Constant c > 0 finite because V(P i Q i ) V > 0 for all i. Now upper bound the RHS by ε and solve for R Vincent Tan (NUS) Second-Order Asymptotics NTU 29 / 109

Further Asymptotic Expansions Our real objective in this section is β 1 ε (P n, Q n ) := inf {P MD(δ) : P FA (δ) ε} δ:z {0,1} Vincent Tan (NUS) Second-Order Asymptotics NTU 30 / 109

Further Asymptotic Expansions Our real objective in this section is β 1 ε (P n, Q n ) := inf {P MD(δ) : P FA (δ) ε} δ:z {0,1} Now, we assume that P n and Q n are product distributions where the component distributions are identical, i.e., P n (z n ) = n P(z i ), Q n (z n ) = i=1 n Q(z i ). i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 30 / 109

Further Asymptotic Expansions Our real objective in this section is β 1 ε (P n, Q n ) := inf {P MD(δ) : P FA (δ) ε} δ:z {0,1} Now, we assume that P n and Q n are product distributions where the component distributions are identical, i.e., P n (z n ) = n P(z i ), Q n (z n ) = i=1 n Q(z i ). i=1 Recall that D ε h(p n Q n ) := log β 1 ε(p n, Q n ) 1 ε and D ε h is related to Dε s by simple bounds Vincent Tan (NUS) Second-Order Asymptotics NTU 30 / 109

Further Asymptotic Expansions Lemma Assume that P Q. Then, for every ε (0, 1), D ε h(p n Q n ) = nd + nvφ 1 (ε) + O(log n) where D = D(P Q) and V = V(P Q). Vincent Tan (NUS) Second-Order Asymptotics NTU 31 / 109

Further Asymptotic Expansions Lemma Assume that P Q. Then, for every ε (0, 1), D ε h(p n Q n ) = nd + nvφ 1 (ε) + O(log n) where D = D(P Q) and V = V(P Q). Corollary Assume that P Q. Then, for every ε (0, 1), β 1 ε (P n, Q n ) = exp ( nd nvφ 1 (ε) + O(log n) ) Vincent Tan (NUS) Second-Order Asymptotics NTU 31 / 109

Chernoff-Stein Lemma H. Chernoff C. Stein Vincent Tan (NUS) Second-Order Asymptotics NTU 32 / 109

Chernoff-Stein Lemma H. Chernoff C. Stein Corollary (Chernoff-Stein Lemma) lim 1 n n log β 1 ε(p n, Q n ) = D(P Q), ε (0, 1). Vincent Tan (NUS) Second-Order Asymptotics NTU 32 / 109

Chernoff-Stein Lemma H. Chernoff C. Stein Corollary (Chernoff-Stein Lemma) lim 1 n n log β 1 ε(p n, Q n ) = D(P Q), ε (0, 1). We have proved a refinement to the Chernoff-Stein Lemma, cf. log β 1 ε (P n, Q n ) = nd(p Q) + nv(p Q)Φ 1 (ε) + O(log n) Vincent Tan (NUS) Second-Order Asymptotics NTU 32 / 109

Further Asymptotic Expansions: Proof Recall that D ε s(p n Q n 1 ) log 1 ε Dε h(p n Q n ) D ε+η s (P n Q n ) + log 1 ε η Vincent Tan (NUS) Second-Order Asymptotics NTU 33 / 109

Further Asymptotic Expansions: Proof Recall that D ε s(p n Q n 1 ) log 1 ε Dε h(p n Q n ) D ε+η s (P n Q n ) + log 1 ε η From the lower bound, we obtain D ε s(p n Q n ) log 1 1 ε = nd + nvφ 1 (ε) + O(1) Vincent Tan (NUS) Second-Order Asymptotics NTU 33 / 109

Further Asymptotic Expansions: Proof Recall that D ε s(p n Q n 1 ) log 1 ε Dε h(p n Q n ) D ε+η s (P n Q n ) + log 1 ε η From the lower bound, we obtain D ε s(p n Q n ) log 1 1 ε = nd + nvφ 1 (ε) + O(1) From the upper bound, setting η = 1 n, we obtain Ds ε+η (P n Q n ) + log 1 ε = nd + ( nvφ 1 ε + 1 ) + 1 log n + O(1) η n 2 Taylor = nd + nvφ 1 (ε) + 1 log n + O(1) 2 Vincent Tan (NUS) Second-Order Asymptotics NTU 33 / 109

Summary of Binary Hypothesis Testing The main result of this section is 1 V(P Q) n log β 1 ε(p n, Q n ) D(P Q) + Φ 1 (ε). n Vincent Tan (NUS) Second-Order Asymptotics NTU 34 / 109

Summary of Binary Hypothesis Testing The main result of this section is 1 V(P Q) n log β 1 ε(p n, Q n ) D(P Q) + Φ 1 (ε). n We can be more precise about the third-order terms (omitted in this tutorial) Vincent Tan (NUS) Second-Order Asymptotics NTU 34 / 109

Summary of Binary Hypothesis Testing The main result of this section is 1 V(P Q) n log β 1 ε(p n, Q n ) D(P Q) + Φ 1 (ε). n We can be more precise about the third-order terms (omitted in this tutorial) Sometimes we can t guarantee that inf V(P i Q i ) > 0. i 1 In this case, we can use Chebyshev s inequality instead of the Berry-Esseen theorem to upper bound D ε s(p (n) Q (n) ) Vincent Tan (NUS) Second-Order Asymptotics NTU 34 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 35 / 109

Setup for Lossless Source Coding x m ˆx f ϕ Illustration of the fixed-to-fixed length source coding problem. Vincent Tan (NUS) Second-Order Asymptotics NTU 36 / 109

Setup for Lossless Source Coding x m ˆx f ϕ Illustration of the fixed-to-fixed length source coding problem. An (M, ε)-code for the source P P(X ) consists of encoder f : X {1,..., M} decoder ϕ : {1,..., M} X such that the probability of error P ( {x X : ϕ(f (x)) x} ) ε. Vincent Tan (NUS) Second-Order Asymptotics NTU 36 / 109

Non-Asymptotic Fundamental Limit for Source Coding Define M (P, ε) := min{m : an (M, ε)-code for P} Vincent Tan (NUS) Second-Order Asymptotics NTU 37 / 109

Non-Asymptotic Fundamental Limit for Source Coding Define M (P, ε) := min{m : an (M, ε)-code for P} When we observe n independent and identically distributed realizations of the source, we are interested in M (P n, ε), where P n (x n ) = n P(x i ). i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 37 / 109

The Source Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless source and any ε (0, 1), 1 lim n n log M (P n, ε) = H(P) = x P(x) log 1 P(x), bits per source symb. Vincent Tan (NUS) Second-Order Asymptotics NTU 38 / 109

The Source Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless source and any ε (0, 1), 1 lim n n log M (P n, ε) = H(P) = x P(x) log 1 P(x), bits per source symb. The quantity H(P) is known as the entropy of the source. Vincent Tan (NUS) Second-Order Asymptotics NTU 38 / 109

The Source Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless source and any ε (0, 1), 1 lim n n log M (P n, ε) = H(P) = x P(x) log 1 P(x), bits per source symb. The quantity H(P) is known as the entropy of the source. Interpretation: Entropy is the smallest exponent of the size of sets in X n with P n -probability of at least 1 ε. Vincent Tan (NUS) Second-Order Asymptotics NTU 38 / 109

Refinements to the Source Coding Theorem Can we refine the remainder terms in log M (P n, ε) = nh(p) + o(n) bits per source symb. Vincent Tan (NUS) Second-Order Asymptotics NTU 39 / 109

Refinements to the Source Coding Theorem Can we refine the remainder terms in log M (P n, ε) = nh(p) + o(n) bits per source symb. In fact, this can be done very easily by using binary hypothesis testing! Vincent Tan (NUS) Second-Order Asymptotics NTU 39 / 109

Refinements to the Source Coding Theorem Can we refine the remainder terms in log M (P n, ε) = nh(p) + o(n) bits per source symb. In fact, this can be done very easily by using binary hypothesis testing! Strong and simple connection between binary hypothesis testing and fixed-length lossless source coding Vincent Tan (NUS) Second-Order Asymptotics NTU 39 / 109

Non-Asymptotic Achievability for Source Coding Lemma Let ε (0, 1). We have M (P, ε) β 1 ε (P, µ) = D ε h(p µ) log where µ is the counting measure, i.e., 1 1 ε µ(a) = A, A X Vincent Tan (NUS) Second-Order Asymptotics NTU 40 / 109

Non-Asymptotic Achievability for Source Coding Lemma Let ε (0, 1). We have M (P, ε) β 1 ε (P, µ) = D ε h(p µ) log where µ is the counting measure, i.e., 1 1 ε µ(a) = A, A X Remark that second argument β 1 ε (P, Q) can be any unnormalized measure (not necessarily a probability measure), i.e., we take Q = µ Vincent Tan (NUS) Second-Order Asymptotics NTU 40 / 109

Proof of Achievability for Source Coding Let T X be a typical set of symbols with P-probability 1 ε, i.e., P(T ) 1 ε Vincent Tan (NUS) Second-Order Asymptotics NTU 41 / 109

Proof of Achievability for Source Coding Let T X be a typical set of symbols with P-probability 1 ε, i.e., P(T ) 1 ε Search over all such sets for the one with the smallest cardinality. Vincent Tan (NUS) Second-Order Asymptotics NTU 41 / 109

Proof of Achievability for Source Coding Let T X be a typical set of symbols with P-probability 1 ε, i.e., P(T ) 1 ε Search over all such sets for the one with the smallest cardinality. Assign each symbol in T a unique index from {1,..., T } and those in X \ T the symbol 1. Vincent Tan (NUS) Second-Order Asymptotics NTU 41 / 109

Proof of Achievability for Source Coding Let T X be a typical set of symbols with P-probability 1 ε, i.e., P(T ) 1 ε Search over all such sets for the one with the smallest cardinality. Assign each symbol in T a unique index from {1,..., T } and those in X \ T the symbol 1. Clearly, error probability ε and hence M (P, ε) The RHS is exactly β 1 ε (P, µ). min T T X :P(T ) 1 ε Vincent Tan (NUS) Second-Order Asymptotics NTU 41 / 109

Non-Asymptotic Converse for Source Coding Lemma Let ε (0, 1). For any η (0, 1 ε), we have log M (P, ε) D ε+η s (P µ) log 1 η Proof follows from the idea as upper bounding D ε h with Dε+η s Vincent Tan (NUS) Second-Order Asymptotics NTU 42 / 109

Non-Asymptotic Converse for Source Coding Lemma Let ε (0, 1). For any η (0, 1 ε), we have log M (P, ε) D ε+η s (P µ) log 1 η Proof follows from the idea as upper bounding D ε h In summary, with Dε+η s D ε+η s (P µ) log 1 η log M (P, ε) D ε 1 h(p µ) log 1 ε Vincent Tan (NUS) Second-Order Asymptotics NTU 42 / 109

Asymptotic Expansion for Lossless Source Coding Define the entropy variance as V(P) = x Theorem (Strassen (1962)) P(x) For any source P with V(P) > 0, we have [ log 1 ] 2 P(x) H(P) log M (P n, ε) = nh(p) nv(p)φ 1 (ε) + O(log n) V. Strassen Vincent Tan (NUS) Second-Order Asymptotics NTU 43 / 109

Asymptotic Expansion for Source Coding: Proof The lower bound reads log M (P n, ε) D ε+η s (P n µ n ) log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 44 / 109

Asymptotic Expansion for Source Coding: Proof The lower bound reads log M (P n, ε) D ε+η s (P n µ n ) log 1 η Choose η = 1 n. Use the asymptotic expansion for D ε s: log M (P n, ε) D ε+ 1 n s (P n µ n ) + O(log n) = nh(p) nv(p)φ 1 ( ε + 1 n ) + O(log n) because D(P µ) = H(P), and V(P µ) = V(P). Vincent Tan (NUS) Second-Order Asymptotics NTU 44 / 109

Asymptotic Expansion for Source Coding: Proof The lower bound reads log M (P n, ε) D ε+η s (P n µ n ) log 1 η Choose η = 1 n. Use the asymptotic expansion for D ε s: log M (P n, ε) D ε+ 1 n s (P n µ n ) + O(log n) = nh(p) nv(p)φ 1 ( ε + 1 n ) + O(log n) because D(P µ) = H(P), and V(P µ) = V(P). Proof is completed by Taylor expanding Φ 1 ( ). Vincent Tan (NUS) Second-Order Asymptotics NTU 44 / 109

Asymptotic Expansion for Source Coding: Proof The upper bound reads log M (P n, ε) D ε h(p n µ n ) log 1 1 ε Vincent Tan (NUS) Second-Order Asymptotics NTU 45 / 109

Asymptotic Expansion for Source Coding: Proof The upper bound reads log M (P n, ε) D ε h(p n µ n ) log 1 1 ε The asymptotic expansion for D ε h then yields log M (P n, ε) nh(p) nv(p)φ 1 (ε) + O(log n) again because D(P µ) = H(P) and V(P µ) = V(P) Vincent Tan (NUS) Second-Order Asymptotics NTU 45 / 109

Lossless Source Coding: Remarks Main takeaway here is that log M (P n, ε) = nh(p) nv(p)φ 1 (ε) + O(log n) We can be more precise about O(log n) = 1 log n + O(1) 2 This requires some additional techniques. Vincent Tan (NUS) Second-Order Asymptotics NTU 46 / 109

Lossless Source Coding: Remarks Main takeaway here is that log M (P n, ε) = nh(p) nv(p)φ 1 (ε) + O(log n) We can be more precise about O(log n) = 1 log n + O(1) 2 This requires some additional techniques. Observe that fixed-length lossless source coding is nothing but binary hypothesis testing with Q taken to be the counting measure µ and P remains as P! Vincent Tan (NUS) Second-Order Asymptotics NTU 46 / 109

Lossless Source Coding: Remarks Main takeaway here is that log M (P n, ε) = nh(p) nv(p)φ 1 (ε) + O(log n) We can be more precise about O(log n) = 1 log n + O(1) 2 This requires some additional techniques. Observe that fixed-length lossless source coding is nothing but binary hypothesis testing with Q taken to be the counting measure µ and P remains as P! Proof follows directly from expansions of D ε h (Pn Q n ) and D ε s(p n Q n ) Vincent Tan (NUS) Second-Order Asymptotics NTU 46 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 47 / 109

The Setup of the Channel Coding Problem m x y ˆm f W ϕ Illustration of the channel coding problem. Vincent Tan (NUS) Second-Order Asymptotics NTU 48 / 109

The Setup of the Channel Coding Problem m x y ˆm f W ϕ Illustration of the channel coding problem. An (M, ε) av -code for the channel W P(Y X ) consists of encoder f : {1,..., M} X decoder ϕ : X {1,..., M} such that the average probability of error 1 M M W(Y \ ϕ(m) f (m)) ε. m=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 48 / 109

Non-Asymptotic Fund. Limit for Channel Coding Define M av(w, ε) := max{m : an (M, ε) av -code for W} Vincent Tan (NUS) Second-Order Asymptotics NTU 49 / 109

Non-Asymptotic Fund. Limit for Channel Coding Define M av(w, ε) := max{m : an (M, ε) av -code for W} When we use the channel n times, we are interested in n Mav(W n, ε), where W n (y n x n ) = W(y i x i ). i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 49 / 109

Non-Asymptotic Fund. Limit for Channel Coding Define M av(w, ε) := max{m : an (M, ε) av -code for W} When we use the channel n times, we are interested in n Mav(W n, ε), where W n (y n x n ) = W(y i x i ). This means that the channel is stationary and memoryless. i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 49 / 109

Non-Asymptotic Fund. Limit for Channel Coding Define M av(w, ε) := max{m : an (M, ε) av -code for W} When we use the channel n times, we are interested in n Mav(W n, ε), where W n (y n x n ) = W(y i x i ). This means that the channel is stationary and memoryless. We also assume X and Y are finite so W is a DMC. i=1 Vincent Tan (NUS) Second-Order Asymptotics NTU 49 / 109

Non-Asymptotic Fund. Limit for Channel Coding Define M av(w, ε) := max{m : an (M, ε) av -code for W} When we use the channel n times, we are interested in M av(w n, ε), where W n (y n x n ) = n W(y i x i ). This means that the channel is stationary and memoryless. i=1 We also assume X and Y are finite so W is a DMC. Later we will also consider AWGN channels in which there is a power constraint Vincent Tan (NUS) Second-Order Asymptotics NTU 49 / 109

The Channel Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless channel and any ε (0, 1), 1 lim n n log M av(w n, ε) = C = max I(P, W), bits per channel use. P Vincent Tan (NUS) Second-Order Asymptotics NTU 50 / 109

The Channel Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless channel and any ε (0, 1), 1 lim n n log M av(w n, ε) = C = max I(P, W), bits per channel use. P Mutual information for an input distribution P and channel W is defined as I(P, W) = x P(x)D(W( x) PW) = x,y P(x)W(y x) log W(y x) PW(y) Vincent Tan (NUS) Second-Order Asymptotics NTU 50 / 109

The Channel Coding Theorem Shannon, in his seminal 1948 paper, showed using typicality arguments that Theorem (Shannon (1948)) For any discrete memoryless channel and any ε (0, 1), 1 lim n n log M av(w n, ε) = C = max I(P, W), bits per channel use. P Mutual information for an input distribution P and channel W is defined as I(P, W) = x P(x)D(W( x) PW) = x,y P(x)W(y x) log W(y x) PW(y) Interpretation: We can send up to C bits per channel use over W. Vincent Tan (NUS) Second-Order Asymptotics NTU 50 / 109

Refinements to the Channel Coding Theorem Can we refine the remainder terms in log M av(w n, ε) = nc + o(n) Vincent Tan (NUS) Second-Order Asymptotics NTU 51 / 109

Refinements to the Channel Coding Theorem Can we refine the remainder terms in log M av(w n, ε) = nc + o(n) This is not as simple as lossless source coding Vincent Tan (NUS) Second-Order Asymptotics NTU 51 / 109

Refinements to the Channel Coding Theorem Can we refine the remainder terms in log M av(w n, ε) = nc + o(n) This is not as simple as lossless source coding Requires understanding of non-asymptotic (finite blocklength) bounds Vincent Tan (NUS) Second-Order Asymptotics NTU 51 / 109

Refinements to the Channel Coding Theorem Can we refine the remainder terms in log M av(w n, ε) = nc + o(n) This is not as simple as lossless source coding Requires understanding of non-asymptotic (finite blocklength) bounds Careful asymptotic evaluations Vincent Tan (NUS) Second-Order Asymptotics NTU 51 / 109

Non-Asymptotic Achievability for Channel Coding Lemma (Feinstein (1954)) Let ε (0, 1) and W be any channel from X to Y. Then for any η (0, ε), we have log M av(w, ε) sup P D ε η s (P W P PW) log 1 η. Vincent Tan (NUS) Second-Order Asymptotics NTU 52 / 109

Non-Asymptotic Achievability for Channel Coding Lemma (Feinstein (1954)) Let ε (0, 1) and W be any channel from X to Y. Then for any η (0, ε), we have log M av(w, ε) sup P D ε η s (P W P PW) log 1 η. In fact, the original Feinstein s lemma (1954) works for max probability of error log M max(w, ε) sup P D ε η s (P W P PW) log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 52 / 109

Non-Asymptotic Achievability for Channel Coding Lemma (Feinstein (1954)) Let ε (0, 1) and W be any channel from X to Y. Then for any η (0, ε), we have log M av(w, ε) sup P D ε η s (P W P PW) log 1 η. In fact, the original Feinstein s lemma (1954) works for max probability of error log M max(w, ε) sup P D ε η s (P W P PW) log 1 η Again connection to binary hypothesis testing Vincent Tan (NUS) Second-Order Asymptotics NTU 52 / 109

Proof of Feinstein s Lemma : Part I Generate M codewords independently from P. This forms the codebook C = {x(1),..., x(m)}. Vincent Tan (NUS) Second-Order Asymptotics NTU 53 / 109

Proof of Feinstein s Lemma : Part I Generate M codewords independently from P. This forms the codebook C = {x(1),..., x(m)}. Given y, decode to m {1,..., M} if and only if log W(y x(m)) PW(y) > γ Vincent Tan (NUS) Second-Order Asymptotics NTU 53 / 109

Proof of Feinstein s Lemma : Part I Generate M codewords independently from P. This forms the codebook C = {x(1),..., x(m)}. Given y, decode to m {1,..., M} if and only if log W(y x(m)) PW(y) > γ Assume m = 1. Error events are { E 1 := E 2 := log W(Y X(1)) PW(Y) } γ { m 1 : log W(Y X( m)) > γ PW(Y) } Vincent Tan (NUS) Second-Order Asymptotics NTU 53 / 109

Proof of Feinstein s Lemma : Part II Probability of error is Pr(E) Pr(E 1 ) + Pr(E 2 ) Vincent Tan (NUS) Second-Order Asymptotics NTU 54 / 109

Proof of Feinstein s Lemma : Part II Probability of error is Pr(E) Pr(E 1 ) + Pr(E 2 ) First term is (related to information spectrum divergence) ( Pr(E 1 ) = Pr log W(Y X(1)) ) γ PW(Y) Vincent Tan (NUS) Second-Order Asymptotics NTU 54 / 109

Proof of Feinstein s Lemma : Part II Probability of error is Pr(E) Pr(E 1 ) + Pr(E 2 ) First term is (related to information spectrum divergence) ( Pr(E 1 ) = Pr log W(Y X(1)) ) γ PW(Y) Second term is Pr(E 2 ) = Pr M x,y M x,y ( m 1 : log W(Y X( m)) PW(Y) P(x)PW(y)1 ) > γ { log W(y x) PW(y) > γ P(x)W(y x) exp( γ)1 } { log W(y x) } PW(y) > γ M exp( γ) Vincent Tan (NUS) Second-Order Asymptotics NTU 54 / 109

Proof of Feinstein s Lemma : Part III Hence, there exists an (M, ε) av -code such that for every P and every γ, ( ε Pr log W(Y X(1)) ) γ + M exp( γ) PW(Y) or setting η := M exp( γ), ( ε η Pr log W(Y X(1)) log M ) PW(Y) η Vincent Tan (NUS) Second-Order Asymptotics NTU 55 / 109

Proof of Feinstein s Lemma : Part III Hence, there exists an (M, ε) av -code such that for every P and every γ, ( ε Pr log W(Y X(1)) ) γ + M exp( γ) PW(Y) or setting η := M exp( γ), ( ε η Pr log W(Y X(1)) log M ) PW(Y) η In other words, for every P and every η, we have an (M, ε) av -code s.t. log M η Dε η s (P W P PW) Vincent Tan (NUS) Second-Order Asymptotics NTU 55 / 109

Proof of Feinstein s Lemma : Part III Hence, there exists an (M, ε) av -code such that for every P and every γ, ( ε Pr log W(Y X(1)) ) γ + M exp( γ) PW(Y) or setting η := M exp( γ), ( ε η Pr log W(Y X(1)) log M ) PW(Y) η In other words, for every P and every η, we have an (M, ε) av -code s.t. log M η Dε η s (P W P PW) This completes the proof of Feinstein s lemma. Vincent Tan (NUS) Second-Order Asymptotics NTU 55 / 109

Second-Order Achievability for Channel Coding Feinstein s lemma says that log M av(w n, ε) sup D ε η s (P W n P PW n ) log 1 P P(X n ) η. Vincent Tan (NUS) Second-Order Asymptotics NTU 56 / 109

Second-Order Achievability for Channel Coding Feinstein s lemma says that log M av(w n, ε) sup D ε η s (P W n P PW n ) log 1 P P(X n ) η. Choose η = 1 n and where P(x n ) = which we assume to be unique. n P (x i ) i=1 P = arg max I(P, W) P Vincent Tan (NUS) Second-Order Asymptotics NTU 56 / 109

Second-Order Achievability for Channel Coding By the asymptotic expansion of D ε s, we obtain D ε η s ((P ) n W n (P ) n (P W) n ) = ni(p, W) + nu(p, W)Φ 1 (ε) + O(log n) Vincent Tan (NUS) Second-Order Asymptotics NTU 57 / 109

Second-Order Achievability for Channel Coding By the asymptotic expansion of D ε s, we obtain D ε η s ((P ) n W n (P ) n (P W) n ) = ni(p, W) + nu(p, W)Φ 1 (ε) + O(log n) Then using the bound on log M av(w n, ε), we obtain log M av(w n, ε) ni(p, W) + nu(p, W)Φ 1 (ε) + O(log n) = nc + nu(p, W)Φ 1 (ε) + O(log n) where the unconditional information variance is U(P, W) = [ P (x)w(y x) log W(y x) P W(y) C x,y ] 2 Vincent Tan (NUS) Second-Order Asymptotics NTU 57 / 109

Second-Order Achievability for Channel Coding Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) Vincent Tan (NUS) Second-Order Asymptotics NTU 58 / 109

Second-Order Achievability for Channel Coding Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) The second-order term contains U(P, W) = x,y P (x)w(y x) [ log W(y x) ] 2 P W(y) C Vincent Tan (NUS) Second-Order Asymptotics NTU 58 / 109

Second-Order Achievability for Channel Coding Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) The second-order term contains U(P, W) = x,y P (x)w(y x) [ log W(y x) ] 2 P W(y) C This is not quite right, but almost right; need a converse. Vincent Tan (NUS) Second-Order Asymptotics NTU 58 / 109

Second-Order Achievability for Channel Coding Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) The second-order term contains U(P, W) = x,y P (x)w(y x) [ log W(y x) ] 2 P W(y) C This is not quite right, but almost right; need a converse. We can refine the third-order term (Polyanskiy s thesis (2010)) Vincent Tan (NUS) Second-Order Asymptotics NTU 58 / 109

Second-Order Achievability for Channel Coding Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) The second-order term contains U(P, W) = x,y P (x)w(y x) [ log W(y x) ] 2 P W(y) C This is not quite right, but almost right; need a converse. We can refine the third-order term (Polyanskiy s thesis (2010)) Since Feinstein s lemma holds under the max error setting, the same bound holds for log M max(w n, ε). Vincent Tan (NUS) Second-Order Asymptotics NTU 58 / 109

Non-Asymptotic Converse for Channel Coding Lemma (Hayashi-Nagaoka (2003)) Let ε (0, 1) and η (0, 1 ε). Let W be any channel from X to Y. Then, log M av(w, ε) inf sup Q P(Y) P P(X ) D ε+η s (P W P Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 59 / 109

Non-Asymptotic Converse for Channel Coding Lemma (Hayashi-Nagaoka (2003)) Let ε (0, 1) and η (0, 1 ε). Let W be any channel from X to Y. Then, log M av(w, ε) inf sup Q P(Y) P P(X ) D ε+η s (P W P Q) + log 1 η Amazing duality with Feinstein s lemma log M av(w, ε) sup P P(X ) Ds ε η (P W P PW) log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 59 / 109

Non-Asymptotic Converse for Channel Coding Lemma (Hayashi-Nagaoka (2003)) Let ε (0, 1) and η (0, 1 ε). Let W be any channel from X to Y. Then, log M av(w, ε) inf sup Q P(Y) P P(X ) D ε+η s (P W P Q) + log 1 η Amazing duality with Feinstein s lemma log M av(w, ε) sup P P(X ) Ds ε η (P W P PW) log 1 η But the user can choose the output distribution Q Vincent Tan (NUS) Second-Order Asymptotics NTU 59 / 109

Non-Asymptotic Converse for Channel Coding Lemma (Hayashi-Nagaoka (2003)) Let ε (0, 1) and η (0, 1 ε). Let W be any channel from X to Y. Then, log M av(w, ε) inf sup Q P(Y) P P(X ) D ε+η s (P W P Q) + log 1 η Amazing duality with Feinstein s lemma log M av(w, ε) sup P P(X ) Ds ε η (P W P PW) log 1 η But the user can choose the output distribution Q Key in the second-order converse is how to choose Q so as to ensure the evaluation of RHS is easy Vincent Tan (NUS) Second-Order Asymptotics NTU 59 / 109

Proof of Hayashi-Nagaoka Lemma : Part I Fix any (M, ε) av -code for W. This induces the Markov chain J X Y Ĵ where J is uniformly distributed on {1,..., M}. This Markov chain induces the code distribution P (j, x, y, ĵ) = 1 JXYĴ 1{x = f (j)}w(y x)1{ĵ = ϕ(y)}. M Vincent Tan (NUS) Second-Order Asymptotics NTU 60 / 109

Proof of Hayashi-Nagaoka Lemma : Part I Fix any (M, ε) av -code for W. This induces the Markov chain J X Y Ĵ where J is uniformly distributed on {1,..., M}. This Markov chain induces the code distribution P (j, x, y, ĵ) = 1 JXYĴ 1{x = f (j)}w(y x)1{ĵ = ϕ(y)}. M Due to the data processing inequality for D ε h, we obtain D ε h(p W P Q) = D ε h(p XY P X Q Y ) D ε h(p JĴ P J QĴ) where QĴ is induced by the decoder ϕ applied to Q Y. Vincent Tan (NUS) Second-Order Asymptotics NTU 60 / 109

Proof of Hayashi-Nagaoka Lemma : Part I Fix any (M, ε) av -code for W. This induces the Markov chain J X Y Ĵ where J is uniformly distributed on {1,..., M}. This Markov chain induces the code distribution P (j, x, y, ĵ) = 1 JXYĴ 1{x = f (j)}w(y x)1{ĵ = ϕ(y)}. M Due to the data processing inequality for D ε h, we obtain D ε h(p W P Q) = D ε h(p XY P X Q Y ) D ε h(p JĴ P J QĴ) where QĴ is induced by the decoder ϕ applied to Q Y. Consider the test δ(j, ĵ) = 1{j ĵ} Vincent Tan (NUS) Second-Order Asymptotics NTU 60 / 109

Proof of Hayashi-Nagaoka Lemma : Part II The test satisfies E PJĴ [δ(j, Ĵ)] = Pr(J Ĵ) ε. Vincent Tan (NUS) Second-Order Asymptotics NTU 61 / 109

Proof of Hayashi-Nagaoka Lemma : Part II The test satisfies E PJĴ [δ(j, Ĵ)] = Pr(J Ĵ) ε. Furthermore, E PJ QĴ[δ(J, Ĵ)] = j,ĵ P J (j)qĵ(ĵ)1{j ĵ} = 1 j,ĵ P J (j)qĵ(ĵ)1{j = ĵ} = 1 ĵ QĴ(ĵ) j P J (j)1{j = ĵ} = 1 ĵ QĴ(ĵ) 1 M = 1 1 M Vincent Tan (NUS) Second-Order Asymptotics NTU 61 / 109

Proof of Hayashi-Nagaoka Lemma : Part III By the definition of the hypothesis-testing divergence, D ε h(p JĴ P J QĴ) log M + log(1 ε) Vincent Tan (NUS) Second-Order Asymptotics NTU 62 / 109

Proof of Hayashi-Nagaoka Lemma : Part III By the definition of the hypothesis-testing divergence, D ε h(p JĴ P J QĴ) log M + log(1 ε) By the relation between D ε s and D ε h, log M D ε h(p W P Q) + log 1 1 ε D ε+η s (P W P Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 62 / 109

Proof of Hayashi-Nagaoka Lemma : Part III By the definition of the hypothesis-testing divergence, D ε h(p JĴ P J QĴ) log M + log(1 ε) By the relation between D ε s and D ε h, log M D ε h(p W P Q) + log 1 1 ε D ε+η s (P W P Q) + log 1 η Maximize over P to make the bound code-independent log M sup P Ds ε+η (P W P Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 62 / 109

Proof of Hayashi-Nagaoka Lemma : Part III By the definition of the hypothesis-testing divergence, D ε h(p JĴ P J QĴ) log M + log(1 ε) By the relation between D ε s and D ε h, log M D ε h(p W P Q) + log 1 1 ε D ε+η s (P W P Q) + log 1 η Maximize over P to make the bound code-independent log M sup P Q is a free parameter. Minimize over it. Ds ε+η (P W P Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 62 / 109

Second-Order Converse for Channel Coding : Part I Fix an (M, ε) av -code. Starting from the Hayashi-Nagaoka converse for the channel W n, we obtain log M sup P for any fixed Q P(Y n ). D ε+η s (P W n P Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 63 / 109

Second-Order Converse for Channel Coding : Part I Fix an (M, ε) av -code. Starting from the Hayashi-Nagaoka converse for the channel W n, we obtain log M sup P for any fixed Q P(Y n ). D ε+η s (P W n P Q) + log 1 η We can replace the optimization over P P(X n ) to an optimization over input sequences x n X n : log M max x n D ε+η s (W n ( x n ) Q) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 63 / 109

Second-Order Converse for Channel Coding : Part I Fix an (M, ε) av -code. Starting from the Hayashi-Nagaoka converse for the channel W n, we obtain log M sup P for any fixed Q P(Y n ). D ε+η s (P W n P Q) + log 1 η We can replace the optimization over P P(X n ) to an optimization over input sequences x n X n : log M max x n D ε+η s (W n ( x n ) Q) + log 1 η Now choose Q P(Y n ) to be the convex combination Q(y n 1 ) = (PW) n (y n ) P n (X ) P P n(x ) Vincent Tan (NUS) Second-Order Asymptotics NTU 63 / 109

Second-Order Converse for Channel Coding : Part II Lemma Let θ i 0 be such that i θ i = 1. Then ( ) D ε s P θ i Q i inf {D εs(p Q } i ) + log 1θi i i Vincent Tan (NUS) Second-Order Asymptotics NTU 64 / 109

Second-Order Converse for Channel Coding : Part II Lemma Let θ i 0 be such that i θ i = 1. Then ( ) D ε s P θ i Q i inf {D εs(p Q } i ) + log 1θi i i From the previous derivations, log M max D ε+η x n s W n ( x n ) 1 P n (X ) P P n(x ) (PW) n + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 64 / 109

Second-Order Converse for Channel Coding : Part II Lemma Let θ i 0 be such that i θ i = 1. Then ( ) D ε s P θ i Q i inf {D εs(p Q } i ) + log 1θi i i From the previous derivations, log M max D ε+η x n s W n ( x n ) 1 P n (X ) P P n(x ) (PW) n + log 1 η By sieving out the type of x n, one has log M max x n D ε+η s (W n ( x n ) (ˆP x nw) n ) + log P n (X ) + log 1 η Vincent Tan (NUS) Second-Order Asymptotics NTU 64 / 109

Second-Order Converse for Channel Coding : Part III log M max x n D ε+η s (W n ( x n ) (ˆP x nw) n ) + log P n (X ) + log 1 η }{{} =O(log n) Choose η = 1 n. By the asymptotic expansion of the information spectrum divergence, log M max ni(p, W) + nv(p, W)Φ 1 (ε) + O(log n). P P n(x ) where the conditional information variance is V(P, W) = P(x) [ W(y x) log W(y x) PW(y) D(W( x) PW) x y ] 2 Vincent Tan (NUS) Second-Order Asymptotics NTU 65 / 109

Second-Order Converse for Channel Coding : Part III log M max x n D ε+η s (W n ( x n ) (ˆP x nw) n ) + log P n (X ) + log 1 η }{{} =O(log n) Choose η = 1 n. By the asymptotic expansion of the information spectrum divergence, log M max ni(p, W) + nv(p, W)Φ 1 (ε) + O(log n). P P n(x ) where the conditional information variance is V(P, W) = P(x) [ W(y x) log W(y x) PW(y) D(W( x) PW) x y ] 2 Now, invoke continuity of P I(P, W) and P V(P, W) to replace P with P above Vincent Tan (NUS) Second-Order Asymptotics NTU 65 / 109

Second-Order Converse for Channel Coding : Part IV Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nv(p, W)Φ 1 (ε) + O(log n) Vincent Tan (NUS) Second-Order Asymptotics NTU 66 / 109

Second-Order Converse for Channel Coding : Part IV Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nv(p, W)Φ 1 (ε) + O(log n) This is almost the same as the achievability log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) Vincent Tan (NUS) Second-Order Asymptotics NTU 66 / 109

Second-Order Converse for Channel Coding : Part IV Lemma For a DMC with unique capacity-achieving input distribution P, log M av(w n, ε) nc + nv(p, W)Φ 1 (ε) + O(log n) This is almost the same as the achievability log M av(w n, ε) nc + nu(p, W)Φ 1 (ε) + O(log n) So is U(P, W) = V(P, W)?? Vincent Tan (NUS) Second-Order Asymptotics NTU 66 / 109

Second-Order Asymptotics for Channel Coding Conditional information variance is V(P, W) = P(x) [ W(y x) log W(y x) PW(y) D(W( x) PW) x y Unconditional information variance is U(P, W) = x P(x) y W(y x) [ log W(y x) ] 2 PW(y) C ] 2 Vincent Tan (NUS) Second-Order Asymptotics NTU 67 / 109

Second-Order Asymptotics for Channel Coding Conditional information variance is V(P, W) = P(x) [ W(y x) log W(y x) PW(y) D(W( x) PW) x y Unconditional information variance is U(P, W) = x P(x) y W(y x) [ log W(y x) ] 2 PW(y) C ] 2 In general, if (X, Y) P W, var(i(x; Y)) = U(P, W) V(P, W) = E[var(i(X; Y)) X] by the law of total variance. Vincent Tan (NUS) Second-Order Asymptotics NTU 67 / 109

Second-Order Asymptotics for Channel Coding If we choose P to be capacity-achieving (i.e., I(P, W) = C), then by the KKT conditions, D(W( x) P W) = C Vincent Tan (NUS) Second-Order Asymptotics NTU 68 / 109

Second-Order Asymptotics for Channel Coding If we choose P to be capacity-achieving (i.e., I(P, W) = C), then by the KKT conditions, D(W( x) P W) = C and so V(P, W) = x = x P(x) y P(x) y [ W(y x) log W(y x) ] 2 P W(y) D(W( x) P W) [ W(y x) log W(y x) ] 2 P W(y) C = U(P, W) Vincent Tan (NUS) Second-Order Asymptotics NTU 68 / 109

Second-Order Asymptotics for Channel Coding Theorem (Strassen (1962)) For a DMC with unique capacity-achieving input distribution P, log Mav(W n, ε) = nc + nvφ 1 (ε) + O(log n) where V = V(P, W) Vincent Tan (NUS) Second-Order Asymptotics NTU 69 / 109

Second-Order Asymptotics for Channel Coding Theorem (Strassen (1962)) For a DMC with unique capacity-achieving input distribution P, log Mav(W n, ε) = nc + nvφ 1 (ε) + O(log n) where V = V(P, W) Direct part based on Feinstein Vincent Tan (NUS) Second-Order Asymptotics NTU 69 / 109

Second-Order Asymptotics for Channel Coding Theorem (Strassen (1962)) For a DMC with unique capacity-achieving input distribution P, log Mav(W n, ε) = nc + nvφ 1 (ε) + O(log n) where V = V(P, W) Direct part based on Feinstein Converse part based on Hayashi-Nagaoka with clever choice of Q Vincent Tan (NUS) Second-Order Asymptotics NTU 69 / 109

Second-Order Asymptotics for Channel Coding Theorem (Strassen (1962)) For a DMC with unique capacity-achieving input distribution P, log Mav(W n, ε) = nc + nvφ 1 (ε) + O(log n) where V = V(P, W) Direct part based on Feinstein Converse part based on Hayashi-Nagaoka with clever choice of Q We can optimize the third-order term; Usually O(log n) = 1 log n + O(1) 2 See Tomamichel-Tan (2013) and Altuğ-Wagner (2014). Vincent Tan (NUS) Second-Order Asymptotics NTU 69 / 109

Summary : Channel Coding Derived the second-order asymptotic expansions for DMCs and AWGN channels Vincent Tan (NUS) Second-Order Asymptotics NTU 70 / 109

Summary : Channel Coding Derived the second-order asymptotic expansions for DMCs and AWGN channels Achievability hinges on Feinstein s lemma or its generalized version Vincent Tan (NUS) Second-Order Asymptotics NTU 70 / 109

Summary : Channel Coding Derived the second-order asymptotic expansions for DMCs and AWGN channels Achievability hinges on Feinstein s lemma or its generalized version Converse hinges on Hayashi-Nagaoka s lemma with a good choice of output distribution Vincent Tan (NUS) Second-Order Asymptotics NTU 70 / 109

Outline 1 Motivation, Background and History 2 Binary Hypothesis Testing 3 Fixed-Length Lossless Source Coding 4 Channel Coding 5 Slepian-Wolf Coding 6 Summary and Open Problems Vincent Tan (NUS) Second-Order Asymptotics NTU 71 / 109

Setup of the Slepian-Wolf coding problem x1 n m 1 f 1 ϕ (ˆx 1 n, ˆxn 2 ) x n 2 f 2 m 2 Illustration of the Slepian-Wolf problem. Vincent Tan (NUS) Second-Order Asymptotics NTU 72 / 109

Setup of the Slepian-Wolf coding problem x1 n m 1 f 1 ϕ (ˆx 1 n, ˆxn 2 ) x n 2 f 2 m 2 Illustration of the Slepian-Wolf problem. Two correlated sources (X n 1, Xn 2 ) n i=1 P X 1 X 2 (x 1i, x 2i ). Vincent Tan (NUS) Second-Order Asymptotics NTU 72 / 109

Setup of the Slepian-Wolf coding problem x1 n m 1 f 1 ϕ (ˆx 1 n, ˆxn 2 ) x n 2 f 2 m 2 Illustration of the Slepian-Wolf problem. Two correlated sources (X n 1, Xn 2 ) n i=1 P X 1 X 2 (x 1i, x 2i ). Separately encoded Vincent Tan (NUS) Second-Order Asymptotics NTU 72 / 109

Setup of the Slepian-Wolf coding problem x1 n m 1 f 1 ϕ (ˆx 1 n, ˆxn 2 ) x n 2 f 2 m 2 Illustration of the Slepian-Wolf problem. Two correlated sources (X n 1, Xn 2 ) n i=1 P X 1 X 2 (x 1i, x 2i ). Separately encoded Both to be decoded at destination Vincent Tan (NUS) Second-Order Asymptotics NTU 72 / 109

The Slepian-Wolf theorem Sources to be compressed to nr 1 and nr 2 bits respectively. Vincent Tan (NUS) Second-Order Asymptotics NTU 73 / 109

The Slepian-Wolf theorem Sources to be compressed to nr 1 and nr 2 bits respectively. (R 1, R 2 ) achievable if there exists a sequence of (2 nr 1, 2 nr 2, n)-codes such that lim Pr ( (ˆX n n 1, ˆX 2 n ) (Xn 1, Xn 2 )) = 0. R(P XY ) is the set of all achievable (R 1, R 2 ) pairs. Vincent Tan (NUS) Second-Order Asymptotics NTU 73 / 109

The Slepian-Wolf theorem Sources to be compressed to nr 1 and nr 2 bits respectively. (R 1, R 2 ) achievable if there exists a sequence of (2 nr 1, 2 nr 2, n)-codes such that lim Pr ( (ˆX n n 1, ˆX 2 n ) (Xn 1, Xn 2 )) = 0. R(P XY ) is the set of all achievable (R 1, R 2 ) pairs. Slepian and Wolf (1973) R(P XY ) = {R 1 H(X 1 X 2 ), R 2 H(X 2 X 1 ), R 1 + R 2 H(X 1, X 2 )} D. Slepian J. Wolf Vincent Tan (NUS) Second-Order Asymptotics NTU 73 / 109

The Slepian-Wolf region R 2 H 2 H 2 1 R(P XY ) H 1 2 H 1 R 1 R 1 H(X 1 X 2 ) R 2 H(X 2 X 1 ) R 1 + R 2 H(X 1, X 2 ) Vincent Tan (NUS) Second-Order Asymptotics NTU 74 / 109

A Review of Slepian-Wolf coding Partition X n j randomly into exp(nr j ) bins, i.e., Pr(x n j B j (m j )) = exp( nr j ), j = 1, 2 Vincent Tan (NUS) Second-Order Asymptotics NTU 75 / 109

A Review of Slepian-Wolf coding Partition X n j randomly into exp(nr j ) bins, i.e., Pr(x n j B j (m j )) = exp( nr j ), j = 1, 2 Transmit bin index m j [1 : exp(nr j )] if X n j B(m j ). Vincent Tan (NUS) Second-Order Asymptotics NTU 75 / 109

A Review of Slepian-Wolf coding Partition X n j randomly into exp(nr j ) bins, i.e., Pr(x n j B j (m j )) = exp( nr j ), j = 1, 2 Transmit bin index m j [1 : exp(nr j )] if X n j B(m j ). At decoder, declare that (X n 1, Xn 2 ) B 1(m 1 ) B 2 (m 2 ) are the transmitted vectors iff (X n 1, Xn 2 ) T ɛ Vincent Tan (NUS) Second-Order Asymptotics NTU 75 / 109

A Review of Slepian-Wolf coding Partition X n j randomly into exp(nr j ) bins, i.e., Pr(x n j B j (m j )) = exp( nr j ), j = 1, 2 Transmit bin index m j [1 : exp(nr j )] if X n j B(m j ). At decoder, declare that (X n 1, Xn 2 ) B 1(m 1 ) B 2 (m 2 ) are the transmitted vectors iff (X n 1, Xn 2 ) T ɛ By standard typicality arguments, we need R 1 H(X 1 X 2 ) R 2 H(X 2 X 1 ) R 1 + R 2 H(X 1, X 2 ) Vincent Tan (NUS) Second-Order Asymptotics NTU 75 / 109

Setup for Second-Order Asymptotics for Slepian-Wolf We chose M 1 = exp(nr 1 ), M 2 = exp(nr 2 ) for the optimum rate region and sought rates (R 1, R 2 ) such that Pr ( (ˆX 1 n, ˆX 2 n ) (Xn 1, Xn 2 )) 0. Vincent Tan (NUS) Second-Order Asymptotics NTU 76 / 109

Setup for Second-Order Asymptotics for Slepian-Wolf We chose M 1 = exp(nr 1 ), M 2 = exp(nr 2 ) for the optimum rate region and sought rates (R 1, R 2 ) such that Pr ( (ˆX 1 n, ˆX 2 n ) (Xn 1, Xn 2 )) 0. Alternatively, we can fix (R 1, R 2 ) Bd(R(P XY)) and ε (0, 1) and choose M 1 = exp(nr 1 + nl 1 ), M 2 = exp(nr 2 + nl 2 ) Then seek (L 1, L 2 ) pair such that Pr ( (ˆX n 1, ˆX n 2 ) (Xn 1, Xn 2 )) ε + o(1). Vincent Tan (NUS) Second-Order Asymptotics NTU 76 / 109

Setup for Second-Order Asymptotics for Slepian-Wolf Note that we re operating on the boundary of the SW region! R 2 Bd(R(P XY )) H 2 H 2 1 (R 1, R 2 ) R(P XY ) H 1 2 H 1 R 1 Vincent Tan (NUS) Second-Order Asymptotics NTU 77 / 109

Definitions for Second-Order Asymptotics for SW (L 1, L 2 ) R 2 is (R 1, R 2, ε)-achievable if there exists a sequence of (n, M 1n, M 2n, ε n )-codes such that lim sup n lim sup n 1 n (log M 1n nr 1) L 1 1 n (log M 2n nr 2) L 2 and lim sup ε n ε. n Vincent Tan (NUS) Second-Order Asymptotics NTU 78 / 109

Definitions for Second-Order Asymptotics for SW (L 1, L 2 ) R 2 is (R 1, R 2, ε)-achievable if there exists a sequence of (n, M 1n, M 2n, ε n )-codes such that lim sup n lim sup n 1 n (log M 1n nr 1) L 1 1 n (log M 2n nr 2) L 2 and lim sup ε n ε. n L(ε; R 1, R 2 ) is the set of all (R 1, R 2, ε)-achievable (L 1, L 2 ) pairs Vincent Tan (NUS) Second-Order Asymptotics NTU 78 / 109

Second-Order Asymptotics for SW Joint work with Oliver Kosut Paper published in Feb 2014 issue of IT Transactions Vincent Tan (NUS) Second-Order Asymptotics NTU 79 / 109