On Concentration of Martingales and Applications in Information Theory, Communication & Coding

Size: px

Start display at page:

Download "On Concentration of Martingales and Applications in Information Theory, Communication & Coding"

Egbert Blair
5 years ago
Views:

1 On Concentration of Martingales and Applications in Information Theory, Communication & Coding Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich, Switzerland August 22 23, I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

2 Concentration of Measures Concentration of Measures Concentration of measures is a central issue in probability theory, and it is strongly related to information theory and coding. Roughly speaking, the concentration of measure phenomenon can be stated in the following simple way: A random variable that depends in a smooth way on many independent random variables (but not too much on any of them) is essentially constant (Talagrand, 1996). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

3 Concentration of Measures Concentration of Measures Concentration of measures is a central issue in probability theory, and it is strongly related to information theory and coding. Roughly speaking, the concentration of measure phenomenon can be stated in the following simple way: A random variable that depends in a smooth way on many independent random variables (but not too much on any of them) is essentially constant (Talagrand, 1996). Concentration Inequalities Concentration inequalities provide upper bounds on tail probabilities of the type P( X x t) (or P(X x t) for a random variable (RV) X, where x denotes the expectation or median of X. Several techniques were developed to prove concentration. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

4 Survey Papers on Concentration Inequalities for Martingales Survey Papers on Concentration Inequalities for Martingales 1 C. McDiarmid, Concentration, Probabilistic Methods for Algorithmic Discrete Mathematics, pp , Springer, F. Chung and L. Lu, Concentration inequalities and martingale inequalities: a survey, Internet Mathematics, vol. 3, no. 1, pp , March Available at fan/wp/concen.pdf. 3 I. S., On Refined Versions of the Azuma-Hoeffding Inequality with Applications in Information Theory, Communications and Coding, a tutorial paper with some original results, July Last updated in July See I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

5 Concentration via the Martingale Approach Definition - [Discrete-Time Martingales] Let (Ω, F, P) be a probability space. Let F 0 F 1... be a sequence of sub σ-algebras of F (which is called a filtration). A sequence X 0,X 1,... of RVs is a martingale if for every i 1 X i : Ω R is F i -measurable, i.e., 2 E[ X i ] <. {ω Ω : X i (ω) t} F i i {0,1,...},t R. 3 X i = E[X i+1 F i ] almost surely. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

6 Concentration via the Martingale Approach Definition - [Discrete-Time Martingales] Let (Ω, F, P) be a probability space. Let F 0 F 1... be a sequence of sub σ-algebras of F (which is called a filtration). A sequence X 0,X 1,... of RVs is a martingale if for every i 1 X i : Ω R is F i -measurable, i.e., 2 E[ X i ] <. {ω Ω : X i (ω) t} F i i {0,1,...},t R. 3 X i = E[X i+1 F i ] almost surely. A Simple Example: Random walk X n = n i=0 U i where P(U i = +1) = P(U i = 1) = 1 2 are i.i.d. RVs. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

7 Concentration via the Martingale Approach Martingales Remarks Remark 1 Given a RV X L 1 (Ω, F, P) and a filtration of sub σ-algebras {F i }, let X i = E[X F i ] i = 0,1,.... Then, the sequence X 0,X 1,... forms a martingale. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

8 Concentration via the Martingale Approach Martingales Remarks Remark 1 Given a RV X L 1 (Ω, F, P) and a filtration of sub σ-algebras {F i }, let X i = E[X F i ] i = 0,1,.... Then, the sequence X 0,X 1,... forms a martingale. Remark 2 Choose F 0 = {Ω, } and F n = F, then Remark 1 gives a martingale with X 0 = E[X F 0 ] = E[X] (F 0 doesn t provide information about X). X n = E[X F n ] = X a.s. (F n provides full information about X). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

9 Concentration via the Martingale Approach Concentration via the Martingale Approach Theorem - [Azuma-Hoeffding inequality] Let {X k, F k } k=0 be a discrete-parameter real-valued martingale. If the jumps of the sequence {X k } are bounded almost surely (a.s.), i.e., X i X i 1 d i i = 1,2,...,n a.s. then ( P( X n X 0 r) 2exp r 2 2 n i=1 d2 i ), r > 0. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

10 Concentration via the Martingale Approach Concentration via the Martingale Approach Theorem - [Azuma-Hoeffding inequality] Let {X k, F k } k=0 be a discrete-parameter real-valued martingale. If the jumps of the sequence {X k } are bounded almost surely (a.s.), i.e., X i X i 1 d i i = 1,2,...,n a.s. then ( P( X n X 0 r) 2exp r 2 2 n i=1 d2 i ), r > 0. Concentration Inequalities Around the Average The Azuma-Hoeffding inequality and Remark 2 enable to get a concentration inequality for X around its expected value (E[X]). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

11 Refined Versions of the Azuma-Hoeffding Inequality But, the Azuma-Hoeffding inequality is not tight!. For example, if r > n i=1 d i P( X n X 0 r) = 0. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

12 Refined Versions of the Azuma-Hoeffding Inequality Theorem (Th. 2 McDiarmid 89) Let {X k, F k } k=0 be a discrete-parameter real-valued martingale. Assume that, for some constants d,σ > 0, the following two requirements hold a.s. X k X k 1 d, Var(X k F k 1 ) = E [ (X k X k 1 ) 2 F k 1 ] σ 2 for every k {1,...,n}. Then, for every α 0, ( ( δ + γ )) γ P( X n X 0 αn) 2exp n D 1 + γ 1 + γ ( ) ( ) where γ σ2 d, δ α 2 d, and D(p q) p ln p q + (1 p)ln 1 p 1 q. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

13 Refined Versions of the Azuma-Hoeffding Inequality Corollary Under the conditions of Theorem 2, for every α 0, P( X n X 0 αn) 2exp ( nf(δ)) where f(δ) = { ln(2) [ 1 h 2 ( 1 δ 2 ) ], 0 δ 1 +, δ > 1 and h 2 (x) xlog 2 (x) (1 x)log 2 (1 x) for 0 x 1. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

14 Refined Versions of the Azuma-Hoeffding Inequality Corollary Under the conditions of Theorem 2, for every α 0, P( X n X 0 αn) 2exp ( nf(δ)) where f(δ) = { ln(2) [ 1 h 2 ( 1 δ 2 ) ], 0 δ 1 +, δ > 1 and h 2 (x) xlog 2 (x) (1 x)log 2 (1 x) for 0 x 1. Proof Set γ = 1 in Theorem 2. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

15 Refined Versions of the Azuma-Hoeffding Inequality Corollary Under the conditions of Theorem 2, for every α 0, P( X n X 0 αn) 2exp ( nf(δ)) where f(δ) = { ln(2) [ 1 h 2 ( 1 δ 2 ) ], 0 δ 1 +, δ > 1 and h 2 (x) xlog 2 (x) (1 x)log 2 (1 x) for 0 x 1. Proof Set γ = 1 in Theorem 2. Observation (first set γ = 1, and then use Pinsker s Inequality) Theorem 2 Corollary Azuma-Hoeffding inequality. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

16 Refined Versions of the Azuma-Hoeffding Inequality 2 LOWER BOUNDS ON EXPONENTS Theorem 2: γ=1/8 1/4 1/2 Corollary 2: f(δ) δ = α/d Exponent of Azuma inequality: δ 2 /2 I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

17 A Simple Example to Motivate Further Improvements A Simple Example to Motivate Further Improvements Let (Ω, F, P) be a probability space. d > 0 and γ (0,1] be fixed numbers. {U k } k N be i.i.d. random variables where P(U k = +d) = P(U k = d) = γ 2, P(U k = 0) = 1 γ, k N. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

18 A Simple Example to Motivate Further Improvements A Simple Example to Motivate Further Improvements Let (Ω, F, P) be a probability space. d > 0 and γ (0,1] be fixed numbers. {U k } k N be i.i.d. random variables where P(U k = +d) = P(U k = d) = γ 2, P(U k = 0) = 1 γ, k N. Consider the Markov process {X k } k 0 where X 0 = 0 and X k = X k 1 + U k, k N. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

19 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis From Cramér s theorem in R, for every α E[U 1 ] = 0, where the rate function is given by lim 1 n n ln P(X n αn) = I(α) I(α) = sup {tα ln E[exp(tU 1 )]}. t 0 I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

20 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis From Cramér s theorem in R, for every α E[U 1 ] = 0, where the rate function is given by lim 1 n n ln P(X n αn) = I(α) I(α) = sup {tα ln E[exp(tU 1 )]}. t 0 Let δ α d. Calculation shows that I(α) = δx ln (1 + γ [ cosh(x) 1 ]) ( δ(1 γ) + ) δ x ln 2 (1 γ) 2 + γ 2 (1 δ 2 ). γ(1 δ) I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

21 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis Lets examine the martingale approach in this simple case, where X k = with the natural filtration k U i, k N, X 0 = 0 i=1 F k = σ(u 1,...,U k ), k N, F 0 = {,Ω}. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

22 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis Lets examine the martingale approach in this simple case, where X k = with the natural filtration k U i, k N, X 0 = 0 i=1 F k = σ(u 1,...,U k ), k N, F 0 = {,Ω}. In this case, the martingale has bounded increments, and for every k N X k X k 1 d E[(X k X k 1 ) 2 F k 1 ] = Var(U k ) = γd 2 almost surely. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

23 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis Via the martingale approach, the following exponential inequality follows: ( ( δ + γ )) γ P (X n X 0 αn) exp n D 1 + γ 1 + γ where δ α d, and ( p ( 1 p ) D(p q) p ln + (1 p)ln, p,q [0,1] q) 1 q is the divergence (relative entropy) between the probability distributions (p,1 p) and (q,1 q). If δ > 1, then the above probability is zero. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

24 A Simple Example to Motivate Further Improvements A Simple Example (Cont.) - Large Deviation Analysis Via the martingale approach, the following exponential inequality follows: ( ( δ + γ )) γ P (X n X 0 αn) exp n D 1 + γ 1 + γ where δ α d, and ( p ( 1 p ) D(p q) p ln + (1 p)ln, p,q [0,1] q) 1 q is the divergence (relative entropy) between the probability distributions (p,1 p) and (q,1 q). If δ > 1, then the above probability is zero. This exponential bound is not tight (unless γ = 1), and the gap to the exact asymptotic exponent increases as the value of γ (0, 1] is decreased. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

25 A Simple Example to Motivate Further Improvements Comparison of the Exact Exponent and Its Lower Bound 0.8 COMPARISON BETWEEN THE EXPONENTS γ = 9/ δ = α/d I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

26 A Simple Example to Motivate Further Improvements Comparison of the Exact Exponent and Its Lower Bound (Cont.) 3 COMPARISON BETWEEN THE EXPONENTS γ = 1/ δ = α/d I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

27 A Simple Example to Motivate Further Improvements Questions: Why the lower bound on the exponent is not asymptotically tight? Why this gap increases as the value of γ is decreased? How this situation can be improved via the martingale approach? I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

28 A Simple Example to Motivate Further Improvements Questions: Why the lower bound on the exponent is not asymptotically tight? Why this gap increases as the value of γ is decreased? How this situation can be improved via the martingale approach? This exponential bound relies on Bennett s inequality: Since U k d, E[U k ] = 0, and Var(U k ) = γd 2, then γ exp(td) + exp( γtd) E [exp(tu k )], t γ I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

29 A Simple Example to Motivate Further Improvements Questions: Why the lower bound on the exponent is not asymptotically tight? Why this gap increases as the value of γ is decreased? How this situation can be improved via the martingale approach? This exponential bound relies on Bennett s inequality: Since U k d, E[U k ] = 0, and Var(U k ) = γd 2, then γ exp(td) + exp( γtd) E [exp(tu k )], t γ The probability distribution that achieves Bennett s inequality with equality is asymmetric (unless γ = 1), and is equal to P(Ũk = d) = γ 1 + γ, P(Ũk = γd) = γ. By reducing the value of γ (0,1], the above asymmetry grows. Indeed, this enlarges the gap to the exact asymptotic exponent. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

30 Conditionally Symmetric Martingales Interim Conclusion An improvement is likely to be obtained by a refinement of Bennett s inequality when X k is conditionally symmetric given F k 1. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

31 Conditionally Symmetric Martingales Interim Conclusion An improvement is likely to be obtained by a refinement of Bennett s inequality when X k is conditionally symmetric given F k 1. Definition: Conditionally Symmetric Martingales Let {X k, F k } k N0, where N 0 N {0}, be a discrete-time and real-valued martingale, and let ξ k X k X k 1 for every k N. Then {X k, F k } k N0 is a conditionally symmetric martingale if, conditioned on F k 1, the RV ξ k is symmetrically distributed around zero. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

32 Conditionally Symmetric Martingales Interim Conclusion An improvement is likely to be obtained by a refinement of Bennett s inequality when X k is conditionally symmetric given F k 1. Definition: Conditionally Symmetric Martingales Let {X k, F k } k N0, where N 0 N {0}, be a discrete-time and real-valued martingale, and let ξ k X k X k 1 for every k N. Then {X k, F k } k N0 is a conditionally symmetric martingale if, conditioned on F k 1, the RV ξ k is symmetrically distributed around zero. Definition: Conditionally Symmetric Sub/ Supermartingales Let {X k, F k } k N0 be a discrete-time real-valued sub or supermartingale, and let η k X k E[X k F k 1 ] for every k N. Then it is conditionally symmetric if, conditioned on F k 1, the RV η k is symmetrically distributed around zero. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

33 Conditionally Symmetric Martingales Construction of Conditionally Symmetric Martingales Example 1: Let (Ω, F, P) be a probability space {U k } k N L 1 (Ω, F, P) be independent random variables that are symmetrically distributed around zero {F k } k 0 be the natural filtration where F 0 = {,Ω} and F k = σ(u 1,...,U k ), k N. For k N, let A k L (Ω, F k 1, P) be an F k 1 -measurable random variable with a finite essential supremum. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

34 Conditionally Symmetric Martingales Construction of Conditionally Symmetric Martingales Example 1: Let (Ω, F, P) be a probability space {U k } k N L 1 (Ω, F, P) be independent random variables that are symmetrically distributed around zero {F k } k 0 be the natural filtration where F 0 = {,Ω} and F k = σ(u 1,...,U k ), k N. For k N, let A k L (Ω, F k 1, P) be an F k 1 -measurable random variable with a finite essential supremum. Define a new sequence of random variables in L 1 (Ω, F, P) where X n = n A k U k, n N k=1 and X 0 = 0. Then, {X n, F n } n N0 is a conditionally symmetric martingale. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

35 Conditionally Symmetric Martingales Construction of Conditionally Symmetric Martingales Example 2: Let {X n, F n } n N0 be a conditionally symmetric martingale A k L (Ω, F k 1, P) be an F k 1 -measurable random variable with a finite essential supremum. Define Y 0 = 0 and Y n = n k=1 A k(x k X k 1 ), n N. Then, {Y n, F n } n N0 is a conditionally symmetric martingale. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

36 Conditionally Symmetric Martingales Construction of Conditionally Symmetric Martingales Example 2: Let {X n, F n } n N0 be a conditionally symmetric martingale A k L (Ω, F k 1, P) be an F k 1 -measurable random variable with a finite essential supremum. Define Y 0 = 0 and Y n = n k=1 A k(x k X k 1 ), n N. Then, {Y n, F n } n N0 is a conditionally symmetric martingale. Example 3: A sampled Brownian motion is a discrete time, conditionally symmetric martingale with un-bounded increments. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

37 Conditionally Symmetric Martingales Construction of Conditionally Symmetric Martingales Example 2: Let {X n, F n } n N0 be a conditionally symmetric martingale A k L (Ω, F k 1, P) be an F k 1 -measurable random variable with a finite essential supremum. Define Y 0 = 0 and Y n = n k=1 A k(x k X k 1 ), n N. Then, {Y n, F n } n N0 is a conditionally symmetric martingale. Example 3: A sampled Brownian motion is a discrete time, conditionally symmetric martingale with un-bounded increments. Goal: Our next goal is to demonstrate how the assumption of the conditional symmetry improves existing exponential inequalities for discrete-time real-valued martingales with bounded increments. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

38 Exponential Inequalities for Conditionally Symmetric Martingales Exponential Inequalities for Conditionally Symmetric Martingales Lemma Let X be a real-valued RV with a symmetric distribution around zero, a support [ d,d], and assume Var(X) γd 2 for some d > 0 and γ [0,1]. Let h be a real-valued convex function, and assume that h(d 2 ) h(0). Then, E[h(X 2 )] (1 γ)h(0) + γh(d 2 ) with equality if P(X = ±d) = γ 2, P(X = 0) = 1 γ. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

39 Exponential Inequalities for Conditionally Symmetric Martingales Exponential Inequalities for Conditionally Symmetric Martingales Lemma Let X be a real-valued RV with a symmetric distribution around zero, a support [ d,d], and assume Var(X) γd 2 for some d > 0 and γ [0,1]. Let h be a real-valued convex function, and assume that h(d 2 ) h(0). Then, E[h(X 2 )] (1 γ)h(0) + γh(d 2 ) with equality if P(X = ±d) = γ 2, P(X = 0) = 1 γ. Proof h convex, X [ d,d] a.s. h(x 2 ) h(0) + ( ) X 2 ( d h(d 2 ) h(0) ). Taking expectations on both sides gives the inequality, with an equality for the above symmetric distribution. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

40 Exponential Inequalities for Conditionally Symmetric Martingales Exp. Inequalities for Conditionally Symmetric Martingales (Cont.) Corollary Let X be a real-valued RV with a symmetric distribution around zero, a support [ d,d], and assume Var(X) γd 2 for some d > 0 and γ [0,1]. Then, E [ exp(λx) ] 1 + γ [ cosh(λd) 1 ], λ R with equality if P(X = ±d) = γ 2, P(X = 0) = 1 γ. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

41 Exponential Inequalities for Conditionally Symmetric Martingales Exp. Inequalities for Conditionally Symmetric Martingales (Cont.) Corollary Let X be a real-valued RV with a symmetric distribution around zero, a support [ d,d], and assume Var(X) γd 2 for some d > 0 and γ [0,1]. Then, E [ exp(λx) ] 1 + γ [ cosh(λd) 1 ], λ R with equality if P(X = ±d) = γ 2, P(X = 0) = 1 γ. Proof Symmetric distribution of X E [ exp(λx) ] = E [ cosh(λx) ]. The corollary follows from the lemma since, for every x R, cosh(λx) = h(x 2 ) where h(x) λ 2n x n n=0 (2n)! is a convex function, and h(d 2 ) = cosh(λd) 1 = h(0). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

42 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 (I. S., 2012) Let {X k, F k } k N0 be a discrete-time real-valued and conditionally symmetric martingale. Assume that, for some fixed numbers d,σ > 0, the following two requirements are satisfied a.s. X k X k 1 d, Var(X k F k 1 ) = E [ (X k X k 1 ) 2 F k 1 ] σ 2 for every k N. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

43 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 (I. S., 2012) Let {X k, F k } k N0 be a discrete-time real-valued and conditionally symmetric martingale. Assume that, for some fixed numbers d,σ > 0, the following two requirements are satisfied a.s. X k X k 1 d, Var(X k F k 1 ) = E [ (X k X k 1 ) 2 F k 1 ] σ 2 for every k N. Then, for every α 0 and n N, ( ) P max X k X 0 αn 2exp ( ne(γ,δ) ) 1 k n where γ σ2 d 2, δ α d and for γ (0,1] and δ [0,1), the exponent E(γ,δ) is given as follows: I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

44 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 (Cont.) E(γ,δ) δx ln (1 + γ [ cosh(x) 1 ]) ( δ(1 γ) + ) δ x ln 2 (1 γ) 2 + γ 2 (1 δ 2 ). γ(1 δ) If δ > 1, then the probability is zero (so E(γ,δ) + ), and E(γ,1) = ln ( 2 γ). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

45 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 (Cont.) E(γ,δ) δx ln (1 + γ [ cosh(x) 1 ]) ( δ(1 γ) + ) δ x ln 2 (1 γ) 2 + γ 2 (1 δ 2 ). γ(1 δ) If δ > 1, then the probability is zero (so E(γ,δ) + ), and E(γ,1) = ln ( 2 γ). Asymptotic Optimality of the Exponent The example we studied earlier shows that the exponent of the bound in Theorem 1 is asymptotically optimal. That is, there exists a conditionally symmetric martingale, satisfying the conditions in this theorem, that attains the exponent E(γ,δ) in the limit where n. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

46 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 should be compared to the bound in Theorem 1 which does not require the conditional symmetry property. Theorem 1 (Reminder) [McDiarmid 1989, book of Dembo & Zeitouni (Corollary 2.4.7)] Let {X k, F k } k N0 be a discrete-time real-valued martingale with bounded jumps. Assume that the two conditions on the bounded increments and conditional variance from Theorem 1 are satisfied a.s. for every k N. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

47 Exponential Inequalities for Conditionally Symmetric Martingales Theorem 2 should be compared to the bound in Theorem 1 which does not require the conditional symmetry property. Theorem 1 (Reminder) [McDiarmid 1989, book of Dembo & Zeitouni (Corollary 2.4.7)] Let {X k, F k } k N0 be a discrete-time real-valued martingale with bounded jumps. Assume that the two conditions on the bounded increments and conditional variance from Theorem 1 are satisfied a.s. for every k N. Then, for every α 0 and n N, ( ) P max X k X 0 αn 1 k n ( 2exp n D ( δ + γ )) γ 1 + γ 1 + γ where γ σ2 d and δ α 2 d were introduced earlier, and ( p ( 1 p ) D(p q) p ln + (1 p)ln, p,q [0,1]. q) 1 q If δ > 1, then the probability is zero. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

48 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 Define X n X 0 = n k=1 ξ k where ξ k X k X k 1. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

49 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 Define X n X 0 = n k=1 ξ k where ξ k X k X k 1. ξ k d, E [ ξ k F k 1 ] = 0, Var(ξk F k 1 ) γd 2. The RV ξ k is F k -measurable. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

50 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 Define X n X 0 = n k=1 ξ k where ξ k X k X k 1. ξ k d, E [ ξ k F k 1 ] = 0, Var(ξk F k 1 ) γd 2. The RV ξ k is F k -measurable. Exponentiation and the maximal inequality for submartingales give that, for every α 0, ( ) [ ( P max (X k X 0 ) nα e nαt E exp t 1 k n n ξ k )], t 0. k=1 I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

51 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 Define X n X 0 = n k=1 ξ k where ξ k X k X k 1. ξ k d, E [ ξ k F k 1 ] = 0, Var(ξk F k 1 ) γd 2. The RV ξ k is F k -measurable. Exponentiation and the maximal inequality for submartingales give that, for every α 0, ( ) [ ( P max (X k X 0 ) nα e nαt E exp t 1 k n n ξ k )], t 0. k=1 Since {F i } forms a filtration, then for every t 0 [ ( n )] [ ( n 1 ) E exp t ξ k = E exp t ξ k E [ ] ] exp(tξ n ) F n 1. k=1 k=1 I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

52 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 (Cont.) For proving Theorem 2, the use of Bennett s inequality gives E [exp(tξ k ) F k 1 ] γ exp(td) + exp( γtd). 1 + γ I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

53 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 (Cont.) For proving Theorem 2, the use of Bennett s inequality gives E [exp(tξ k ) F k 1 ] γ exp(td) + exp( γtd). 1 + γ For the refinement in Theorem 1 for conditionally symmetric martingales with bounded increments, use of Lemma 1 gives E [exp(tξ k ) F k 1 ] 1 + γ [ cosh(td) 1 ]. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

54 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 (Cont.) For proving Theorem 2, the use of Bennett s inequality gives E [exp(tξ k ) F k 1 ] γ exp(td) + exp( γtd). 1 + γ For the refinement in Theorem 1 for conditionally symmetric martingales with bounded increments, use of Lemma 1 gives E [exp(tξ k ) F k 1 ] 1 + γ [ cosh(td) 1 ]. In both cases, one gets exponential bounds with the free parameter t. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

55 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 (Cont.) For proving Theorem 2, the use of Bennett s inequality gives E [exp(tξ k ) F k 1 ] γ exp(td) + exp( γtd). 1 + γ For the refinement in Theorem 1 for conditionally symmetric martingales with bounded increments, use of Lemma 1 gives E [exp(tξ k ) F k 1 ] 1 + γ [ cosh(td) 1 ]. In both cases, one gets exponential bounds with the free parameter t. Optimization over t 0 & union bound to get two-sided inequalities. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

56 Exponential Inequalities for Conditionally Symmetric Martingales Proof Technique for Theorems 1 and 2 (Cont.) For proving Theorem 2, the use of Bennett s inequality gives E [exp(tξ k ) F k 1 ] γ exp(td) + exp( γtd). 1 + γ For the refinement in Theorem 1 for conditionally symmetric martingales with bounded increments, use of Lemma 1 gives E [exp(tξ k ) F k 1 ] 1 + γ [ cosh(td) 1 ]. In both cases, one gets exponential bounds with the free parameter t. Optimization over t 0 & union bound to get two-sided inequalities. The asymptotic optimality of the exponent in Theorem 1 follows from the simple example that was shown earlier. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

57 Relation to Classical Results in Probability Theory Relation to Classical Results in Probability Theory: The above concentration inequalities are linked to Central limit theorem for martingales Law of iterated logarithm Moderate deviations principle for real-valued i.i.d. RVs. For proofs and discussions on these relations, see Section IV in I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

58 Application 1: MDP for Binary Hypothesis Testing Binary Hypothesis Testing Let the RVs X 1,X 2... be i.i.d. Q, and consider two hypotheses: H 1 : Q = P 1. H 2 : Q = P 2. For simplicity, assume that the RVs are discrete, and take their values on a finite alphabet X where P 1 (x),p 2 (x) > 0 for every x X. The log-likelihood ratio (LLR): L(X n 1 ) = n i=1 ln P 1(X i ) P 2 (X i ). By the strong law of large numbers (SLLN), if H 1 is true, then a.s. L(X1 lim n) L(X1 n n = D(P 1 P 2 ) and, if H 2, lim n) n n = D(P 2 P 1 ). Let λ,λ R satisfy D(P 2 P 1 ) < λ λ < D(P 1 P 2 ). Decide on H 1 if L(X n 1 ) > nλ and on H 2 if L(X n 1 ) < nλ. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

59 Application 1: MDP for Binary Hypothesis Testing Let and then ) α (1) n P 1 (L(X n 1 n ) nλ ( ) β n (1) P2 n L(X1 n ) nλ ) α (2) n P 1 (L(X n 1 n ) nλ ( ) β n (2) P2 n L(X1 n ) nλ α (1) n and β n (1) are the probabilities of either making an error/ declaring an erasure under, respectively, hypotheses H 1 and H 2. α (2) n and β n (2) are the probabilities of making an error under, respectively, hypotheses H 1 and H 2. Large deviations analysis α n and β n both decay exponentially to zero as a function of the block length (n). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

60 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis for Binary Hypothesis Testing Instead of the setting where the thresholds are kept fixed independently of n, let these thresholds tend to their asymptotic limits (due to the SLLN), i.e., lim n λ(n) = D(P 1 P 2 ), lim n λ(n) = D(P 2 P 1 ). In moderate-deviations analysis of binary hypothesis testing, we are interested to analyze the case where 1 The block length n of the input sequence tends to infinity. 2 The thresholds tend to their asymptotic limits simultaneously. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

61 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis for Binary Hypothesis Testing (Cont.) To this end, let η ( 1 2,1), and ε 1,ε 2 > 0 be arbitrary fixed numbers. Set the upper and lower thresholds to λ (n) = D(P 1 P 2 ) ε 1 n (1 η) λ (n) = D(P 2 P 1 ) + ε 2 n (1 η). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

62 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach Under hypothesis H 1, construct the martingale {U k, F k } n k=0 where F 0 F 1...F n is the natural filtration F 0 = {,Ω}, F k = σ(x 1,...,X k ), k {1,...,n}. U k = E P n 1 [ L(X n 1 ) F k ]. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

63 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach Under hypothesis H 1, construct the martingale {U k, F k } n k=0 where F 0 F 1...F n is the natural filtration F 0 = {,Ω}, F k = σ(x 1,...,X k ), k {1,...,n}. U k = E P n 1 [ L(X n 1 ) F k ]. For every k {0,...,n}: U k = k i=1 ln P 1(X i ) P 2 (X i ) + (n k)d(p 1 P 2 ). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

64 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) U 0 = nd(p 1 P 2 ), and U n = L(X n 1 ). Let d 1 max x X ln P 1(x) P 2 (x) D(P 1 P 2 ) d 1 <. U k U k 1 d 1 holds a.s. for every k {1,...,n}. Due to the independence of the RVs in the sequence {X i } [ E P n (Uk 1 U k 1 ) 2 ] F k 1 = { ( P 1 (x) ln P ) } 2 1(x) P 2 (x) D(P 1 P 2 ) x X σ 2 1. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

65 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) Let ε 1 > 0 and η ( 1 2,1) be two arbitrarily fixed numbers as above. Under hypothesis H 1, it follows from Theorem 2 and the above construction of a martingale that ( ( P1 n ( L(X n 1 ) nλ (n) (η,n) ) δ 1 + γ 1 ) ) exp nd γ γ γ 1 where with d 1 and σ 2 1 δ (η,n) 1 ε 1n (1 η) d 1, γ 1 σ2 1 d 2 1 as defined in the previous slide. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

66 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) The concentration inequality in Theorem 2 and ( a simple ) inequality give that, under hypothesis H 1, for every η 1 2,1, α (1) n ( exp ( ε2 1 n2η 1 2σ1 2 1 ε 1 d 1 3σ 2 1 (1 + γ 1) )) 1 n 1 η so this upper bound on the overall probability of either making an error or declaring an erasure under hypothesis H 1 decays sub-exponentially to zero. It improves by increasing η ( 1 2,1). On the other hand, the exponential decay of the probability of error improves as the value of η ( 1 2,1) is decreased (since the margin for making an error is increased). α (2) n I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

67 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) Moderate-deviations analysis reflects, as can be expected, a tradeoff between the two α n s and also between the two β n s. But all decay asymptotically to zero as n tends to infinity (which is indeed consistent with the SLLN). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

68 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) The following upper bound implies that lim n n1 2η ln P1 n ( L(X n 1 ) nλ (n) ) ε 2 1 2σ1 2. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

69 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) The following upper bound implies that lim n n1 2η ln P1 n ( L(X n 1 ) nλ (n) ) ε 2 1 2σ1 2. Question: But, does the upper bound reflect the correct asymptotic scaling of this probability? I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

70 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) The following upper bound implies that lim n n1 2η ln P1 n ( L(X n 1 ) nλ (n) ) ε 2 1 2σ1 2. Question: But, does the upper bound reflect the correct asymptotic scaling of this probability? Reply: Indeed, it follows from the moderate-deviations principle (MDP) for real-valued RVs. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

71 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Principle (MDP) for Real-Valued RVs Theorem Let {X i } be i.i.d. random real-valued RVs, satisfying E[X i ] = 0, Var(X i ) = σ 2, X i d and let η ( 1 2,1). Then, for every α > 0, ( ) n lim n n1 2η ln P X i αn η i=1 = α2 2σ2, α 0. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

72 Application 1: MDP for Binary Hypothesis Testing Moderate-Deviations Analysis via the Martingale Approach (Cont.) The MDP shows that this inequality holds in fact with equality. The refined inequality in Theorem 2 gives the correct asymptotic scaling in this case. It also gives an analytical bound for finite n. This is in contrast to the analysis which follows from the Azuma-Hoeffding inequality, which does not coincide with the correct asymptotic scaling (since ε2 1 2σ 2 1 is replaced by ε2 1, and σ 2d 2 1 d 1 ). 1 In the considered setting of moderate-deviations analysis for binary hypothesis testing, the error probability decays sub-exponentially to 0. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

73 Application 1: MDP for Binary Hypothesis Testing Papers on Moderate-Deviations Analysis in IT Aspects 1 E. A. Abbe, Local to Global Geometric Methods in Information Theory, Ph.D. dissertation, MIT, Boston, MA, USA, June Y. Altuǧ and A. B. Wagner, Moderate-deviations analysis of channel coding: discrete memoryless case, Proc. ISIT 2010, pp , Austin, Texas, USA, June D. He, L. A. Lastras-Montaño, E. Yang, A. Jagmohan and J. Chen, On the redundancy of Slepian-Wolf coding, IEEE Trans. on Information Theory, vol. 55, no. 12, pp , December Y. Polyanskiy and S. Verdú, Channel dispersion and moderate-deviations limits of memoryless channels, Proceedings Forty-Eighth Annual Allerton Conference, pp , UIUC, Illinois, USA, October I. Sason, Moderate-deviations analysis of binary hypothesis testing, November V. Y. F. Tan, Moderate-deviations of lossy source coding for discrete and Gaussian channels, November I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

74 Application 2 - Concentration of Martingales in Coding Theory Concentration Phenomena for Codes Defined on Graphs Motivation & Background The performance analysis of a particular code is difficult, especially for codes of large block lengths. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

75 Application 2 - Concentration of Martingales in Coding Theory Concentration Phenomena for Codes Defined on Graphs Motivation & Background The performance analysis of a particular code is difficult, especially for codes of large block lengths. Does the performance concentrate around the average performance of the ensemble? I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

76 Application 2 - Concentration of Martingales in Coding Theory Concentration Phenomena for Codes Defined on Graphs Motivation & Background The performance analysis of a particular code is difficult, especially for codes of large block lengths. Does the performance concentrate around the average performance of the ensemble? The existence of such a concentration validates the use of the density evolution technique as an analytical tool to assess performance of long enough codes (e.g., LDPC codes) and to assess their asymptotic gap to capacity. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

77 Application 2 - Concentration of Martingales in Coding Theory Concentration Phenomena for Codes Defined on Graphs Motivation & Background The performance analysis of a particular code is difficult, especially for codes of large block lengths. Does the performance concentrate around the average performance of the ensemble? The existence of such a concentration validates the use of the density evolution technique as an analytical tool to assess performance of long enough codes (e.g., LDPC codes) and to assess their asymptotic gap to capacity. The current concentration results for codes defined on graphs, which mainly rely on the Azuma-Hoeffding inequality, are weak since in practice concentration is observed at much shorter block lengths. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

78 Application 2 - Concentration of Martingales in Coding Theory Performance under Message-Passing Decoding Theorem I - [Concentration of performance under iterative message-passing decoding (Richardson and Urbanke, 2001)] Let C, a code chosen uniformly at random from the ensemble LDPC(n,λ,ρ), be used for transmission over a memoryless binary-input output-symmetric (MBIOS) channel. Assume that the decoder performs l iterations of message-passing decoding, and let P b (C,l) denote the resulting bit error probability. Then, for every δ > 0, there exists an α > 0 where α = α(λ,ρ,δ,l) (independent of the block length n) such that P ( P b (C,l) E LDPC(n,λ,ρ) [P b (C,l)] δ ) e αn Proof The proof applies Azuma s inequality to a martingale sequence with bounded differences (IEEE Trans. on IT, Feb. 2001). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

79 Application 2 - Concentration of Martingales in Coding Theory Conditional Entropy of LDPC code ensembles Theorem II - [Concentration of Conditional Entropy of LDPC code ensembles (Méasson et al. 2008)] Let C be chosen uniformly at random from the ensemble LDPC(n,λ,ρ). Assume that the transmission of the code C takes place over an MBIOS channel. Let H(X Y) designate the conditional entropy of the transmitted codeword X given the received sequence Y from the channel. Then, for any ξ > 0, P ( H(X Y) E LDPC(n,λ,ρ) [H(X Y)] n ξ ) 2exp( Bξ 2 ) 1 where B 2(d max c +1) 2 (1 R d ), dmax c is the maximal check-node degree, and R d is the design rate of the ensemble. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

80 Application 2 - Concentration of Martingales in Coding Theory Proof - [outline] 1 Introduction of a martingale sequence with bounded differences: Define the RV Z = HG (X Y), where G is a graph of a code chosen uniformly at random from the ensemble LDPC(n, λ, ρ) Define the martingale sequence Zt = E[Z F t ] t {0, 1,..., m}, where the filtration is the sequence of subsets of σ-algebras, generated by revealing each time another parity-check equation of the code. 2 Upper bounds on the differences Z t+1 Z t : It was proved that Zt+1 Z t (r + 1)H G ( X Y), where r is the degree of parity-check equation revealed at time t, and X = X i1... X ir (i.e., X is the modulo-2 sum of some r bits in the codeword X). Then r d max c, and H G ( X Y) 1. 3 Azuma s inequality was applied to a get a concentration inequality, using Z t+1 Z t d max c + 1 for every t = 0,...,m 1 where m = n(1 R d ) is the number of parity-check nodes. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

81 Application 2 - Concentration of Martingales in Coding Theory Improvements to Theorem II Improvement 1 - A tightened upper bound on the conditional entropy Instead of upper bounding H G ( X Y) by 1, which is independent of the channel capacity (C), it can be proved that ( ) H G ( X Y) 1 C r 2 h 2 where h is the binary entropy function to the base 2. For a BSC or BEC, this bound can be further improved. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

82 Application 2 - Concentration of Martingales in Coding Theory Improvement 2 (trivial) Instead of taking the trivial bound r d max c for all m terms in the Azuma s inequality, one can rely on the degree distribution of the parity-check nodes. The number of parity-check nodes of degree r is n(1 R d )Γ r. Theorem III - [Tightened Expressions for B] Considering the terms of Theorem II, applying these two improvements yields tightened expressions for B. 1 General MBIOS - B [ ( BSC - B 1 2(1 R d ) d max c i=1 (i+1)2 Γ i h 1 C i 2 2 2(1 R d ) d max ( )] c 1 [1 2h i=1 (i+1)2 1 (1 C)] i 2 Γ i [h 2 BEC - B 1 2(1 R d ) d max c i=1 (i+1)2 Γ i (1 C i ) 2 )] 2 I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

83 Application 2 - Concentration of Martingales in Coding Theory Numerical comparison for BEC and BIAWGN Let us consider the case where (2,20) regular LDPC code ensemble. Communication over a BEC or BIAWGNC with capacity of 0.98 per channel use. Compared to Theorems II, applying Theorem III results in tighter expressions for B: [ ( )] 2 BIAWGN - Improvement by factor h = C dc 2 2 BEC - Improvement by factor 1 (1 C dc ) 2 = I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

84 Application 2 - Concentration of Martingales in Coding Theory Comparison for Heavy-Tail Poisson Distribution (Tornado Codes) Consider the capacity-achieving Tornado LDPC code ensemble for a BEC with erasure probability p. We wish to design a code ensemble that achieves a fraction 1 ε of the capacity. Theorem II - The parity-check degree is Poisson distributed, therefore =. Hence, B = 0 and this result is useless. ( ) 1 Theorem III - B scales (at least) like O ). d max c log 2( 1 ε The parameter B tends to zero slowly as we let the fractional gap ε tend to zero; this demonstrates a rather fast concentration. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

85 Application 2 - Concentration of Martingales in Coding Theory Interim Conclusions The use of Azuma s inequality was addressed in the context of proving concentration phenomena for code ensembles defined on graphs and iterative decoding algorithms. A possible tightening of a concentration inequality for the conditional entropy of LDPC ensembles by using Azuma s inequality, and deriving an improved upper bound on the jumps of the martingale sequence. The improved inequality enables to prove concentration of the conditional entropy for ensembles of Tornado codes (in contrast to the original concentration inequality). I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

86 Application 2 - Concentration of Martingales in Coding Theory Concentration of Martingales in Coding Theory Selected Papers Applying Azuma s Inequality for LDPC Ensembles M. Sipser and D. A. Spielman, Expander codes, IEEE Trans. on Information Theory, vol. 42, no. 6, pp , November M.G. Luby, M. Mitzenmacher, M.A. Shokrollahi and D.A. Spielman, Efficient erasure correcting codes, IEEE Trans. on Information Theory, vol. 47, no. 2, pp , February T. Richardson and R. Urbanke, The capacity of low-density parity check codes under message-passing decoding. IEEE Trans. on Information Theory, vol. 47, pp , February A. Kavcic, X. Ma and M. Mitzenmacher, Binary intersymbol interference channels: Gallager bounds, density evolution, and code performance bounds, IEEE Trans. on Information Theory, vol. 49, no. 7, pp , July I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

87 Application 2 - Concentration of Martingales in Coding Theory (Cont.) A. Montanari, Tight bounds for LDPC and LDGM codes under MAP decoding, IEEE Trans. on Information Theory, vol. 51, no. 9, pp , September C. Méasson, A. Montanari and R. Urbanke, Maxwell construction: The hidden bridge between iterative and maximum a-posteriori decoding, IEEE Trans. on Information Theory, vol. 54, pp , December L.R. Varshney, Performance of LDPC codes under faulty iterative decoding, IEEE Trans. on Information Theory, vol. 57, no. 7, pp , July I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

88 Application 3: Concentration for OFDM Signals Orthogonal Frequency Division Multiplexing (OFDM) The OFDM modulation converts a high-rate data stream into a number of low-rate steams that are transmitted over parallel narrow-band channels. One of the problems of OFDM is that the peak amplitude of the signal can be significantly higher than the average amplitude. Sensitivity to non-linear devices in the communication path (e.g., digital-to-analog converters, mixers and high-power amplifiers). An increase in the symbol error rate and also a reduction in the power efficiency as compared to single-carrier systems. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

89 Application 3: Concentration for OFDM Signals OFDM (Cont.) Given an n-length codeword {X i } i=0 n 1, a single OFDM baseband symbol is described by s(t;x 0,...,X n 1 ) = 1 n 1 ( j 2πit ) X i exp, 0 t T. n T i=0 Assume that X 0,...,X n 1 are i.i.d. complex RVs with X i = 1. Since the sub-carriers are orthonormal over [0,T], then a.s. the power of the signal s over this interval is 1. The CF of the signal s, composed of n sub-carriers, is defined as CF n (s) max 0 t T s(t). The CF scales with high probability like log(n) for large n. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

90 Application 3: Concentration for OFDM Signals Concentration of Measures In the following, we consider two of the main approaches for proving concentration inequalities, and apply them to prove concentration for the crest factor of OFDM signals. 1 The 1st approach is based on martingales (the Azuma-Hoeffding inequality and some refinements). 2 The 2nd approach is based on Talagrand s inequalities. I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

91 Application 3: Concentration for OFDM Signals A Previously Reported Result A concentration inequality for the CF of OFDM signals was derived (Litsyn and Wunder, IEEE Trans. on IT, 2006). It states that for every c 2.5 CFn P( (s) ) ( clog log(n) 1 log(n) < = 1 O ( ) log(n) 4 ). log(n) I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

92 Application 3: Concentration for OFDM Signals Theorem - [McDiarmid s Inequality] Let X = (X 1,...,X n ) be a vector of independent random variables with X k taking values in a set A k for each k. Suppose that a real-valued function f, defined on k A k, satisfies f(x) f(x ) c k whenever the vectors x and x differ only in the k-th coordinate. Let µ E[f(X)] be the expected value of f(x). Then, for every α 0, P( f(x) µ α) 2exp ) ( 2α2. k c2 k I. Sason (Technion) Seminar Talk, ETH, Zurich, Switzerland August 22 23, / 90

On the Concentration of the Crest Factor for OFDM Signals

On the Concentration of the Crest Factor for OFDM Signals Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel The 2011 IEEE International Symposium