Hypercontractivity of spherical averages in Hamming space Yury Polyanskiy Department of EECS MIT yp@mit.edu Allerton Conference, Oct 2, 2014 Yury Polyanskiy Hypercontractivity in Hamming space 1
Hypercontractivity: definition L p -norms of random variables: ( f(x) Lp(µ) f(x) p µ(dx) U p (E[ U p ]) 1 p ) 1 p Let P XY joint distribution. Define T Y X stochastic (Markov) averaging operator T Y X f(x) E[f(Y ) X = x] = T f p f p by Jensen s. Yury Polyanskiy Hypercontractivity in Hamming space 2
Hypercontractivity: definition L p -norms of random variables: ( f(x) Lp(µ) f(x) p µ(dx) U p (E[ U p ]) 1 p ) 1 p Let P XY joint distribution. Define T Y X stochastic (Markov) averaging operator T Y X f(x) E[f(Y ) X = x] = T f p f p by Jensen s. T is hypercontractive if T p q = 1 for p < q: T f q f p, T p q sup f T f q f p T f gains integrability! (by smoothing peaks) T is HC for some p T is HC for all p. Yury Polyanskiy Hypercontractivity in Hamming space 2
Hypercontractivity: examples Nelson-Gross: (X, Y ) jointly Gaussian: E[f(Y ) X] 4 f(y ) 2 ρ(x, Y ) 1 3. [ ] Bonami-Gross: (X, Y ) 1 1 δ δ 2 BSS(δ): δ 1 δ E[f(Y ) X] q f(y ) p p 1 (1 2δ)2 q 1 Bonami-Beckner: T p q = 1 T n p q = 1 Ahslwede-Gács: for finite X, Y pretty much any T Y X is HC Yury Polyanskiy Hypercontractivity in Hamming space 3
What is HC good for? Canonical example: Estimate probability of a rectangle P[X A, Y B], where P[X A] P[Y B] = θ 1. Spectral gap (expanders, etc): 1 = σ 1 (T ) > σ 2 (T ) > P[X A, Y B] θ 2 + σ 2 θ(1 θ) σ 2 θ Hypercontractivity: P[X A, Y B] = (1 A, T 1 B ) 1 A q T 1 B q 1 A q 1 B p = θ 1 1 q + 1 p = θ 1+ɛ σ 2 θ (!) Note: Used 0/1 nature of indicator functions. Yury Polyanskiy Hypercontractivity in Hamming space 4
Hypercontractivity and KL-divergence P X P Y X P Y X Y Q X Q Y Apply to typical sets: So we get: 2 nd(q X P X ) P[X n T n [Q X ], Y n T n [Q Y ] ] 2 n 1 q D(Q X P X ) n 1 p D(Q Y P Y ) T Y X p q = 1 = D(Q Y P Y ) p q D(Q X P X ) Q X Yury Polyanskiy Hypercontractivity in Hamming space 5
Hypercontractivity and KL-divergence P X P Y X P Y X Y Q X Q Y Apply to typical sets: So we get: 2 nd(q X P X ) P[X n T n [Q X ], Y n T n [Q Y ] ] 2 n 1 q D(Q X P X ) n 1 p D(Q Y P Y ) T Y X p q = 1 = D(Q Y P Y ) p q D(Q X P X ) Q X = Yury Polyanskiy Hypercontractivity in Hamming space 5
Hypercontractivity and KL-divergence P X P Y X P Y X Y Q X Q Y For fixed P XY, KL-contraction ratio: η KL (P XY ) sup Q X D(Q Y P Y ) D(Q X P X ) = sup U X Y I(U; Y ) I(U; X) Yury Polyanskiy Hypercontractivity in Hamming space 6
Hypercontractivity and KL-divergence P X P Y X P Y X Y Q X Q Y For fixed P XY, KL-contraction ratio: Theorem (Ahlswede-Gács) Roughly: η KL (P XY ) sup Q X D(Q Y P Y ) D(Q X P X ) = η KL (P XY ) = inf sup U X Y { } p q : T Y X p q = 1 I(U; Y ) I(U; X) Q X : D(Q Y P Y ) rd(q X P X ) = T Y X p p r = 1, p 1 Yury Polyanskiy Hypercontractivity in Hamming space 6
Hypercontractivity and KL-divergence [Nair 2014]: Apply HC to typical sets of arbitrary Q XY to get T Y X p q = 1 = 1 p D(Q Y P Y ) 1 q D(Q X P X ) + D(Q Y X P Y X Q X ) Q XY Yury Polyanskiy Hypercontractivity in Hamming space 7
Hypercontractivity and KL-divergence [Nair 2014]: Apply HC to typical sets of arbitrary Q XY to get T Y X p q = 1 1 p D(Q Y P Y ) 1 q D(Q X P X ) + D(Q Y X P Y X Q X ) Q XY Proof: Take g 1 1 = 1 and let f = (g 1 ) 1 p STP: T f q 1. Ahlswede-Gács: L p Nair: Q X (x) = P X (x) T f(x)q T f q q Q Y X (y x) = P Y X (y x) Q X (x) =... same... Q Y X (y x) = P Y X (y x) f(y) T f(x) log T f q 1 p D(Q Y P Y ) 1 q D(Q X P X ) D(Q Y X P Y X Q X ) 0 Yury Polyanskiy Hypercontractivity in Hamming space 7
Hypercontractivity and KL-divergence [Nair 2014]: Apply HC to typical sets of arbitrary Q XY to get T Y X p q = 1 1 p D(Q Y P Y ) 1 q D(Q X P X ) + D(Q Y X P Y X Q X ) Q XY Proof: Take g 1 1 = 1 and let f = (g 1 ) 1 p STP: T f q 1. Ahlswede-Gács: L p Nair: Q X (x) = P X (x) T f(x)q T f q q Q Y X (y x) = P Y X (y x) Q X (x) =... same... Q Y X (y x) = P Y X (y x) f(y) T f(x) log T f q 1 p D(Q Y P Y ) 1 q D(Q X P X ) D(Q Y X P Y X Q X ) 0 Note: For p 1 both have Q Y X P Y X. Yury Polyanskiy Hypercontractivity in Hamming space 7
L p L q norm as measure of dependence T Y X p q measures dependence: Invariant to relabeling Satisfies data-processing Unusual: sticky at 1. Yury Polyanskiy Hypercontractivity in Hamming space 8
L p L q norm as measure of dependence T Y X p q measures dependence: Invariant to relabeling Satisfies data-processing Unusual: sticky at 1. Turns out the last one is a general phenomenon: All operators T sufficiently noisy are p q HC. Yury Polyanskiy Hypercontractivity in Hamming space 8
Semenov-Shneiberg 88 Theorem For any q 2 p there is ɛ = ɛ(p, q) s.t. simultaneously for all P XY : T Y X E Y p q ɛ T Y X p q = 1 where E Y f 1 X E[f(Y )]. Remarks: Also for general q < p with extra condition: T E 2 2 ɛ Also for other operators of the type: T 1 Y = 1 X, (1 X, T ) = (1 X, E ) Orig. result is for P X = P Y, but proof easily generalizes. For finite (X, Y ), suff. to check P XY P X P Y T V ɛ = ɛ (p, q, P X, P Y ) Yury Polyanskiy Hypercontractivity in Hamming space 9
Semenov-Shneiberg 88: Proof Proof for q 2 p: WLOG can take f = 1 + Z with Z 1. Decompose T = E +. STP: ( ) 1 + Z q 1 + Z p Cool fact: universal τ, a, b s.t. 1 + Z q 1 + a Z 2 q if Z q τ 1 + Z p 1 + b Z 2 p if Z p τ Whenever 2 p q b a and Z p τ: 1 + Z) q 1 + a Z 2 q 1 + Z p Yury Polyanskiy Hypercontractivity in Hamming space 10
Semenov-Shneiberg 88: Proof Proof for q 2 p: WLOG can take f = 1 + Z with Z 1. Decompose T = E +. STP: ( ) 1 + Z q 1 + Z p Cool fact: universal τ, a, b s.t. 1 + Z q 1 + a Z 2 q if Z q τ 1 + Z p 1 + b Z 2 p if Z p τ Whenever 2 p q b a and Z p τ: 1 + Z) q 1 + a Z 2 q 1 + Z p Large Z: Another cool fact: universal β p ( ) 1 + Z p 1 + β p ( Z p ), and β p(u) u increasing So (*) holds if p q βp(τ) τ. Yury Polyanskiy Hypercontractivity in Hamming space 10
Spherical averages in Hamming space Hamming space F n 2 with uniform probability Bonami-Gross operator N δ f(x) E[f(x + Z)], Z Bern(δ) n Spherical average: T δ f(x) E[f(x + Z)], Z Unif(S δn ) where S j {y : wt(y) = j}. Theorem (P 13) δ 1 2 and p = 1 + (1 2δ)2 there is a dim-independent C p,δ s.t. T δ f 2 C p,δ f p Remark: Bonami-Gross N δ f 2 1 f p Yury Polyanskiy Hypercontractivity in Hamming space 11
Hypercontractivity of T δ : exact version 1 + (1 2δ)2 p 2 F = (δ, p) : 0 δ 1 p 1 Theorem (P 13) For any compact subset K F there is a constant C K s.t. T δ p 2 C K (δ, p) K, where T δ f = f P Z, P Z = Unif(S δn ). For p = 1, δ = 1/2: T 1 p 2 = θ(n 1 4 ). 2 Note: for Bernoulli noise N δ p 2 = 1 everywhere on closure of F. Yury Polyanskiy Hypercontractivity in Hamming space 12
Proof: Prelim remarks Why is constant not 1? Let 1 even = 1{x : wt(x)-even} Then: T δ 1 even = 1 even (or 1 odd ) So: T δ p q 2 1 p 1 q > 1. Yury Polyanskiy Hypercontractivity in Hamming space 13
Proof: Prelim remarks Why is constant not 1? Let 1 even = 1{x : wt(x)-even} Then: T δ 1 even = 1 even (or 1 odd ) So: T δ p q 2 1 p 1 q > 1. Beautiful methods that failed: Tensorization (T δ not a tensor power) Semigroup method, aka d dδ (T δ not a semigroup) Comparison of Dirichlet forms (not useful for δ 1 n ) Ugly method that worked: Direct computation of eigenvalues of T δ Yury Polyanskiy Hypercontractivity in Hamming space 13
Proof: Orthogonal decomposition T δ and N δ are S n -equivariant f(x) = poly(( 1) x 1,..., ( 1) xn ) = n f a (x), a=0 where f a (x) = degree a part of the poly. By odd/even trick can assume f a = 0 for a n/2. Have: N δ f a = (1 2δ) a f a T δ f a = (?? ) f a Yury Polyanskiy Hypercontractivity in Hamming space 14
Proof: Orthogonal decomposition T δ and N δ are S n -equivariant f(x) = poly(( 1) x 1,..., ( 1) xn ) = n f a (x), a=0 where f a (x) = degree a part of the poly. By odd/even trick can assume f a = 0 for a n/2. Have: N δ f a = (1 2δ) a f a T δ f a = κ δn (a) f a and κ δn (a) Krawtchouk polynomial of degree δn. Yury Polyanskiy Hypercontractivity in Hamming space 14
Proof (cont d): Eigenvalues of N δ vs T δ 0 Asymptotic spectra of T δ and N δ : δ = 0.1 Bernoulli noise N δ Spherical average T δ 0.05 (1/n) log(fourier coef.) 0.1 0.15 0.2 0.25 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a/n Yury Polyanskiy Hypercontractivity in Hamming space 15
Proof (cont d): Eigenvalues of N δ vs T δ 0 Asymptotic spectra of T δ and N δ : δ = 0.1 Bernoulli noise N δ Spherical average T δ 0.05 (1/n) log(fourier coef.) 0.1 0.15 For any eigenspace a λ a (T δ ) λ a (N δ ) 0.2 0.25 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a/n Yury Polyanskiy Hypercontractivity in Hamming space 15
Proof (cont d): Eigenvalues of N δ vs T δ 0 Asymptotic spectra of T δ and N δ : δ = 0.1 Bernoulli noise N δ Spherical average T δ 0.05 (1/n) log(fourier coef.) 0.1 0.15 0.2 For any eigenspace a λ a (T δ ) λ a (N δ ) But only for δ 0.17! 0.25 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a/n Yury Polyanskiy Hypercontractivity in Hamming space 15
Proof (cont d): Failure for δ 0.17 0 Asymptotic spectra of T δ and N δ : δ = 0.25 Fourier projection norm Π a p(δ) >2 0.05 Bernoulli noise N δ Spherical average T δ 0.1 0.15 (1/n) log(fourier coef.) 0.2 0.25 0.3 0.35 0.4 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a/n Yury Polyanskiy Hypercontractivity in Hamming space 16
Proof (cont d): Failure for δ 0.17 0 Asymptotic spectra of T δ and N δ : δ = 0.25 Fourier projection norm Π a p(δ) >2 0.05 Bernoulli noise N δ Spherical average T δ 0.1 (1/n) log(fourier coef.) 0.15 0.2 0.25 0.3 When δ > 0.17 and a large λ a (T δ ) λ a (N δ ) 0.35 Fortunately: f a 2 f 2 0.4 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 a/n Yury Polyanskiy Hypercontractivity in Hamming space 16
Proof (cont d): Overall argument Know: N δ f 2 2 = a (1 2δ) 2a f a 2 2 f 2 p by Bonami! Need to show: T δ f 2 2 = a κ δn (a) 2 f a 2 2 C f 2 p a a crit : κ δn (a) 2 f a 2 2 C (1 2δ) 2a f a 2 2 a a crit a = C N δ f 2 2 C f 2 p a > a crit : Need extra estimate: κ δn (a) f a 2 e Cn f p = a ξ crit n contributes C f 2 p. Yury Polyanskiy Hypercontractivity in Hamming space 17
Proof (cont d): Overall argument Know: N δ f 2 2 = a (1 2δ) 2a f a 2 2 f 2 p by Bonami! Need to show: T δ f 2 2 = a κ δn (a) 2 f a 2 2 C f 2 p a a crit : For δ < 0.17: a crit = n/2! κ δn (a) 2 f a 2 2 C (1 2δ) 2a f a 2 2 a a crit a a > a crit : Need extra estimate: = C N δ f 2 2 C f 2 p κ δn (a) f a 2 e Cn f p = a ξ crit n contributes C f 2 p. Yury Polyanskiy Hypercontractivity in Hamming space 17
Application to additive-combinatorics Known fact: Subspace L F n 2 s.t. L S δn S δn implies dim L n. Yury Polyanskiy Hypercontractivity in Hamming space 18
Application to additive-combinatorics Known fact: Subspace L F n 2 s.t. L S δn S δn implies dim L n. Corollary Let A F n 2 be a subset s.t. sumset A + A intersects some S δn with average multiplicty > λ A. Then Proof: Statement means A λ ɛ 2 n. 2 n (1 A 1 A, 1 Sδn ) λ A S δn Rearrange (1 A 1 A, 1 S ) = (T δ 1 A, 1A) Apply Hölder and HC. Yury Polyanskiy Hypercontractivity in Hamming space 18
Thank you! References R. Ahlswede and P. Gacs, Spreading of sets in product spaces and hypercontraction of the Markov operator, Ann. Probab., pp. 925 939, 1976. E. M. Semenov and I. Y. Shneiberg, Hypercontractive operators and Khinchin s inequality, Func. Analysis and Appl., vol. 22, no. 3, pp. 244 246, 1988. Y. Polyanskiy, Hypercontractivity of spherical averages in Hamming space, arxiv:1309.3014, Sep. 2013 Y. Polyanskiy, Hypothesis testing via a comparator and hypercontractivity, preprint, Feb. 2013. C. Nair, Equivalent formulations of hypercontractivity using information measures, Zurich Seminar on Comm., 2014 Yury Polyanskiy Hypercontractivity in Hamming space 19
Dobrushin coefficient total variation P Q TV = 1 2 dp dq Dobrushin coefficient η TV = P Y Q Y sup TV P X Q X P X Q X TV Original motivation: mixing of Markov chains = sup x,x P Y X=x P Y X=x TV. Yury Polyanskiy Hypercontractivity in Hamming space 20
Contraction coeffs and mixing of Markov chains Consider a M.C. with invariant dist. P : Contraction: X 0 X 1 X 2 P Xn P TV (η TV ) n χ 2 (P Xn P ) (η χ 2) n χ 2 (P X0 P ) D(P Xn P ) (η KL ) n D(P X0 P ) (Dobrushin) (spectral gap) (log-sobolev) Yury Polyanskiy Hypercontractivity in Hamming space 21
Contraction coeffs and mixing of Markov chains Consider a M.C. with invariant dist. P : Contraction: X 0 X 1 X 2 P Xn P TV (η TV ) n χ 2 (P Xn P ) (η χ 2) n χ 2 (P X0 P ) D(P Xn P ) (η KL ) n D(P X0 P ) (Dobrushin) (spectral gap) (log-sobolev) TODO: add mixing estimate via HC Yury Polyanskiy Hypercontractivity in Hamming space 21
From TV to KL... and further Theorem (Cohen-Iwasa-Rautu-Ruskai-Seneta-Zbăganu 93) η KL η TV Yury Polyanskiy Hypercontractivity in Hamming space 22
From TV to KL... and further Theorem (Cohen-Iwasa-Rautu-Ruskai-Seneta-Zbăganu 93) η KL η TV Fun application: Let T kernel of a finite-state M.C. with invariant distribution P. Let E be indep. M.C.: Ef fdp If T (x, ) E(x, ) TV < 1/2 then η TV = max x,x T (x, ) T (x, ) TV < 1 Then η KL < 1. By Ahslwede-Gács q 1 T f q f p, p = η KL q < q (hypercont) There is a ball around E s.t. for all T : T Lp L q = 1. Yury Polyanskiy Hypercontractivity in Hamming space 22
From TV to KL... and further Theorem (Cohen-Iwasa-Rautu-Ruskai-Seneta-Zbăganu 93) η KL η TV Fun application: Let T kernel of a finite-state M.C. with invariant distribution P. Let E be indep. M.C.: Ef fdp If T (x, ) E(x, ) TV < 1/2 then η TV = max x,x T (x, ) T (x, ) TV < 1 Then η KL < 1. By Ahslwede-Gács q 1 T f q f p, p = η KL q < q (hypercont) There is a ball around E s.t. for all T : T Lp L q = 1. Every (mixing) M.C. eventually (hyper-)contracts L p L q. Yury Polyanskiy Hypercontractivity in Hamming space 22
From TV to KL... and further Theorem (Cohen-Iwasa-Rautu-Ruskai-Seneta-Zbăganu 93) η KL η TV Fun application: Let T kernel of a finite-state M.C. with invariant distribution P. Let E be indep. M.C.: Ef fdp If T (x, ) E(x, ) TV < 1/2 then η TV = max x,x T (x, ) T (x, ) TV < 1 Then η KL < 1. By Ahslwede-Gács q 1 T f q f p, p = η KL q < q (hypercont) There is a ball around E s.t. for all T : T Lp L q = 1. Every (mixing) M.C. eventually (hyper-)contracts L p L q. Euclidean space: planar face of the ball of Fourier multipliers: [Segal 70], [Fefferman-Shapiro 72], [Semenov-Shneiberg 88] Yury Polyanskiy Hypercontractivity in Hamming space 22