General Strong Polarization Madhu Sudan Harvard University Joint work with Jaroslaw Blasiok (Harvard), Venkatesan Gurswami (CMU), Preetum Nakkiran (Harvard) and Atri Rudra (Buffalo) May 1, 018 G.Tech: General Strong Polarization 1 of 1
This Talk: Martingales and Polarization [0,1]-Martingale: Sequence of r.v.s XX 0, XX 1, XX, XX tt 0,1, EE XX tt+1 XX 0,, XX tt = XX tt Well-studied in algorithms: [MotwaniRaghavan, 4.4], [MitzenmacherUpfal, 1] E.g., XX tt = Expected approx. factor of algorithm given first tt random choices. Usually study concentration E.g. whp XX final [XX 0 ± εε] Today: Polarization: whp XX final εε, 1 εε Why? Polar codes lim tt XX tt = XX 0 lim tt XX tt = Bern XX 0 May 1, 018 G.Tech: General Strong Polarization of 1
Main Result: Definition and Theorem Strong Polarization: (informally) Pr XX tt (ττ, 1 ττ) εε if tt max OO(log 1/εε), oo(log 1/ττ) formally γγ > 0 ββ < 1, cc ss. tt. tt Pr XX tt (γγ tt, 1 γγ tt ) cc ββ tt A non-polarizing example: XX 0 = 1 ; XX tt+1 = XX tt ± tt Local Polarization: Both definitions qualitative! Variance in the middle: XX tt (ττ, 1 ττ) ττ > 0 σσ > 0 ss. tt. tt, XX tt ττ, 1 ττ VVVVVV XX tt+1 XX tt Suction at the ends: XX tt ττ, 1 ττ θθ > 0, cc <, ττ > 0 ss. tt. XX tt < ττ Pr XX tt+1 < XX tt cc σσ θθ Theorem: Local Polarization low Strong end condition. Polarization. Similar condition for high end May 1, 018 G.Tech: General Strong Polarization 3 of 1
Rest of the talk Background/Motivation: The Shannon Capacity Challenge Resolution by [Arikan 08], [Guruswami-Xia 13], [Hassani,Alishahi,Urbanke 13]: Polar Codes! Key ideas: Polar Codes & Arikan-Martingale. Implications of Weak/Strong Polarization. Our Contributions: Simplification & Generalization Local Polarization Strong Polarization Arikan-Martingale Locally Polarizes (generally) May 1, 018 G.Tech: General Strong Polarization 4 of 1
Shannon and Channel Capacity Shannon [ 48]: For every (noisy) channel of communication, capacity CC s.t. communication at rate RR < CC is possible and RR > CC is not. Needs lot of formalization omitted. Example: XX FF BBBBBB(pp) Acts independently on bits XX w.p. 1 pp 1 XX w.p. pp Capacity = 1 HH(pp) ; HH(pp) = binary entropy! Rate? Encode:FF kk nn FF nn kk lim nn RR nn nn HH pp = pp log 1 pp + 1 pp log 1 1 pp May 1, 018 G.Tech: General Strong Polarization 5 of 1
Achieving Channel Capacity Commonly used phrase. Many mathematical interpretations. Some possibilities: How small can nn be? Get RR > CC εε with polytime algorithms? [Luby et al. 95] Make running time poly nn εε? Open till 008 Shannon 48 : nn = OO CC RR Forney 66 :time = pppppppp nn, 1 εε Arikan 08: Invented Polar Codes Final resolution: Guruswami+Xia 13, Hassani+Alishahi+Urbanke 13 Strong analysis May 1, 018 G.Tech: General Strong Polarization 6 of 1
Polar Codes and Polarization Arikan: Defined Polar Codes, one for every tt Associated Martingale XX 0,, XX tt, ttth code: Length nn = tt If Pr XX tt γγ, 1 δδ εε then ttth code is εε + δδ - close to capacity, and Pr DDDDDDDDDDDD EEEEEEEEEEEE mm + eeeeeeeeee mm nn γγ Need γγ = oo 1 nn (strong polarization γγ = nnnnnn(nn)) Need εε = 1/nn Ω(1) ( strong polarization) May 1, 018 G.Tech: General Strong Polarization 7 of 1
Context for today s talk [Arikan 08]: Martingale associated to general class of codes. Martingale polarizes (weakly) implies code achieves capacity (weakly) Martingale polarizes weakly for entire class. [GX 13, HAU 13]: Strong polarization for one specific code. Why so specific? Bottleneck: Strong Polarization Rest of talk: Polar codes, Martingale and Strong Polarization May 1, 018 G.Tech: General Strong Polarization 8 of 1
Lesson 0: Compression Coding Linear Compression Scheme: HH, DD for compression scheme for bern pp nn if Linear map HH: FF nn FF mm Pr ZZ bern pp nn DD HH ZZ ZZ = oo 1 Hopefully mm HH pp + εε nn Compression Coding Let HH be such that HH HH = 0; Encoder: XX HH XX Error-Corrector: YY = HH XX + ηη YY DD HH YY = ww.pp. 1 oo(1) HH XX May 1, 018 G.Tech: General Strong Polarization 9 of 1
Question: How to compress? Arikan s key idea: Start with Polarization Transform : UU, VV UU + VV, VV Invertible so does nothing? If UU, VV independent, then UU + VV more random than either VV UU + VV less random than either Iterate (ignoring conditioning) End with bits that are almost random, or almost determined (by others). Output totally random part to get compression! May 1, 018 G.Tech: General Strong Polarization 10 of 1
Information Theory Basics Entropy: Defn: for random variable XX Ω, HH XX = pp xx log 1 where pp pp xx = Pr XX = xx xx xx Ω Bounds: 0 HH XX log Ω Conditional Entropy Defn: for jointly distributed XX, YY HH XX YY = pp yy HH(XX YY = yy) yy Chain Rule: HH XX, YY = HH XX + HH(YY XX) Conditioning does not increase entropy: HH XX YY HH(XX) Equality Independence: HH XX YY = HH XX XX YY May 1, 018 G.Tech: General Strong Polarization 11 of 1
Iteration by Example: AA, BB, CC, DD AA + BB + CC + DD, BB + DD, CC + DD, DD EE, FF, GG, HH EE + FF + GG + HH, FF + HH, GG + HH, HH BB + DD HH BB + DD AA + BB + CC + DD) BB + DD + FF + HH HH BB + DD + FF + HH AA + BB + CC + DD, EE + FF + GG + HH FF + HH HH FF + HH EE + FF + GG + HH) HH FF + HH FF + HH AA + BB + CC + DD, EE + FF + GG + HH, BB + DD + FF + HH May 1, 018 G.Tech: General Strong Polarization 1 of 1
The Polarization Butterfly PP nn UU, VV = PPnn UU + VV, PPnn VV UU ZZ 1 ZZ ZZ 1 + +1 ZZ + + WW 1 WW Polarization SS nn HH WW nn SS WW SS 0 SS HH pp + εε nn + ZZ nn WWnn Compression EE ZZ = PP nn ZZ SS VV +1 + +1 + WWnn +1 WWnn + Encoding time = OO nn log nn (given SS) To be shown: ZZ nn ZZ nn WW nn 1. Decoding = OO nn log nn. Polarization! May 1, 018 G.Tech: General Strong Polarization 13 of 1
The Polarization Butterfly: Decoding ZZ 1 ZZ ZZ 1 + +1 ZZ + + WW 1 WW Decoding idea: Given SS, WW SS = PP nn UU, VV Compute UU + VV Bern pp pp Determine VV? VV UU + VV ii Bern(rr ii ) +1 + +1 + + ZZ nn WWnn WWnn +1 WWnn + Key idea: Stronger induction! - non-identical product distribution Decoding Algorithm: Given SS, WW SS = PP nn UU, VV, pp 1,, pp nn - Compute qq 1,.. qqnn pp 1 pp nn ZZ nn ZZ nn WW nn - Determine UU + VV = DD WW SS +, qq 1 qqnn - Compute rr 1 rrnn pp 1 pp nn ; UU + VV - Determine VV = DD WW SS, rr 1 rrnn May 1, 018 G.Tech: General Strong Polarization 14 of 1
Polarization? Martingale? ZZ 1 ZZ +1 + ZZ nn ZZ 1 + +1 ZZ + +1 + ZZ nn + ZZ nn + WW 1 WW ZZ ii,0 ZZ ii,1 Z ii,tt ZZ ii,log nn WWnn WWnn +1 WWnn + WW nn Idea: Follow XX tt = HH ZZ ii,tt ZZ <ii,tt for random ii Martingale? ZZ jj,tt ZZ kk,tt XX tt can t distinguish ZZ jj,tt+1 ZZ kk,tt+1 ii = jj from ii = kk HH ZZ jj,tt, ZZ kk,tt = HH ZZ jj,tt+1, ZZ kk,tt+1 Chain Rule EE XX tt+1 XX tt = XX tt May 1, 018 G.Tech: General Strong Polarization 15 of 1
Remaining Tasks: 1. Prove that XX tt HH ZZ ii,tt ZZ <ii,tt polarizes locally. Prove that Local polarization Strong Polarization. Recall Strong Polarization: (informally) Pr XX tt (ττ, 1 ττ) εε if tt max OO(log 1/εε), oo(log 1/ττ) formally γγ > 0 ββ < 1, cc ss. tt. tt Pr XX tt (γγ tt, 1 γγ tt ) cc ββ tt Local Polarization: Variance in the middle: XX tt (ττ, 1 ττ) ττ > 0 σσ > 0 ss. tt. tt, XX tt ττ, 1 ττ VVVVVV XX tt+1 XX tt σσ Suction at the ends: XX tt ττ, 1 ττ θθ > 0, cc <, ττ > 0 ss. tt. XX tt < ττ Pr XX tt+1 < XX tt cc θθ May 1, 018 G.Tech: General Strong Polarization 16 of 1
Local (Global) Strong Polarization Step 1: Medium Polarization: αα, ββ < 1 Pr XX tt ββ tt, 1 ββ tt αα tt Proof: Potential Φ tt = EE min XX tt, 1 XX tt Variance+Suction ηη < 1 s.t. Φ tt ηη φφ tt 1 (Aside: Why XX tt? Clearly XX tt won t work. And XX tt increases!) Φ tt ηη tt Pr XX tt ββ tt, 1 ββ tt ηη ββ tt May 1, 018 G.Tech: General Strong Polarization 17 of 1
Local (Global) Strong Polarization Step 1: Medium Polarization: αα, ββ < 1 Pr XX tt ββ tt, 1 ββ tt αα tt Step : Medium Polarization + tt-more steps Strong Polarization: Proof: Assume XX 0 αα tt ; Want XX tt γγ tt w.p. 1 OO αα 1 tt Pick cc large & ττ s.t. Pr XX ii < XX ii 1 cc XX ii 1 ττ θθ.1: Pr ii ss. tt. XX ii ττ ττ 1 αα tt (Doob s Inequality).: Let YY ii = log XX ii ; Assuming XX ii ττ for all ii YY ii YY ii 1 log cc w.p. θθ; & kk w.p. kk. EEEE YY tt tt log cc θθ + 1 tt log γγ Concentration! Pr YY tt tt log γγ αα tt QED! May 1, 018 G.Tech: General Strong Polarization 18 of 1
Polarization of Arikan Martingale Ignoring conditioning: XX tt = HH pp XX tt+1 = HH pp pp w.p. ½ = HH pp HH pp pp w.p. ½ Variance in the middle: Continuity of HH and separation of pp pp from pp Suction at high end: XX tt = 1 εε Suction at low end: XX tt+1 = 1 OO(εε ) w.p. ½ XX tt = HH pp XX tt+1 = HH pp log pp w.p. ½ May 1, 018 G.Tech: General Strong Polarization 19 of 1
General Strong Polarization? So far: PP nn = HH l for HH = 1 1 1 0 What about other matrices? E.g., PP nn = HH 3 l for HH 3 = Open till our work! 1 1 0 1 0 0 0 0 1 Necessary conditions: HH invertible, not-lowertriangular Our work: Necessary conditions suffice (for local polarization)!. May 1, 018 G.Tech: General Strong Polarization 0 of 1
Conclusion Simple General Analysis of Strong Polarization But main reason to study general polarization: Maybe some matrices polarize even faster! This analysis does not get us there. But hopefully following the (simpler) steps in specific cases can lead to improvements. May 1, 018 G.Tech: General Strong Polarization 1 of 1
Thank You! May 1, 018 G.Tech: General Strong Polarization of 1