General Strong Polarization Madhu Sudan Harvard University Joint work with Jaroslaw Blasiok (Harvard), Venkatesan Guruswami (CMU), Preetum Nakkiran (Harvard) and Atri Rudra (Buffalo) Oct. 5, 018 Stanford: General Strong Polarization 1 of 0
This Talk: Martingales and Polarization [0,1]-Martingale: Sequence of r.v.s XX 0, XX 1, XX, XX tt 0,1, EE XX tt+1 XX 0,, XX tt = XX tt Well-studied in algorithms: [MotwaniRaghavan, 4.4], [MitzenmacherUpfal, 1] E.g., XX tt = Expected approx. factor of algorithm given first tt random choices. Usually study concentration E.g. whp XX final [XX 0 ± σσ] lim XX tt = NN(XX 0, σσ ) tt Today: Polarization: whp XX final εε, 1 εε lim XX tt = Bern XX 0 tt Why? Polar codes Oct. 5, 018 Stanford: General Strong Polarization of 0
Some examples XX tt+1 = XX tt + 4 tt w. p. 1 XX tt 4 tt w. p. 1 Converges! XX tt+1 = XX tt + tt w. p. 1 XX tt tt w. p. 1 Uniform on [0,1] XX tt+1 = XX tt w. p. 1 3 1 XX tt w. p. 1 if XX tt 1 if XX tt 1 Polarizes (weakly)! XX tt w. p. 1 if XX tt 1 XX tt+1 = XX tt XX tt w. p. 1 if XX tt 1 Polarizes (strongly)! Oct. 5, 018 Stanford: General Strong Polarization 3 of 0
Main Result: Definition and Theorem Strong Polarization: (informally) Pr XX tt (ττ, 1 ττ) εε if tt max OO(log 1/εε), oo(log 1/ττ) formally γγ > 0 ββ < 1, cc ss. tt. tt Pr XX tt (γγ tt, 1 γγ tt ) cc ββ tt Local Polarization: Both definitions qualitative! Variance in the middle: XX tt (ττ, 1 ττ) ττ > 0 σσ > 0 ss. tt. tt, XX tt ττ, 1 ττ VVVVVV XX tt+1 XX tt Suction at the ends: XX tt ττ, 1 ττ θθ > 0, cc <, ττ > 0 ss. tt. XX tt < ττ Pr XX tt+1 < XX tt cc σσ θθ Theorem: Local Polarization Strong Polarization. low end condition. Similar condition for high end Oct. 5, 018 Stanford: General Strong Polarization 4 of 0
Proof (Idea): Step 1: The potential Φ tt min XX tt, 1 XX tt decreases by constant factor in expectation in each step. EE Φ TT = exp TT Pr XX TT exp TT/ exp TT/ Step : Next TT time steps, XX tt plummets whp.1: Say, If XX tt ττ then Pr XX tt+1 XX tt 100 1/..: Pr tt TT, TT ss. tt. XX tt > ττ XX TT /ττ [Doob].3: If above doesn t happen XX TT < 5 TT whp QED Oct. 5, 018 Stanford: General Strong Polarization 5 of 0
Rest of the talk Background/Motivation: The Shannon Capacity Challenge Resolution by [Arikan 08], [Guruswami-Xia 13], [Hassani,Alishahi,Urbanke 13]: Polar Codes! Key ideas: Polar Codes & Arikan-Martingale. Implications of Weak/Strong Polarization. Our Contributions: Simplification & Generalization Local Polarization Strong Polarization Arikan-Martingale Locally Polarizes (generally) Oct. 5, 018 Stanford: General Strong Polarization 6 of 0
Shannon and Channel Capacity Shannon [ 48]: For every (noisy) channel of communication, capacity CC s.t. communication at rate RR < CC is possible and RR > CC is not. Needs lot of formalization omitted. Example: XX FF BBBBBB(pp) Acts independently on bits XX w.p. 1 pp 1 XX w.p. pp Capacity = 1 HH(pp) ; HH(pp) = binary entropy! Capacity? Encode:FF kk nn FF nn kk lim nn nn nn RR (Rate) HH pp = pp log 1 pp + 1 pp log 1 1 pp Oct. 5, 018 Stanford: General Strong Polarization 7 of 0
Achieving Channel Capacity Commonly used phrase. Many mathematical interpretations. Some possibilities: How small can nn be? Get RR > CC εε with polytime algorithms? [Luby et al. 95] Make running time poly nn εε? Open till 008 Shannon 48 : nn = OO CC RR Forney 66 :time = pppppppp nn, 1 εε Arikan 08: Invented Polar Codes Final resolution: Guruswami+Xia 13, Hassani+Alishahi+Urbanke 13 Strong analysis Oct. 5, 018 Stanford: General Strong Polarization 8 of 0
Polar Codes and Polarization Arikan: Defined Polar Codes, one for every tt Associated Martingale XX 0,, XX tt, ttth code: Length nn = tt If Pr XX tt γγ, 1 δδ εε then ttth code is εε + δδ - close to capacity, and Pr Decode Encode mm + eeeeeeeeee mm nn γγ Need γγ = oo 1 nn (strong polarization γγ = nnnnnn(nn)) Need εε = 1/nn Ω(1) ( strong polarization) Oct. 5, 018 Stanford: General Strong Polarization 9 of 0
Context for today s talk [Arikan 08]: Martingale associated to general class of codes. Martingale polarizes (weakly) implies code achieves capacity (weakly) Martingale polarizes weakly for entire class. [GX 13, HAU 13]: Strong polarization for one specific code. Why so specific? Bottleneck: Strong Polarization Rest of talk: Polar codes, Martingale and Strong Polarization Oct. 5, 018 Stanford: General Strong Polarization 10 of 0
Lesson 0: Compression Coding Defn: Linear Compression Scheme: HH, DD for compression scheme for bern pp nn if Linear map HH: FF nn FF mm Pr ZZ bern pp nn DD HH ZZ ZZ = oo 1 Hopefully mm HH pp + εε nn Compression Coding Let HH be such that HH HH = 0; Encoder: XX HH XX Error-Corrector: YY = HH XX + ZZ YY DD HH YY = YY DD HH HH XX + HH. ZZ = ww.pp. 1 oo(1) HH XX Oct. 5, 018 Stanford: General Strong Polarization 11 of 0
Question: How to compress? Arikan s key idea: Start with Polarization Transform : UU, VV UU + VV, VV Invertible so does nothing? If UU, VV independent, then UU + VV more random than either VV UU + VV less random than either Iterate (ignoring conditioning) End with bits that are almost random, or almost determined (by others). Output totally random part to get compression! Oct. 5, 018 Stanford: General Strong Polarization 1 of 0
Iteration elaborated: A Martingale: XX tt conditional entropy of A+B random A+B+C+D output after tt iterations B C B C+D C+D B+D D E F G D E+F F G+H D E+F+G+H G+H F+H H H H Oct. 5, 018 Stanford: General Strong Polarization 13 of 0
The Polarization Butterfly PP nn UU, VV = PPnn UU + VV, PPnn VV UU ZZ 1 ZZ ZZ 1 + +1 ZZ + + WW 1 WW Polarization SS nn HH WW nn SS WW SS 0 SS HH pp + εε nn + ZZ nn WWnn Compression EE ZZ = PP nn ZZ SS VV +1 + +1 + WWnn +1 WWnn + Encoding time = OO nn log nn (given SS) To be shown: ZZ nn ZZ nn WW nn 1. Decoding = OO nn log nn. Polarization! Oct. 5, 018 Stanford: General Strong Polarization 14 of 0
The Polarization Butterfly: Decoding ZZ 1 ZZ ZZ 1 + +1 ZZ + + WW 1 WW Decoding idea: Given SS, WW SS = PP nn UU, VV Compute UU + VV Bern pp pp Determine VV? VV UU + VV ii Bern(rr ii ) +1 + +1 + + ZZ nn WWnn WWnn +1 WWnn + Key idea: Stronger induction! - non-identical product distribution Decoding Algorithm: Given SS, WW SS = PP nn UU, VV, pp 1,, pp nn - Compute qq 1,.. qqnn pp 1 pp nn ZZ nn ZZ nn WW nn - Determine UU + VV = DD WW SS +, qq 1 qqnn - Compute rr 1 rrnn pp 1 pp nn ; UU + VV - Determine VV = DD WW SS, rr 1 rrnn Oct. 5, 018 Stanford: General Strong Polarization 15 of 0
Polarization Martingale ZZ 1 ZZ ZZ 1 + +1 ZZ + + WW 1 WW ZZ ii,0 ZZ ii,1 Z ii,tt ZZ ii,log nn Idea: Follow XX tt = HH ZZ ii,tt ZZ <ii,tt for random ii +1 + ZZ nn +1 + ZZ nn + ZZ nn WWnn WWnn +1 WWnn + WW nn XX tt Martingale! Polarizes? Strongly? Locally? Oct. 5, 018 Stanford: General Strong Polarization 16 of 0
Local Polarization of Arikan Martingale Variance in the Middle: Roughly: HH pp, HH pp (HH pp pp, HH pp HH pp pp ) pp ττ, 1 ττ pp pp far from pp + continuity of HH HH pp pp far from HH(pp) Suction: High end:hh 1 γγ HH 1 γγ HH 1 γγ = 1 Θ γγ 1 γγ 1 Θ γγ 4 Low end: HH pp pp log 1 ; HH pp pp log 1 ; pp pp HH pp pp HH pp pp log 1 pp HH pp pp 1 log 1 pp Dealing with conditioning more work (lots of Markov) HH pp Oct. 5, 018 Stanford: General Strong Polarization 17 of 0
New contributions So far: Reproduced old work (simpler proofs?) Main new technical contributions: Strong polarization of general transforms E.g. UU, VV, WW (UU + VV, VV, WW)? Exponentially strong polarization [BGS 18] Suction at low end is very strong! Random kk kk matrix yields XX tt+1 XX tt kk.99 whp Strong polarization of Markovian sources [GNS 19?] Separation of compression of known sources from unknown ones. Oct. 5, 018 Stanford: General Strong Polarization 18 of 0
Conclusions Polarization: Important phenomenon associated with bounded random variables. Not always bad Needs better understanding! Recently also used in CS: [Lovett Meka] implicit [Chattopadhyay-Hatami-Hosseini-Lovett] - explicit Oct. 5, 018 Stanford: General Strong Polarization 19 of 0
Thank You! Oct. 5, 018 Stanford: General Strong Polarization 0 of 0