General Strong Polarization Madhu Sudan Harvard University Joint work with Jaroslaw Blasiok (Harvard), Venkatesan Gurswami (CMU), Preetum Nakkiran (Harvard) and Atri Rudra (Buffalo) December 4, 2017 IAS: General Strong Polarization 1 of 21
Shannon and Channel Capacity Shannon [ 48]: For every (noisy) channel of communication, capacity CC s.t. communication at rate RR < CC is possible and RR > CC is not. Needs lot of formalization omitted. Example: XX FF 2 BBBBBB(pp) XX w.p. 1 pp 1 XX w.p. pp Acts independently on bits Capacity = 1 HH(pp) ; HH(pp) = binary entropy! HH pp = pp log 1 pp + 1 pp log 1 1 pp December 4, 2017 IAS: General Strong Polarization 2 of 21
Achieving Channel Capacity Commonly used phrase. Many mathematical interpretations. Some possibilities: How small can nn be? Get RR > CC εε with polytime algorithms? [Luby et al. 95] Make running time poly 1 εε? Open till 2008 Shannon 48 : nn = OO CC RR 2 Forney 66 :time = pppppppp nn, 2 1 εε 2 Arikan 08: Invented Polar Codes Final resolution: Guruswami+Xia 13, Hassani+Alishahi+Urbanke 13 Strong analysis December 4, 2017 IAS: General Strong Polarization 3 of 21
Context for today s talk [Arikan 08]: General class of codes. [GX 13, HAU 13]: One specific code within. Why? Bottleneck: Strong Polarization Arikan et al.: General class polarizes [GX,HAU]: Specific code polarizes strongly! This work/talk: Introduce Local Polarization Local Strong Polarization Thm: All known codes that polarize also polarize strongly! December 4, 2017 IAS: General Strong Polarization 4 of 21
This talk: What are Polar Codes Local vs. Global (Strong) Polarization Analysis of Polarization December 4, 2017 IAS: General Strong Polarization 5 of 21
Lesson 0: Compression Coding Linear Compression Scheme: HH, DD for compression scheme for bern pp nn if Linear map HH: FF 2 nn FF 2 mm Pr ZZ bern pp nn DD HH ZZ ZZ εε Hopefully mm HH(pp) nn Compression Coding Let HH be such that HH HH = 0; Encoder: XX HH XX Error-Corrector: YY = HH XX + ηη YY DD HH YY = ww.pp. 1 εε HH XX December 4, 2017 IAS: General Strong Polarization 6 of 21
Question: How to compress? Arikan s key idea: Start with 2 2 Polarization Transform : UU, VV UU + VV, VV Invertible so does nothing? If UU, VV independent, then UU + VV more random than either VV UU + VV less random than either Iterate (ignoring conditioning) End with bits that are completely random, or completed determined (by others). Output totally random part to get compression! December 4, 2017 IAS: General Strong Polarization 7 of 21
Information Theory Basics Entropy: Defn: for random variable XX Ω, HH XX = pp xx log 1 where pp pp xx = Pr XX = xx xx xx Ω Bounds: 0 HH XX log Ω Conditional Entropy Defn: for jointly distributed XX, YY HH XX YY = pp yy yy HH(XX YY = yy) Chain Rule: HH XX, YY = HH XX + HH(YY XX) Conditioning does not increase entropy: HH XX YY HH(XX) Equality Independence: HH XX YY = HH XX XX YY December 4, 2017 IAS: General Strong Polarization 8 of 21
Iteration by Example: AA, BB, CC, DD AA + BB + CC + DD, BB + DD, CC + DD, DD EE, FF, GG, HH EE + FF + GG + HH, FF + HH, GG + HH, HH BB + DD HH BB + DD AA + BB + CC + DD) BB + DD + FF + HH HH BB + DD + FF + HH AA + BB + CC + DD, EE + FF + GG + HH FF + HH HH FF + HH EE + FF + GG + HH) HH FF + HH FF + HH AA + BB + CC + DD, EE + FF + GG + HH, BB + DD + FF + HH December 4, 2017 IAS: General Strong Polarization 9 of 21
Algorithmics Subliminal channel: details in appendix. Summary: Iterating tt times yields nn nn-polarizing transform for nn = 2 tt : (ZZ 1,, ZZ nn ) (WW 1,, WW nn ) WW SS ( SS HH pp nn) Encoding/Compression: Clearly poly time. Decoding/Decompression: Successive cancellation decoder : implementable in poly time. Can compute Pr WW ii = 1 WW <ii recursively ZZ bern(pp) Can reduce both run times to OO(nn log nn) Only barrier: Determining SS Resolved by [Tal-Vardy 13] December 4, 2017 IAS: General Strong Polarization 10 of 21
Analysis: The Polarization martingale Martingale: Sequence of r.v., XX 0, XX 1, ii, xx 0,, xx ii 1, EE XX ii xx 0,, xx ii 1 Polarization martingale: = xx ii 1 XX ii = Conditional entropy of random output at iith stage. Why martingale? At iith stage two inputs take (UU, AA); (VV, BB) Output = (UU + VV, A BB) and VV, AA BB UU + VV Input entropies = xx ii 1 (HH UU AA = HH VV BB = xx ii 1 ) Output entropies average to xx ii 1 HH UU + VV AA, BB + HH VV AA, BB, UU + VV = 2xx ii 1 December 4, 2017 IAS: General Strong Polarization 11 of 21
Quantitative issues How quickly does this martingale converge? (λλ) How well does it converge? εε Formally, want: Pr XX tt λλ, 1 λλ εε εε: Gives gap to capacity λλ: Governs decoding failure nn Reminder: nn = 2 tt For useable code, need: λλ 4 tt λλ For poly 1 εε block length, need: εε αα tt, αα < 1 December 4, 2017 IAS: General Strong Polarization 12 of 21
Regular and Strong Polarization (Regular) Polarization: γγ > 0, Lim tt Pr XX tt γγ tt, 1 γγ tt 0. Strong Polarization: γγ > 0 αα > 0 tt, Pr XX tt γγ tt, 1 γγ tt αα tt Both global properties (large tt behavior) Construction yields: Local Property How does XX ii relate to XX ii 1? Local vs. Global Relationship? Regular polarization well understood. Strong understood only UU, VV (UU + VV, VV) What about UU, VV, WW UU + VV, VV, WW?, more generally? December 4, 2017 IAS: General Strong Polarization 13 of 21
General Polarization Basic underlying ingredient MM FF 2 kk kk Local polarization step: Pick UU jj, AA jj jj [kk] i.i.d. One step: generate LL 1,, LL kk = MM UU 1,, UU kk XX ii 1 = HH UU jj l AA l, UU <jj = HH UU jj AA jj XX ii = HH LL jj l AA l, LL <jj Examples 1 1 0 1 1 HH 2 = 1 0 ; HH 3 = 1 0 0 0 0 1 Which matrices polarize? Strongly? December 4, 2017 IAS: General Strong Polarization 14 of 21
Mixing matrices Observation 1: Identity matrix does not polarize. Observation 1 : Upper triangular matrix does not polarize. Observation 2: (Row) permutation of nonpolarizing matrix does not polarize! Corresponds to permuting UU 1,, UU kk Definition: MM mixing if no row permutation is upper-triangular. [KSU]: MM mixing martingale polarizes. [Our main theorem]: MM mixing martingale polarizes strongly. December 4, 2017 IAS: General Strong Polarization 15 of 21
Key Concept: Local (Strong) Polarization Martingale XX 0, XX 1, locally polarizes if it exhibits Variance in the middle: ττ > 0 σσ > 0, Var XX ii XX ii 1 ττ, 1 ττ > σσ Suction at the end: θθ > 0 cc <, ττ > 0 Pr XX ii < XX ii 1 cc (similarly with 1 XX ii ) Definition local: compares XX ii with XX ii 1 Implications global: XX ii 1 ττ θθ Thm 1: Local Polarization Strong Polarization. Thm 2: Mixing matrices Local Polarization. December 4, 2017 IAS: General Strong Polarization 16 of 21
Mixing Matrices Local Polarization 2 2 case: Ignoring conditioning, we have 1 HH 2pp 1 pp w. p. HH pp 2 1 2HH pp HH 2pp 1 pp w. p. 2 Variance, Suction Taylor series for log. kk kk case: Reduction to 2 2 case: 1 2 probs. 1 kk December 4, 2017 IAS: General Strong Polarization 17 of 21
Local (Global) Strong Polarization Step 1: Medium Polarization: αα, ββ < 1 Pr XX tt ββ tt, 1 ββ tt αα tt Proof: Potential Φ tt = EE min XX tt, 1 XX tt Variance+Suction ηη < 1 s.t. Φ tt ηη φφ tt 1 (Aside: Why XX tt? Clearly XX tt won t work. And XX tt 2 increases!) Φ tt ηη tt Pr XX tt ββ tt, 1 ββ tt ηη ββ tt December 4, 2017 IAS: General Strong Polarization 18 of 21
Local (Global) Strong Polarization Step 1: Medium Polarization: αα, ββ < 1 Pr XX tt ββ tt, 1 ββ tt αα tt Step 2: Medium Polarization + tt-more steps Strong Polarization: Proof: Assume XX 0 αα tt ; Want XX tt γγ tt w.p. 1 OO αα 1 tt Pick cc large & ττ s.t. Pr XX ii < XX ii 1 cc XX ii 1 ττ θθ 2.1: Pr ii ss. tt. XX ii ττ ττ 1 αα tt (Doob s Inequality) 2.2: Let YY ii = log XX ii ; Assuming XX ii ττ for all ii YY ii YY ii 1 log cc w.p. θθ; & kk w.p. 2 kk. EExp YY tt tt log cc θθ + 1 2tt log γγ Concentration! Pr YY tt tt log γγ αα 2 tt QED! December 4, 2017 IAS: General Strong Polarization 19 of 21
Conclusion Simple General Analysis of Strong Polarization But main reason to study general polarization: Maybe some matrices polarize even faster! This analysis does not get us there. But hopefully following the (simpler) steps in specific cases can lead to improvements. December 4, 2017 IAS: General Strong Polarization 20 of 21
Thank You! December 4, 2017 IAS: General Strong Polarization 21 of 21
Algorithmics: Encoding Upto permutation on input variables, the polarizing transform is given by EE 2nn UU, VV = EE nn UU VV, EE nn VV where UU, VV FF 2 nn Obviously takes OO nn log nn time. December 4, 2017 IAS: General Strong Polarization 22 of 21
Algorithmics: Decoding Key idea: Strengthen algorithm to work when inputs are independent, but not identical!! Decompressor s input: pp 1,, pp 2nn ; ww 0,1,? 2nn Notations: ww 1 = ww 1,, ww nn ; ww 2 = ww nn+1,, ww 2nn pp aa, bb = expectation of XX YY where XX bern(aa) and YY bern(bb) [XX, YY are single bits!] pp aa, bb cc = expectation of YY XX YY = cc DD pp 1,, pp 2nn ; ww : If nn = 1 2 output ww 1 if ww 1 {0,1}; else output 1 if pp 1.5 and 0 o.w. Else Let qq ii = pp pp ii, pp nn+ii for ii nn Let XX = DD qq 1,, qq nn ; ww 1 Let rr ii = pp pp ii, pp nn+ii XX ii for ii nn Let YY = (DD( rr 1,, rr nn ; ww 2 ) Output (XX YY, YY) December 4, 2017 IAS: General Strong Polarization 23 of 21
Algorithmics: Decoding Analysis Inductive claim: Recursion is correct! i.e., If UU, VV ii bern pp ii then UU VV ii bern qq ii & VV UU VV = XX ii bern(rr ii ) So eventually get down to guessing individual bits. Let WW = EE 2nn (UU, VV) ; and ww 1,, ww 2nn be the 2nn individual bits output in line 1 of decoder. Claim: When guessing WW ii, pp-value = Pr WW ii = 1 WW <ii = ww <ii ] Claim: If HH WW ii WW <ii = ττ then Pr HH WW ii ww <ii ττ ττ ; and so Pr ww ii WW ii ww <ii = WW <ii 2 ττ Back to talk December 4, 2017 IAS: General Strong Polarization 24 of 21