On the Construction of Polar Codes

On the Constructon of Polar Codes Ratn Pedarsan School of Coputer and Councaton Systes, Lausanne, Swtzerland. ratn.pedarsan@epfl.ch S. Haed Hassan School of Coputer and Councaton Systes, Lausanne, Swtzerland. seyedhaed.hassan@epfl.ch Ido Tal Inforaton Theory and Applcatons, UCSD La Jolla, CA, USA. dotal@eee.org Ere Telatar School of Coputer and Councaton Systes, Lausanne, Swtzerland. ere.telatar@epfl.ch arxv:1209.4444v1 [cs.it] 20 Sep 2012 Abstract We consder the proble of effcently constructng polar codes over bnary eoryless syetrc BMS) channels. The coplexty of desgnng polar codes va an exact evaluaton of the polarzed channels to fnd whch ones are good appears to be exponental n the block length. In [3], Tal and Vardy show that f nstead the evaluaton f perfored approxately, the constructon has only lnear coplexty. In ths paper, we follow ths approach and present a fraework where the algorths of [3] and new related algorths can be analyzed for coplexty and accuracy. We provde nuercal and analytcal results on the effcency of such algorths, n partcular we show that one can fnd all the good channels except a vanshng fracton) wth alost lnear coplexty n block-length except a polylogarthc factor). A. Polar Codes I. INTRODUCTION Polar codng, ntroduced by Arıkan n [1], s an encodng/decodng schee that provably acheves the capacty of the class of BMS channels. Let W be a BMS channel. Gven the rate R < IW), polar codng s based on choosng a set of 2 n R rows of the atrx G n = [ 1 0 n 1 1] to for a 2 n R 2 n atrx whch s used as the generator atrx n the encodng procedure 1. The way ths set s chosen s dependent on the channel W and uses a phenoenon called channel polarzaton: Consder an nfnte bnary tree and place the underlyng channel W on the root node and contnue recursvely as follows. Havng the channel P : {0,1} Y on a node of the tree, defne the channels P : {0,1} Y 2 and P + : {0,1} {0,1} Y 2 P y 1,y 2 x 1 ) = x 2 {0,1} 1 2 Py 1 x 1 x 2 )Py 2 x 2 ) 1) P + y 1,y 2,x 1 x 2 ) = 1 2 Py 1 x 1 x 2 )Py 2 x 2 ), 2) and place P and P + as the left and rght chldren of ths node. As a result, at level n there are N = 2 n channels whch we denote fro left to rght by W 1 N to WN N. In [1], Arıkan proved that as n, a fracton approachng IW) of the channels at level n have capacty close to 1 call the noseless channels) and a fracton approachng 1 IW) have capacty close to 0 call the copletely 1 There are extensons of polar codes gven n [2] whch use dfferent knds of atrces. nosy channels). Gven the rate R, the ndces of the atrx G n are chosen as follows: choose a subset of the channels {W ) N } 1 N wth the ost utual nforaton and choose the rows G n wth the sae ndces as these channels. For exaple, f the channel W j) N s chosen, then the j-th row of G n s selected, up to the bt-reversal perutaton. In the followng, gven n, we call the set of ndces of NR channels wth the ost utual nforaton, the set of good ndces. We can equvalently say that as n the fracton of channels wth Bhattacharyya constant near 0 approaches IW) and the fracton of channels wth Bhattacharyya constant near 1 approaches 1 IW). The Bhattacharyya constant of a channel P : {0,1} Y s gven by ZP) = y Y Py 0)Py 1). 3) Therefore, we can alternatvely call the set of ndces of NR channels wth least Bhattacharyya paraeters, the set of good ndces. It s also nterestng to enton that the su of the Bhattacharyya paraeters of the chosen channels s an upper bound on the block error probablty of polar codes when we use the successve cancellaton decoder. B. Proble Forulaton Desgnng a polar code s equvalent to fndng the set of good ndces. The an dffculty n ths task s that, snce the output alphabet of W ) N s YN {0,1}, the cardnalty of the output alphabet of the channels at the level n of the bnary tree s doubly exponental n n or s exponental n the block-length. So coputng the exact transton probabltes of these channels sees to be ntractable and hence we need soe effcent ethods to approxate these channels. In [1], t s suggested to use a Monte-Carlo ethod for estatng the Bhattacharyya paraeters. Another ethod n ths regard s by quantzaton [3], [4], [5], [6, Appendx B]: approxatng the gven channel wth a channel that has fewer output sybols. More precsely, gven a nuber k, the task s to coe up wth effcent ethods to replace channels that have ore that k outputs wth close channels that have at ost k outputs. Few coents n ths regard are the followng: The ter close above depends on the defnton of the quantzaton error whch can be dfferent dependng on the context. In our proble, n ts ost general settng

we can defne the quantzaton error as the dfference between the true set of good ndces and the approxate set of good ndces. However, t sees that analyzng ths type of error ay be dffcult and n the sequel we consder types of errors that are easer to analyze. Thus, as a coprose, wll ntutvely thnk of two channels as beng close f they are close wth respect to soe gven etrc; typcally utual nforaton but soetes probablty of error. More so, we requre that ths closeness s n the rght drecton: the approxated channel ust be a pessstc verson of the true channel. Thus, the approxated set of good channels wll be a subset of the true set. Intutvely, we expect that as k ncreases the overall error due to quantzaton decreases; the an art n desgnng the quantzaton ethods s to have a sall error whle usng relatvely sall values of k. However, for any quantzaton algorth an portant property s that as k grows large, the approxate set of good ndces usng the quantzaton algorth wth k fxed approaches the true set of good ndces. We gve a precse atheatcal defnton n the sequel. Takng the above entoned factors nto account, a sutable forulaton of the quantzaton proble s to fnd procedures to replace each channel P at each level of the bnary tree wth another syetrc channel P wth the nuber of output sybols lted to k such that frstly, the set of good ndces obtaned wth ths procedure s a subset of the true good ndces obtaned fro the channel polarzaton.e. channel P s polar degraded wth respect to P, and secondly the rato of these good ndces s axzed. More precsely, we start fro channel W at the root node of the bnary tree, quantze t to W and obtan W and W + accordng to 1) and 2). Then, we quantze the two new channels and contnue the procedure to coplete the tree. To state thngs atheatcally, let Q k be a quantzaton procedure that assgns to each channel P a bnary syetrc channel P such that the output alphabet of P s lted to a constant k. We call Q k adssble f for any and n ) I W N ) IW) N ). 4) One can alternatvely call Q k adssble f for any and n Z W ) N ) ZW) N ). 5) Note that 4) and 5) are essentally equvalent as N grows large. Gven an adssble procedure Q k and a BMS channel W, let ρq k,w) be 2 ) { : I W N ρq k,w) = l ) > 1 2 } 6) n N So the quantzaton proble s that gven a nuber k N and a channel W, how can we fnd adssble procedures Q k such that ρq k,w) s axzed and s close to the capacty of W. Can we reach the capacty of W as k goes to nfnty? 2 Instead of 1 n 6) we can use any nuber n 0,1). 2 Are such schees unversal n the sense that they work well for all the BMS channels? It s worth entonng that f we frst let k tend to nfnty and then n to nfnty then the lt s ndeed the capacty, but we are addressng a dfferent queston here, naely we frst let n tend to nfnty and then k or perhaps couple k to n). In Secton IV, we ndeed prove that such schees exst. A. Prelnares II. ALGORITHMS FOR QUANTIZATION Any dscrete BMS channel can be represented as a collecton of bnary syetrc channels BSC s). The bnary nput s gven to one of these BSC s at rando such that the -th BSC s chosen wth probablty p. The output of ths BSC together wth ts cross over probablty x s consdered as the output of the channel. Therefore, a dscrete BMS channel W can be copletely descrbed by a rando varable χ [0,1/2]. The pdf of χ wll be of the for: P χ x) = p δx x ) 7) =1 such that =1 p = 1 and 0 x 1/2. Note that ZW) and 1 IW) are expectatons of the functons fx) = 2 x1 x) and gx) = xlogx) 1 x)log1 x) over the dstrbuton P χ, respectvely. Therefore, n the quantzaton proble we want to replace the ass dstrbuton P χ wth another ass dstrbuton P χ such that the nuber of output sybols of χ s at ost k, and the channel W s polar degraded wth respect to W. We know that the followng two operatons ply polar degradaton: Stochastcally degradng the channel. Replacng the channel wth a BEC channel wth the sae Bhattacharyya paraeter. Furtherore, note that the stochastc donance of rando varable χ wth respect to χ ples W s stochastcally degraded wth respect to W. But the reverse s not true.) In the followng, we propose dfferent algorths based on dfferent ethods of polar degradaton of the channel. The frst algorth s a nave algorth called the ass transportaton algorth based on the stochastc donance of the rando varable χ, and the second one whch outperfors the frst s called greedy ass ergng algorth. For both of the algorths the quantzed channel s stochastcally degraded wth respect to the orgnal one. B. Greedy Mass Transportaton Algorth In the ost general for of ths algorth we bascally look at the proble as a ass transport proble. In fact, we have non-negatve asses p at locatons x, = 1,,,x 1 < < x. What s requred s to ove the asses, by only oves to the rght, to concentrate the on k < locatons, and try to nze p d where d = x +1 x s the aount th ass has oved. Later, we wll show that ths ethod s not optal but useful n the theoretcal analyss of the algorths that follow.

Algorth 1 Mass Transportaton Algorth 1: Start fro the lst p 1,x 1 ),,p,x ). 2: Repeat k tes 3: Fnd j = argn{p d : } 4: Add p j to p j+1.e. ove p j to x j+1 ) 5: Delete p j,x j ) fro the lst. Note that Algorth 1 s based on the stochastc donance of rando varable χ wth respect to χ. Furtherore, n general, we can let d = fx +1 ) fx ), for an arbtrary ncreasng functon f. C. Mass Mergng Algorth The second algorth erges the asses. Two asses p 1 and p 2 at postons x 1 and x 2 would be erged nto one ass p 1 +p 2 at poston x 1 = p1 p 1+p 2 x 1 + p2 p 1+p 2 x 2. Ths algorth s based on the stochastc degradaton of the channel, but the rando varable χ s not stochastcally donated by χ. The greedy algorth for the ergng of the asses would be the followng: Algorth 2 Mergng Masses Algorth 1: Start fro the lst p 1,x 1 ),,p,x ). 2: Repeat k tes 3: Fnd j = argn{p f x ) fx )) p +1 fx +1 ) f x )) : } x = p p +p +1 x + p+1 p +p +1 x +1 4: Replace the two asses p j,x j ) and p j+1,x j+1 ) wth a sngle ass p j +p j+1, x j ). Note that n practce, the functon f can be any ncreasng concave functon, for exaple, the entropy functon or the Bhattacharyya functon. In fact, snce the algorth s greedy and suboptal, t s hard to nvestgate explctly how changng the functon f wll affect the total error of the algorth n the end.e., how far W s fro W ). III. BOUNDS ON THE APPROXIMATION LOSS In ths secton, we provde soe bounds on the axu approxaton loss we have n the algorths. We defne the approxaton loss to be the dfference between the expectaton of the functon f under the true dstrbuton P χ and the approxated dstrbuton P χ. Note that the knd of error that s analyzed n ths secton s dfferent fro what was defned n Secton I-B. The connecton of the approxaton loss wth the quantzaton error s ade clear n Theore 1. For convenence, we wll sply stck to the word error nstead of approxaton loss fro now on. We frst fnd an upper bound on the error ade n Algorths 1 and 2 and then use t to provde bounds on the error ade whle perforng operatons 1) and 2). Lea 1. The axu error ade by Algorths 1 and 2 s upper bounded by O 1 k ). Proof: Frst, we derve an upper bound on the error of Algorths 1 and 2 n each teraton, and therefore a bound on the error of the whole process. Let us consder Algorth 1. The proble can be reduced to the followng optzaton proble: such that e = ax p,x n p d ) 8) p = 1, d 1, 9) where d = fx +1 ) fx ), and f 1 2 ) f0) = 1 s assued w.l.o.g. We prove the lea by Cauchy-Schwarz nequalty. n n p d = p d ) 2 = Now by applyng Cauchy-Schwarz we have n p d ) 2 10) ) 1/2 ) 1/2 p d p d 1 11) =1 =1 Snce the su of ters p d s less than 1, the nu of the ters wll certanly be less than 1. Therefore, e = n ) 2 1 p d 2. 12) For Algorth 2, achevng the sae bound as Algorth 1 s trval. Denote e 1) the error ade n Algorth 1 and e 2) the error ade n Algorth 2. Then, =1 e 2) = p f x ) fx )) p +1 fx +1 ) f x )) 13) p f x ) fx )) 14) p fx +1 ) fx )) = e 1). 15) Consequently, the error generated by runnng the whole algorth can be upper bounded by =k+1 1 2 whch so 1 k ). What s stated n Lea 1 s a loose upper bound on the error of Algorth 2. To acheve better bounds, we upper bound the error ade n each teraton of the Algorth 2 as the followng: e = p f x ) fx )) p +1 fx +1 ) f x )) 16) p +1 p x f x ) p +1 x f x +1 ) p +p +1 p +p +1 p 17) = p p +1 p +p +1 x f x ) f x +1 )) 18) p +p +1 x 2 4 f c ), 19) where x = x +1 x and 17) s due to concavty of functon f. Furtherore, 19) s by the ean value theore, where x c x +1. If f x) s bounded for x 0,1), then we can prove that n e O 1 ) slarly to Lea 1. Therefore the error 3

of the whole algorth would be O 1 k 2 ). Unfortunately, ths s not the case for ether of entropy functon or Bhattacharyya functon. However, we can stll acheve a better upper bound for the error of Algorth 2. Lea 2. The axu error ade by Algorth 2 for the entropy functon hx) can be upper bounded by the order of O logk) k 1.5 ). Proof: See Appendx. We can see that the error s proved by a factor of logk k n coparson wth Algorth 1. Now we use the result of Lea 1 to provde bounds on the total error ade n estatng the utual nforaton of a channel after n levels of operatons 1) and 2). Theore 1. Assue W s a BMS channel and usng Algorth 1 or 2 we quantze the channel W to a channel W. Takng k = n 2 s suffcent to gve an approxaton error that decays to zero. Proof: Frst notce that for any two BMS channelsw and V, dong the polarzaton operatons 1) and 2), the followng s true: IW ) IV ))+IW + ) IV + )) = 2IW) IV)) 20) Replacng V wth W n 20) and usng the result of Lea 1, we conclude that after n levels of polarzaton the su of the errors n approxatng the utual nforaton of the 2 n channels s upper-bounded by O n2n k ). In partcular, takng k = n 2, one can say that the average approxaton error of the 2 n channels at level n s upper-bounded by O 1 n ). Therefore, at least a fracton 1 1 n of the channels are 1 dstorted by at ost n.e., except for a neglgble fracton of the channels the error n approxatng the utual nforaton decays to zero. As a result, snce the overall coplexty of the encoder constructon s Ok 2 N), ths leads to alost lnear algorths for encoder constructon wth arbtrary accuracy n dentfyng good channels. IV. EXCHANGE OF LIMITS In ths secton, we show that there are adssble schees such that as k, the lt n 6) approaches IW) for any BMS channel W. We use the defnton stated n 5) for the adssblty of the quantzaton procedure. Theore 2. Gven a BMS channel W and for large enough k, there exst adssble quantzaton schees Q k such that ρq k,w) s arbtrarly close to IW). Proof: Consder the followng algorth: The algorth starts wth a quantzed verson of W and t does the noral channel splttng transforaton followed by quantzaton accordng to Algorth 1 or 2, but once a sub-channel s suffcently good, n the sense that ts Bhattacharyya paraeter s less than an approprately chosen paraeter δ, the algorth replaces the sub-channel wth a bnary erasure channel whch s degraded polar degradaton) wth respect to t As the operatons 1) and 2) over an erasure channel also yelds and erasure channel, no further quantzaton s need for the chldren of ths sub-channel). Snce the rato of the total good ndces of BECZP)) s 1 ZP), then the total error that we ake by replacngp wth BECZP)) s at ost ZP) whch n the above algorth s less that the paraeter δ. Now, for a fxed level n, accordng to Theore 1 f we ake k large enough, the rato of the quantzed sub-channels that ther Bhattacharyya value s less that δ approaches to ts orgnal value wth no quantzaton), and for these subchannels as explaned above the total error ade wth the algorth s δ. Now fro the polarzaton theore and by sendng δ to zero we deduce that as k the nuber of good ndces approaches the capacty of the orgnal channel. V. SIMULATION RESULTS In order to evaluate the perforance of our quantzaton algorth, slarly to [3], we copare the perforance of the degraded quantzed channel wth the perforance of an upgraded quantzed channel. An algorth slar to Algorth 2 for upgradng a channel s the followng. Consder three neghborng asses n postons x 1,x,x +1 ) wth probabltes p 1,p,p +1 ). Let t = x x 1 x +1 x 1. Then, we splt the ddle ass at x to the other two asses such that the fnal probabltes wll be p 1 +1 t)p,p +1 +tp ) at postons x 1,x +1 ). The greedy channel upgradng procedure s descrbed n Algorth 3. Algorth 3 Splttng Masses Algorth 1: Start fro the lst p 1,x 1 ),,p,x ). 2: Repeat k tes 3: Fnd j = argn{p fx ) tfx +1 ) 1 t)fx 1 )) : 1,} 4: Add 1 t)p j to p j 1 and tp j to p j+1. 5: Delete p j,x j ) fro the lst. The sae upper bounds on the error of ths algorth can be provded slarly to Secton III wth a lttle bt of odfcaton. In the sulatons, we easure the axu achevable rate whle keepng the probablty of error less than 10 3 by fndng axu possble nuber of channels wth the sallest Bhattacharyya paraeters such that the su of ther Bhattacharyya paraeters s upper bounded by 10 3. The channel s a bnary syetrc channel wth capacty 0.5. Usng Algorths 2 and 3 for degradng and upgradng the channels wth the Bhattacharyya functonfx) = 2 x1 x), we obtan the followng results: It s worth restatng that the algorth runs n coplexty Ok 2 N). Table I shows the achevable rates for Algorths

k 2 4 8 16 32 64 degrade 0.2895 0.3667 0.3774 0.3795 0.3799 0.3800 upgrade 0.4590 0.3943 0.3836 0.3808 0.3802 0.3801 TABLE I: Achevable rate wth error probablty at ost 10 3 vs. axu nuber of output sybols k for block-length N = 2 15 2 and 3 when the block-length s fxed to N = 2 15 and k changes n the range of 2 to 64. It can be seen fro Table I that the dfference of achevable rates wthn the upgraded and degraded verson of the schee s as sall as 10 4 for k = 64. We expect that for a fxed k, as the block-length ncreases the dfference wll also ncrease see Table II). n 5 8 11 14 17 20 degrade 0.1250 0.2109 0.2969 0.3620 0.4085 0.4403 upgrade 0.1250 0.2109 0.2974 0.3633 0.4102 0.4423 TABLE II: Achevable rate wth error probablty at ost 10 3 vs. block-length N = 2 n for k = 16 However, n our schee ths dfference wll rean sall even as N grows arbtrarly large as predcted by Theore 2. see Table III). n 21 22 23 24 25 degrade 0.4484 0.4555 0.4616 0.4669 0.4715 upgrade 0.4504 0.4575 0.4636 0.4689 0.4735 TABLE III: Achevable rate wth error probablty at ost 10 3 vs. block-length N = 2 n for k = 16 We see that the dfference between the rate achevable n the degraded channel and upgraded channel gets constant2 10 3 even after 25 levels of polarzatons for k = 16. A. Proof of Lea 2 APPENDIX Proof: Let us frst fnd an upper bound for the second dervatve of the entropy functon. Suppose that hx) = xlogx) 1 x)log1 x). Then, for 0 x 1 2, we have h 1 x) = x1 x)ln2) 2 xln2). 21) Usng 21) the nu error can further be upper bounded by ne np +p +1 ) x 2 1 x ln4). 22) Now suppose that we have l ass ponts wth x 1 and l ass ponts wth x 1. For the frst l ass ponts we use the upper bound obtaned for Algorth 1. Hence, for 1 l we have ne np hx ) 23) ) log) O l 2, 24) where 23) s due to 15) and 24) can be derved agan by applyng Cauchy-Schwarz nequalty. Note that ths te l hx ) h 1 ) log) ) O. 25) =1 For the l ass ponts one can wrte ne np +p +1 ) x 2 1 x ln4) np +p +1 ) x 2 ln4) O l) 3 26) 27) ), 28) where 28) s due to Hölder s nequalty as follows: Let q = p + p +1. Therefore, p + p +1 ) 2 and x 1/2. n q x 2 = ) 1/3 3 nq x) 2 = Now by applyng Hölder s nequalty we have n q x 2 ) 1/3 ) 3 29) ) 1/3 ) 2/3 q x 2 ) 1/3 q x 1 30) Therefore, ne ) nq x 2 )1/3) 3 O l) 3. 31) Overall, the error ade n the frst step of the algorth would be { ) )} log) ne n O l 2,O l) 3 32) ) log) O. 33) 2.5 Thus, the error generated by runnng the whole algorth can be upper bounded by log) =k+1 O logk) 2.5 k ). 1.5 ACKNOWLEDGMENTS authors are grateful to Rüdger Urbanke for helpful dscussons. Ths work was supported n part by grant nuber 200021-125347 of the Swss Natonal Scence Foundaton. REFERENCES [1] E. Arıkan, Channel Polarzaton: A Method for Constructng Capacty- Achevng Codes for Syetrc Bnary-Input Meoryless Channels, IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051 3073, Jul. 2009. [2] S. B. Korada, Polar Codes for Channel and Source Codng, Ph.D. dssertaton,, Lausanne, Swtzerland, Jul. 2009. [3] I. Tal and A. Vardy, How to Construct Polar Codes, [Onlne]. Avalable: http://arxv.org/pdf/1105.6164. [4] S. H. Hassan, S. B. Korada, and R. Urbanke, The Copound Capacty of Polar Codes, Proceedngs of Allerton Conference on Councaton, Control and Coputng, Allerton, Sep. 2009. [5] R. Mor and T. Tanaka, Perforance and Constructon of Polar Codes on Syetrc Bnary-Input Meoryless Channels, Proceedngs of ISIT, Seoul, South Korea, Jul. 2009, pp. 1496 1500. [6] T. Rchardson and R. Urbanke, Modern Codng Theory, Cabrdge Unversty Press, 2008.