Polar codes for reliable transmission

Size: px
Start display at page:

Download "Polar codes for reliable transmission"

Transcription

1 Polar codes for reliable transmission Theoretical analysis and applications Jing Guo Department of Engineering University of Cambridge Supervisor: Prof. Albert Guillén i Fàbregas This dissertation is submitted for the degree of Doctor of Philosophy Trinity Hall June 2015

2

3 I would like to dedicate this thesis to my parents, who provide me with two essential things: life and love.

4

5 Declaration I hereby declare that except where specific reference is made to the work of others, the contents of this dissertation are original and have not been submitted in whole or in part for consideration for any other degree or qualification in this, or any other university. This dissertation is my own work and contains nothing which is the outcome of work done in collaboration with others, except as specified in the text and Acknowledgements. This dissertation contains fewer than 65,000 words including appendices, bibliography, footnotes, tables and equations and has fewer than 150 figures. Jing Guo June 2015

6

7 Acknowledgements First, I would like to express the sincerest gratitude to my supervisor Prof. Albert Guillén i Fàbregas. Discussions with him have been very enlightening. His sharp and rigorous thought have illuminated my path into research. I have benefited a lot from his attitude towards life and work. I would like to thank him for his belief in me that has allowed me to do research in my own interests. I would like to thank him for his patience and encouragement when I get stuck in research. It was a great pleasure to study at the University of Cambridge. I have enjoyed my time spent here, both within and outside my research. I would like to thank Prof. Paul H. Siegel at the University of California, San Diego (UCSD) for the collaboration with his group and inviting me to visit UCSD. I would like to thank all my collaborators, Jossy Sayir, Minghai Qin, Aman Bhatia, and Paul H. Siegel, for their great contributions to my dissertation. I would like to thank Minghai Qin for proofreading this thesis in great detail and providing many useful suggestions that dramatically improved the quality of the thesis. I would also like to thank Jonathan Scarlett and Jossy Sayir for helping with proofreading. I would like to thank the staff in Division F, in particular, Rachel Fogg, Lorraine Baker, and Phill Richardson for their kind help in making my life in the department more enjoyable and easier. I would like to thank secretaries at the Universitat Pompeu Fabra (UPF), Joana Clotet, Vanessa Jiménez, and Beatriz Abad for their kind help during my stay at UPF. I would like to thank the current members and alumni of the group: Alfonso Martinez, Gonzalo Váquez Vilar, Jossy Sayir, Adrià Tauste Camp, Tobias Koch, Alex Alvarado, Li Peng, Taufiq Agustian Asyhari, Jonathan Scarlett, and Seçkin Anıl Yıldırım. I have benefited a lot during the lunch time discussions and conversations with them. In particular, I learned a lot from Alfonso. His capacious knowledge and deep sympathy towards the underprivileged people has had a great influence on me. I would like to thank all my friends, including Chong Chen, Lan Jiang, Jonathan Scarlett, Xiao Rong, Yingsong Zhang, Zhihan Xu, Bo Zhen, Yining Chen, Wei Wu,

8 viii Fei Jin, Fei Wang, Ningjun Jiang, Kai Gu, Li Peng, Xiaoke Yang, Wenjing Yan, Shihui Guo, Xuefei Wu, Minghai Qin, Keqian Yan, Xiaozhi Fu, Mengshi Wang, Ruizhi Liao, Zhouyue Wang, Xinyi Wu, Ling Tan, etc. I have shared many precious moments with them. I would like to thank Chong Chen, in particular, for always being there, walking me through difficulties. I would like to thank Minghai Qin for helpful discussions and constant support during the last two years of my PhD. Finally, I would like to express my deepest gratitude to my parents and my grandparents for their continued love and support throughout all stages of my life. This thesis was supported in part by the Cambridge Overseas Trust, the Chinese Scholarship Council and the European Research Council under ERC grant agreement

9 Abstract Polar codes, as the first provable capacity-achieving codes for binary discrete memoryless channels with low encoding and decoding complexity, have attracted considerable attention recently. We first study the channel polarization phenomenon over arbitrary-input discrete memoryless channels. We show that under the restriction that the channel has zero zero-error capacity, channel polarization occurs for any input alphabet set that forms a group. We also discuss the set of channels to which the virtual channels converge. We then study a family of polar codes whose frozen set is chosen such that one discards the bit-channels for which the mutual information falls below a certain (fixed) threshold. We show that if the threshold, which might depend on the code length, is bounded appropriately, a coding theorem can be proved for the underlying polar code. We also give accurate closed-form upper and lower bounds on the minimum distance of the resulting code when the design channel is a binary erasure channel. Furthermore, we investigate the code constructions of polar codes in the finite blocklength regime. We propose a concatenation scheme that utilizes the bit-channels that are not fully polarized. Simulation results show that the concatenated polar codes outperform the conventional polar codes under belief propagation decoding. Moreover, by tracking how the messages are updated during belief propagation decoding, we propose a log-likelihood ratio oriented criterion to select the frozen set. This criterion shows advantages over the conventional construction under both soft-output and hardoutput decoding. Finally, we study the sphere decoding algorithms of polar codes, which can achieve the maximum-likelihood performance. We propose stricter branching conditions compared to the previous approaches, that significantly reduce the search complexity. Simulation results report two orders of magnitude improvement over the existing sphere decoding algorithms. Based on the recursive structure of the generator matrix of polar codes, we also propose a bit-reversed decoding order which can further reduce the complexity in the low-to-medium SNR regime.

10

11 Table of contents List of figures List of tables Nomenclature xv xix xxi 1 Introduction Channel model Channel coding Channel coding theorem A brief review of existing coding schemes Polar codes Dissertation overview Notations A review of polar codes Preliminaries Channel transformation Channel combining Channel splitting Recursive channel transformation Channel polarization Polar code encoding SC decoding of polar codes Coding theorems Practical code constructions Summary

12 xii Table of contents 3 Channel polarization for arbitrary-input DMCs Introduction Entropy inequality for prime-input DMCs Preliminaries Zero-error capacity Basic algebraic structures Entropy inequalities for arbitrary-input DMCs Channel polarization for arbitrary-input DMCs Channel polarization over groups Channel polarization over monoids Proofs Proof of Lemma Proof of Lemma Proof of Lemma Proof of Lemma Proof of Lemma Proof of Theorem Conclusion Fixed-threshold polar codes Introduction Preliminaries Fixed-threshold construction A coding theorem Minimum distance Proofs Proof of Lemma Proof of Theorem Proof of Theorem Conclusion Belief propagation decoding of polar codes through concatenation Introduction Factor graph Message updating rules Factor graph of standard polar codes Construction of concatenated polar codes

13 Table of contents xiii Code construction and factor graph representation BP decoding of concatenated polar codes Flooding BP SCAN BP Early termination check Simulation results Channel ordering Outer LDPC code Outer convolutional code Results Conclusion Polar code constructions adapted for BP + and SCL decoders Introduction Analysis of LLRs with fixed iterations under BP + decoding Simulation results of LLR-oriented constructions under BP + decoding BP + decoding BP + decoding with guessing LLR-oriented construction under SCL decoding SCL decoding Weight enumerating functions Numerical results Conclusion Efficient sphere decoding of polar codes Introduction Modulation and channel model Sphere decoding algorithms Sphere decoding with fixed lower bounds Simulations Sphere decoding with dynamic lower bounds Simulation results Sphere decoding with bit-permute orders Definitions and properties of bit-permute orders An example of bit-permute orders Analysis on bit-permute orders Conclusion

14 xiv Table of contents 8 Conclusions and future work 123 References 125

15 List of figures 1.1 The basic communication system Channel polarization for BEC ǫ = Vector channel W Vector channel W Vector channel W N Encoding procedure for an (8, 4) polar code, BEC with ǫ = A degrading operation One step channel transformation Channels belong to C Setup conditioned on U 1 = Channel described by case Channel described by case Thresholds of Arıkan s fixed-rate polar codes for R = 0.3, 0.4 and fixedthreshold polar codes with threshold θ N over the BEC with I(W) = Rate convergence for different threshold functions θ N over a BEC with I(W) = Minimum distance of fixed-threshold polar codes with θ N = 1 1 N 2 N β with β = 2 over the BEC Minimum distance d min for Arıkan s fixed-rate polar codes with R = 0.35, 0.49 and for fixed-threshold polar code with threshold function θ N = 1 2 N β with β = 2 for a BEC with I(W) = Factor graph of a (7, 4) Hamming code Message propagation between CNs and VNs Factor graph of a standard polar code of length N =

16 xvi List of figures 5.4 Thresholds for sorted Bhattacharyya parameters Z(W (i) N ). The underlying channel is an AWGN channel with E b N 0 = 0 db. Blocklength N = Factor graph of concatenated polar codes of length N = FER of BP decoding with 2 schedulings (flooding and SCAN) over AWGN channels. All codes have length N = 4096 and overall rate R = 1. The concatenated LDPC code is a (3,5)-regular Tanner code Average number of iterations for early termination enabled BP + decoding, (a) flooding, (b) SCAN. All codes have length N = 4096 and rate R = FER comparison of fixed-iteration BP decoding and early termination enabled BP + (flooding). All codes have length N = 4096 and overall rate R = 1. The concatenated LDPC code is a (3,5)-regular Tanner code FER comparison of fixed-iteration BP decoding and early termination enabled BP + (SCAN). All codes have length N = 4096 and overall rate R = 1. The concatenated LDPC code is a (3,5)-regular Tanner code FER and BER comparison of different schemes: 1) early termination enabled BP (flooding); 2) early termination enabled BP (flooding) with a concatenated Tanner code; 3) successive cancellation (SC) decoding; 4) early termination enabled BP (SCAN); 5) early termination enabled BP (SCAN) with a concatenated Tanner code; 6) early termination enabled BP (SCAN) with a concatenated convolutional code. All codes have length N = 4096 and overall rate R = 1 2. The polar code is constructed based on [75] at E b N 0 = 0 db Average number of iterations for early termination enabled BP + decoding, (a) flooding and (b) SCAN. All codes have length N = 4096 and rate R = FER and BER comparison of different concatenation schemes. All decoders are based on early termination enabled BP + (SCAN) decoding over the AWGN channel at SNR = 4 db. All codes have length N = 4096 but unequal rate FER of polar codes and RM codes over AWGN channels under SCL decoding with list size L = 128. The codes have blocklength N = 128 and rate R = 1 2. The polar code is constructed based on [75] at E b N 0 = 3 db

17 List of figures xvii 6.2 LLR analysis of BP (SCAN) over the AWGN channel at E b N 0 = 2.25 db of a polar code with N = 1024 and R = 1. The polar code is constructed 2 based on [75] at E b N 0 = 2.25 db LLR analysis of BP (SCAN) over the AWGN channel at E b N 0 = 2.25 db using a LLR-oriented construction by swapping 12 bit-channels of the conventional construction [75] FER of BP (SCAN) over AWGN channels with conventional and LLRoriented constructions. All codes have length N = 1024, and the rate is R = 0.5, 0.86, respectively FER comparison of BP + (SCAN) with guessing over AWGN channels. Both codes have length N = 4096 and rate R = Breadth-first search of a decoding tree with list size L = Union bounds and FER comparison of LLR-oriented and conventional constructions of polar codes over AWGN channels. Both codes have length N = 1024 and rate R = 1. The decoder is a SCL decoder of list 2 size FER comparison of LLR-oriented and conventional constructions of polar codes with SCL decoders over AWGN channels. All codes have length N = 1024 and rate R = 1. The conventional construction is 2 based on [75] optimized at E b N 0 = 2.25 db. The SCL decoders have list sizes 16 and 32. The CRC-8 is an 8-bit CRC from [74] FER comparison of LLR-oriented and conventional constructions of polar codes with CRC-aided SCL decoding over AWGN channels. Both codes have length N = 1024 and rate R = The SCL decoder has list size 64. The CRC-8 is an 8-bit CRC from [74] Union bounds and FER comparison of LLR-oriented and conventional construction of polar codes with CRC-aided SCL decoding over AWGN channels. Both codes have length N = 1024 and R = Average number of nodes visited for the (64, 57) RM code Error rate performance for the (64, 57) RM code Average number of nodes visited for (64, K) RM codes with R = K 64 over AWGN channels with E b N 0 = 6 db Average number of nodes visited for (N, RN ) polar codes with R = 0.89 over an AWGN channel with E b N 0 = 8 db Average number of nodes visited for (64, K) RM codes over AWGN channels with E b N 0 = 6 db

18 xviii List of figures 7.6 Average number of nodes visited for a (64, 32) RM code for 2 decoding orders (fixed lower bounds applied) over AWGN channels: 1) natural order; 2) bit-reverse order Average number of nodes visited for a (64, 32) polar code over AWGN channels. 4 decoding orders of maximum and minimum sums are shown compared to the average of all n! decoding orders

19 List of tables 6.1 Number of weight-16 codewords for polar codes with N = 1024 and R = 1 with conventional and LLR-oriented constructions

20

21 Nomenclature Roman Symbols C I N O Channel capacity Mutual information Blocklength Order notation for sequences Other Symbols R X Y Z Z + Set of real numbers Input alphabet Output alphabet Set of integers Set of positive integers Acronyms / Abbreviations AWGN Additive white Gaussian noise B-DMC Binary discrete memoryless channel BEC Binary erasure channel BER Bit error rate BP BSC Belief propagation Binary symmetric channel

22 xxii Nomenclature CRC Cyclic redundancy check DMC Discrete memoryless channel FER Frame error rate i.i.d. Independent and identically distributed LDPC Low-density parity-check LLR ML SCL SC SD Log-likelihood ratio Maximum-likelihood Successive cancellation list Successive cancellation Sphere decoding

23 Chapter 1 Introduction One of the most important questions in information theory is at what rate information bits can be reliably transmitted over a noisy channel. In real-world communication systems, data transmission is inevitably affected by noise. In order to recover the original data after transmission in a noisy environment, redundancy needs to be added to the data before transmission. The procedure of adding redundancy is called channel coding, and is essential in communication systems. An efficient coding scheme adds redundancy in such a way that the coding rate is high while the probability of error is kept low. 1.1 Channel model The mathematical framework of a communication system is shown in Fig Assuming a message m is to be transmitted, it is mapped into a data sequence x by the encoder. The transmitted data sequence x is corrupted by noise during transmission, thus the received sequence y is different from x. The output of the decoder ˆm is an estimation of the message transmitted m base on the received sequence y. The channel is the physical medium over which data is transmitted. It can be modeled Noise m x y Encoder Channel Decoder ˆm Fig. 1.1 The basic communication system.

24 2 Introduction by a transition probability function W(y x) that depends on the communication environment. W(y x) is defined as the probability of observing y when the sequence x is sent. Throughout the thesis, we will focus on discrete memoryless channels (DMCs), defined as follows. Definition [44] A discrete memoryless channel (DMC), denoted by W : X Y, consists of two finite sets X and Y and a collection of probability mass functions W(y x), one for each x X, such that for every x and y, W(y x) 0, and for every x, y W(y x) = 1, with the interpretation that x is the input and y is the output of the channel. The probability distribution of the output at a given time depends only on the input at that time and is conditionally independent of previous channel inputs or outputs, i.e. for k multiple channel uses, the probability of observing y k when x k 1 is sent and y k 1 1 is observed is W ( ) y k x k 1y1 k 1 = W(yk x k ), k = 1, 2,... (1.1) Furthermore, if a DMC is used without feedback, we have W ( ) y k 1 x k k 1 = W(y i x i ), k = 1, 2,... (1.2) i=1 Throughout this thesis, DMCs are used without feedback. A transition matrix of a DMC can be used to describe the transition probabilities between the input and output alphabet set, where the entry in the ith row and the jth column is the conditional probability that the jth element in Y is received when the ith element in X is sent. A DMC can be classified by the symmetry 1 of its transition matrix. Definition A DMC is said to be symmetric if any two rows of its transition matrix are permutations of each other, and any two columns are permutations of each other. For example, a DMC with transition matrix is a symmetric DMC W(y x) = (1.3) There are various definitions of symmetry, but we will follow the definition in [14, Chapter 8]

25 1.1 Channel model 3 Definition A DMC is said to be weakly symmetric if any two rows of its transition matrix are permutations of each other, and each column sums to the same value. For example, a DMC with transition matrix 1 W(y x) = (1.4) is a weakly symmetric DMC. A DMC can also be classified by the cardinality of its input alphabet set. Definition If the input alphabet of a DMC only has two elements, the channel is said to be a binary-input discrete memoryless channel (B-DMC). Definition If the cardinality of the input alphabet of a DMC is a prime number, the channel is said to be a prime-input discrete memoryless channel (prime-input DMC). The mutual information provides a measure of the dependence between two random variables, and it is a key quantity in coding and information theory. Definition The mutual information I(X; Y ) of a DMC W : X Y and input distribution p(x) is defined as I(X; Y ) = W(y x) p(x)w(y x) log x X y Y W(y x )p(x ), (1.5) x X where p(x) is an input distribution, namely, the distribution of X. Remark 1. The base of the logarithm will be set to 2 by default. Other bases will be explicitly shown or stated. While the mutual information depends on the input distribution, we are particularly interested in the mutual information with the uniform input distribution. Definition The symmetric mutual information of a DMC W : X Y is defined as I(W) = y Y x X 1 W(y x) W(y x) log q x X where q is the cardinality of the input alphabet set X. 1 q W(y x ), (1.6)

26 4 Introduction As we will see below, the mutual information maximized with respect to the input distribution also plays a fundamental role in channel coding. Definition The information capacity of a DMC W : X Y is defined as C = max I(X; Y ), (1.7) p(x) where the maximum is over the set of all input distributions p(x). 1.2 Channel coding As long as the channel is not noiseless, data may be corrupted during transmission. Channel coding is an effective method to combat noise. We first give definitions related to channel codes. This section is mainly based on [14, Chapter 8]. Definition An (M, N) code for the channel W : X Y consists of the following: 1. An index set {1, 2,..., M}, representing messages. 2. An encoding function f: {1, 2,..., M} X N. A codeword is generated by applying x = f(m), m {1, 2,..., M}. The set of codewords is called the codebook. 3. A decoding function g: Y N {1, 2,..., M}, which is a deterministic rule that assigns an estimated message to each possible received vector. We use λ m to denote the probability of error when the index m is transmitted, i.e., λ m = P ( g(y N 1 ) m x N 1 = f(m) ). (1.8) In the following, we will define some important terminology related to an (M, N) code. Definition The average probability of error P e for an (M, N) code is defined as P e = 1 M M λ m. (1.9) m=1 Definition The maximal probability of error λ (N) as λ (N) = for an (M, N) code is defined max λ m. (1.10) m {1,2,...,M}

27 1.2 Channel coding 5 Definition The rate R of an (M, N) code is defined as R = log M bits per transmission. (1.11) N Definition A rate R is said to be achievable if there exists a sequence of ( 2 NR, N ) codes such that the maximal probability of error λ (N) tends to 0 as N. Definition The operational capacity of a discrete memoryless channel is the supremum of all achievable rates. The operational capacity measures the highest rate at which one can communicate reliably Channel coding theorem In the landmark work [70], Shannon proved that the operational capacity is equal to the information capacity. Theorem (The channel coding theorem [70]). All rates below the information capacity C are achievable. Specifically, for every rate R < C, there exists a sequence of ( 2 NR, N ) codes with maximum probability of error λ (N) 0 when N. Conversely, any sequence of ( 2 NR, N ) codes with λ (N) 0 must have R C. To prove the achievability of the information capacity, Shannon calculated the average probability of error P e, averaged over a randomly generated codebook. He showed that P e 0 for any R < C when N. The vanishing P e guarantees the existence of at least one good code that achieves the information capacity. By discarding the worst half of the codewords of the good code, the maximum probability of error λ (N) under joint typicality decoding converges to 0 for any R < C when N. Remark 2. Since the information capacity and the operational capacity are equal, in the rest of the thesis, we will simply use the word capacity to refer to either definition according to the context, when there is no possibility of confusion.

28 6 Introduction 1.3 A brief review of existing coding schemes Since Shannon proved the existence of capacity-achieving codes, there has been a tremendous effort in searching for such codes. In addition to low probabilities of error, practical codes should have low encoding and decoding complexity. In this section, we briefly review existing low-complexity coding schemes. We refer the reader to [15] for a detailed history of channel coding theory and applications. One direction of research is focused on finding linear codes with good algebraic properties, such as large minimum Hamming distance. Hamming codes [28], Golay codes [26], Reed-Muller codes [59, 49], Reed-Solomon codes [60], and BCH codes [11, 31, 9, 43] are important examples in this category. The constructions of Reed-Muller codes bear some resemblance to those of polar codes, but they seek to maximize the minimum distance instead of the quality measures considered in polar coding. Another direction of research, instead of finding specific codes with good performance under the worst scenario, is focused on finding families of codes with good performance on average. Convolutional codes, concatenated codes (such as turbo codes [46]), and low-density parity-check (LDPC) codes [23] fall within this category. Polar codes have advantages over the above coding schemes in the sense of being provably capacity achieving with low encoding and decoding complexity. Spatial coupled LDPC codes have been introduced recently, and are also capacity achieving, but the proofs are typically more involved. 1.4 Polar codes Despite the excellent performance of turbo codes and LDPC codes in practice, none of the aforementioned codes can be proved to achieve capacity of channels except for the binary erasure channel (BEC). Polar codes, introduced by Arıkan [3], are the first provably capacity-achieving codes with low encoding and decoding complexity for symmetric B-DMCs. The encoding and decoding complexities of polar codes are O(N log N), where N is the blocklength of the codes. The analysis and construction of polar codes are summarized as follows: (1) Given a symmetric B-DMC, virtual channels between the bits at the input of a linear encoder and the channel output sequence are created, such that the capacity of these virtual channels polarizes to either zero or one as the blocklength tends to infinity; the proportion of virtual channels with capacity close to one converges to the capacity of the original channel. This phenomenon is termed as channel polarization. (2) By transmitting data bits through the noiseless

29 1.5 Dissertation overview 7 virtual channels, polar codes achieve the capacity under successive cancellation (SC) decoding. Fig. 1.2 demonstrates how the symmetric mutual information of virtual channels polarizes as the blocklength grows. We will give a detailed review on polar codes in Chapter N = 2 8 N = 2 10 N = Mutual information Normalized virtual channel index Fig. 1.2 Channel polarization for BEC ǫ = Dissertation overview In Chapter 2, we give a literature review on polar codes designed for B-DMCs. We start from a theoretical point of view on why polar codes can achieve the symmetric mutual information of B-DMCs when the blocklength tends to infinity. Then we move on to practical constructions, including encoding and decoding schemes of polar codes in finite blocklength regime. In Chapter 3, we study the channel polarization of arbitrary-input DMCs. We show that if the original channel has zero zero-error capacity, the virtual channels will polarize towards channels with positive zero-error capacity or channels with zero capacity. The main contribution of this chapter is that we provide a simpler proof for the channel polarization theorem of arbitrary-input DMCs. The research work in this

30 8 Introduction chapter has been published in part in the paper: Jing Guo, Jossy Sayir, Minghai Qin and Albert Guillén i Fàbregas, An alternative proof of channel polarization for channels with arbitrary input alphabets, accepted by 53rd Annual Allerton Conference on Communication, Control, and Computing. In Chapter 4, we study a family of polar codes whose construction rule is such that it discards the virtual channels for which the mutual information falls below a certain (fixed) threshold. We show that if the threshold, which might depend on the code length, is bounded appropriately, a coding theorem can be proved for the underlying polar code. We also give accurate closed-form upper and lower bounds on the minimum distance of the resulting code when the original channel is a binary erasure channel. The research work in this chapter has been published in part in the paper: Jing Guo, Albert Guillén i Fàbregas and Jossy Sayir, Fixed-threshold polar codes, in Proc. IEEE International Symposium of Information Theory (ISIT) 2013, Istanbul, Turkey, July 2013, pp In Chapter 5, we propose a concatenated polar coding scheme employing an inner polar code and an outer LDPC code or a convolutional code for intermediate-quality virtual channels coupled with belief propagation (BP) decoding. We also propose an early termination method that reduces the decoding complexity. Both parallel and sequential updating schedules on polar decoding graphs are considered. Performance comparisons between concatenated polar codes and standard polar codes are provided. The research work in this chapter has been published in part in the paper: Jing Guo, Minghai Qin, Albert Guillén i Fàbregas and Paul H. Siegel, Enhanced belief propagation decoding of polar codes through concatenation, in Proc. IEEE International Symposium of Information Theory (ISIT) 2014, Honolulu, HI, USA, July 2014, pp In Chapter 6, we analyze how the messages passed through the decoding graph evolve during the BP decoding procedure. A construction of polar codes that is suitable for BP decoding are proposed based on the analysis. We also analyze the Hamming weight enumerating function of the codes formed by this construction. Simulation results show that this construction also provides better performance than the standard construction under successive cancellation list (SCL) decoding. In Chapter 7, we propose an efficient sphere decoding (SD) algorithm for polar codes that achieves the maximum-likelihood (ML) performance. We improve standard SD branching conditions by computing lower bounds on the optimal decoding metric. Both fixed and dynamic lower bounds are considered. We also propose an alternative decoding order based on the structure of polar codes, which further reduces

31 1.6 Notations 9 the decoding complexity in the low-to-medium SNR regime. The research work in this chapter has been published in part in the paper: Jing Guo and Albert Guillén i Fàbregas, Efficient sphere decoding of polar codes, accepted for publication by IEEE International Symposium of Information Theory (ISIT) In Chapter 8, we conclude the thesis by summarizing the main contributions of each chapter and by discussing potential interesting open problems for future research. 1.6 Notations Throughout the thesis, we will use the following notations. We use upper case letters such as X, Y, U to denote random variables, and lower case letters such as x, y, u to denote their realizations. We use W : X Y to denote the discrete memoryless channel (DMC) with input alphabet X and output alphabet Y. We use W(y x) to denote the transition probability of a channel W. We use bold symbols such as x, y, u to denote vectors. Row vectors are assumed. We use 0 and 1 to denote the all-zero vector and the all-one vector, respectively. We use d H (u, v) to denote the Hamming distance between binary vectors v and u. We define [b] = {1,..., b} for b Z +. We use F to denote a subset of [N] and F c to denote its complement. We use F to denote the cardinality of F. We let u F denote the sub-vector of u with indices i F. We use u b a to denote the sub-vector (u a,..., u b ) for 1 a b N. We use g i,j to denote the element in the ith row and jth column of a matrix G. We use G T to denote the transpose of a matrix G. We use G 1 to denote the inverse of a non-singular matrix G.

32

33 Chapter 2 A review of polar codes In this chapter, we give a literature review of polar codes designed for B-DMCs from both the theoretical and practical points of view. We will first present the channel transformation from which virtual channels are created. Then we will introduce the channel polarization phenomenon, based on which polar codes are constructed. Finally, we will move on to practical constructions, encoding, and decoding schemes of polar codes in finite blocklength regime. This chapter lays the foundations for the rest of the thesis. 2.1 Preliminaries In this chapter, we assume the input alphabet set X to be {0, 1}. We first introduce two channel parameters that are of primary interest. One is the symmetric mutual information defined in the previous chapter; we repeat the definition here. Definition The symmetric mutual information of a B-DMC W : X Y is defined as I(W) = y Y x X 1 2 W(y x) log W(y x) 1 W(y 0) + 1 (2.1) W(y 1). 2 2 The other one is the Bhattacharyya parameter, which measures the reliability of the channel for a single use. Definition The Bhattacharyya parameter of a B-DMC W : X Y is defined as Z(W) = y Y W(y 0)W(y 1). (2.2) A relationship between I(W) and Z(W) is stated in the following proposition.

34 12 A review of polar codes Proposition ([3]). For any B-DMC W : X Y, we have 2 I(W) log 1 + Z(W), (2.3) I(W) 1 Z(W) 2. (2.4) Proposition implies that if Z(W) is close to zero then I(W) is close to one, whereas if Z(W) is close to one then I(W) is close to zero. We now introduce a matrix operation, based on which the generator matrix of polar codes is constructed. Definition The Kronecker product of an m n matrix A and a k l matrix B is defined as a 11 B a 1n B A B = (2.5) a m1 B a mn B We use A n to denote Kronecker product of the matrix A by itself n times, i.e., 2.2 Channel transformation A n = A A... A. (2.6) }{{} n Given a B-DMC, the virtual channels are created by a channel transformation that contains two steps: channel combining and channel splitting [1]. N copies of the original channel W are transformed into N virtual channels W (i) N, which have some good properties that enable polar codes to achieve the symmetric mutual information. W (i) N, i [N] is also termed as the ith bit-channel of W. We will discuss the channel transformation in detail in this subsection and the properties of bit-channels in the next subsection Channel combining Channel combining is a step that combines copies of the original channel W into a vector channel W N : X N Y N. The vector channel W N is the virtual channel between the input sequence u N 1 to a linear encoder and the output sequence y N 1 of N copies of the original channel W. We use W N : X N Y N to denote the vector

35 2.2 Channel transformation 13 channel between the input sequence x N 1 and the output sequence y N 1 of N copies of the original channel W. The transition probabilities of the channels W N, W N and W are related by W N ( y N 1 u N 1 ) = W N ( y N 1 x N 1 ) (2.7) N = W(y i x i ). (2.8) i=1 The linear encoder that maps u N 1 x N 1 can be represented by a square matrix G N, which is created by applying the Kronecker product to a base matrix G 2 = 1 0 (2.9) 1 1 n = log(n) times. Notice that G can be constructed in a recursive manner as follows: G N = G 2 N 2 (. = G 2 N 4 (2.10) ) 2 (2.11) = G n 2. (2.12) This enables us to define the procedure of channel combining in a recursive fashion. Let denote the bitwise XOR operation, i.e., for two vectors u N 1 and v N 1, u N 1 v N 1 = (u 1 v 1,..., u N v N ). (2.13) For the first step, we combine two copies of the original channel W into W 2 (see Fig. 2.1). The mapping from u 2 1 x 2 1 can be written as x 2 1 = u 2 1G 2 (2.14) = (u 1 u 2, u 2 ). (2.15) For the second step, we combine two copies of W 2 into W 4 (see Fig. 2.2). The mapping from u 4 1 x 4 1 can be written as x 4 1 = u 4 1G 4 (2.16)

36 14 A review of polar codes Fig. 2.1 Vector channel W 2. Fig. 2.2 Vector channel W 4. = u 4 1G 2 2 (2.17) = ( ) u 2 1, u 4 G (2.18) G 2 G 2 = (( u 2 1 u 4 3) G2, u 4 3G 2 ). (2.19) For the nth step, we combine two copies of W N into W N where N = 2 n (see 2 Fig. 2.3). The mapping from u N 1 x N 1 can be written as x N 1 = u N 1 G N (2.20) = u N 1 G 2 N 2 (( = u N 2 1 u N N 2 ) G N 2, u N N G N 2 2 In Arıkan s original paper [3], the input sequence u N 1 matrix B N, i.e., x N 1 (2.21) ). (2.22) is permuted by a permutation = u N 1 B N G n 2, (2.23) where B N is a permutation matrix. Since the permutation only serves as a reordering of the indices of (x 1,..., x N ) and it does not affect the properties of polar codes, we

37 2.2 Channel transformation 15 Fig. 2.3 Vector channel W N. skip this permutation for simplicity of presentation throughout the thesis. Now we can rewrite Eq. (2.7) as ( ) W N y N 1 u ( ) N 1 = W N y N 1 u N 1 G N. (2.24) Channel splitting Having synthesized the vector channel W N, the next step is channel splitting. This involves splitting the vector channel W N into N bit-channels W (i) N : X Y N X i 1. The transition probability of the bit-channel W (i) N is defined as W (i) N (y N 1, u1 i 1 1 u i ) = 2 W ( ) N 1 N y N 1 u N 1. (2.25) u N i+1 X N i The corresponding Bhattacharyya parameter Z(W (i) N ) is defined as Z(W (i) N ) = y N 1 YN u i 1 1 X i 1 W (i) N ( y N 1, u i ) W (i) ( N y N 1, u i ). (2.26) The bit-channel W (i) N decoding the ith bit u i with perfect knowledge of channel outputs y N 1 is the channel that a successive cancellation decoder sees when and u i 1 1.

38 16 A review of polar codes Recursive channel transformation So far, we have described the procedures of channel combining and splitting, the necessary steps to obtain the bit-channels W (i) N. Due to the special structure of G N, the channel transformation can be done in a recursive fashion. Let u N 1 be the input sequence and y N 1 be the output sequence. A single step channel transformation generates a pair of binary input channels ( W (1) 2, W (2) ) 2 from two independent copies of the channel W. For the first step, given an original channel W, the pair of bit-channels ( (1) W 2, W (2) ) 2 can be described through the transition probabilities W (2) 2 W (1) ( ) 2 y 2 1 u 1 = 2 W(y 1 u 1 u 2 )W(y 2 u 2 ), (2.27) ( ) y 2 1 1, u 1 u 2 = 2 W(y 1 u 1 u 2 )W(y 2 u 2 ). (2.28) u 2 1 We use W W to denote the transformation of two copies of the channel W defined by Eq. (2.27) and W W to denote the transformation of two copies of the channel W defined by Eq. (2.28). At the ith step, we have synthesized 2 i 1 bit-channels W (j) 2 : X Y i 1 2(i 1) X j 1, j = 1,..., 2 i 1. By applying a single step channel transformation to two copies of the bit-channel W (j) 2, we could obtain a pair of bit-channels ( W (2j 1) i 1 2, W (2j) ) i 2, where i W (2j 1) 2 = W (j) W (j), i 2 (i 1) 2 (i 1) Eq. (2.29) can be expanded as W (2j) 2 i = W (j) 2 (i 1) W (j) 2 (i 1). (2.29) W (2j 1) 2 i ( y 2 i 1, u 2j 2 1 u 2j 1 ) = u 2j 1 2 W (j) 2 i 1 ( y 2 i 1 1, u 2j 2 1,o u 2j 2 1,e u 2j 1 u 2j ) W (j) 2 i 1 ( y 2 i 2 i 1 +1, u 2j 2 1,e u 2j ), (2.30) W (2j) 2 i ( y 2 i 1, u 2j 1 1 u 2j ) = 1 2 W (j) 2 i 1 ( y 2 i 1 1, u 2j 2 1,o u 2j 2 1,e u 2j 1 u 2j ) W (j) 2 i 1 ( y 2 i 2 i 1 +1, u 2j 2 1,e u 2j ). (2.31) Here u 2j 2 1,o = (u 1, u 3,, u 2j 1 ) and u 2j 2 1,e = (u 2, u 4,, u 2j 2 ). Note that Eq. (2.30) and Eq. (2.31) are identical to Eq. (2.27) and Eq. (2.28) if we apply the following substitutions: u 2j 1 u 1, ( y 2i 1 1, u 2j 2 1,o u 2j 2 ) 1,e y1,

39 2.3 Channel polarization 17 u 2j u 2, ( y 2i 2 +1, u 2j 2 ) i 1 1,e y2. Thus we could first transform N copies of the original channel W into N 2 bit-channel W (1) 2 and W (2) 2 each, then into N 4 and W (4) 4, till we obtain N bit-channels W (i) N, i = 1,..., N. copies of (1) copies of bit-channel W 4, W (2) 4, W (3) 4, 2.3 Channel polarization In this section, we introduce the concept of channel polarization, which is fundamental in proving that polar codes can achieve the symmetric mutual information of B-DMCs. We first prove that the channel transformation preserves the overall symmetric mutual information. According to the chain rule for mutual information, we have I ( ) ( ) ( ) Y N 1 ; X N 1, U N 1 = I Y N 1 ; X N 1 + I Y N 1 ; U N 1 X N 1 (2.32) = I ( ) ( ) Y N 1 ; U N 1 + I Y N 1 ; X N 1 U N 1. (2.33) Since we have I ( Y N 1 ; U N 1 X N 1 I ( Y N 1 ; X N 1 ) ( ) = I Y N 1 ; X N 1 U N 1 = 0, (2.34) ) ( ) = I Y N 1 ; U N 1. (2.35) Thus, I(W N ) = I ( Y N 1 ; U N 1 = I ( Y N 1 ; X N 1 ) ) (2.36) (2.37) = NI (W). (2.38) This shows that the channel transformation preserves the sum of N copies of the original channel s symmetric mutual information. However, the symmetric mutual information of the bit-channel W (i) N, i [N] is different from I(W). The following theorem shows that as n, the symmetric mutual information of each individual channel converges almost surely to either 0 or 1. Theorem ([3]). For any B-DMC W, the bit-channels W (i) N polarize in the sense that, for any fixed δ (0, 1), as N tends to infinity through powers of two, the fraction of indices i {1,..., N} for which I ( W (i) ) N (1 δ, 1] tends toward I(W) and the

40 18 A review of polar codes fraction for which I ( W (i) ) N [0, δ) tends toward 1 I(W). That is, lim N lim N i [N] : I ( W (i) ) N (1 δ, 1] N i [N] : I ( W (i) N N ) [0, 1 δ) = I(W), (2.39) = 1 I(W). (2.40) We call the phenomenon showed in Eq. (2.39) and Eq. (2.40) channel polarization. Since the capacity-achieving property of polar codes is mainly based on Theorem 2.3.1, we give a sketch of the proof of Theorem from [3] in the remainder of this subsection. Let n = log N and label the bit-channel W (i) N W (i) 2 n W b 1 b 2...b n, i = 1 + as n b j 2 n j. (2.41) j=1 Define {B n ; n 1} as a sequence of independent and identically distributed (i.i.d.) Bernoulli random variables equiprobable on the set {0, 1}. Define a random tree process {W n : n 0} as follows: W 0 = W, (2.42) W n = W B1 B 2...B n. (2.43) Given b 1 b 2... b n as a sample value of the random variables B 1 B 2... B n, the random process W n takes value W b1 b 2...b n. Let a probability space be defined as (Ω, A, P), where Ω is the space of all binary sequences (b 1 b 2...) {0, 1}. Moreover, A 0 = {φ, Ω} and A n = σ(b 1,..., B n ), where A n is the Borel Field generated by the cylinder sets (b 1,..., b n ). Finally, P is the probability measure defined on A. We now define the random processes for the channel parameters as follows: {I n ; n 0} def ={I(W n ); n 0}, (2.44) {Z n ; n 0} def ={Z(W n ); n 0}. (2.45) We proceed by presenting some important properties of these random processes.

41 2.4 Polar code encoding 19 Proposition ([3]). {(I n, A n ; n 0)} is a bounded martingale, i.e., A n A n+1 and I n is A n -measurable (2.46) E[ I n ] < (2.47) I n = E[I n+1 A n ]. (2.48) Building on Proposition 2.3.2, it can be shown that the sequence {I n ; n 0} converges almost everywhere to a random variable I such that E[I ] = I 0. Proposition ([3]). {(Z n, A n ; n 0)} is a supermartingale, i.e., A n A n+1 and Z n is A n -measurable (2.49) E[ Z n ] < (2.50) Z n E[Z n+1 A n ]. (2.51) Building on Proposition 2.3.3, it can be shown that the sequence {Z n ; n 0} converges almost everywhere to a random variable Z that takes a value in {0, 1}. Proposition ([3]). The limit I takes values almost everywhere in the set {0, 1}: P(I = 1) = I 0 and P(I = 0) = 1 I 0. Theorem is a corollary of Proposition Recall that I 0 = I(W), according to P(I = 1) = I 0, the fraction of good channels converges to I(W); according to P(I = 0) = 1 I 0, the fraction of completely noisy channels converges to 1 I(W). 2.4 Polar code encoding Given a B-DMC W, we will use F [N] to denote the set of indices of bit-channels whose symmetric mutual information is close to 0. These bit-channels are not capable of transmitting data bits reliably, and we will freeze their corresponding input u F to be predetermined values known at the decoder. We call this subset F the frozen set. The complement of u F, denoted by u F c, can be used to transmit information, and we will interchangeably call u F c information bits or data bits. The frozen set F could be composed by the indices of N K bit-channels with the largest Bhattacharrya

42 20 A review of polar codes parameter. That is, Z ( W (i) N ) ( ) (j) Z W, i F, j F c. (2.52) N Once the frozen set F is decided, the encoding procedure of polar codes is quite straightforward. We set u F to predetermined values, and u F c to the information bits to be transmitted. Then u is mapped to a codeword x through the linear encoder x = ug N. For the sake of simplicity, we let u i = 0, i F. Due to the recursive structure of G N, the encoding complexity of polar codes can be reduced from O(N 2 ), which is the complexity of vector-matrix multiplication, to O(N log N). The encoding is done layer by layer for n = log N layers and within each layer the computational complexity is O(N). Fig. 2.4 illustrates how the encoding is done for an (N, K) = (8, 4) polar code designed for a BEC with erasure probability ǫ = 1 2. Note that for BECs, closed form formulas which can be used to compute the Bhattacharrya parameters of bit-channels efficiently are introduced [3]. However, for other B-DMCs, no explicit formulas are available. We will discuss a way of constructing polar codes by approximating the quality of bit-channels in Section 2.7. So far, the encoder of polar codes is not systematic. Systematic polar coding was studied in [4] and it was observed that the bit error rate (BER) is smaller compared to non-systematic polar codes. We will focus on non-systematic polar codes throughout the thesis, since a major amount of work on polar codes is focused on non-systematic polar codes. Fig. 2.4 Encoding procedure for an (8, 4) polar code, BEC with ǫ = 1 2.

43 2.5 SC decoding of polar codes SC decoding of polar codes It has been proved in [3] that polar codes with SC decoding can achieve the capacity of B-DMCs. Since then, many other decoding algorithms have been proposed [8, 13, 42, 77, 35, 78, 36, 25] to improve the error rate performance in the finite-length regime. We will review the SC decoding algorithm in this section and review some other decoding algorithms of polar codes such as belief propagation (BP) decoding, successive cancellation list (SCL) decoding, and sphere decoding (SD) in the later chapters. Let u N 1 be the input sequences to the polar encoder, x N 1 be the corresponding codeword and y N 1 be the channel observations. The SC decoding is done in an sequential manner. The estimation of bit u i is based on the received vector y and estimations û i 1 1 of the previous bits u1 i 1. Letting u i is then estimated as L (i) N ( y, û i 1 1 ) W (i) ( N y, û i 1 = W (i) N 1 u i = 0 ) ( y, û i 1 1 u 1 = 1 ), (2.53) 0 if L (i) ( ) N y, û i 1 1 1, and i F c û i = 1 if L (i) ( ) N y, û i 1 1 < 1, and i F c frozen value otherwise. 2.6 Coding theorems Let P e (N, R, u F ) be the average block error probability of the polar code with blocklength N, rate R = N K and frozen bits u F under SC decoding. Let P e (N, R) be the average error probability of the polar code over all choices of u F, i.e., P e (N, R) = E [P e (N, R, U F )], (2.54) = P (U F = u F ) P e (N, R, u F ). u F {0,1} N K (2.55) Arıkan proved the following coding theorem in [3]. Theorem For any given B-DMC W and any fixed rate R < I(W), the average block error probability of a polar code under SC decoding satisfies P e (N, R) = O ( N 1 4 ). (2.56)

44 22 A review of polar codes Theorem is a corollary of the following theorem. Theorem For any B-DMC with I(W) > 0, and any fixed R < I(W), there exists a sequence of sets F c N [N], N {1,..., 2 n,... }, such that F c N NR and Z ( W (i) ) ( ) N O N 5 4, for all i F c N. (2.57) The following stronger version of Theorem is proved in [3] as well. Theorem For any given B-DMC W and any fixed rate R < I(W), the block error probability of a polar code with blocklength N, rate R = K N under SC decoding satisfies and frozen bits u F P e (N, R, u F ) = O ( N 1 4 ). (2.58) 2.7 Practical code constructions Although the construction of a polar code can be explicitly defined in theory, it is a challenge in practice to calculate the quality (mutual information or Bhattacharyya parameter) of the bit-channels. The output alphabet size of the corresponding bitchannels grows exponentially with the blocklength, which results in the intractability of calculating the qualities of the bit-channels. The only exception is BECs for which, closed-form formulas (Eq. (2.59) and Eq. (2.60)) are proposed in [3] to calculate the Bhattacharyya parameter of bit-channels recursively. That is, Z (W W) = 2Z(W) Z(W) 2, (2.59) Z (W W) = Z(W) 2. (2.60) Several methods have been proposed to estimate the qualities of the bit-channels [47, 75, 76, 80, 81, 37, 48, 56]. The approximation accuracy of the methods proposed in [75] is theoretically guaranteed and by far the most accurate one. Thus we use the method proposed in [75] to construct the polar codes used in this thesis, unless specified otherwise. In this section, we outline the methods proposed in [75].

45 2.7 Practical code constructions 23 The key idea in [75] is to approximate the bit-channels having an intractable output alphabet size by channels having a manageable output alphabet size. They consider two approximation methods, termed as channel degradation and channel upgradation, which yield channels with worse and better qualities than the original bit-channel, respectively. Thus the quality of the original bit-channel is bounded in between the two. Furthermore, the gap between the upper and lower bound is shown to be small [75]. Thus, the qualities of the approximated bit-channels should be close to those of the original bit-channels. Definition [75] A channel Q : X Z is degraded with respect to a channel W : X Y, denoted by Q W, if there exists a channel P : Y Z such that for all z Z and x X, Q(z x) = W(y x)p(z y). (2.61) y Y Definition [75] A channel Q : X Z is upgraded with respect to a channel W : X Y, denoted by Q W, if there exists a channel P : Y Z such that for all z Z and x X, W(y x) = Q (z x)p(y z ). (2.62) z Z The following lemma states that the degraded channel Q is worse than the original channel W measured by mutual information and Bhattacharyya parameter. Lemma [61] Given two B-DMCs P and Q, if Q W, we have Z(Q) Z(W), (2.63) I(Q) I(W). (2.64) Similarly, the following lemma states that the upgraded channel Q is better than the original channel W measured by mutual information and Bhattacharyya parameter. Lemma [61] Given two B-DMCs W and Q, if Q W, we have Z(Q ) Z(W), (2.65) I(Q ) I(W). (2.66)

46 24 A review of polar codes The following lemma states that the degradation and upgradation relation is preserved by the channel transformation defined in Section 2.2. Lemma [75] Applying the channel transformation defined in Eq. (2.27) and Eq. (2.28) to two B-DMCs W and Q, if Q W, we have Q Q W W, (2.67) Q Q W W. (2.68) On the other hand, if Q W, we have Q Q W W, (2.69) Q Q W W. (2.70) Given a B-DMC W, we now describe how to approximate its corresponding bitchannel W (i) N, i [N] with a degraded channel Q whose output alphabet size is at most L. Let (b 1 (i),..., b n (i)) denote the binary representation form of i 1, so that i = 1 + n b j (i)2 (n j). In the algorithm, the function degrading(w, L) will return a j=1 degraded channel with respect to W and with an output alphabet size at most L. Algorithm Channel degradation Input: W: The original channel; (b 1 (i),..., b n (i)): The binary representation form of the index of the bit-channel wanted to be approximated. L: Bound on the output alphabet size of the degraded channel. Output: Q (i) N : A channel that is degraded with respect to the bit-channel W (i) N. Q (i) N degrading(w, L); For j = 1,..., n do if b j (i) = 0 then

47 2.8 Summary 25 else W W W W W W end if Q (i) N degrading(w, L); end for return Q (i) N Consider a B-DMC W with output alphabet size Y, we give a simple example of the function degrading(w,l). Fig. 2.5 illustrates an operation that results in a degraded channel Q with output alphabet size Y 1. The entry in the first or second row of a channel in Fig. 2.5 is the probability of receiving the corresponding symbol given 0 or 1 is transmitted, respectively. One could obtain a degraded channel with output alphabet size L by repeating the operation Y L times. Fig. 2.5 A degrading operation. The upgrading procedure is essentially the same as Algorithm 2.7.6, except that the degrading function is changed to the upgrading function. For details on how to find the appropriate upgrading and degrading functions, we refer the reader to [75]. The frozen set F for a polar code with rate R is constructed by choosing N(1 R) indices such that Z ( Q (i) ) ( ) (j) N Z Q N for all i F, j F c. 2.8 Summary In this chapter, we gave a brief review of standard polar codes for B-DMCs. In particular, we provided a sketch of the proof that polar codes achieve the symmetric mutual

48 26 A review of polar codes information when the blocklength tends to infinity. The capacity-achieving property and its proof lay the theoretical foundations for Chapter 3 and Chapter 4. Furthermore, we reviewed a practical method to estimate the quality of virtual channels efficiently in finite blocklength regime, which is used to generate standard polar codes used in Chapter 5, Chapter 6, and Chapter 7. We also reviewed the SC decoding algorithm for polar codes, which is used as a benchmark later to show the improvement of our proposed decoding algorithms.

49 Chapter 3 Channel polarization for arbitrary-input DMCs 3.1 Introduction In the celebrated work of Arıkan [3], the channel polarization theorem is proved only for B-DMCs. We briefly reviewed the proof in Chapter 2 and showed that polar codes, constructed based on this theorem, can achieve the channel capacity under successive cancellation (SC) decoding. The channel polarization theorem is generalized to primeinput DMCs in [66], to prime power-input DMCs in [54, 55] and to arbitrary-input DMCs in [67, 63, 64]. References [54, 55, 67, 63, 64] all follow Arıkan s proof technique summarized in Section 2.3, which is based on the martingale properties of the random processes Z n and I n. In [66], the channel polarization theorem for prime-input DMCs is proved without using the martingale property of Z n. Instead, the proof is based on the entropy inequality of virtual channels (mentioned in [3] when the original channel is a B-DMC), i.e., the mutual information of the virtual channels is strictly different from that of the original channel, and the martingale property of random process {I n ; n 0}. As an extension of [66], the channel polarization theorem is proved in [52] for arbitrary DMCs with input alphabet set forming a quasigroup. In this chapter, we revisit the channel polarization problem for arbitrary DMCs, and provide an alternative proof for the channel polarization theorem for arbitraryinput DMCs. Similarly to [52], our approach does not consider the Battacharyya parameter. There are two main differences between our proof technique and the one proposed in [52]. First, while [52] proves the entropy inequality of virtual channels by lower bounding the mutual information difference between the original and the worse virtual channel, we consider the difference between the better virtual channel and the

50 28 Channel polarization for arbitrary-input DMCs original channel, for which a simple expression is given and bounded away from zero when the input alphabet and the operation used in the channel transformation forms a monoid. Though these two ideas might seem similar, this leads to a new approach for proving the strict inequality. Second, our approach makes use of the properties of Markov chains and the zero-error capacity, without involving distances between probability distributions. Moreover, we show that the extremal channels to which the virtual channels converge have a zero-error capacity equal to their capacity. We note that our proof of channel polarization theorem is restricted to group operations for now, while the stronger results in [52] apply to the wider class of quasigroups. We first review the entropy-based proof proposed in [66] and introduce the definition of the zero-error capacity. We then prove an entropy inequality for arbitrary-input DMCs with zero-error capacity being zero under a step of channel transformation, and prove that the zero-error capacity of the virtual channels is zero as well. Then, we show that the virtual channels converge to a set of channels with positive zero-error capacity or channels with zero capacity asymptotically. We conclude this chapter with some discussions on channel polarization for channels whose input alphabet set forms a monoid (not necessary a group) Entropy inequality for prime-input DMCs In this section, we briefly review the proofs in [66]. Consider a DMC W : X Y. Let the input alphabet set be X = {0,..., q 1} and assume that for all y Y there exists x X such that W(y x) > 0. Let U 1 and U 2 be independent random variables taking values from the set X. Let X 1 = U 1 U 2, (3.1) X 2 = U 2, (3.2) where denotes modulo-q addition. Let W : X Y 2 be the virtual channel between U 1 and Y 1 Y 2, and W + : X Y 2 X be the virtual channel between U 2 and Y 1 Y 2 U 1. W and W + are synthesized after one channel transformation step (see Fig. (3.1)). After n recursive steps of channel transformation, we can synthesize 2 n virtual channels. We follow the notations defined in Chapter 2. Let W n be a random variable that chooses equiprobably from all possible 2 n virtual channels after nth step, and I n = I(W n ) be the mutual information of W n. The random process {I n ; n 0} is proved in [3] to be a bounded martingale for B-DMCs.

51 3.1 Introduction 29 Fig. 3.1 One step channel transformation. Based on the assumption that the input alphabet size X = q is a prime integer, [66] proved that the symmetric mutual information of the virtual channel W is strictly less than that of the original channel W. The base of the logarithm is set to be q in this chapter. Lemma ([66]). If I(W) (δ, 1 δ), for some δ > 0, then there exists an ǫ(δ) > 0 such that I(W ) + ǫ(δ) I(W) I(W + ) ǫ(δ). (3.3) Lemma is a corollary of the following lemma. Lemma ([66]). Let X 1, X 2 X, Y 1, Y 2 Y be random variables with joint probability density P X1 X 2 Y 1 Y 2 (x 1, y 1, x 2, y 2 ) = P X1 Y 1 (x 1, y 1 )P X2 Y 2 (x 2, y 2 ). (3.4) If H(X 1 Y 1 ), H(X 2 Y 2 ) (δ, 1 δ) for some δ > 0, then there exists an ǫ(δ) > 0 such that H(X 1 + X 2 Y 1, Y 2 ) max{h(x 1 Y 1 ), H(X 2 Y 2 )} ǫ(δ). (3.5) It is easy to see that Lemma follows from Lemma We briefly explain the logic behind here. Consider a step of channel transformation described in Fig. 3.1, since U 1 and U 2 are independent and equiprobable on the supporting set X and is moduloq addition, X 1, X 2 are independent and equiprobable on X. Thus X 1, X 2, Y 1, Y 2 are jointly distributed as in Eq. (3.4). Assume I(W) (δ, 1 δ), here the base of the logarithm is set to be q, we have H(X 1 Y 1 ) = 1 I(W) (δ, 1 δ). According to Lemma 3.1.2, we have I(W ) = 1 H(X 1 + X 2 Y 1 Y 2 ) < 1 H(X 1 Y 1 ) = I(W). Since I(W ) + I(W + ) = 2I(W), Lemma is proved. Based on the martingale property of random process I n (as defined in Eq. (2.44)) and Lemma 3.1.1, one can prove the channel polarization theorem for q-ary input DMCs. However, the proof of Lemma (see [66] for details of the proof) is critically based on the assumption that the input alphabet size is a prime number. We will generalize Lemma to arbitrary-input DMCs with a different proof technique.

52 30 Channel polarization for arbitrary-input DMCs 3.2 Preliminaries Before we present the main result of this chapter, we introduce following technical lemma, which will be used to prove Lemma and Theorem Lemma For random variables X, Y, Z whose probability distributions are supported over their respective alphabets X, Y, Z, if X Y Z and Y X Z form Markov chains, then x, y, z such that P XY (x, y) > 0, P Z Y (z y) = P Z X (z x). (3.6) So for any y Y, P Z X (z x) takes on the same value for all x such that P XY (xy) > 0. Proof. See Section Now we introduce terminology that will be used throughout this chapter Zero-error capacity In [71], Shannon introduced the concept of zero-error capacity. Definition The zero-error capacity of a noisy channel is the supremum of all rates at which the information can be transmitted with zero error probability. Since the capacity of a channel is the supremum of all rates at which the information can be transmitted with vanishing error probability, a channel s zero-error capacity is always upper bounded by its capacity. We use C 0 (W) to denote the zero-error capacity of a channel W. Channels with zero-error capacity equaling zero are of primary interest. BECs with strictly positive erasure error probability, BSCs with strictly positive crossover probability and AWGN channels with strictly positive noise power are examples of such channels. Definition Let C 0 be the set of channels whose zero-error capacity is positive. Let C be the set of channels whose capacity is zero. Let C0 = C 0 C. Fig. 3.2 illustrates some channels which belong to C0, where Fig. 3.2a and Fig. 3.2c illustrate channels with positive zero-error capacity, and Fig. 3.2b illustrates channel with zero capacity.

53 3.2 Preliminaries 31 (a) Noiseless channel (b) Zero capacity channel (c) Typewriter channel Fig. 3.2 Channels belong to C 0 Now we claim the following lemma without proof, which is summarized from the statements in [71]. Lemma For a DMC W : X Y, the following statements are equivalent. 1. W / C 0 2. x 1, x 2 X, W(y x 1 )W(y x 2 ) > 0. y Y Basic algebraic structures Now we introduce some basic algebraic structures that will be considered in this chapter. Definition Suppose X is a set and an operation is defined over X. (X ; ) forms a monoid if it satisfies the following three axioms. 1. x 1, x 2 X, x 1 x 2 X. 2. x 1, x 2, x 3 X, (x 1 x 2 ) x 3 = x 1 (x 2 x 3 ). 3. There exists an element x 0 in X such that for every element x X, x x 0 = x 0 x = x. x 0 is also referred as the neutral element of (X ; ). In short, a monoid is a single operation algebraic structure satisfying closure, associativity, and the existence of an identity element. Definition A group is a monoid in which every element has an inverse.

54 32 Channel polarization for arbitrary-input DMCs For example, the set of numbers {0, 1,, q 1} with multiplication modulo q forms a monoid for all q, but only forms a group for multiplication modulo q if q is prime and 0 is removed from the set. Definition Let (X ; ) by any algebraic structure and X s be a proper subset of X. If (X s ; ) forms a group, we call (X s ; ) a subgroup of (X ; ) and denote this relation by (X s ; ) (X ; ). Note that our definition allows for a monoid to have a subgroup. Definition Given (X s ; ) (X ; ), for any x X, x X s = {x x x X s } is called the left coset of X s in X with respect to x, and X s x = {x x x X s } is called the right coset of X s in X with respect to x. According to Lagrange s Theorem, the left cosets of a subgroup in a group partition the group and the cardinality of the cosets is the same. The left cosets of a subgroup in a monoid partition the monoid as well, but their cardinalities can be different. 3.3 Entropy inequalities for arbitrary-input DMCs In this section, we will consider the scenario illustrated in Fig. 3.1, where U 1, U 2, X 1, X 2 are defined over a finite set X, and the operation is defined over X so that (X ; ) forms a monoid. We assume that U 1 and U 2 are i.i.d. random variables equiprobable on the set X, and for all y Y there exists x X such that W(y x) > 0. We first derive a closed-form expression to characterize the difference between the mutual information of the virtual channels and the original channel after a step of channel transformation. Lemma Given a DMC W : X Y, we have I(W + ) I(W) = I(X 1 ; Y 1 U 1 Y 2 ). (3.7) Proof. See Section The rest of the chapter is devoted to find the sufficient and necessary condition for I(X 1 ; Y 1 U 1 Y 2 ) > 0. We first give a sufficient condition. Lemma Given a DMC W : X Y, if the channel W / C0, then we have I(X 1 ; Y 1 U 1 Y 2 ) > 0. (3.8)

55 3.3 Entropy inequalities for arbitrary-input DMCs 33 Proof. See Section Note that Lemma provides a sufficient but not necessary condition for I(X 1 ; Y 1 U 1 Y 2 ) > 0. Based on Lemma 3.3.2, we manage to find a sufficient and necessary condition for I(X 1 ; Y 1 U 1 Y 2 ) > 0, which will be stated in Theorem Lemma Given a DMC W : X Y, we have I(W ) + I(W + ) 2I(W). (3.9) Proof. See Section The equality in Eq. (3.9) holds if (X, ) forms a group. That is, the channel transformation preserves the overall symmetric mutual information when (X, ) forms a group. The sufficient and necessary conditions for the equality in Eq. (3.9) to hold are studied in [51]. Based on Lemma and Lemma together with Lemma 3.3.3, we can prove the main result of the chapter, which is the following theorem. Theorem For a DMC W : X Y with W / C0, I(W) I(W ) > 0, (3.10) I(W + ) I(W) > 0. (3.11) The proof of Theorem is straightforward. First, Eq. (3.11) is a direct consequence of Lemma Then, Eq. (3.10) is a direct consequence of Eq. (3.11) and Lemma Moreover, Lemma will be used in the proof of Lemma (see Section 3.5.5). Theorem generalizes the results in [66] where Eq. (3.10) and Eq. (3.11) are proved for prime-input DMCs only. We will show how Theorem leads to a proof of channel polarization in the next section. In order to make arguments about entropy inequalities of virtual channels for multiple channel transformation steps, we investigate whether the virtual channels W and W + inherit the zero zero-error capacity property of the original channel W. Lemma Consider a DMC W : X Y. If the channel W / C0 and (X ; ) forms a group, then we have W + / C 0, (3.12)

56 34 Channel polarization for arbitrary-input DMCs W / C 0. (3.13) Moreover, if W / C 0 and (X ; ) only forms a monoid, we have W + / C 0, (3.14) W / C 0. (3.15) Proof. See Section Channel polarization for arbitrary-input DMCs In the previous section, we proved that the symmetric mutual information of the virtual channels is strictly different from that of the original channel after one step of channel transformation. A natural step forward is to investigate whether the symmetric mutual information of the virtual channels converges asymptotically, and if so, the set of possible values that it converges to Channel polarization over groups We first consider the case when (X, ) forms a group. Since Proposition still holds for arbitrary-input DMCs, i.e., the random process I n is a bounded martingale and I n converges almost everywhere to a random variable I. Then we have Since E[ I n+1 I n ] = 1 2 n+1 2 n i=1 we have that as n, lim E[ I n+1 I n ] = 0. (3.16) n { I ( W (2i 1) 2 n+1 ) I ( W (i) 2 n ) + I ( W (2i) 2 n+1 ) I ( W (i) 2 n ) }, (3.17) I ( W (2i 1) 2 n+1 ) I ( W (i) 2 n ) = 0, (3.18) I ( W (2i) 2 n+1 ) I ( W (i) 2 n ) = 0. (3.19) So Eq. (3.16) together with Theorem imply that for any W / C 0, its corresponding virtual channels will converge to channels in C 0 asymptotically.

57 3.4 Channel polarization for arbitrary-input DMCs 35 As for the set of values the virtual channels will converge to, we need to investigate the set of invariant channels under channel transformation, i.e., channels with I(X 1 ; Y 1 U 1 Y 2 ) = 0. Definition Let C inv (X ) denote the the set of channels with input alphabet set X and I(X 1 ; Y 1 U 1 Y 2 ) = 0 after one step of channel transformation. It follows from Lemma that C inv (X ) C 0. Theorem Given a DMC W : X Y, a necessary and sufficient condition for W C inv (X ) is that both following statements are fulfilled. 1. W : X Y can be decomposed into t 1 disjoint subchannels W i : X i Y i, with X i X and Y i Y and W i C, i [t], and 2. X s {X 1,, X t } such that (X s ; ) (X ; ) and any X i {X 1,, X t } is a left coset of X s. Moreover, if W C inv, then Proof. See Section W C inv, (3.20) W + C inv. (3.21) Channels described in Fig. 3.2a and Fig. 3.2b belong to C inv. In particular, Fig. 3.2a can be decomposed into 2 zero capacity subchannels, each with input alphabet size 1. With Eq. (3.16), Theorem 3.3.4, and Theorem 3.4.2, we can conclude that successive transformations of channels with zero-error capacity equal to zero will give rise to channels converging towards a set of channels C inv (X ) with positive zero-error capacity or channels with zero capacity asymptotically. Let W denote the limit random variable of the random process {W n ; n 0} as defined previously by Eq. (2.43), we have W = lim n W n C inv. Note that this does not conflict with Lemma , which states that after any finite steps of channel transformation to a DMC W / C 0, the corresponding bit-channels W (i) 2 n / C 0, i = {1,, 2 n }. Now we investigate the value of the limit random variable I. Theorem implies that the set C inv is a set of sum channels of which every component channel has zero capacity. Moreover, W C inv, the zero-error capacity of channel W equals to its capacity. The logic behind this is as follows: W C inv, we have C 0 (W) C(W), C(W) = log( t 2 C(Wi) ) = log t and C 0 (W) log t = C(W), where t is the number of i=1 disjoint subchannels. Thus we can conclude that W C inv, C 0 (W) = C(W) = log t.

58 36 Channel polarization for arbitrary-input DMCs Theorem Given a DMC W : X Y, W takes values in the set C inv (X ) and I takes values in { log X X s (X s; ) (X ; ) }. For example, given a channel W : X Y with X = 6, I takes values in {log 1, log 2, log 3, log 6}. Theorem is a direct consequence of Theorem 3.4.2, so we skip the proof Channel polarization over monoids We now briefly discuss channels whose input alphabet set is not a group, but instead only a monoid. The proof of Theorem is still valid, but equality may not be achieved in Eq. (3.9). A consequence of this is that the random process I n is no longer a martingale, but instead a supermartingale. Moreover, I(W + ) I(W) = I(X 1 ; Y 1 U 1 Y 2 ) = 0 does not necessarily imply that I(W ) I(W) = 0. Instead, I(W ) I(W) = 0 if W C. The possible values of W and I are not known. Intuitively, one might expect that W takes values in C and I = Proofs Proof of Lemma Since X Y Z and Y X Z both form Markov chains, we have P Z XY (z xy) = P Z X (z x) = P Z Y (z y), if P XY (xy) > 0. (3.22) This completes the proof Proof of Lemma According to the chain rule of entropy, we have I(W + ) I(W) = I(U 2 ; Y 1 Y 2 U 1 ) I(U 2 ; Y 2 ) (3.23) = H(U 2 Y 2 ) H(U 2 Y 1 Y 2 U 1 ) (3.24) = I(U 2 ; Y 1 U 1 Y 2 ) (3.25) = H(U 1 Y 1 Y 2 ) H(U 1 Y 1 U 2 Y 2 ) (3.26) = H(U 1 Y 2 ) + H(Y 1 U 1 Y 2 ) H(U 1 U 2 Y 2 ) H(Y 1 U 1 U 2 Y 2 ) (3.27)

59 3.5 Proofs 37 = H(Y 1 U 1 Y 2 ) H(Y 1 U 1 U 2 Y 2 ) (3.28) = H(Y 1 U 1 Y 2 ) H(Y 1 X 1 ) (3.29) = H(Y 1 U 1 Y 2 ) H(Y 1 X 1 U 1 Y 2 ) (3.30) = I(X 1 ; Y 1 U 1 Y 2 ). (3.31) In particular, Eq. (3.28) comes from the fact that U 1 U 2 Y 2 forms a Markov chain. Eq. (3.29) and Eq. (3.30) come from the fact that Y 1 X 1 (U 1, Y 2 ) forms a Markov chain. This completes the proof Proof of Lemma We prove by contradiction. Assume I(X 1 ; Y 1 U 1 Y 2 ) = 0, we will show that this will lead to a contradiction that W C 0. By this assumption, we have that X 1 (U 1, Y 2 ) Y 1 forms a Markov chain. By construction (see Fig. 3.1), (U 1, Y 2 ) X 1 Y 1 forms a Markov chain too. Hence, we are in the scenario of Lemma Note that U 1 and Y 2 are independent, we have P U1,Y 2 (u, y 2 ) = P U1 (u)p Y2 (y 2 ). We have assumed that U 1 is uniformly distributed over X, and y Y, x X such that W(y x) > 0. Then we have P U1,Y 2 (u, y 2 ) = 1 q P Y 2 (y 2 ) > 0, u X, y 2 Y. If there exists u, y 2 such that P X1 U 1,Y 2 (x u, y 2 ) > 0 for all x X, then by Lemma 3.2.1, for any y, W(y x) has the same value for all x X and hence I(W) = 0, which contradicts the condition of the lemma. We can hence assume that, u, y 2, the set X u,y2 = {x X P X1 U 1,Y 2 (x u, y 2 ) > 0} is a proper subset of X. Consider the set X 0,y for some y corresponding to U 1 = 0 and Y 2 = y, where 0 is the neutral element of the monoid (X ; ). Examining Fig. 3.1 for U 1 = 0, we observe that X 1 = X 2 = U 2, and hence the setup conditioned on U 1 = 0 is equivalent to the setup in Fig In this case, X 1 = X 2 is equiprobable on the set X. From the figure, it is clear that X 0,y Y 1 W X 2 = X 1 W Y 2 Fig. 3.3 Setup conditioned on U 1 = 0. is non-empty, since otherwise it would contradict the definition of a channel. Since X 0,y is a non-empty proper subset of X, its complement X0,y c = X \ X 0,y is also a non-empty proper subset of X. Let x and x be elements of X 0,y and X0,y, c respectively. By the definition of X 0,y, we know that W(y x) > 0 and W(y x) = 0. Pick any ỹ such that W(ỹ x) > 0. Let us assume for now that W(ỹ x) > 0 as well. W(ỹ x) > 0 and

60 38 Channel polarization for arbitrary-input DMCs W(ỹ x) > 0 imply P X1 Y 2 (x, ỹ) > 0, and (3.32) P X1 Y 2 ( x, ỹ) > 0, (3.33) respectively. Lemma with Eq. (3.32) and Eq. (3.33) gives, for any y 1, P Y1 X 1 (y 1 x) = P Y1 X 1 (y 1 x), (3.34) which is impossible by construction because W(y x) > 0 and W(y x) = 0. Hence, our assumption that W(ỹ x) > 0 leads to a contradiction, and we conclude that W(ỹ x) = 0. Having shown that for any ỹ Y such that W(ỹ x) > 0, W(ỹ x) = 0, it follows that inputs x and x can be used to transmit 1 bit with zero probability of error over the channel, which contradicts the condition of the lemma. This completes the proof Proof of Lemma Now we prove that the overall mutual information will be non-increasing after a step of channel transformation. According to the chain rule of mutual information and entropy, we have that I(W ) + I(W + ) = I(U 1 ; Y 1 Y 2 ) + I(U 2 ; Y 1 Y 2 U 1 ) (3.35) = I(U 1 ; Y 1 Y 2 ) + I(U 2 ; Y 1 Y 2 U 1 ) (3.36) = I(U 1 U 2 ; Y 1 Y 2 ) (3.37) = I(X 1 X 2 ; Y 1 Y 2 ) (3.38) = H(Y 1 Y 2 ) H(Y 1 Y 2 X 1 X 2 ) (3.39) = H(Y 2 ) + H(Y 1 Y 2 ) H(Y 1 X 1 ) H(Y 2 X 2 ) (3.40) I(X 2 ; Y 2 ) + H(Y 1 ) H(Y 1 X 1 ) (3.41) = 2I(W). (3.42) A sufficient but not necessary condition for the equality in Eq. (3.41) to hold is that X 1 and X 2 is independent from each other, e.g., (X, ) forms a group. Studying the full range of operations that yield equality in Eq. (3.41) is an interesting problem that has been studied in [51].

61 3.5 Proofs Proof of Lemma We first prove W + / C 0. The transition probability of channel W + : U 2 U 1 Y 1 Y 2 is W + (y 1 y 2 u 1 u 2 ) = P U1 (u 1 )W(y 1 u 1 u 2 )W(y 2 u 2 ). (3.43) Then for any u 2, u 2 X, we have y 1 Y y 2 Y u 1 X = y 1 Y y 2 Y u 1 X W + (y 1 y 2 u 1 u 2 )W + (y 1 y 2 u 1 u 2) (3.44) ( (PU1 (u 1 )) 2 W(y 2 u 2 )W(y 2 u 2)W(y 1 u 1 u 2 )W(y 1 u 1 u 2) ) (3.45) = W(y 2 u 2 )W(y 2 u 2) (P U1 (u 1 )) 2 W(y 1 u 1 u 2 )W(y 1 u 1 u 2) y 2 Y u 1 X y 1 Y }{{}}{{} >0 >0 (3.46) > 0. Eq. (3.47) along with Lemma implies C 0 (W + ) = 0, that is to say, Moreover, since (3.47) W + / C 0. (3.48) I(W + ) = I(U 2 ; Y 1 Y 2 U 1 ) (3.49) = I(U 2 ; Y 2 ) + I(U 2 ; Y 1 U 1 Y 2 ) (3.50) I(W) (3.51) > 0, (3.52) we have W + / C. (3.53) Based on Eq. (3.48) and Eq. (3.53), we conclude that W + / C 0, (3.54)

62 40 Channel polarization for arbitrary-input DMCs which completes the first part of the proof. The transition probability of the channel is W (y 1 y 2 u 1 ) = Then for any u 1, u 1 X, we have y 1 Y y 2 Y = y 1 Y y 2 Y u 2 X W : U 1 Y 1 Y 2 P U2 (u 2 )W(y 1 u 1 u 2 )W(y 2 u 2 ). (3.55) W (y 1 y 2 u 1 )W (y 1 y 2 u 1) (3.56) y 1 Y y 2 Y u 2 X P U2 (u 2 )W(y 1 u 1 u 2 )W(y 2 u 2 ) P U2 (u 2)W(y 1 u 1 u 2)W(y 2 u 2) u 2 X (3.57) (P U2 (u 2 )W(y 1 u 1 u 2 )W(y 2 u 2 )P U2 (u 2)W(y 1 u 1 u 2)W(y 2 u 2)) (3.58) = P U2 (u 2 )P U2 (u 2) W(y 2 u 2 )W(y 2 u 2) W(y 1 u 1 u 2 )W(y 1 u 1 u 2) y 2 Y } {{ y 1 Y }} {{ } >0 >0 > 0, (3.59) (3.60) where Eq. (3.58) holds for any u 2, u 2 X and this follows from the fact that the summation of non-negative numbers is larger or equal to any addend. Eq. (3.60) along with Lemma implies that C 0 (W ) = 0, that is to say, W / C 0. (3.61) Next we prove W / C if W / C, i.e., I(W ) > 0 if I(W) > 0. We will prove the equivalent proposition that I(W ) = 0 implies I(W) = 0. Consider the series of equations, assuming I(W ) = 0, I(W + ) I(W) = I(X 1 ; Y 1 U 1 Y 2 ) = I(W) I(W ) = I(W), (3.62) I(X 1 ; Y 1 U 1 Y 2 ) = I(X 1 ; Y 1 ), (3.63) H(Y 1 U 1 Y 2 ) H(Y 1 X 1 ) = H(Y 1 ) H(Y 1 X 1 ), (3.64) H(Y 1 U 1 Y 2 ) H(Y 1 ) = 0, (3.65) I(Y 1 ; U 1 Y 2 ) = 0, (3.66)

63 3.5 Proofs 41 I(Y 1 ; U 1 ) + I(Y 1 ; Y 2 U 1 ) = 0, (3.67) I(Y 1 ; Y 2 U 1 ) = 0, (3.68) I(Y 1 ; Y 2 U 1 = 0) = 0, (3.69) where in the last step U 1 is the neutral element of the group. The second equality in Eq. (3.62) holds when (X ; ) forms a group (see Lemma 3.3.3). The left-hand side of Eq. (3.64) comes from the fact that Y 1 X 1 (U 1, Y 2 ) forms a Markov chain. Eq. (3.68) comes from the non-negative property of mutual information. We will look into the joint distribution of (Y 1, Y 2 ) given U 1 = 0. All following arguments are conditioned on U 1 = 0 and we omit this expression for simplicity. Since U 1 = 0, we let X = X 1 = X 2 = U 2 be a uniform random variable on X. Then the joint distribution satisfies P Y1 Y 2 (y 1 y 2 ) = x = x P Y1 Y 2 X(y 1 y 2 x)p X (x) (3.70) P Y1 X(y 1 x)p Y2 X(y 2 x)p X (x), (3.71) and the marginal distributions satisfy ( ) ( ) P Y1 (y 1 )P Y2 (y 2 ) = P Y1 X(y 1 x)p X (x) P Y2 X(y 2 x)p X (x). (3.72) x x Since I(Y 1 ; Y 2 U 1 = 0) = 0, Y 1 and Y 2 are independent (given U 1 = 0). We have P Y1 Y 2 (y 1 y 2 ) = P Y1 (y 1 )P Y2 (y 2 ), (3.73) ( ) ( ) P X (x)p Y1 X(y 1 x)p Y2 X(y 2 x) = P X (x)p Y1 X(y 1 x) P X (x)p Y2 X(y 2 x), x x x (3.74) 1 P Y1 X(y 1 x)p Y2 X(y 2 x) = 1 ( ) ( ) P q q 2 Y1 X(y 1 x) P Y2 X(y 2 x), (3.75) x x for all y 1, y 2 Y. Let y 1 = y 2 = y and note that P Y1 X and P Y2 X are both the transition probability of the original channel W, denoted by P Y X, we have 1 q ( PY X (y x) ) ( = P Y X (y x)). (3.76) x q x x

64 42 Channel polarization for arbitrary-input DMCs According to Jensen s inequality, the equality is achieved if and only if all terms are equal, i.e., for each y Y, P Y X (y x) = c x X, where c is a constant depending on y. Thus for each y Y, P X Y (x y) = 1 q, x X. Then I(W) must satisfy I(W) = H(X) H(X Y ) (3.77) = log q y P Y (y)h(x Y = y) (3.78) = log q log q y P Y (y) (3.79) = 0. (3.80) We have shown that I(W ) = 0 implies I(W) = 0, equivalently, if W / C, W / C. (3.81) Based on Eq. (3.61) and Eq. (3.81), we conclude that W / C 0. (3.82) Furthermore, we notice that in the proof of Eq. (3.48) and Eq. (3.61), the inverse property of a group is not required. Thus if W / C 0 and (X ; ) forms a monoid, we have W / C 0, (3.83) W + / C 0. (3.84) This completes the proof Proof of Theorem We first prove the necessary condition for W C inv. This is a stronger result than what has been proved in Lemma We follow the idea in the proof of Lemma Let X u,y2 = {x X P X1 U 1,Y 2 (x u, y 2 ) > 0} and Y u,y2 = {y Y P Y1 U 1,Y 2 (y u, y 2 ) > 0}. Assume W C inv, i.e., I(X 1 ; Y 1 U 1 Y 2 ) = 0. According to the proof of Lemma 3.3.2, we have following two cases.

65 3.5 Proofs 43 Case 1: If u, y 2 such that X uy2 = X, then W C and both conditions are fulfilled. Fig. 3.4 illustrates the channel described by case 1, where lines with the same color represent the same transition probability. Fig. 3.4 Channel described by case 1. Case 2 : If u, y 2, the set X u,y2 is a proper subset of X. Examining Figure 3.1 for U 1 = 0, where 0 is the neutral element for (X, ), we observe that X 1 = X 2 = U 2, and hence the setup conditioned on U 1 = 0 is equivalent to the setup in Figure 3.3. Fig. 3.5 illustrates the channel described in case 2. We first prove Condition 1. According to the proof of Lemma 3.3.2, we have that if x 0 X 0y and x 1 X0y, c then W(y x 0 )W(y x 1 ) = 0. (3.85) y Y We have y i, y j Y, X 0yi = X 0yj or X 0yi X 0yj =. (3.86) This can be seen via a proof by contradiction. Assuming the contrary, we can find x 0 X 0yi X 0yj and x 1 X 0yi X0y c j such that W(y x 0 ) = W(y x 1 ), (3.87) W(y x 0 )W(y x 1 ) = 0. (3.88) y Y

An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets

An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets Jing Guo University of Cambridge jg582@cam.ac.uk Jossy Sayir University of Cambridge j.sayir@ieee.org Minghai Qin

More information

Practical Polar Code Construction Using Generalised Generator Matrices

Practical Polar Code Construction Using Generalised Generator Matrices Practical Polar Code Construction Using Generalised Generator Matrices Berksan Serbetci and Ali E. Pusane Department of Electrical and Electronics Engineering Bogazici University Istanbul, Turkey E-mail:

More information

Lecture 4 Noisy Channel Coding

Lecture 4 Noisy Channel Coding Lecture 4 Noisy Channel Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 9, 2015 1 / 56 I-Hsiang Wang IT Lecture 4 The Channel Coding Problem

More information

Polar Codes: Construction and Performance Analysis

Polar Codes: Construction and Performance Analysis Master Project Polar Codes: Construction and Performance Analysis Ramtin Pedarsani ramtin.pedarsani@epfl.ch Supervisor: Prof. Emre Telatar Assistant: Hamed Hassani Information Theory Laboratory (LTHI)

More information

ECEN 655: Advanced Channel Coding

ECEN 655: Advanced Channel Coding ECEN 655: Advanced Channel Coding Course Introduction Henry D. Pfister Department of Electrical and Computer Engineering Texas A&M University ECEN 655: Advanced Channel Coding 1 / 19 Outline 1 History

More information

On Bit Error Rate Performance of Polar Codes in Finite Regime

On Bit Error Rate Performance of Polar Codes in Finite Regime On Bit Error Rate Performance of Polar Codes in Finite Regime A. Eslami and H. Pishro-Nik Abstract Polar codes have been recently proposed as the first low complexity class of codes that can provably achieve

More information

LOW-density parity-check (LDPC) codes were invented

LOW-density parity-check (LDPC) codes were invented IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 54, NO 1, JANUARY 2008 51 Extremal Problems of Information Combining Yibo Jiang, Alexei Ashikhmin, Member, IEEE, Ralf Koetter, Senior Member, IEEE, and Andrew

More information

Polar Codes: Graph Representation and Duality

Polar Codes: Graph Representation and Duality Polar Codes: Graph Representation and Duality arxiv:1312.0372v1 [cs.it] 2 Dec 2013 M. Fossorier ETIS ENSEA/UCP/CNRS UMR-8051 6, avenue du Ponceau, 95014, Cergy Pontoise, France Email: mfossorier@ieee.org

More information

Polar Code Construction for List Decoding

Polar Code Construction for List Decoding 1 Polar Code Construction for List Decoding Peihong Yuan, Tobias Prinz, Georg Böcherer arxiv:1707.09753v1 [cs.it] 31 Jul 2017 Abstract A heuristic construction of polar codes for successive cancellation

More information

RCA Analysis of the Polar Codes and the use of Feedback to aid Polarization at Short Blocklengths

RCA Analysis of the Polar Codes and the use of Feedback to aid Polarization at Short Blocklengths RCA Analysis of the Polar Codes and the use of Feedback to aid Polarization at Short Blocklengths Kasra Vakilinia, Dariush Divsalar*, and Richard D. Wesel Department of Electrical Engineering, University

More information

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Lecture 6 I. CHANNEL CODING. X n (m) P Y X 6- Introduction to Information Theory Lecture 6 Lecturer: Haim Permuter Scribe: Yoav Eisenberg and Yakov Miron I. CHANNEL CODING We consider the following channel coding problem: m = {,2,..,2 nr} Encoder

More information

POLAR CODES FOR ERROR CORRECTION: ANALYSIS AND DECODING ALGORITHMS

POLAR CODES FOR ERROR CORRECTION: ANALYSIS AND DECODING ALGORITHMS ALMA MATER STUDIORUM UNIVERSITÀ DI BOLOGNA CAMPUS DI CESENA SCUOLA DI INGEGNERIA E ARCHITETTURA CORSO DI LAUREA MAGISTRALE IN INGEGNERIA ELETTRONICA E TELECOMUNICAZIONI PER L ENERGIA POLAR CODES FOR ERROR

More information

Lecture 9 Polar Coding

Lecture 9 Polar Coding Lecture 9 Polar Coding I-Hsiang ang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 29, 2015 1 / 25 I-Hsiang ang IT Lecture 9 In Pursuit of Shannon s Limit Since

More information

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011

Constructing Polar Codes Using Iterative Bit-Channel Upgrading. Arash Ghayoori. B.Sc., Isfahan University of Technology, 2011 Constructing Polar Codes Using Iterative Bit-Channel Upgrading by Arash Ghayoori B.Sc., Isfahan University of Technology, 011 A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree

More information

Random Redundant Soft-In Soft-Out Decoding of Linear Block Codes

Random Redundant Soft-In Soft-Out Decoding of Linear Block Codes Random Redundant Soft-In Soft-Out Decoding of Linear Block Codes Thomas R. Halford and Keith M. Chugg Communication Sciences Institute University of Southern California Los Angeles, CA 90089-2565 Abstract

More information

Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes

Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes Xiaojie Zhang and Paul H. Siegel University of California, San Diego, La Jolla, CA 9093, U Email:{ericzhang, psiegel}@ucsd.edu

More information

Lecture 11: Polar codes construction

Lecture 11: Polar codes construction 15-859: Information Theory and Applications in TCS CMU: Spring 2013 Lecturer: Venkatesan Guruswami Lecture 11: Polar codes construction February 26, 2013 Scribe: Dan Stahlke 1 Polar codes: recap of last

More information

Investigation of the Elias Product Code Construction for the Binary Erasure Channel

Investigation of the Elias Product Code Construction for the Binary Erasure Channel Investigation of the Elias Product Code Construction for the Binary Erasure Channel by D. P. Varodayan A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF APPLIED

More information

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission

More information

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes EE229B - Final Project Capacity-Approaching Low-Density Parity-Check Codes Pierre Garrigues EECS department, UC Berkeley garrigue@eecs.berkeley.edu May 13, 2005 Abstract The class of low-density parity-check

More information

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Introduction to Low-Density Parity Check Codes. Brian Kurkoski Introduction to Low-Density Parity Check Codes Brian Kurkoski kurkoski@ice.uec.ac.jp Outline: Low Density Parity Check Codes Review block codes History Low Density Parity Check Codes Gallager s LDPC code

More information

Low-Complexity Puncturing and Shortening of Polar Codes

Low-Complexity Puncturing and Shortening of Polar Codes Low-Complexity Puncturing and Shortening of Polar Codes Valerio Bioglio, Frédéric Gabry, Ingmar Land Mathematical and Algorithmic Sciences Lab France Research Center, Huawei Technologies Co. Ltd. Email:

More information

Successive Cancellation Decoding of Single Parity-Check Product Codes

Successive Cancellation Decoding of Single Parity-Check Product Codes Successive Cancellation Decoding of Single Parity-Check Product Codes Mustafa Cemil Coşkun, Gianluigi Liva, Alexandre Graell i Amat and Michael Lentmaier Institute of Communications and Navigation, German

More information

Polar Coding for the Large Hadron Collider: Challenges in Code Concatenation

Polar Coding for the Large Hadron Collider: Challenges in Code Concatenation Polar Coding for the Large Hadron Collider: Challenges in Code Concatenation Alexios Balatsoukas-Stimming, Tomasz Podzorny, Jan Uythoven {alexios.balatsoukas, tomasz.podzorny, jan.uythoven}@cern.ch European

More information

Belief propagation decoding of quantum channels by passing quantum messages

Belief propagation decoding of quantum channels by passing quantum messages Belief propagation decoding of quantum channels by passing quantum messages arxiv:67.4833 QIP 27 Joseph M. Renes lempelziv@flickr To do research in quantum information theory, pick a favorite text on classical

More information

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels Jilei Hou, Paul H. Siegel and Laurence B. Milstein Department of Electrical and Computer Engineering

More information

Lecture 4 : Introduction to Low-density Parity-check Codes

Lecture 4 : Introduction to Low-density Parity-check Codes Lecture 4 : Introduction to Low-density Parity-check Codes LDPC codes are a class of linear block codes with implementable decoders, which provide near-capacity performance. History: 1. LDPC codes were

More information

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@cam.ac.uk Alfonso Martinez Universitat Pompeu Fabra alfonso.martinez@ieee.org

More information

Compound Polar Codes

Compound Polar Codes Compound Polar Codes Hessam Mahdavifar, Mostafa El-Khamy, Jungwon Lee, Inyup Kang Mobile Solutions Lab, Samsung Information Systems America 4921 Directors Place, San Diego, CA 92121 {h.mahdavifar, mostafa.e,

More information

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land Ingmar Land, SIPCom8-1: Information Theory and Coding (2005 Spring) p.1 Overview Basic Concepts of Channel Coding Block Codes I:

More information

Nearest Neighbor Decoding in MIMO Block-Fading Channels With Imperfect CSIR

Nearest Neighbor Decoding in MIMO Block-Fading Channels With Imperfect CSIR IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 3, MARCH 2012 1483 Nearest Neighbor Decoding in MIMO Block-Fading Channels With Imperfect CSIR A. Taufiq Asyhari, Student Member, IEEE, Albert Guillén

More information

Low-Complexity Fixed-to-Fixed Joint Source-Channel Coding

Low-Complexity Fixed-to-Fixed Joint Source-Channel Coding Low-Complexity Fixed-to-Fixed Joint Source-Channel Coding Irina E. Bocharova 1, Albert Guillén i Fàbregas 234, Boris D. Kudryashov 1, Alfonso Martinez 2, Adrià Tauste Campo 2, and Gonzalo Vazquez-Vilar

More information

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel An Achievable Error Exponent for the Mismatched Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@camacuk Albert Guillén i Fàbregas ICREA & Universitat Pompeu Fabra University of

More information

SC-Fano Decoding of Polar Codes

SC-Fano Decoding of Polar Codes SC-Fano Decoding of Polar Codes Min-Oh Jeong and Song-Nam Hong Ajou University, Suwon, Korea, email: {jmo0802, snhong}@ajou.ac.kr arxiv:1901.06791v1 [eess.sp] 21 Jan 2019 Abstract In this paper, we present

More information

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122 Lecture 5: Channel Capacity Copyright G. Caire (Sample Lectures) 122 M Definitions and Problem Setup 2 X n Y n Encoder p(y x) Decoder ˆM Message Channel Estimate Definition 11. Discrete Memoryless Channel

More information

Fountain Uncorrectable Sets and Finite-Length Analysis

Fountain Uncorrectable Sets and Finite-Length Analysis Fountain Uncorrectable Sets and Finite-Length Analysis Wen Ji 1, Bo-Wei Chen 2, and Yiqiang Chen 1 1 Beijing Key Laboratory of Mobile Computing and Pervasive Device Institute of Computing Technology, Chinese

More information

LDPC Codes. Slides originally from I. Land p.1

LDPC Codes. Slides originally from I. Land p.1 Slides originally from I. Land p.1 LDPC Codes Definition of LDPC Codes Factor Graphs to use in decoding Decoding for binary erasure channels EXIT charts Soft-Output Decoding Turbo principle applied to

More information

Performance of Polar Codes for Channel and Source Coding

Performance of Polar Codes for Channel and Source Coding Performance of Polar Codes for Channel and Source Coding Nadine Hussami AUB, Lebanon, Email: njh03@aub.edu.lb Satish Babu Korada and üdiger Urbanke EPFL, Switzerland, Email: {satish.korada,ruediger.urbanke}@epfl.ch

More information

Polar Codes are Optimal for Lossy Source Coding

Polar Codes are Optimal for Lossy Source Coding Polar Codes are Optimal for Lossy Source Coding Satish Babu Korada and Rüdiger Urbanke EPFL, Switzerland, Email: satish.korada,ruediger.urbanke}@epfl.ch Abstract We consider lossy source compression of

More information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information 204 IEEE International Symposium on Information Theory Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information Omur Ozel, Kaya Tutuncuoglu 2, Sennur Ulukus, and Aylin Yener

More information

Making Error Correcting Codes Work for Flash Memory

Making Error Correcting Codes Work for Flash Memory Making Error Correcting Codes Work for Flash Memory Part I: Primer on ECC, basics of BCH and LDPC codes Lara Dolecek Laboratory for Robust Information Systems (LORIS) Center on Development of Emerging

More information

Lecture 12. Block Diagram

Lecture 12. Block Diagram Lecture 12 Goals Be able to encode using a linear block code Be able to decode a linear block code received over a binary symmetric channel or an additive white Gaussian channel XII-1 Block Diagram Data

More information

National University of Singapore Department of Electrical & Computer Engineering. Examination for

National University of Singapore Department of Electrical & Computer Engineering. Examination for National University of Singapore Department of Electrical & Computer Engineering Examination for EE5139R Information Theory for Communication Systems (Semester I, 2014/15) November/December 2014 Time Allowed:

More information

Quasi-cyclic Low Density Parity Check codes with high girth

Quasi-cyclic Low Density Parity Check codes with high girth Quasi-cyclic Low Density Parity Check codes with high girth, a work with Marta Rossi, Richard Bresnan, Massimilliano Sala Summer Doctoral School 2009 Groebner bases, Geometric codes and Order Domains Dept

More information

Bounds on Mutual Information for Simple Codes Using Information Combining

Bounds on Mutual Information for Simple Codes Using Information Combining ACCEPTED FOR PUBLICATION IN ANNALS OF TELECOMM., SPECIAL ISSUE 3RD INT. SYMP. TURBO CODES, 003. FINAL VERSION, AUGUST 004. Bounds on Mutual Information for Simple Codes Using Information Combining Ingmar

More information

Lecture 3: Channel Capacity

Lecture 3: Channel Capacity Lecture 3: Channel Capacity 1 Definitions Channel capacity is a measure of maximum information per channel usage one can get through a channel. This one of the fundamental concepts in information theory.

More information

Structured Low-Density Parity-Check Codes: Algebraic Constructions

Structured Low-Density Parity-Check Codes: Algebraic Constructions Structured Low-Density Parity-Check Codes: Algebraic Constructions Shu Lin Department of Electrical and Computer Engineering University of California, Davis Davis, California 95616 Email:shulin@ece.ucdavis.edu

More information

Graph-based Codes and Iterative Decoding

Graph-based Codes and Iterative Decoding Graph-based Codes and Iterative Decoding Thesis by Aamod Khandekar In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy California Institute of Technology Pasadena, California

More information

Reed-Solomon codes. Chapter Linear codes over finite fields

Reed-Solomon codes. Chapter Linear codes over finite fields Chapter 8 Reed-Solomon codes In the previous chapter we discussed the properties of finite fields, and showed that there exists an essentially unique finite field F q with q = p m elements for any prime

More information

Lecture 8: Shannon s Noise Models

Lecture 8: Shannon s Noise Models Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 8: Shannon s Noise Models September 14, 2007 Lecturer: Atri Rudra Scribe: Sandipan Kundu& Atri Rudra Till now we have

More information

The Compound Capacity of Polar Codes

The Compound Capacity of Polar Codes The Compound Capacity of Polar Codes S. Hamed Hassani, Satish Babu Korada and Rüdiger Urbanke arxiv:97.329v [cs.it] 9 Jul 29 Abstract We consider the compound capacity of polar codes under successive cancellation

More information

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results Information Theory Lecture 10 Network Information Theory (CT15); a focus on channel capacity results The (two-user) multiple access channel (15.3) The (two-user) broadcast channel (15.6) The relay channel

More information

An Introduction to Low Density Parity Check (LDPC) Codes

An Introduction to Low Density Parity Check (LDPC) Codes An Introduction to Low Density Parity Check (LDPC) Codes Jian Sun jian@csee.wvu.edu Wireless Communication Research Laboratory Lane Dept. of Comp. Sci. and Elec. Engr. West Virginia University June 3,

More information

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel 2009 IEEE International

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Noisy channel communication

Noisy channel communication Information Theory http://www.inf.ed.ac.uk/teaching/courses/it/ Week 6 Communication channels and Information Some notes on the noisy channel setup: Iain Murray, 2012 School of Informatics, University

More information

Guess & Check Codes for Deletions, Insertions, and Synchronization

Guess & Check Codes for Deletions, Insertions, and Synchronization Guess & Check Codes for Deletions, Insertions, and Synchronization Serge Kas Hanna, Salim El Rouayheb ECE Department, Rutgers University sergekhanna@rutgersedu, salimelrouayheb@rutgersedu arxiv:759569v3

More information

Efficient Bit-Channel Reliability Computation for Multi-Mode Polar Code Encoders and Decoders

Efficient Bit-Channel Reliability Computation for Multi-Mode Polar Code Encoders and Decoders Efficient Bit-Channel Reliability Computation for Multi-Mode Polar Code Encoders and Decoders Carlo Condo, Seyyed Ali Hashemi, Warren J. Gross arxiv:1705.05674v1 [cs.it] 16 May 2017 Abstract Polar codes

More information

Channel Polarization and Blackwell Measures

Channel Polarization and Blackwell Measures Channel Polarization Blackwell Measures Maxim Raginsky Abstract The Blackwell measure of a binary-input channel (BIC is the distribution of the posterior probability of 0 under the uniform input distribution

More information

Iterative Encoding of Low-Density Parity-Check Codes

Iterative Encoding of Low-Density Parity-Check Codes Iterative Encoding of Low-Density Parity-Check Codes David Haley, Alex Grant and John Buetefuer Institute for Telecommunications Research University of South Australia Mawson Lakes Blvd Mawson Lakes SA

More information

Shannon s noisy-channel theorem

Shannon s noisy-channel theorem Shannon s noisy-channel theorem Information theory Amon Elders Korteweg de Vries Institute for Mathematics University of Amsterdam. Tuesday, 26th of Januari Amon Elders (Korteweg de Vries Institute for

More information

On the Block Error Probability of LP Decoding of LDPC Codes

On the Block Error Probability of LP Decoding of LDPC Codes On the Block Error Probability of LP Decoding of LDPC Codes Ralf Koetter CSL and Dept. of ECE University of Illinois at Urbana-Champaign Urbana, IL 680, USA koetter@uiuc.edu Pascal O. Vontobel Dept. of

More information

Efficient Log Likelihood Ratio Estimation for Polar Codes

Efficient Log Likelihood Ratio Estimation for Polar Codes Efficient Log Likelihood Ratio Estimation for Polar Codes by Alaa Abdulameer Hasan, M. Sc. A dissertation submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements

More information

Modern Coding Theory. Daniel J. Costello, Jr School of Information Theory Northwestern University August 10, 2009

Modern Coding Theory. Daniel J. Costello, Jr School of Information Theory Northwestern University August 10, 2009 Modern Coding Theory Daniel J. Costello, Jr. Coding Research Group Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556 2009 School of Information Theory Northwestern University

More information

APPLICATIONS. Quantum Communications

APPLICATIONS. Quantum Communications SOFT PROCESSING TECHNIQUES FOR QUANTUM KEY DISTRIBUTION APPLICATIONS Marina Mondin January 27, 2012 Quantum Communications In the past decades, the key to improving computer performance has been the reduction

More information

Exact Probability of Erasure and a Decoding Algorithm for Convolutional Codes on the Binary Erasure Channel

Exact Probability of Erasure and a Decoding Algorithm for Convolutional Codes on the Binary Erasure Channel Exact Probability of Erasure and a Decoding Algorithm for Convolutional Codes on the Binary Erasure Channel Brian M. Kurkoski, Paul H. Siegel, and Jack K. Wolf Department of Electrical and Computer Engineering

More information

Low-complexity error correction in LDPC codes with constituent RS codes 1

Low-complexity error correction in LDPC codes with constituent RS codes 1 Eleventh International Workshop on Algebraic and Combinatorial Coding Theory June 16-22, 2008, Pamporovo, Bulgaria pp. 348-353 Low-complexity error correction in LDPC codes with constituent RS codes 1

More information

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents Meryem Benammar, Valerio Bioglio, Frédéric Gabry, Ingmar Land Mathematical and Algorithmic Sciences Lab Paris Research Center, Huawei

More information

for some error exponent E( R) as a function R,

for some error exponent E( R) as a function R, . Capacity-achieving codes via Forney concatenation Shannon s Noisy Channel Theorem assures us the existence of capacity-achieving codes. However, exhaustive search for the code has double-exponential

More information

Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal

Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal Design of Non-Binary Quasi-Cyclic LDPC Codes by Absorbing Set Removal Behzad Amiri Electrical Eng. Department University of California, Los Angeles Los Angeles, USA Email: amiri@ucla.edu Jorge Arturo Flores

More information

A Non-Asymptotic Approach to the Analysis of Communication Networks: From Error Correcting Codes to Network Properties

A Non-Asymptotic Approach to the Analysis of Communication Networks: From Error Correcting Codes to Network Properties University of Massachusetts Amherst ScholarWorks@UMass Amherst Open Access Dissertations 5-2013 A Non-Asymptotic Approach to the Analysis of Communication Networks: From Error Correcting Codes to Network

More information

Short Polar Codes. Peihong Yuan. Chair for Communications Engineering. Technische Universität München

Short Polar Codes. Peihong Yuan. Chair for Communications Engineering. Technische Universität München Short Polar Codes Peihong Yuan Chair for Communications Engineering July 26, 2016 LNT & DLR Summer Workshop on Coding 1 / 23 Outline 1 Motivation 2 Improve the Distance Property 3 Simulation Results 4

More information

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Nan Liu and Andrea Goldsmith Department of Electrical Engineering Stanford University, Stanford CA 94305 Email:

More information

Lecture 4: Proof of Shannon s theorem and an explicit code

Lecture 4: Proof of Shannon s theorem and an explicit code CSE 533: Error-Correcting Codes (Autumn 006 Lecture 4: Proof of Shannon s theorem and an explicit code October 11, 006 Lecturer: Venkatesan Guruswami Scribe: Atri Rudra 1 Overview Last lecture we stated

More information

LDPC Codes. Intracom Telecom, Peania

LDPC Codes. Intracom Telecom, Peania LDPC Codes Alexios Balatsoukas-Stimming and Athanasios P. Liavas Technical University of Crete Dept. of Electronic and Computer Engineering Telecommunications Laboratory December 16, 2011 Intracom Telecom,

More information

Maximum Likelihood Decoding of Codes on the Asymmetric Z-channel

Maximum Likelihood Decoding of Codes on the Asymmetric Z-channel Maximum Likelihood Decoding of Codes on the Asymmetric Z-channel Pål Ellingsen paale@ii.uib.no Susanna Spinsante s.spinsante@univpm.it Angela Barbero angbar@wmatem.eis.uva.es May 31, 2005 Øyvind Ytrehus

More information

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I

More information

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Nevroz Şen, Fady Alajaji and Serdar Yüksel Department of Mathematics and Statistics Queen s University Kingston, ON K7L 3N6, Canada

More information

Lecture 4 Channel Coding

Lecture 4 Channel Coding Capacity and the Weak Converse Lecture 4 Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 15, 2014 1 / 16 I-Hsiang Wang NIT Lecture 4 Capacity

More information

Reliable Communication Under Mismatched Decoding

Reliable Communication Under Mismatched Decoding Reliable Communication Under Mismatched Decoding Jonathan Scarlett Department of Engineering University of Cambridge Supervisor: Albert Guillén i Fàbregas This dissertation is submitted for the degree

More information

An Introduction to Low-Density Parity-Check Codes

An Introduction to Low-Density Parity-Check Codes An Introduction to Low-Density Parity-Check Codes Paul H. Siegel Electrical and Computer Engineering University of California, San Diego 5/ 3/ 7 Copyright 27 by Paul H. Siegel Outline Shannon s Channel

More information

Codes on graphs and iterative decoding

Codes on graphs and iterative decoding Codes on graphs and iterative decoding Bane Vasić Error Correction Coding Laboratory University of Arizona Funded by: National Science Foundation (NSF) Seagate Technology Defense Advanced Research Projects

More information

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel Introduction to Coding Theory CMU: Spring 2010 Notes 3: Stochastic channels and noisy coding theorem bound January 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami We now turn to the basic

More information

Lecture 5 Channel Coding over Continuous Channels

Lecture 5 Channel Coding over Continuous Channels Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5 From

More information

Channel combining and splitting for cutoff rate improvement

Channel combining and splitting for cutoff rate improvement Channel combining and splitting for cutoff rate improvement Erdal Arıkan Electrical-Electronics Engineering Department Bilkent University, Ankara, 68, Turkey Email: arikan@eebilkentedutr arxiv:cs/5834v

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

Advances in Error Control Strategies for 5G

Advances in Error Control Strategies for 5G Advances in Error Control Strategies for 5G Jörg Kliewer The Elisha Yegal Bar-Ness Center For Wireless Communications And Signal Processing Research 5G Requirements [Nokia Networks: Looking ahead to 5G.

More information

Algebraic Soft-Decision Decoding of Reed-Solomon Codes Using Bit-level Soft Information

Algebraic Soft-Decision Decoding of Reed-Solomon Codes Using Bit-level Soft Information 1 Algebraic Soft-Decision Decoding of Reed-Solomon Codes Using Bit-level Soft Information arxiv:cs/0611090v [cs.it] 4 Aug 008 Jing Jiang and Krishna R. Narayanan Department of Electrical and Computer Engineering,

More information

On Achievable Rates and Complexity of LDPC Codes over Parallel Channels: Bounds and Applications

On Achievable Rates and Complexity of LDPC Codes over Parallel Channels: Bounds and Applications On Achievable Rates and Complexity of LDPC Codes over Parallel Channels: Bounds and Applications Igal Sason, Member and Gil Wiechman, Graduate Student Member Abstract A variety of communication scenarios

More information

Distributed Source Coding Using LDPC Codes

Distributed Source Coding Using LDPC Codes Distributed Source Coding Using LDPC Codes Telecommunications Laboratory Alex Balatsoukas-Stimming Technical University of Crete May 29, 2010 Telecommunications Laboratory (TUC) Distributed Source Coding

More information

Graph-based codes for flash memory

Graph-based codes for flash memory 1/28 Graph-based codes for flash memory Discrete Mathematics Seminar September 3, 2013 Katie Haymaker Joint work with Professor Christine Kelley University of Nebraska-Lincoln 2/28 Outline 1 Background

More information

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 10, OCTOBER

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 10, OCTOBER IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 10, OCTOBER 2016 4029 Optimized Design of Finite-Length Separable Circulant-Based Spatially-Coupled Codes: An Absorbing Set-Based Analysis Behzad Amiri,

More information

Coding Techniques for Data Storage Systems

Coding Techniques for Data Storage Systems Coding Techniques for Data Storage Systems Thomas Mittelholzer IBM Zurich Research Laboratory /8 Göttingen Agenda. Channel Coding and Practical Coding Constraints. Linear Codes 3. Weight Enumerators and

More information

Construction of Protographs for QC LDPC Codes With Girth Larger Than 12 1

Construction of Protographs for QC LDPC Codes With Girth Larger Than 12 1 Construction of Protographs for QC LDPC Codes With Girth Larger Than 12 1 Sunghwan Kim, Jong-Seon No School of Electrical Eng. & Com. Sci. Seoul National University, Seoul, Korea Email: {nodoubt, jsno}@snu.ac.kr

More information

Chapter 7: Channel coding:convolutional codes

Chapter 7: Channel coding:convolutional codes Chapter 7: : Convolutional codes University of Limoges meghdadi@ensil.unilim.fr Reference : Digital communications by John Proakis; Wireless communication by Andreas Goldsmith Encoder representation Communication

More information

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES RETHNAKARAN PULIKKOONATTU ABSTRACT. Minimum distance is an important parameter of a linear error correcting code. For improved performance of binary Low

More information

Codes for Partially Stuck-at Memory Cells

Codes for Partially Stuck-at Memory Cells 1 Codes for Partially Stuck-at Memory Cells Antonia Wachter-Zeh and Eitan Yaakobi Department of Computer Science Technion Israel Institute of Technology, Haifa, Israel Email: {antonia, yaakobi@cs.technion.ac.il

More information

The Poisson Channel with Side Information

The Poisson Channel with Side Information The Poisson Channel with Side Information Shraga Bross School of Enginerring Bar-Ilan University, Israel brosss@macs.biu.ac.il Amos Lapidoth Ligong Wang Signal and Information Processing Laboratory ETH

More information

Physical Layer and Coding

Physical Layer and Coding Physical Layer and Coding Muriel Médard Professor EECS Overview A variety of physical media: copper, free space, optical fiber Unified way of addressing signals at the input and the output of these media:

More information

Generalized Writing on Dirty Paper

Generalized Writing on Dirty Paper Generalized Writing on Dirty Paper Aaron S. Cohen acohen@mit.edu MIT, 36-689 77 Massachusetts Ave. Cambridge, MA 02139-4307 Amos Lapidoth lapidoth@isi.ee.ethz.ch ETF E107 ETH-Zentrum CH-8092 Zürich, Switzerland

More information