Lecture 8: Shannon s Noise Models

Similar documents
atri/courses/coding-theory/book/

Lecture 4: Proof of Shannon s theorem and an explicit code

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Limits to List Decoding Random Codes

4 An Introduction to Channel Coding and Decoding over BSC

Noisy channel communication

Lecture 4 Noisy Channel Coding

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

CSCI 2570 Introduction to Nanocomputing

Error-Correcting Codes:

Capacity of a channel Shannon s second theorem. Information Theory 1/33

3F1 Information Theory, Lecture 3

Lecture 19: Elias-Bassalygo Bound

Lecture 4 Channel Coding

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Lecture 12: Reed-Solomon Codes

Lecture 8: Channel Capacity, Continuous Random Variables

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Lecture 11: Polar codes construction

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

Lecture 4: Codes based on Concatenation

Revision of Lecture 5

Exercise 1. = P(y a 1)P(a 1 )

List Decoding of Reed Solomon Codes

Entropies & Information Theory

Variable Rate Channel Capacity. Jie Ren 2013/4/26

Lecture 1. Introduction

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Lecture 12: November 6, 2017

Error Correcting Codes Questions Pool

Principles of Communications

(Classical) Information Theory III: Noisy channel coding

Lecture 22: Final Review

Chapter 9 Fundamental Limits in Information Theory

Lecture 12. Block Diagram

3F1 Information Theory, Lecture 3

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

Upper Bounds on the Capacity of Binary Intermittent Communication

And for polynomials with coefficients in F 2 = Z/2 Euclidean algorithm for gcd s Concept of equality mod M(x) Extended Euclid for inverses mod M(x)

Lecture 11: Quantum Information III - Source Coding

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Appendix B Information theory from first principles

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Lecture 6: Expander Codes

Hamming Codes 11/17/04

1 Background on Information Theory

MATH Examination for the Module MATH-3152 (May 2009) Coding Theory. Time allowed: 2 hours. S = q

Shannon s noisy-channel theorem

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

ECE Information theory Final

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

Information Theory - Entropy. Figure 3

An introduction to basic information theory. Hampus Wessman

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Shannon's Theory of Communication

Lecture 1: Introduction, Entropy and ML estimation

Shannon s A Mathematical Theory of Communication

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

Bridging Shannon and Hamming: List Error-Correction with Optimal Rate

CSCI-B609: A Theorist s Toolkit, Fall 2016 Oct 4. Theorem 1. A non-zero, univariate polynomial with degree d has at most d roots.

16.36 Communication Systems Engineering

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Lecture 14 February 28

Shannon s Noisy-Channel Coding Theorem

UNIT I INFORMATION THEORY. I k log 2

Ma/CS 6b Class 25: Error Correcting Codes 2

ECE Information theory Final (Fall 2008)

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Quantum rate distortion, reverse Shannon theorems, and source-channel separation

Compression and Coding

2012 IEEE International Symposium on Information Theory Proceedings

Noisy-Channel Coding

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

One Lesson of Information Theory

EE 376A: Information Theory Lecture Notes. Prof. Tsachy Weissman TA: Idoia Ochoa, Kedar Tatwawadi

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

ELEC546 Review of Information Theory

Lecture 10: Broadcast Channel and Superposition Coding

Lecture 11: Continuous-valued signals and differential entropy

An Introduction to (Network) Coding Theory

Lecture 9 Polar Coding

Message Passing Algorithm and Linear Programming Decoding for LDPC and Linear Block Codes

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Lecture 3: Error Correcting Codes

Solutions to Homework Set #3 Channel and Source coding

for some error exponent E( R) as a function R,


Turbo Compression. Andrej Rikovsky, Advisor: Pavol Hanus

Entropy as a measure of surprise

A One-to-One Code and Its Anti-Redundancy

X 1 : X Table 1: Y = X X 2

9 THEORY OF CODES. 9.0 Introduction. 9.1 Noise

Transcription:

Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 8: Shannon s Noise Models September 14, 2007 Lecturer: Atri Rudra Scribe: Sandipan Kundu& Atri Rudra Till now we have concentrated on the worst case noise model pioneered by Hamming. In today s lecture, we will study some stochastic noise models (most of) which were first studied by Shannon. 1 Overview of Shannon s Result Shannon introduced the notion of reliable communication over noisy channels. Broadly, there are two types of channels that were studied by Shannon: Noisy Channel: This type of channel introduces error in during transmission, which results in incorrect reception of the transmitted signal by the receiver. Redundancy is added at the transmitter to increase reliability of the transmitted data. The redundancy is taken off at the receiver. This process is termed as Channel Coding. Noise-free Channel: As the name suggests, this channel does not introduce any type of error in transmission. Redundancy in source data is being used for of the compression of the source data in the transmitter. The data is decompressed at the receiver. The process is popularly known as Source Coding. The figure below presents a generic model of a communication system, which combines the two concepts we discussed above: Message Source Encoder Channel Encoder Channel Channel Channel Decoder Source Decoder Decoded Message In the figure above, source coding and channel coding are coupled. However, Shannon s source coding theorem allows us to decouple both these parts of the communication and study each of these parts separately. Intuitively, this makes sense: if one can have reliable communication over the channel using channel coding, then for the source coding the channel effectively has no noise. For source coding, Shannon proved a theorem that precisely calculated the amount by which the message can be compressed: this amount is related to the entropy of the message. We will however, not talk about source coding in any detail in this course. From now on, we will exclusively focus on the channel coding part of the communication setup. Note that one aspect of channel coding is how we model the channel noise. We have seen Hamming s worst case noise model in some detail. Next, we will study some specific stochastic channels. 1

2 Shannon s Noise Model Shannon proposed a stochastic way of modeling noise. The symbols that are input to the channel are assumed to come from some input alphabet X, while the channel spits out symbols from its output alphabet Y. The following diagram shows this relationship: X x channel y Y The channels considered by Shannon are also memoryless, that is, noise acts independently on each transmitted symbol. In this course, well only study discrete channels where both the alphabets X and Y are finite (today we will define one channel that is not discrete, though we will not study it in any detail later on). The final piece in specification of a channel is the transition matrix M that governs the process of how the channel introduces error. In particular, the channel is described in form of a matrix with entries as cross over probability over all combination of the input and output alphabets. For any pair (x, y) X Y, let Pr(y x) denote the probability that when x is input to the channel and y is output by the channel. Then the transition matrix is given by M(x, y)=pr(y x). Specific structure of the matrix is shown below.. M= Pr(y x). Next, we look at some specific instances of channels. 2.1 Binary Symmetric Channel (BSC) Let 0 p 1 2. The Binary Symmetric Channel with crossover probability p or BS C p is defined as follows. X=Y={0, 1}. The 2 2 transition matrix can naturally be represented as a bipartite graph where the left vertices correspond to the rows and the right vertices correspond to the columns of the matrix, where M(x, y) is represented as the weight of the corresponding (x, y) edge. For BS C p, the graph looks as follows: 2

In other words, every bit is flipped with probability p. 2.2 q-ary Symmetric Channel (qsc) We now look at the generalization of BS C p to alphabets of size q 2. Let 0 p 1 1. The q-ary q Symmetric Channel with crossover probability p, or qs C p is defined as follows. X=Y=[q]. The transition matrix M for qs C p is defined as follows. { 1 p if y= x M(x, y)= if y x p q 1 In other words, every symbol is left untouched with probability 1 p and is distorted to each of the q 1 possible different symbols with equal probability. 2.3 Binary Erasure Channel (BEC) In the previous two examples that we saw, X=Y. However this need not always be the case. Let 0 α 1. The Binary Erasure Channel with erasure probabilityαis defined as follows. X={0, 1} and Y={0, 1,?}, where? denotes an erasure. The transition matrix is as follows: In the above any edge that is not present represents a transition that occurs with 0 probability. In other words, every bit in BEC α is erased with probabilityα(and is left as is with probability 1 α). 2.4 Binary Input Additive Gaussian White Noise Channel (BIAGWN) We now look at a channel that is not discrete. Letσ 0. The Binary Input Additive Gaussian White Noise Channel with standard deviationσor BIAGWN σ is defined as follows. X={ 1, 1} and Y = R. The noise is modeled by continuous Gaussian probability distribution function. The Gaussian distribution has lots of nice properties and is a popular choice for modeling noise of 3

continuous nature. Given (x, y) { 1, 1} R, the noise y x is distributed according to the Gaussian distribution of zero mean and standard deviation ofσ. In other words, ( ( )) 1 (y x) 2 Pr (y x)= σ 2π exp 2σ 2 3 Error Correction in Stochastic Noise Models We now need to revisit the notion of error correction. Note that unlike in Hamming s noise model, we cannot hope to always recover the transmitted codeword. As an example, in BS C p there is always some positive probability that can a codeword can be distorted into another codeword during transmission. In such a scenario no decoding algorithm can hope to recover the transmitted codeword. Thus, in stochastic channels there is always will be some decoding error probability (where the randomness is from the channel noise). However, we would like this error probability to be small for every possible transmitted codeword. More precisely, for every message, we would like the decoding algorithm to recover the transmitted message with probability 1 f (n), where lim n f (n) 0, that is f (n) is o(1). Ideally, we would like to have f (n)=2 Ω(n). 3.1 Shannon s General Theorem Recall that the big question that we are interested in this course is the tradeoff between the rate of the code and the fraction of errors that can be corrected. For stochastic noise models that we have seen, it is natural to think of the fraction of errors to be the parameter that governs the amount of error that is introduced by the channel. For example, for BS C p, we will think of p as the fraction of errors. Shannon s remarkable theorem on channel coding was to precisely identify when reliable transmission is possible over the stochastic noise models that he considered. In particular, for the general framework of noise models that he considered, Shannon defined the notion of capacity, which is a real number such that reliable communication is possible if and only if the rate is less than the capacity of the channel. We are going to state (and prove) Shannon s general result for the special case of BS C p. To state the result, we will need the following definition: Definition 3.1 (q-ary Entropy Function). Let q 2 be an integer and 0 x 1 be a real. Then the q-ary entropy function is defined as follows: H q (x)= x log q (q 1) x log q (x) (1 x) log q (1 x). See Figure 1 for a pictorial representation of the H q ( ) for the first few values of q. For the special case of q=2, we will drop the subscript from the entropy function and denote H 2 (x) by just H(x), that is, H(x)= x log x (1 x) log(1 x), where log x is defined as log 2 (x) (we are going to follows this convention for the rest of the course). We are now ready to state the theorem: 4

1 0.9 q=2 q=3 q=4 0.8 0.7 0.6 H q (x) ---> 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 x ---> Figure 1: A plot of H q (x) for q=2, 3 and 4. The maximum value of 1 is achieved at x=1 1/q. Theorem 3.2 (Shannon s Capacity for BSC). For reals 0 p< 1 and 0 ε 1 p, the following 2 2 hold for large enough n. 1. There exists a realδ>0, an encoding function E :{0, 1} k {0, 1} n and a decoding function D :{0, 1} n {0, 1} k where k 1 H(p+ε)n, such that the following holds for every m {0, 1} k. Pr Noise e from BS C [D(E(m)+e)) m] 2 δn. 2. If k (1 H(p)+ε)n then for every pair of encoding and decoding functions, E :{0, 1} k {0, 1{ n and D :{0, 1} n {0, 1} k, there exists m {0, 1} k such that Pr [D(E(m)+e)) m] 1 Noise e from BS C 2. Remark 3.3. Theorem 3.2 implies that the capacity of BS C p is 1 H(p). It can also been shown that the capacity of qs C p and BEC α are 1 H q (p) and 1 α respectively. The appearance of the entropy function in Theorem 3.2 might surprise the reader who has not seen the theorem before. Without going into the details of the proof for now we remark that the entropy function gives a very good estimate of the volume of a Hamming ball. In particular, recall that B q (y,ρn) is the Hamming Ball of radiusρn, that is, B q (y,ρn)={x [q] n (x, y) ρn}. Let Vol q (y,ρ n )= B q (y,ρ n ) denote the volume of the Hamming ball of radiusρn. Note that since the volume of a Hamming ball is translation invariant, Vol q (y,ρn)=vol q (0,ρ n ). We will need the following inequalities in the proof of Theorem 3.2. 5

Proposition 3.4. Let q 2, n 1 be integers and let 0 ρ 1 1 be a real. Then the following q inequalities hold: 1. Vol q (0,ρn) q nh q(ρ) ; and 2. Vol q (0,ρn) q nh q(ρ) o(n). In the next lecture, we will see the proof of Proposition 3.4 as well as the proof of the negative part of Theorem 3.2. 6