Lecture 1: Basic problems of coding theory

Lecture 1: Basic problems of codig theory Error-Correctig Codes (Sprig 016) Rutgers Uiversity Swastik Kopparty Scribes: Abhishek Bhrushudi & Aditya Potukuchi Admiistrivia was discussed at the begiig of this lecture. All the iformatio about the course ca be foud o the course web page: http://www.math.rutgers.edu/ sk133/courses/codes- S16/. 1 Basic cocepts, defiitios, ad facts 1.1 Defiitios For most part of the course, will deote a parameter takig values i N. Defiitio 1 (Hammig distace). The Hammig distace betwee x, y {0, 1} is defied as the umber of coordiates i which x ad y differ. We shall deote the Hammig distace betwee x ad y by (x, y). A related otio is that of Hammig weight: Defiitio (Hammig weight). For x {0, 1}, the Hammig weight of x, deoted by wt(x) or x, is defied as (x, 0), where 0 deotes the all zero vector. Eve though we defied the above cocepts for strigs over {0, 1}, oe ca exted them to strigs over a arbitrary alphabet Σ i a atural maer. The reader ca easily check the followig: Fact 3. For x, y, z {0, 1}, (x, y) + (y, z) (x, z). Also, (x, y) 0, ad equality holds iff x = y. Aother way of sayig this is that (.,.) is a metric o {0, 1}. We are ow i a positio to defie what a code with miimum distace d is: Defiitio 4 (Code with miimum distace d). A code with miimum distace d i {0, 1} is a subset C {0, 1} with the property that mi x,y C,x y (x, y) = d. d is called the miimum distace of the code C. Note that the otio of miimum distace is well defied for ay subset of {0, 1}. Before we go o, we itroduce yet aother useful cocept. Defiitio 5 (Hammig ball). The Hammig ball of radius r cetered aroud c {0, 1} is the set {y {0, 1}, (y, c) r}. We will deote it by B (c, r), though it s typical to drop the i the subscript whe it s clear that we are workig over {0, 1}. 1

1. A short discussio o error correctio What does error correctio have to do with the combiatorial objects defied above? Let s cosider the followig sceario: Alice is o Earth ad Bob s o the Moo. They are sedig messages to each other, where the messages come from a subset C of {0, 1}. Give the huge distace betwee them, it s coceivable that wheever a message is set, some of its bits get corrupted (flipped). As we shall see later, it s reasoable to assume that wheever a message x C {0, 1} is trasmitted, at most t of its bits get corrupted. Alice ad Bob wat to be able to recover the origial message from a corrupted oe. Let us suppose that the set C had miimum distace d. How much error ca be tolerated, i.e. how large ca t be? Claim 6. t d 1 Proof. Suppose d < t + 1. Let c 1 ad c be i C such that (c 1, c ) = d. It follows that there is a strig x {0, 1} such that (x, c 1 ) t ad (x, c ) t. Now suppose Alice seds c 1 to Bob ad it gets corrupted with t errors ad becomes x whe it reaches Bob. Bob has o way to tell whether Alice had set c 1 or c sice both ca be corrupted with t errors ad be made ito x. But what happes whe t d 1? Ca the origial codeword be recovered? The followig fact, which follows trivially from the triagle iequality, tells us that recovery is possible i this regime: Fact 7. Let C be a code with miimum distace d. For ay c, c C, c c, we have that B(c, d 1 ) is disjoit from B(c, d 1 ). Proof. We will give a proof by cotradictio. Suppose x B(c, d 1 ) B(c, d 1 ). The (x, c) d 1 ad (x, c ) d 1, ad by triagle iequality, we have (c, c ) d 1 < d, which is a cotradictio. Thus, we wat the miimum distace to be as large as possible i order to be able to tolerate a large umber of errors. But the oe might argue, why ot just take C to be the set cotaiig the all zero strig ad the all oe strig, makig the distace? The reaso is that we ot oly wat large distace, but also wat C to be large. Why? Let s say that Alice ad Bob have come up with a way to idetify the itegers 1,..., C with the elemets of C. Now wheever Alice has a umber i her mid that she wats to covey to Bob, all she has to do is to sed the elemet of C correspodig to that umber. But, for doig this, she has to sed a strig of legth. It is easy to see that itegers betwee 1 ad C ca be represeted usig roughly log C bits, ad so if C <<, the Alice ad Bob are beig wasteful - the effective amout of iformatio beig set i the bits is log C <<. 1.3 The mai questios The above discussio raises the followig questios: 1. Give, d, how large ca C {0, 1} be such that C has miimum distace d?

. Give, d, how ca oe efficietly costruct a code i {0, 1} with miimum distace d? 3. Give, d, a code C {0, 1} with miimum distace d, ad a x {0, 1} with the property that there is a (uique) c C such that x B(c, d 1 ), how ca oe efficietly decode x, i.e. how does oe efficietly fid c? For this lecture, we will mostly focus our attetio o the first questio. Basic results We begi by givig some basic existece ad impossibility results about the umber of codewords i a code with miimum distace d. We will always be iterested i the asymptotics of. Particularly iterestig choices of d are (1) costat distace d = O(1), ad () costat relative distace d = δ..1 The volume boud Set r = d 1. It follows from Fact 7 that c C B(c, r) = c C B(c, r) = C B(r). Also ad so we have which gives us B(c, r) {0, 1} c C B(c, r) c C C B(r). Thus, C (d 1)/ i=0 This boud is kow as the volume boud or the Hammig boud. ( i). The greedy costructio A greedy approach: Cosider the followig greedy process: Set S = {0, 1} ad let C =. As log as S, pick a poit v S ad add v to C. S = S \ B(v, d 1). Remove B(v, d 1) from S, i.e. 3

Notice that we are removig at most d 1 ( ) i=0 i poits from S i every iteratio. Sice we add oe elemet to C i each iteratio, we have that Usig the fact that d = O(1), we get C d 1 i=0 C = Ω( d 1 ) Note that this boud does ot match the volume boud that we proved. It s coceivable that a modificatio of this approach might give better bouds though this has eluded everyoe till date. ( i).3 A probabilistic costructio It turs out that radomly chose codes also achieve roughly the same parameters as the greedy costructio. However, some care is eeded i the radom choice, as show by the followig failed attempt. See the homework ad the ext lecture for more o this. Naive probabilistic approach: A aive probabilistic approach is to pick K vectors, x 1,..., x K, i {0, 1} uiformly ad idepedetly, ad set C = {x 1,..., x K }. Fact 8 (Birthday Paradox). If we pick Θ( N) umbers betwee 1 ad N uiformly ad idepedetly the, with probability 0.9, at least two of them will be the same. Thus, if K = Θ( / ), with probability 0.99, it would be the case that x i = x j for some i j. I fact, it s ot hard to see that, eve if we do t worry about duplicatios, with high probability, two of the strigs will ed up beig i the same ball of radius d 1, which would thwart our attempt. A probabilistic approach with alteratios: This approach ivolves pickig strigs uiformly ad idepedetly as above, followed by doig some modificatios to the set i order to get the distace property. [See the first homework set] 3 Costat distace codes Cosider the case whe is a growig parameter, ad d is a fixed costat, i.e. d = O(1). I this case, the volume packig boud says that: ( ) C O (d 1)/. The greedy code costructio produces a code C with ( C Ω These bouds are roughly i the same ball-park, but they are still sigificatly off. I a later lecture we will see that i fact the volume packig boud is tight: this is by a remarkable algebraic family of codes kow as BCH codes. d 1 ). 4

4 The d = 3 case: the Hammig code For d = 3, the volume boud becomes ( ) C = O We will show a costructio that gives a code with C = Θ( /), thus resolvig our questio for d = 3. This code is kow as the Hammig code, ad is due to Richard Hammig who also showed the volume boud. We idetify {0, 1} with the field F, ad thik of the code as a subset of F. Let l N be a parameter, ad H be the l ( l 1) matrix whose colums cosist of all o-zero vectors i F l. We set = l 1 ad defie our code as C = {c F ; Hc = 0} The reader ca check that C is a subspace of F whose dimesio is l1. Thus, we have ( ) C = l = + 1 = Θ All that remais to be show is that the miimum distace of C is 3. We will do this by showig that the miimum distace of C caot be 1 or. Before we show this, we will take a slight detour to talk about some properties of codes that are subspace i F. The followig is a easy fact: Fact 9. Let u, v, w F. The (u, v) = (u w, v w). If a code is a subspace of F, it is called a liear code. We will study more properties of liear codes i the ext lecture, but here is a fact that we will eed for ow: Fact 10. If C is a liear code with miimum distace d the mi c C wt(c) = d, where wt(.) deotes the Hammig weight fuctio. Proof. We will first show that mi c C wt(c) d. Sice C has miimum distace d there are codewords c 1, c C such that (c 1, c ) = d. Usig Fact 9, (c 1 c, 0) = d, which is the same as wt(c 1 c ) = d. Sice C is a liear code c 1 c C ad we have mi c C wt(c) d. We will ow show the other directio. Suppose there is a c C such that wt(c) < d. Sice C is a liear code ad hece a subspace, we have that 0 C. The, (c, 0) < d, which would be a cotradictio sice the miimum distace of C is d. We ow get back to showig that the miimum distace of the Hammig code is 3. By Fact 10, this is the same as showig that there are o codewords of Hammig weight less tha 3. Case i - there is a codeword c such that wt(c) = 1: From the defiitio of the Hammig code, we have Hc = 0. But this is ot possible sice Hc is a colum of H, ad all the colums of H are o-zero. 1 This follows from the rak-ullity theorem over F. The reader is urged to try ad prove the theorem over F if he or she has t doe so before. 5

Case ii - there is a codeword c such that wt(c) = : Agai, Hc = 0, ad Hc = v w, where v, w are distict colums of H. This meas v w = 0 or v = w, which is a cotradictio. This shows that the miimum distace of the Hammig code is at least 3. Codes that match the volume boud are called perfect codes ad the Hammig code is such a code for the d = 3 case. We ow try ad aswer the third questio i Sectio 1.3 for the Hammig code, i.e. give x {0, 1} such that (x, c) 1 for some c C, how do we efficietly fid c? Typically, efficietly here meas that the ruig time of our recovery algorithm should be polyomial i, the iput size. I the case of the Hammig code, polyomial time is trivial, sice oe ca try all 1-bit modificatios of x ad check if that resultig strig is i the code. This gives a quadratic time decodig algorithm. Below we give a liear time decodig algorithm. 1. Give x, compute Hx. This ivolves multiplyig a Θ(log ) matrix with a 1 vector, which ca be doe i time. Θ( log ).. If Hx = 0, the x C. Otherwise x = c + e i for some c C, where e i is the vector i F which is 0 everywhere except i the i th coordiate. The H(c + e i ) = Hc + He i = He i. 3. Note that He i is the i th colum of H. We ow wat to compute i give He i. This ca be doe by comparig He i with all the colums of H, ad takes time O( log ). 4. Oce i is determied, we ca compute c as x + e i. Thus, we have a algorithm to recover c that takes time Θ( log ). This ca be made Θ() by divide ad coquer: exercise! 5 More o liear Codes Recall that a code C F is said to be liear if it is a subspace of F. The defiitio of liear codes allows us to succictly represet the code by selectig a basis v 1, v,..., v k of C where k. It also allows a bijectio betwee {0, 1} k ad C as follows: for a σ {0, 1} k, the correspodig codeword is c(σ) = k i=1 σ 1v i F. To this extet, let G be a matrix with k rows ad colums, where the i th row is vi T. v T 1 v T G =... vk T The matrix G is called the Geerator Matrix for C. The map E : F k F give by E(m) = mg is called the Ecodig Map. Here, oe ormally thiks of m as a message (give by a biary strig) ad E(m) as the ecodig of m. F k Ecodig F Corrupt F Decodig F k 6

Sice liear codes are othig but vector spaces, we defie a Dual Code C of a liear code C F, aalogous to dual spaces, as follows: C := {y F x i y i = 0 for all x C} i=1 Observatio 11. (C ) = C Observatio 1. From the rak-ullity theorem, we have dim(c) + dim(c ) =. A Parity Check Matrix for a liear code C {0, 1} is a ( k) matrix with rows w T 1, wt,..., wt k, where w 1, w,..., w k are the basis vectors for C Observatio 13. c C iff Hc = 0 Observatio 14. GH T = 0 w T 1 w T H =... w k T The above observatio is the mai property of the Parity Check Matrix. Propositio 15. The followig are equivalet: 1. C has miimum distace at least d. Every ozero elemet of C has at least d o zero etries 3. Every d 1 colums of H are liearly idepedet. Proof. (1 ): Sice 0 C, we have that for every x C, where x 0, (x, 0) d. Therefore, x has at least d ozero etries. ( 3 ): Sice Hx = 0 implies x C, if the sum of some t < d colums i H was equal to 0, the we have a codeword that has t < d ozero etries, which is a cotradictio. (3 1 ): Suppose there existed x, y C such that (x, y) < d, we have Hx = Hy = 0, which gives us H(x + y) = 0. But sice x + y had less tha d ozero etries, the sum of < d colums i H equal to 0, which is a cotradictio 5.1 Larger Alphabets So far, we have thought of all of our messages, ad their ecodigs as biary strigs. However, i geeral, we eed ot be restricted by this coditio. Let Σ be ay fiite set of alphabet which 7

make up the cotet of the messages. We ca exted the defiitio of Hammig distace betwee two strigs x ad y i this alphabet to be the umber of poits where they differ. We ca also get a similar volume packig boud as before. Let C be code over the alphabet Σ that ca correct at most r errors. Let B (r) be the set of strigs i Σ, that differ i at most r poits from a give Σ strig. The, we have: C Σ B (r) Σ We ca also pick words greedily to get a lower boud o the umber of code words with miimum distace d: C Σ B (d) Σ However, to talk about liear codes, we ca oly use certai sizes of alphabet, i particular, if Σ is either a prime, or a prime power, we ca idetify Σ with a field F p or F p l, ad talk about liear codes as subspaces of F p or F p l. We ca defie a Geerator Matrix similar to the biary case as a k matrix G, where the rows are a basis of C. We also defie a dual code C as before, ad a Parity Check Matrix as a ( k) matrix whose rows are a basis of C. A few remarks: 1 Eve if you oly care about biary codes, the study codes over larger alphabet sizes is very importat, sice some of the best kow costructios of biary codes are by usig codes over larger alphabet sizes. Imposig the coditio of liearity o codes does ot seem to come with ay disadvatages. We do ot kow ay cosiderable differeces betwee codes ad liear codes. 6 Erasure Correctio I the regime of Erasure Correctio, a codeword is give by a strig x {0, 1}. However, istead of beig flipped, some bits are replaced with the symbol?. These ca be thought of as erasures, i.e., the bits i those locatio have bee erased. Our goal is to desig a code that is resiliet to at most t erasures. I other words, if C is a t erasure code, the for every c C, ad for every set of t coordiates, if we erase these t coordiates, the c should ot match with ay other strig i C i the remaiig t coordiates. Observatio 16. A code C is t erasure iff it has miimum distace at least t + 1. Observatio 17. Hammig code is erasure. 8

7 Costat relative distace Now we study the regime where the distace of the code is δ. The rate of a biary code C, give by log C, is a measure of how much iformatio each coordiate cotais (there are log C bits of iformatio eeded to specify a codeword i C, ad this iformatio is the writte i the form of bits). For a code over a geeral alphabet Σ, the rate is give by log Σ C. The Relative Miimum Distace (or simply relative distace) of a code C {0, 1} with miimum distace d is give by d. Before we study the relatio betwee the rate ad miimum distace of a code, we establish the followig useful estimate for the volume of the Hammig ball. For ay positive costat ρ < 1, we have: B (ρ) = Θ ( ) H(ρ) 1 where H(ρ) is called the Shao Etropy give by H(ρ) = ρ log ρ +(1 ρ) log 1 1 ρ. This itimate relatio to the volumes of Hammig balls is the reaso that the fuctio H shows up so much i the study of codes. The followig plot shows the relatio betwee ρ ad H(ρ). It is symmetric about ρ = 1/, where it takes its maximum value, H(1/) = 1, ad is 0 at ρ = 0, ad ρ = 1. 1 0.8 H(ρ) 0.6 0.4 0. 0 0 0. 0.4 0.6 0.8 1 ρ 9

7.1 Upper Boud From the Volume Packig Boud, we have C = O H(δ/) Takig logarithms o both sides ad dividig by, we have ( ) δ R 1 H + o(1) 7. Lower Boud Usig the greedy approach as before, we get C B (δ 1) = Ω H(δ) Takig logarithms o both sides ad divig by, R 1 H(δ) + o(1) 7.3 The story so far.. So far, we kow that for a positive costat δ < 1, there caot exist codes of rate more tha 1 H(δ/) +o(1) (Volume Packig Boud), ad that there exist codes with rate at least 1 H(δ) +o(1). These are illustrated i the followig plot. 1 0.8 0.6 R 0.4 0. r b r b 0 0 0. 0.4 0.6 0.8 1 δ 10

The upper boud is ot tight for every δ. For istace, we kow that there do ot exist codes with ozero rate with relative distace 3/4 (Exercise!). I the ext lecture, we will see that i fact, there are o codes with ozero rate with relative distace 1/, ad give a somewhat better overall upper boud. 11