Examples of Noisy Channels analogue telephone line over which two modems communicate digital information a teacher mumbling at the board radio communication link between curiosity on Mars and earth reproducing cells, where daughter cells contain DNA from the parents cell a disk drive
Example of Noisy Channel THE COIEF DIFFIOULTY ALOCE FOUOD OT FIRST WAS IN OAOAGING HER FLAOINGO: SHE SUCCEODEO ON GO OTIOG IOS BODY OUOKEO AOAO, COMFOROABLY EOOOGO, UNDER OER O OM, WITO OTS O O OS HANGIOG DOO O, BOT OENEOAO OY, OUST AS SO O HOD OOT OTS O OCK NOCEO O SOROIGHTEOEO O OT, ANO WOS O O ONG TO OIOE TO O HEDGEHOG O OLOW WOTH ITS O OAD, O O WOULO TWOST O OSEOF OOUO O ANO O O OK OP IN HOR OACO, O OTO OUO O A O O OZOED EO OREOSOOO O O O O SHO COUOD O O O O O O O O O OSO O OG O O O OAO OHO O O: AOD WHON O O O OAO OOO O O O O O O O DOO O, O OD O OS GOIOG O O BO O ON O O OIO, O O O OS O O OY O OOOOO O O O O O O O O O O O OT TO O OEOGO O O O O OD O OROLO O O O O O O OF, O O O O O O O O OHO O O O O O O O O O O O O O O O O O O
Discrete Channels x 2 X noisy channel P Y X y 2 Y Def: A discrete channel is denoted by (X,P Y X, Y) where X is a finite input set, Y is a finite output set and is a conditional probability distribution, d.h. P Y X 8x 2 X 8y 2 Y : P Y X (y x) 0 8x 2 X : X y P Y X (y x) =1 P Y X (y x) = the probability that the channel outputs y when given x as input
0 1 (1 f) f (1 f) 0 1 Figure 1.5. A binary data sequence of length 10 000 transmitted over a binary symmetric channel with noise level f = 0.1. [Dilbert image Copyright c 1997 United Feature Syndicate, Inc., used with permission.]
Received sequence r Likelihood ratio P (r s = 1) P (r s = 0) Decoded sequence ŝ 000 γ 3 0 001 γ 1 0 010 γ 1 0 100 γ 1 0 101 γ 1 1 110 γ 1 1 011 γ 1 1 111 γ 3 1 Algorithm 1.9. Majority-vote decoding algorithm for R 3. Also shown are the likelihood ratios (1.23), assuming the channel is a binary symmetric channel; γ (1 f)/f.
s 0 0 1 0 1 1 0 {}}{ t 0 0 0 {}}{ 0 0 0 {}}{ 1 1 1 {}}{ 0 0 0 {}}{ 1 1 1 {}}{ 1 1 1 {}}{ 0 0 0 n 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 r 0}{{} 0 0 0}{{} 0 1 1}{{} 1 1 0}{{} 0 0 0}{{} 1 0 1}{{} 1 1 0}{{} 0 0 ŝ 0 0 1 0 0 1 0 corrected errors undetected errors
s encoder t channel f = 10% r decoder ŝ Figure 1.11. Transmitting 10 000 source bits over a binary symmetric channel with f = 10% using a repetition code and the majority vote decoding algorithm. The probability of decoded bit error has fallen to about 3%; the rate has fallen to 1/3.
Noisy-Channel Coding w 2 [M] Encoder e :[M]! X n x 2 X n noisy channel P Y X y 2 Y n Decoder d : Y n! [M] W Def: A (M,n)-code for the channel (X,P Y X, Y) consists of 1. message set: [M] ={1, 2,...,M} 2. encoding function: e :[M]! X n codebook: {e(1),e(2),...,e(m)} 3. deterministic decoding function assigning a guess to each possible received vector d : Y n! [M] The rate of a (M,n)-code denotes the transmitted bits per channel use R := log M n
Rate and Error w 2 [M] Encoder e :[M]! X n x 2 X n noisy channel P Y X y 2 Y n Decoder d : Y n! [M] W The rate of a (M,n)-code denotes the transmitted bits per channel use R := log M n probability of error when sending w 2 [M] w := Pr[ W = d(y n ) 6= w X n = e(w)] maximal probability of error: average probability of error: (n) := max p (n) e := 1 M w2[m] MX w=1 w w
0.1 0.1 0.01 R5 R3 0.08 p b 1e-05 more useful codes 0.06 0.04 R3 1e-10 0.02 0 R5 more useful codes R61 0 0.2 0.4 0.6 0.8 1 Rate 1e-15 R61 0 0.2 0.4 0.6 0.8 1 Rate Figure 1.12. Error probability p b versus rate for repetition codes over a binary symmetric channel with f = 0.1. The right-hand figure shows p b on a logarithmic scale. We would like the rate to be large and p b to be small.
s t s t s t s t 0000 0000000 0001 0001011 0010 0010111 0011 0011100 0100 0100110 0101 0101101 0110 0110001 0111 0111010 1000 1000101 1001 1001110 1010 1010010 1011 1011001 1100 1100011 1101 1101000 1110 1110100 1111 1111111 Table 1.14. The sixteen codewords {t} of the (7, 4) Hamming code. Any pair of codewords differ from each other in at least three bits.
Richard Hamming 1915-1998 Mathematician from Chicago 1945: Manhattan Project 1946-1976: Scientist at Bell Labs computing machines 1968: Turing Award ( Nobel prize in Computer Science ) best known for Hamming codes, and Hamming distance
s encoder t channel f = 10% r decoder ŝ parity bits Figure 1. source bi symmetr using a ( probabili Figure 1.17. Transmitting 10 000 source bits over a binary symmetric channel with f = 10% using a (7, 4) Hamming code. The probability of decoded bit error is about 7%.
0.1 0.1 0.01 R5 H(7,4) 0.08 0.06 H(7,4) p b 1e-05 BCH(511,76) more useful codes 0.04 R3 BCH(31,16) 1e-10 0.02 R5 BCH(15,7) more useful codes BCH(1023,101) 0 0 0.2 0.4 0.6 0.8 1 Rate 1e-15 0 0.2 0.4 0.6 0.8 1 Rate [3, p.20] Figure 1.18. Error probability p b versus rate R for repetition codes, the (7, 4) Hamming code and BCH codes with blocklengths up to 1023 over a binary symmetric channel with f = 0.1. The righthand figure shows p b on a logarithmic scale. Exercise 1.9. [4, p.19] Design an error-correcting code and a decoding algorithm for it, estimate its probability of error, and add it to figure 1.18. [Don t worry if you find it difficult to make a code better than the Hamming code, or if you find it difficult to find a good decoder for your code; that s the point of this exercise.]
0.1 0.1 0.01 R5 0.08 1e-05 H(7,4) 03. On-screen viewing permitted. Printing not permitted. p b http://www.cambridge.org/0521642981 50. 0.06 See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.,4) 0.04 0.02 0 t achievable R3 0.1 0.01 Book 0.8 by David 1 MacKay R5 1e-10 achievable not achievable R5 achievable not achievable 1e-15 Shannon limit on achievable C 0 0.2 1e-05 0.4 0.6 0.8 1 0 values C 0.2 of 0.4 (R, p b ) 0.6 for the 0.8binary1 p b Rate Figure 1.19. Shannon s Rate noisy-channel coding theorem. The solid curve shows the Shannon limit on achievable 1e-10 values of (R, p b ) for the binary achievable not achievable symmetric channel with f = 0.1. Rates up to R = C are achievable with arbitrarily small p b. The points show the performance of 1e-15 some textbook codes, as in C 0 0.2 0.4 0.6 0.8 1 figure 1.18. The equation Rate defining the Figure 1.19. Shannon s noisy-channel coding theorem. The solid curve shows the 15 symmetric channel with f = 0.1. Rates up to R = C are achievable with arbitrarily small p b. The points show the performance of some textbook codes, as in figure 1.18. The equation defining the Shannon limit (the solid curve) is R = C/(1 H 2 (p b )), where C and H 2 are defined in equation (1.35).
0.1 0.1 0.01 R5 0.08 1e-05 H(7,4) 03. On-screen viewing permitted. Printing not permitted. p b http://www.cambridge.org/0521642981 50. 0.06 See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.,4) 0.04 0.02 0 t achievable R3 0.1 0.01 Book 0.8 by David 1 MacKay R5 1e-10 achievable not achievable R5 achievable not achievable 1e-15 Shannon limit on achievable C 0 0.2 1e-05 0.4 0.6 0.8 1 0 values C 0.2 of 0.4 (R, p b ) 0.6 for the 0.8binary1 p b Rate Figure 1.19. Shannon s Rate noisy-channel coding theorem. The solid curve shows the Shannon s channel-coding Shannon limitheorem: on achievable (informal version) 1e-10 values of (R, p b ) for the binary some textbook codes, as in Every channel has a achievable symmetric capacity not channel C, achievable meaning with f = 0.1. that figure there 1.18. exist codes to Rates up to R = C are achievable communicate over this channel with The equation defining the with arbitrarily small p b arbitrarily small error. The at any rate R<C. points show the performance of 1e-15 some textbook codes, as in [C = 1-h(0.1) = 0.53 for BSC with f=0.1] C 0 0.2 0.4 0.6 0.8 1 figure 1.18. The equation Rate defining the Figure 1.19. Shannon s noisy-channel coding theorem. The solid curve shows the 15 symmetric channel with f = 0.1. Rates up to R = C are achievable with arbitrarily small p b. The points show the performance of Shannon limit (the solid curve) is R = C/(1 H 2 (p b )), where C and H 2 are defined in equation (1.35).
the channel we were discussing earlier with noise level = 0 1 has capacity C 0.53. Let us consider what this means in terms of noisy disk drives. The repetition code R 3 could communicate over this channel with p b = 0.03 at a rate R = 1/3. Thus we know how to build a single gigabyte disk drive with p b = 0.03 from three noisy gigabyte disk drives. We also know how to make a single gigabyte disk drive with p b 10 15 from sixty noisy one-gigabyte drives (exercise 1.3, p.8). And now Shannon passes by, notices us juggling with disk drives and codes and says: What performance are you trying to achieve? 10 15? You don t need sixty disk drives you can get that performance with just two disk drives (since 1/2 is less than 0.53). And if you want p b = 10 18 or 10 24 or anything, you can get there with two disk drives too! [Strictly, the above statements might not be quite right, since, as we shall see, Shannon proved his noisy-channel coding theorem by studying sequences of block codes with ever-increasing blocklengths, and the required blocklength might be bigger than a gigabyte (the size of our disk drive), in which case, Shannon might say well, you can t do it with those tiny disk drives, but if you had two noisy terabyte drives, you could make a single high-quality terabyte drive from them.]
the channel we were discussing earlier with noise level = 0 1 has capacity C 0.53. Let us consider what this means in terms of noisy disk drives. The repetition code R 3 could communicate over this channel with p b = 0.03 at a rate R = 1/3. Thus we know how to build a single gigabyte disk drive with p b = 0.03 from three noisy gigabyte disk drives. We also know how to make a single gigabyte disk drive with p b 10 15 from sixty noisy one-gigabyte drives (exercise 1.3, p.8). And now Shannon passes by, notices us juggling with disk drives and codes and says: What performance are you trying to achieve? 10 15? You don t need sixty disk drives you can get that performance with just two disk drives (since 1/2 is less than 0.53). And if you want p b = 10 18 or 10 24 or anything, you can get there with two disk drives too! [Strictly, the above statements might not be quite right, since, as we shall see, Shannon proved his noisy-channel coding theorem by studying sequences of block codes with ever-increasing blocklengths, and the required blocklength might be bigger than a gigabyte (the size of our disk drive), in which case, Shannon might say well, you can t do it with those tiny disk drives, but if you had two noisy terabyte drives, you could make a single high-quality terabyte drive from them.]
Graph Theory 101 Def: A graph G is defined by its vertex set V(G) and edge set E(G). Example: V(G) = {1,2,3,4,5,6} E(G) = {12,15,25,23,34,45,46} 3 and 4 are adjacent nodes, 6 and 5 are not
Independence Number Def: A graph G is defined by its vertex set V(G) and edge set E(G). Def: An independent set of a Graph G is a subset of pairwise non-adjacent vertices. Def: The independence number α(g) is the maximum cardinality of an independent set.