COMP2610/6261 - Information Theory Lecture 22: Hamming Codes U Logo Use Guidelines Mark Reid and Aditya Menon logo is a contemporary n of our heritage. presents our name, d and our motto: arn the nature of things. authenticity of our brand identity, there are n how our logo is used. go - horizontal logo go should be used on a white background. ludes black text with the crest in Deep Gold in MYK. Research School of Computer Science The Australian National University Preferred logo October 15th, 2014 Black version inting is not available, the black logo can hite background. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 1 / 37
Reminder: Repetition Codes The repetition code R 3 : s 0 0 1 0 1 1 0 t {}}{ 0 0 0 {}}{ 0 0 0 {}}{ 1 1 1 {}}{ 0 0 0 {}}{ 1 1 1 {}}{ 1 1 1 {}}{ 0 0 0 η 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 r 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 1 1 0 0 0 For a BSC with bit flip probability f = 0.1, drives error rate down to 3% For general f, the error probability is f 2 (3 2f ) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 2 / 37
This time Introduction to block codes Extension to basic repetition codes The (7,4) Hamming code Redundancy in (linear) block codes through parity check bits Syndrome decoding Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 3 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 4 / 37
Motivation Goal: Communication with small probability of error and high rate: Repetition codes introduce redundancy on a per-bit basis Can we improve on repetition codes? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 5 / 37
Motivation Goal: Communication with small probability of error and high rate: Repetition codes introduce redundancy on a per-bit basis Can we improve on repetition codes? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 5 / 37
Motivation Goal: Communication with small probability of error and high rate: Repetition codes introduce redundancy on a per-bit basis Can we improve on repetition codes? Idea: Introduce redundancy to blocks of data instead Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 5 / 37
Motivation Goal: Communication with small probability of error and high rate: Repetition codes introduce redundancy on a per-bit basis Can we improve on repetition codes? Idea: Introduce redundancy to blocks of data instead Block Code A block code is a rule for encoding a length-k sequence of source bits s into a length-n sequence of transmitted bits t. Introduce redundancy: N > K Focus on Linear codes Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 5 / 37
Motivation Goal: Communication with small probability of error and high rate: Repetition codes introduce redundancy on a per-bit basis Can we improve on repetition codes? Idea: Introduce redundancy to blocks of data instead Block Code A block code is a rule for encoding a length-k sequence of source bits s into a length-n sequence of transmitted bits t. Introduce redundancy: N > K Focus on Linear codes We will introduce a simple type of block code called the (7,4) Hamming code Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 5 / 37
An Example The (7, 4) Hamming Code Consider K = 4, and a source message s = 1 0 0 0 The repetition code R 2 produces t = 1 1 0 0 0 0 0 0 The (7,4) Hamming code produces t = 1 0 0 0 1 0 1 Redundancy, but not repetition How are these magic bits computed? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 6 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 7 / 37
The (7,4) Hamming code Coding Consider K = 4, N = 7 and s = 1 0 0 0 It will help to think of the code in terms of overlapping circles Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 8 / 37
The (7,4) Hamming code Coding Consider K = 4, N = 7 and s = 1 0 0 0 It will help to think of the code in terms of overlapping circles Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 8 / 37
The (7,4) Hamming code Coding Copy the source bits into the the first 4 target bits: Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 9 / 37
The (7,4) Hamming code Coding Set parity-check bits so that the number of ones within each circle is even: So we have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 10 / 37
The (7,4) Hamming code Coding It is clear that we have set: t i = s i for i = 1,..., 4 t 5 = s 1 s 2 s 3 t 6 = s 2 s 3 s 4 t 7 = s 1 s 3 s 4 where we use modulo-2 arithmetic Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 11 / 37
The (7,4) Hamming code Coding In matrix form: t = G T s with G T = where s = [ s 1 s 2 s 3 s 4 ] T [ I4 P ] = G is called the Generator matrix of the code. The Hamming code is linear! 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 1 1, Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 12 / 37
Codewords Each (unique) sequence that can be transmitted is called a codeword. Codeword examples: s Codeword (t) 0010 0010111 0110 0110001 1010 1010010 1110? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 13 / 37
Codewords Each (unique) sequence that can be transmitted is called a codeword. Codeword examples: s Codeword (t) 0010 0010111 0110 0110001 1010 1010010 1110? For the (7,4) Hamming code we have a total of 16 codewords Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 13 / 37
Codewords Each (unique) sequence that can be transmitted is called a codeword. Codeword examples: s Codeword (t) 0010 0010111 0110 0110001 1010 1010010 1110? For the (7,4) Hamming code we have a total of 16 codewords There are 2 7 2 4 other bit strings that immediately imply corruption Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 13 / 37
Codewords Each (unique) sequence that can be transmitted is called a codeword. Codeword examples: s Codeword (t) 0010 0010111 0110 0110001 1010 1010010 1110? For the (7,4) Hamming code we have a total of 16 codewords There are 2 7 2 4 other bit strings that immediately imply corruption Any two codewords differ in at least three bits Each original bit belongs to at least two circles Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 13 / 37
The (7,4) Hamming code Coding Write G T = [ G 1 G 2 G 3 G 4 ] where each G i is a 7 dimensional bit vector Then, the transmitted message is t = G T s = [ G 1 G 2 G 3 G 4 ] s = s 1 G 1 +... + s 4 G 4 All codewords can be obtained as linear combinations of the rows of G: { 4 Codewords = α i G i }, i=1 where α i {0, 1} and G i is the ith row of G. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 14 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 15 / 37
Decoding We can encode a length-4 sequence s into a length-7 sequence t using 3 parity check bits Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 16 / 37
Decoding We can encode a length-4 sequence s into a length-7 sequence t using 3 parity check bits t can be corrupted by noise which can flip any of the 7 bits (including the parity check bits): s 1 0 0 0 {}}{ t 1 0 0 0 1 0 1 η 0 1 0 0 0 0 0 r 1 1 0 0 1 0 1 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 16 / 37
Decoding We can encode a length-4 sequence s into a length-7 sequence t using 3 parity check bits t can be corrupted by noise which can flip any of the 7 bits (including the parity check bits): s 1 0 0 0 {}}{ t 1 0 0 0 1 0 1 η 0 1 0 0 0 0 0 r 1 1 0 0 1 0 1 How should we decode r? We could do this exhaustively using the 16 codewords Assuming BSC, uniform p(s): Get the most probable explanation Find s such that t(s) r 1 is minimum Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 16 / 37
Decoding We can encode a length-4 sequence s into a length-7 sequence t using 3 parity check bits t can be corrupted by noise which can flip any of the 7 bits (including the parity check bits): s 1 0 0 0 {}}{ t 1 0 0 0 1 0 1 η 0 1 0 0 0 0 0 r 1 1 0 0 1 0 1 How should we decode r? We could do this exhaustively using the 16 codewords Assuming BSC, uniform p(s): Get the most probable explanation Find s such that t(s) r 1 is minimum We can get the most probable source vector in an more efficient way. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 16 / 37
Decoding Example 1 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 1 0 0 1 0 1: (1) Detect circles with wrong (odd) parity What bit is responsible for this? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 17 / 37
Decoding Example 1 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 1 0 0 1 0 1: (2) Detect culprit bit and flip it The decoded sequence is ŝ = 1 0 0 0 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 18 / 37
Decoding Example 2 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 0 0 0 0 1: (1) Detect circles with wrong (odd) parity What bit is responsible for this? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 19 / 37
Decoding Example 2 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 0 0 0 0 1: (2) Detect culprit bit and flip it The decoded sequence is ŝ = 1 0 0 0 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 20 / 37
Decoding Example 3 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 1 0 1 0 1: (1) Detect circles with wrong (odd) parity What bit is responsible for this? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 21 / 37
Decoding Example 3 We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 1 0 1 0 1: (2) Detect culprit bit and flip it The decoded sequence is ŝ = 1 0 0 0 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 22 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 23 / 37
Optimal Decoding Algorithm: Syndrome Decoding Given r = r 1,..., r 7, assume BSC with small noise level f : 1 Define the syndrome as the length-3 vector z that describes the pattern of violations of the parity bits r 5, r 6, r 7. zi = 1 when the ith parity bit does not match the parity of r Flipping a single bit leads to a different syndrome Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 24 / 37
Optimal Decoding Algorithm: Syndrome Decoding Given r = r 1,..., r 7, assume BSC with small noise level f : 1 Define the syndrome as the length-3 vector z that describes the pattern of violations of the parity bits r 5, r 6, r 7. zi = 1 when the ith parity bit does not match the parity of r Flipping a single bit leads to a different syndrome 2 Check parity bits r 5, r 6, r 7 and identify the syndrome Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 24 / 37
Optimal Decoding Algorithm: Syndrome Decoding Given r = r 1,..., r 7, assume BSC with small noise level f : 1 Define the syndrome as the length-3 vector z that describes the pattern of violations of the parity bits r 5, r 6, r 7. zi = 1 when the ith parity bit does not match the parity of r Flipping a single bit leads to a different syndrome 2 Check parity bits r 5, r 6, r 7 and identify the syndrome 3 Unflip the single bit responsible for this pattern of violation This syndrome could have been caused by other noise patterns Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 24 / 37
Optimal Decoding Algorithm: Syndrome Decoding Given r = r 1,..., r 7, assume BSC with small noise level f : 1 Define the syndrome as the length-3 vector z that describes the pattern of violations of the parity bits r 5, r 6, r 7. zi = 1 when the ith parity bit does not match the parity of r Flipping a single bit leads to a different syndrome 2 Check parity bits r 5, r 6, r 7 and identify the syndrome 3 Unflip the single bit responsible for this pattern of violation This syndrome could have been caused by other noise patterns z 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 Flip bit none r 7 r 6 r 4 r 5 r 1 r 2 r 3 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 24 / 37
Optimal Decoding Algorithm: Syndrome Decoding Given r = r 1,..., r 7, assume BSC with small noise level f : 1 Define the syndrome as the length-3 vector z that describes the pattern of violations of the parity bits r 5, r 6, r 7. zi = 1 when the ith parity bit does not match the parity of r Flipping a single bit leads to a different syndrome 2 Check parity bits r 5, r 6, r 7 and identify the syndrome 3 Unflip the single bit responsible for this pattern of violation This syndrome could have been caused by other noise patterns z 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 Flip bit none r 7 r 6 r 4 r 5 r 1 r 2 r 3 The optimal decoding algorithm unflips at most one bit Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 24 / 37
Optimal Decoding Algorithm: Syndrome Decoding When the noise level f on the BSC is small, it may be reasonable that we see only a single bit flip in a sequence of 4 bits The syndrome decoding method exactly recovers the source message in this case c.f. Noise flipping one bit in the repetition code R 3 But what happens if the noise flips more than one bit? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 25 / 37
Decoding Example 4: Flipping 2 Bits We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 1 0 1 0 0: (1) Detect circles with wrong (odd) parity What bit is responsible for this? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 26 / 37
Decoding Example 4: Flipping 2 Bits We have s = 1 0 0 0 encoder t = 1 0 0 0 1 0 1 noise r = 1 0 1 0 1 0 0: (2) Detect culprit bit and flip it The decoded sequence is ŝ = 1 1 1 0 We have made 3 errors but only 2 involve the actual message Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 27 / 37
Syndrome Decoding: Matrix Form Recall that we just need to compare the expected parity bits with the actual ones we received: z 1 = r 1 r 2 r 3 r 5 z 2 = r 2 r 3 r 4 r 6 z 3 = r 1 r 3 r 4 r 7, Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 28 / 37
Syndrome Decoding: Matrix Form Recall that we just need to compare the expected parity bits with the actual ones we received: z 1 = r 1 r 2 r 3 r 5 z 2 = r 2 r 3 r 4 r 6 z 3 = r 1 r 3 r 4 r 7, but in modulo-2 arithmetic 1 1 so we can replace with so we have: z = Hr with H = [ ] 1 1 1 0 1 0 0 P I 3 = 0 1 1 1 0 1 0 1 0 1 1 0 0 1 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 28 / 37
Syndrome Decoding: Matrix Form Recall that we just need to compare the expected parity bits with the actual ones we received: z 1 = r 1 r 2 r 3 r 5 z 2 = r 2 r 3 r 4 r 6 z 3 = r 1 r 3 r 4 r 7, but in modulo-2 arithmetic 1 1 so we can replace with so we have: z = Hr with H = [ ] 1 1 1 0 1 0 0 P I 3 = 0 1 1 1 0 1 0 1 0 1 1 0 0 1 What is the syndrome for a codeword? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 28 / 37
Syndrome Decoding: Matrix Form Recall that we obtain a codeword with t = G T s Assume we receive r = t + η, where η = 0 The syndrome is z = Hr = Ht = HG T s = 0 This is because HG T = P + P = 0 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 29 / 37
Syndrome Decoding: Matrix Form For the noisy case we have: r = G T s + η z = Hr = HG T s + Hη = Hη. Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 30 / 37
Syndrome Decoding: Matrix Form For the noisy case we have: r = G T s + η z = Hr = HG T s + Hη = Hη. Therefore, syndrome decoding boils down to find the most probable η satisfying Hη = z. Maximum likelihood decoder Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 30 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 31 / 37
Error Probabilities Decoding Error : Occurs if at least one of the decoded bits ŝ i does not match the corresponding source bit s i for i = 1,... 4 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 32 / 37
Error Probabilities Decoding Error : Occurs if at least one of the decoded bits ŝ i does not match the corresponding source bit s i for i = 1,... 4 p(block Error) : p B = p(ŝ s) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 32 / 37
Error Probabilities Decoding Error : Occurs if at least one of the decoded bits ŝ i does not match the corresponding source bit s i for i = 1,... 4 p(block Error) : p B = p(ŝ s) p(bit Error) : p b = 1 K p(ŝ k s k ) K k=1 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 32 / 37
Error Probabilities Decoding Error : Occurs if at least one of the decoded bits ŝ i does not match the corresponding source bit s i for i = 1,... 4 p(block Error) : p B = p(ŝ s) p(bit Error) : p b = 1 K p(ŝ k s k ) K k=1 Rate : R = K N = 4 7 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 32 / 37
Error Probabilities Decoding Error : Occurs if at least one of the decoded bits ŝ i does not match the corresponding source bit s i for i = 1,... 4 p(block Error) : p B = p(ŝ s) p(bit Error) : p b = 1 K p(ŝ k s k ) K k=1 Rate : R = K N = 4 7 What is the probability of block error for the (7,4) Hamming code with f = 0.1? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 32 / 37
Leading-Term Error Probabilities Block Error: This occurs when 2 or more bits in the block of 7 are flipped We can approximate p B to the leading term: 7 ( ) 7 p B = f m (1 f ) 7 m m m=2 ( ) 7 f 2 = 21f 2. 2 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 33 / 37
Leading-Term Error Probabilities Bit Error: Given that a block error occurs, the noise must corrupt 2 or more bits The most probable case is when the noise corrupts 2 bits, which induces 3 errors in the decoded vector: Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 34 / 37
Leading-Term Error Probabilities Bit Error: Given that a block error occurs, the noise must corrupt 2 or more bits The most probable case is when the noise corrupts 2 bits, which induces 3 errors in the decoded vector: p(ŝ i s i ) 3 7 p B for i = 1,..., 7 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 34 / 37
Leading-Term Error Probabilities Bit Error: Given that a block error occurs, the noise must corrupt 2 or more bits The most probable case is when the noise corrupts 2 bits, which induces 3 errors in the decoded vector: p(ŝ i s i ) 3 7 p B for i = 1,..., 7 All bits are equally likely to be corrupted (due to symmetry) Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 34 / 37
Leading-Term Error Probabilities Bit Error: Given that a block error occurs, the noise must corrupt 2 or more bits The most probable case is when the noise corrupts 2 bits, which induces 3 errors in the decoded vector: p(ŝ i s i ) 3 7 p B for i = 1,..., 7 All bits are equally likely to be corrupted (due to symmetry) p b 3 7 p B 9f 2 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 34 / 37
What Can Be Achieved with Hamming Codes? 0.1 R1 0.1 0.01 R5 H(7,4) R1 0.08 0.06 H(7,4) Ô 1e-05 BCH(511,76) more useful codes 0.04 R3 BCH(31,16) 1e-10 0.02 R5 BCH(15,7) more useful codes BCH(1023,101) 0 1e-15 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Rate Rate H(7,4) improves p b at a moderate rate R = 4/7 BCH are a generalization of Hamming codes. BCH better than R N but still pretty depressing Can we do better? What is achievable / nonachievable? Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 35 / 37
1 Motivation 2 The (7,4) Hamming code Coding Decoding Syndrome Decoding Error Probabilities 3 Wrapping up Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 36 / 37
Summary The (7,4) Hamming code Redundancy in (linear) block codes through parity check bits Syndrome decoding via identification of single bit noise patterns Block error, bit error, rate Reading: Mackay 1.2 1.5 Mark Reid and Aditya Menon (ANU) COMP2610/6261 - Information Theory October 15th, 2014 37 / 37