Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Department of Mechanical Engineering 6.050J/.0J Information and Entropy Spring 003 Problem Set 6 Solutions Solution to Problem : What Kind of Gate is This? Solution to Problem, part a. If each of the inputs is equally likely, then we ust need to sum the probabilities of the inputs that converge at each output to find the output probabilities. Thus: p(b 0 ) 0.75 p(b ) 0.5 The input information I in is given by equation 5.0 in the notes, and is equal to I in p(a 00 ) log + p(a 0 ) log p(a 00 ) + p(a 0 ) p(a 0 ) log + p(a ) log p(a 0 ) p(a ) 4 (0.5) log 0.5 log (4) The output information I out is given by the same equation bits (6 ) I out p(b 0 ) log + p(b ) log p(b 0 ) p(b (0.75) log + (0.5) log 0.75) 0.5 0.75 0.45 + 0.5 0.8 bits (6 ) The noise N is calculated via equation 7.4. Since c i is either or 0, that means that each ci log ( c i ) is zero by virtue of the logarithm (in the case of c i ) or by c i 0. Thus the noise, N is zero. The loss L is calculated via equation 7.5, with N 0: term The mutual information M is equal to I out when there is no noise. L I in I out 0.8.9 (6 3)
Solution to Problem, part b. Figure 6 is a box diagram for the defective Greater gate. c 00 00 - - : - 0 * c 0 > 0 - H c HHHH 0 0 - XX c H XX X X H X - c 03 c z - Figure 6 : A defective Greater gate and the following is the transition matrix: [ ] [ ] c 00 c 0 c 0 c 03 0. 0.8 c 0 c c c 3 0 0.9 0. 0 (6 4) Solution to Problem, part c. Given that the result is... i. The probability that the was produced by the input (0 ) is ii. The probability that it was produced by the input ( 0) is c 0.9 c + c 0.9 + 0. 0.8 (6 5) c 0. c + c 0.9 + 0. 0.8 (6 6) iii. The probability that it was produced by the input is zero, since c 3 is 0. Solution to Problem, part d. The input information I in is the same as before, i.e., bits. To calculate the output information first we calculate B 0 and B by noting that each input is equally likely B 0 c 00 A 0 + c 0 A + c 0 A + c 03 A 3 0.5 + 0. 0.5 + 0.8 0.5 + 0.5 0.5 + 0.05 + 0. + 0.5 0.75 (6 7) B c 0 A 0 + c A + c A + c 3 A 3 0 0.5 + 0.9 0.5 + 0. 0.5 + 0 0.5 0.5 + 0.05 0.75 (6 8)
3 Then the output information I out is given by equation 7.: Solution to Problem, part e. I out p(b ) log p(b ) 0.75 log + 0.75 log 0.75 0.75 0.75 log + 0.75 log 0.75 0.75 0.85 (6 9) The process is both noisy and lossy, since there both fan outs from the inputs and fan ins to the outputs. The noise N is defined in equation 7.: N p(a i ) c i log c i i p(a 0 ) c 0 log + p(a ) c log + c 0 p(a ) c log + p(a c 3 ) c 3 log c 3 p(a 0 ) c 00 log + c 0 log + p(a ) c 0 log + c log + c 00 c 0 c 0 c p(a ) c 0 log + c log + p(a 3 ) c 03 log + c 3 log c 0 c c 03 c 3 0.5 log + 0 log + 0.5 0. log + 0.9 log + 0 0. 0.9 0.5 0.8 log + 0. log + 0.5 log + 0 log 0.8 0. 0 0 + 0.5 (0. log (0) + 0.9 log (.)) + 0.5 (0.8 log (.5) + 0. log (5)) + 0 0.30 The loss L is defined as: The mutual information M is defined as: c L N + I in I out 0.30 + 0.8 (6 0).49 (6 ) M I in L I out N.49 0.8 0.3 0.5 (6 )
4 Solution to Problem : Cantabridgian Computer Competency Solution to Problem, part a. The probability p(m ) that any given student who took the exam is from MIT is Solution to Problem, part b. 0, 000 p(m ) 0, 000 + 0, 000 0.33 (6 3) The uncertainty U is the amount of information you would receive, on average, if you knew the result. This is the average information per event: Solution to Problem, part c. U p(m ) log + p(h) log p(m ) p(h) 0.33 log + 0.66 log 0.33 0.66 0.33 log (3) + 0.66 log (.5) 0.9 bits (6 4) First let us calculate the probabilities p(h C) and p(m C). We know that half of the Harvard students (0,000 students) and all of the MIT students (0,000 students) are competent. p(h C) 0, 000 0, 000 + 0, 000 0.5 p(m C) (6 5) The uncertainty in school if you are told that the student was deemed competent is the average information gained when we learn the school: I avgc p(m C) log + p(h C) log p(m C) p(h C) 0.5 log + 0.5 log 0.5 0.5 0.5 log () + 0.5 log () bit (6 6) Solution to Problem, part d. Here again, let us first let us calculate the probabilities p(h I) and p(m I). We know that half of the Harvard students (0,000 students) and none of the MIT students (0 students) are incompetent.
5 0, 000 p(h I) 0 + 0, 000 (6 7) p(m I) 0 (6 8) The uncertainty in school if you are told that the student was deemed incompetent is the average information gained when we learn the school: I avgi p(m I) log + p(h I) log p(m I) p(h I) 0 + log 0 bits (6 9) Not a surprising answer. If we know the student is incompetent, we know they are from Harvard, not MIT. Solution to Problem, part e. The average uncertainty on learning the competency of the student is: I out p(i)i avgi + p(c)i avgc 0.33 0 + 0.66 0.66 bits (6 0) Solution to Problem, part f. A process box diagram for the inference process would look as shown in Figure 6 3. c MC C - - - M Z c HC Z ZZ Z Z~ - - I - H c HI (b) Figure 6 3: Cambridge computer competency exam inference machine and the following is the transition matrix: [ ] [ ] c HI c MI 0 c HC c MC 0.5 0.5 (6 ) Solution to Problem, part g. The transition matrix of the whole system is:
6 The noise is computed as follows: [ ] [ ] [ ] c HH c MH c IH c IM c HI c HC c HM c MM c CH c CM c MI c MC [ ] [ ] 0.5 0 0.5 0.5 0 0.5 [ ] 0.5 0.5 (6 ) 0.5 0.75 N p(a i ) c i log c i i p(h) c H log + p(m ) c log + c 0 p(h) c IH log + c CH log + p(m ) c IM log + c CM log + c IH c CH c IM c CM 0.66 0.5 log + 0.5 log + 0.33 0 log + log + 0.5 0.5 0 0.66 (0.5 log () + 0.5 log ()) + 0 0.66 bits The loss is as follows: The mutual information is as follows: c M L I in I out + N 0.9 0.9 + 0.66 (6 3) 0.66 bits (6 4) M I out N 0.9 0.66 0.5 bits (6 5)
MIT OpenCourseWare http://ocw.mit.edu 6.050J /.0J Information and Entropy Spring 008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.