EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios of estimatio theory. Not all exercises are to be tured i. Oly those with the sig are due o Thursday, October 12 th at the begiig of the class. Although the remaiig exercises are ot graded, you are ecouraged to go through them. We will discuss some of the exercises durig discussio sectios. Please feel free to poit out errors ad otios that eed to be clarified. Notes: I this problem set, uless otherwise stated, we will cosider simple biary hypothesis testig. There are two possible hypotheses H 0 ad H 1. Basically, we wish to test the assumptio H 0 is true agaist H 1 is true. By probability of Type I (or false alarm) error, we mea the probability of acceptig hypothesis H 1 (or rejectig H 0 ) while H 0 is true. Type II (or miss detectio) probability is the probability of acceptig H 0 while H 1 is true. (Thik of a radar system where H 0 is the hypothesis that there is ot target ad H 1 the hypothesis that a target is preset.) Exercise 4.1. I discussio sectio, we have see ( ot really!) that i the Neyma-Pearso test, the probability of Type I ad Type II errors (resp. deoted (α) ad (β)) caot be made both small at the same time. Maily, reducig oe type of error will result to icreasig the other. We have also claimed that oe way to reduce both type of errors was to icrease the the umber of observatios. I this exercise we will verify those claims. Let (X 1,..., X ) be idepedet observatios (or realizatios) of a ormal radom variable with mea µ ad variace σ 2 = 100. We would like to test the hypothesis H 0 : µ = 50 agaist H 1 : µ = µ 1 > 50 for a observatio size = 25. A) Error calculatio: We decide to reject H 0 is X 52 where X = X 1+ +X is the sample mea. 1. Fid the probability of rejectig H 0 as a fuctio of µ. 2. Compute the probability of Type I error (α). 3. Fid the probability of Type II error for (i)µ 1 = 53 ad for (ii) µ 1 = 55. B) Now we cosider that the decisio rule is such that we reject H 0 if X c. 1. Fid the value of c such that the probability of Type I error is α = 0.05. 4-1
2. Fid the probability of Type II error β with the ew decisio rule whe µ 1 = 55. 3. Compare the values you get for α ad β to the oes obtaied i part A. C) Repeat part B) with a sample size = 100 ad coclude. A) Error calculatio: (1) Note the the sample mea X is ormal N (µ, σ2 ). Thus the probability of rejectig H 0 is give by P (N (µ, σ2 52 µ ) 52) = P (N (0, 1) σ2 / ) = Q( 52 µ ), µ 50 σ2 / (2) Type I error is the probability of rejectig H 0 give µ = µ 0 = 50. Replacig µ by 50 i the result i part (1) gives: 52 50 α = Q( ) = 0.1587 100/25 (3) Type II error is the probability of acceptig H 0 give that µ = µ 1 > µ 0. It is give by P ( X 52). Similar to part (1) we ca express it as a fuctio of µ ad it is: P (N (µ 1, σ2 ) 52) = P (N (0, 1) 52 µ 1 σ2 / ) = Q( µ 1 52 σ2 / ), usig properties of Q-fuctio For µ 1 = 53 we obtai β = 0.3085 ad for β = µ 1 = 55 we obtai 0.0668. Notice that clearly the probability of Type II error depeds o µ 1. B)-(1) Usig the result i (2) of part A, we are lookig for c such that α = P ( X c µ = µ 0 ) = 0.05 Agai give that µ = µ 0 = 50, X N (µ, σ 2 ). Thus we wat c such that Q( c 50 σ2 / ) = c c = σ 2 /Q 1 (0.05) + 50 Lookig at the table we have c = 53.29 (2) Usig part (A)-(3) with the modified decisio rule, we obtai β = Q( µ 1 53.29 σ2 / ) 4-2
With µ 1 = 55, we obtai β = 0.1963. (3) Comparig with the results of part (A), we otice that with the chage of the decisio rule, α is reduced from 0.1587 to 0.05, but β is icreased from 0.0668 to 0.1963. (C) We will just use the results i part (B) with = 100. (1) c = σ 2 /Q 1 (0.05) + 50, σ 2 = 100, = 100 Usig tables, we obtai c = 51.645. (2) Settig µ 1 = 55, we obtai β = Q( µ 1 51.645 σ2 / ) = 0.0004 (3)Notice that with sample size = 100, both α ad β have decreased from their respective origial values of 0.1587 ad 0.0668 whe = 25. Exercise 4.2. ML, MAP, ad Neyma-Pearso We cosider a biary commuicatio system where at each time oe of the two symbols s = 0 or s = 1 is trasmitted. Our two hypotheses are H 0 : s = 0 was trasmitted H 1 : s = 1 was trasmitted The commuicatio chael adds oise N (0, 1) ad the received sigal is x = s +. At some time istat, we observe that x = 0.6. (a) Usig the maximum likelihood test, determie which sigal was set. Compute α. (b) Suppose ow that the prior probability of s is give by P (s = 0) = 2 3. Repeat part (a) usig the MAP test. (c) Now suppose that we require that α = 0.25. Usig Neyma-Pearso test, determie which sigal was set. What is the correspodig β. Hit (1) Because of the symmetry of the ML detectio, the threshold of the decisio is 1/2. Sice x = 0.6, we will decide s = 1. (2) Writig the MAP rule, we see that the threshold is ow equal to 1.193 > x ad we will decide s = 0. (3) Derivig the NP test we fid that the threshold is equal to 0.674 > x ad hece we will decide s = 0. Try to compute the correspodig errors. Exercise 4.3. : Bayes Test I the Bayes Test, each evet (D i, H j ) has a associated cost C ij, where (D i, H j ) is the evet that hypothesis H i is accepted while H j is true (for i, j = 0, 1). The average cost or Bayes risk is defied as C = C 00 P (D 0 ; H 0 ) + C 01 P (D 0 ; H 1 ) + C 10 P (D 1 ; H 0 ) + C 11 P (D 1 ; H 1 ) 4-3
The test that miimizes the average cost is called the Bayes test ad ca be expressed i terms of the likelihood ratio Λ(X) H 1 η = (C 10 C 00 )P (H 0 ) H 0 (C 01 C 11 )P (H 1 ) Now cosider a biary decisio problem with the followig coditioal pdf s: The costs are give as follows: f(x H 0 ) = 1 2 exp{ x } f(x H 1 ) = exp{ 2 x } C 00 = C 11 = 0 C 01 = 2 C 10 = 1 (a) Determie the Bayes test if P (H 0 ) = 2 3 ad the associated Bayes risk. (b) Repeat part (a) for P (H 0 ) = 1 2 (a) The likelihood ratio is give by: The Bayes test is give by Λ(x) = f(x H 1) 2e 2 x f(x H 1 ) = e x 2e x H 1 H 0 (1 0) 2 3 (2 0) 1 3 Takig the logarithm is both sides we obtai: = 2e x The associated Bayes risk is x H 0 H 1 l( 1 2 ) = 0.693 ad P (D 0 H 1 ) = givig C = 0.5. (b) For P (H 0 ) = 1 2 C = 2P (D 0 H 1 )P (H 1 ) + P (D 1 H 0 )P (H 0 ) 0.693 e 2x dx + P (D 1 H 0 ) = 0.693 0.693 0.693 e 2x dx = 2 e x dx = 2 0.693 0.693 0 e 2x dx = 0.25 e x dx = 0.5 we have equal priors ad the log-likelihood ratio is equal to x H 0 H 1 l( 1 4 ) = 1.386 4-4
Followig the same steps as i part (a), we obtai P (D 0 ; H 1 ) = 2 P (D 1 ; H 0 ) = 2 1.386 1.386 0 e 2x dx = 0.0625 e x dx = 0.75 C = 1 (0.75 + 2 0.0625) = 0.4375 2 Exercise 4.4. The coditioal pmf of Y give X is give by the table below. X Y +1 1 +1 0.3 0.7 1 0.6 0.4 Fid the the estimate of ˆX based o Y that miimizes P ( ˆX = 1 X = 1) subject to the costrait P ( ˆX = 1 X = 1) β, for β (0, 1). Hit See solutio i 126 otes exercise 8.6.7 Exercise 4.5. We observe (Y 1,..., Y ), idepedet realizatios of the same radom variable Y. Y depeds o a parameter θ i the followig way: whe θ = 0, Y is uiformly distributed i [ 1, 1]; whe θ = 1 Y is uiform i [0, 2]. Suppose that we would like to guess the value of θ based o our observatios. (a) Fid a sufficiet statistic. (b) We do ot kow the prior distributio of θ, but whe θ = 0, we would like to detect it correctly at least 95% of the time. Give this costrait, fid the test that miimizes the error probability whe θ = 1. (a) Let Y mi = mi(y 1,..., Y ) ad Y max = max(y 1,..., Y ). We ca easily argue that (Y mi, Y max ) is a sufficiet statistic. I fact if Y mi < 0 the we kow that the Y i s must have come from the U( 1, 1) radom variable. Similarly if Y max > 1 the we kow that the Y i s must have come from the U(0, 2) radom variable. If 0 Y mi Y max 1 the the Y i s are uiformly distributed i (Y mi, Y max ) whether θ = 0 or θ = 1. (b) I this part we wat to miimize a Type II error give some costrait. This is solved by the NP test. We ve already give the decisio rule if Y mi < 0 or Y max > 1. If 0 Y mi Y max 1 we decide θ = 0 with probability γ ad γ is chose such that the Type I is equal to α. 4-5
Give that θ = 0 there is a error oly if 0 Y i 1, i = 1,..., ad we choose ˆθ = 1. This happes with probability P (error θ = 0) = (1 γ)p (Y i [0, 1], i = 1,..., θ = 0) = (1 γ) P (Y i [0, 1] θ = 0) = (1 γ)( 1 2 ) We wat this error probability to be at most equal to α. (1 γ)( 1 2 ) = α γ = max(0, 1 2 α) where we have take the maximum because γ must be i [0, 1]. Note that if 1 2 α < 0 ( > log 2 (1/α)), the we always choose ˆθ = 1 wheever 0 Y mi Y max 1. Ituitively, this meas that if we have eough observatios (such that costrait for Type I error is met: P (error θ = 0) = 2 α), we always choose ˆθ = 1 i the ucertaity regio. The correspodig Type error is zero which is miimal. Exercise 4.6. We cosider a biary detectio problem where we observe (X 1,..., X ) idepedet realizatios of a ormal radom variable with mea µ ad variace σ 2. We would like to test the hypothesis H 0 : µ = µ 0 agaist the H 1 : µ > µ o. Suppose that the ull hypothesis is false ad that the true value of the mea is µ = µ o + δ, for δ > 0. What is the miimum umber of observatios if we wat the Type II probability of error to be at most β? We cosider that the Type I error probability is equal to α. We will use the NP test to fid the threshold first ad after we will compute the correspodig β. Note that the sample mea X = X 1+X 2 + +X is a sufficiet statistic for the detectio problem ad it has a ormal N (µ 0, σ 2 /) give H 0. The decisio rule will be to choose H 0 if X c ad choose H 1 otherwise; for some threshold c. The probability of error give H = H 0 is P r(error H 0 ) = α = P (N (µ 0, σ 2 /) > c) = P (N (0, 1) > c µ 0 σ2 / ) = Q( c µ 0 σ2 / ) Thus we obtai c = σ 2 /Q 1 (α) + µ 0 4-6
Now we write the Type II probability of error: P (error H 1 ) = P (N (µ 0 + δ, σ 2 /) < c) = P (N (0, 1) < c µ 0 δ σ2 / ) = P (N (0, 1) < = P (N (0, 1) < σ2 /Q 1 (α) + µ 0 µ 0 δ ) σ2 / σ2 /Q 1 (α) δ ) σ2 / = 1 P (N (0, 1) > σ2 /Q 1 (α) δ ) σ2 / = 1 Q( = Q( σ2 /Q 1 (α) δ ) σ2 / σ2 /Q 1 (α) δ ) β σ2 / Sice Q is a strictly decreasig fuctio the miimum for which the last equatio is satisfied verifies: Q 1 (α) + δ σ2 / Q 1 (β) σ δ2 (Q 1 (α) + Q 1 (β)) 2 σ δ 2 (Q 1 (α) + Q 1 (β)) 2 Exercise 4.7. Miimum Probability of Error Cosider a Bayes test with the followig costs C 00 = C 11 = 0 C 01 = 1 C 10 = 1 What is the average cost? (you should recogize this cost). What is the Bayes test? (this is a test that you already kow) Hit It should ot be difficult to see that this correspods to the MAP test. Exercise 4.8. Let X ad Y be two idepedet radom variables with the same distributio U(0, 1). Fid the MMSE of X 3 give X + Y. 4-7
We kow that the MMSE is give by E[X 3 X + Y ]. So we oly eed to compute the distributio of X give X + Y = z. Cosiderig the cases where 0 z < 1 ad 2 z 1 we obtai { 1 f X X+Y (x z) = 1 z [0 x 1] if 0 z < 1 1 1 2 z [z 1 x 1] if 2 z 1 Now we oly eed to compute E[X 3 X + Y ] ad see that it is give by: { 1 E[X 3 4 X + Y ] = (x + y)3 if 0 z < 1 1 [1 (x + 4(2 x y) y)4 ] if 2 z 1 Exercise 4.9. Show that the mea square estimatio error i the LLSE case is greater or equal to the mea square estimatio error i the MMSE case. Note: First argue that this is true (usig what you have leared so far) ad the use exercise 4.5 (Gallager) for the proof (for bous). MMSE is computig the estimate that miimizes the error over all fuctios of the observatios. LLSE just cosiders the set of liear fuctios that is a subset of the set all fuctios. Sice if A B the mi B mi A we have that the LLSE estimatio error is larger tha the MMSE estimator. It is iterestig to compute the variace of the estimates. Try this exercise ad you will fid that MMSE estimator has a larger variace. Ask yourself why. By goig through the steps of Gallager 4.5 we see that E[e 2 llse] = E[e 2 mmse] + E[( ˆX mmse ˆX llse ) 2 ] which shows that the MMSE estimatio error is less tha the LLSE estimatio error. Exercise 4.10. : (U)Biased estimator: Let (X 1,..., X 7 ) deote a radom sample from a populatio havig mea µ ad variace σ 2. Cosider the followig estimators of µ: ˆΘ = X 1 + + X 7 7 Θ = 2X 1 X 6 + X 4 2 (a) Is either estimator ubiased? (b) Which estimator is best? I what sese is it best? (c) Show that the estimator of σ 2 defied by S 2 = 1 (X i X) 2 is biased (where X is the sample mea). 4-8
(a) Both estimators are ubiased because they have mea µ. (b) The first estimator has less variace...i this sese it is better tha the secod. (c) It is well kow that the ubiased estimator of the variace is S 2 = 1 1 Try to prove it if your are still ot coviced. (X i X) 2 Exercise 4.11. Defie (X 1,..., X ) as observed values of iid B(m, p), where m is assumed to be kow ad p ukow. Determie the maximum-likelihood estimator of p. Assume that the X i s are idepedet. We wat to fid ˆp give by ˆp = armgax p (P (X 1 = x 1,..., X = x ; p)) [ ] = armgax p P (X i = x i ; p) (idepedece) [ ( ) ] m = armgax p p x i (1 p) m x i = armgax p [ x 1 p P x i (1 p) m P x i ( ) ] m [ = armgax p p P x i (1 p) m P ] x i (bcse terms i productio are costat) ( = armgax p [log p P x i (1 p) m P )] x i (bcse log is icreasig fuctio) ] = armgax p [( x i ) log(p) + (m x i ) log(1 p) Now takig the derivative of ( x i) log(p) + (m x i) log(1 p) with respect to p ad settig it equal to zero, we obtai Exercise 4.12. Gallager s ote: Exercise 3.10 x 1 ˆp = x 1 + x 2 + + x m 4-9
(a) The LLR is give by: Λ(Y ) = log The MAP decisio rule chooses H = H 1 for ( ) fy H (y 1) f Y H (y 0) (y 1)2 = = 2σ 2 + 8y + 24 2σ 2 Λ(Y ) log( p 0 ) p 1 8y + 24 log( p 0 ) 2σ 2 p 1 y 3 σ2 4 log(p 0 p 1 ) (y 5)2 2σ 2 Similarly we choose H = H 0 if y Θ where Θ = 3 σ2 log( p 0 4 p 1 ). The probability of error give H = H 0 is give by: Similar computatio gives that: P r(e H = H 0 ) = P (y Θ H = H 0 ) = P (N (5, σ 2 ) Θ) = P (N (0, 1) Θ 5 σ ) = 1 P (N (0, 1) > Θ 5 σ ) = 1 Q( Θ 5 σ ) = Q( 5 Θ σ ) P r(e H = H 1 ) = P (y > Θ H = H 1 ) = P (N (1, σ 2 ) > Θ) = Q( Θ 1 σ ) (b) The secod observatio Y 2 ca be writte as Y 2 = Y 1 + Z 2 where Z 2 is idepedet to both H ad Z 1. Writig the log-likelihood ratio of the pair (Y 1, Y 2 ) we will see that it is simply equal to the oe computed i (a). Thus the decisio rule ad the estimator errors will be the same as the 4-10
oes i part (a). (This secod observatio will ot help to detect H sice it is idepedet to H, ad it wo t help reduce the oise Z 1 either). (c) Addig a oisy versio of the observatio will ot improve our estimatio. I fact Y 1 is sufficiet to estimate H. (d) Note that i our discussio i part (b) the distributio of Z 2 was ot relevat. Usig the same argumets as i (b) we see that the decisio rule ad estimatio errors will be the same as i part (a). (e) Now if Z 1 U(0, 1), the give H = H 0, Y U(5, 6) ad give H = H 1 Y U(1, 2). Sice these two itervals do ot overlap, we ca always detect H without error. Thus the decisio rule will be { H0 if y [5, 6] H = H 0 if y [1, 2] ad the probability of error will be zero. 4-11