Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet colums. Sice b R m ad the dimesio of R m is larger tha, there are vectors i R m that are ot i the rage of A. If b is equal to oe of these vectors, the equatio does ot have a solutio. b. I this case the dimesio of the rage of A is m because the rak of the matrix is m ad hece the matrix has m liearly idepedet colums. The rage is cosequetly equal to R m ad icludes every possible vector b. The equatio always has a solutio. c. The liear model ca be writte i matrix form as Ax b, where A R 50 50 cotais the average temperatures of each state as its colums ad b R 50 cotais the team s scores. The system of equatios has a solutio as log as A will be ivertible, which will be the case if the vectors of temperatures from each state are liearly idepedet. Of course this does ot mea that the model will geeralize, we are just overfittig the data. d. We use a radom variable X to represet the icome. Applyig Markov s iequality Applyig Chebyshev s iequality, P (X > 50000) E (X) 50000 5. () P (X > 50000) P ( X 0000 > 40000) () Var (X) 40000 (3) 0.04. (4) e. Type I errors are false positives, we wat to miimize them whe our aim is to make sure that a pheomeo that we are observig is ot just geerated by radom fluctuatios i the data. f. Type II errors are false egatives, we wat to miimize them whe our aim is detect patters i the data (that may or may ot be the product of radom fluctuatios). g. First we fid the cdf The pdf of Y is the derivative of the cdf, F Y (y) P (Y y) (5) P (X y) (6) y 0 dx (7) y. (8) f Y y { y if 0 y, 0 otherwise. (9)

. Chad a. The empirical pmf is b. The kerel desity estimator is of the form p C (0) 5 5 3, (0) p C () 0 5 3. () f T C (t 0) 0 0 Π i f T C (t ) Π i ( t d0,i ( t d,i ), () ), (3) where Π is a rectagular kerel with uit width, d,0,..., d,0 the temperatures whe Chad is ot there ad d,,..., d, the temperatures whe he is there. The estimator is show i Figure. c. We have so the ML estimate is that Chad is ot at the office. d. Applyig Bayes rule, p C T (0 64) (4) f T C (68 0) 0. > 0 f T C (68 ), (5) p C (0) f T C (64 0) p C (0) f T C (64 0) + p C () f T C (64 ) (6) 0. 3 0. + 0. (7) 3 3, (8) p C T (0 64) p C T (0 64) (9). (0) The MAP estimate is that there is a 50 % chace that Chad is there. e. Both f T C (57 0) ad f T C (68 ) are zero so the ML ad MAP estimates are icoclusive. If we use a parametric distributio such as a Gaussia to fit the data, the f T C (57 0) ad f T C (68 ) would ot be set to zero as log as the distributio has ozero values o all of the real lie (as is the case for a Gaussia pdf). This would allow us to apply MAP or ML estimatio. A oparametric solutio would be to use a kerel with a larger width. 3. 3-poit shootig a. Uder the assumptio that the shots are idepedet. g (θ, ) is the probability that the player makes or more shots i a row if the probability that he makes each shot is θ. Page of 9 DS-GA 00, Fall 05

0.0 f T C (t 0) 0.5 f T C (t ) 0.0 0.05 0.00 55 56 57 58 59 60 6 6 63 64 65 66 67 68 69 70 7 7 73 74 75 Figure : Kerel desity estimator for Problem. b. The ull hypothesis is that the player s shootig percetage is below 80%, i.e. θ < 0.8. c. From the figure, the player eeds to make at least 4 shots. d. The p value is aroud 3%. This is ot eough to reject the ull hypothesis, so you do ot declare him to be a good shooter. e. The probability that the umber of made shots is less tha 4 if the player has a 90% shootig percetage is g (0.9, 4) 0.76. You miss such a player 76% of the time! f. The ew threshold is 4 because we oly reject the ull hypothesis if the p value is below 0.05/0 0.005. g. The probability is ow g (0.9, 4) 0.9. h. The advatage is that we reduce the probability of false positives: we cotrol for the fact that amog may players there is boud to be oe that makes may shots i a radom just by sheer luck. The disadvatage is that the threshold is ow very strict. It is highly ulikely that we will reject the ull hypothesis for ay player, eve if he has a high shootig percetage. 4. Liear regressio with a itercept a. The least-squares problem is mi α,β y x β α b. {/ } is a basis of spa (). Takig ito accout that. () T (). (3) Sample Fial Problems Solutios Page 3 of 9

g(θ,) 0.950 0.900 0.850 0.800 0.750 0.700 0.650 0.600 0.550 0.500 0.450 0.400 0.350 0.300 0.50 0.00 0.50 0.00 0.050 0.005 4 9 4 9 4 0.5 0.6 0.7 0.8 0.9.0 θ Figure : Graph for Problem. We have P spa() (x) x P spa() (x) (4) ( ( ) T x x) (5) x T x. (6) Projectig oto spa () is equivalet to subtractig the average value of x from each etry. c. We ca write x T x + x so ay vector w i the spa of ad x, w a + bx (7) ( ) a + bt x + b x, (8) also belogs to the spa of ad x. To show that ay vector i the spa of ad x belogs to the spa of ad x we apply the same argumet sice x x T x. d. The least-squares problem is mi α, β y x α β. (9) Page 4 of 9 DS-GA 00, Fall 05

e. The solutio is So βls α LS ( T ) T x x x y (30) 0 0 x T x T y x T y (3). (3) T y x T y x T x i β LS y i, (33) α LS xt y x. (34) f. Sice α LS ad β LS are solutios of the least-squares problem P spa(,x) (y) x β LS (35) α LS β LS x + α LS β LS x + α LS. (36) β LS x + α LS Similarly, P spa(, x) (y) x βls α LS Recall that spa (, x) P spa(, x). This establishes the equality. 5. Foxes ad rabbits (37) β LS x + α LS β LS x + α LS (38) β LS x + α LS a. The umber of rabbits i year is equal to.06 times the rabbits i year mius the umber of foxes i year times 0.6. The umber of foxes i year is equal to 0.84 times the foxes i year plus the umber of rabbits i year times 0.06. b. If that is the case, the populatio becomes extict. Let the umber of rabbits ad foxes be equal to a arbitrary umber m, r m lim lim A (39) m f lim ma u (40) m lim λ u (4) 0. (4) 0 Sample Fial Problems Solutios Page 5 of 9

c. Usig the formula u u (43) 0.8 0.3 0.3 0.8. (44) 0.6.6 Sice we have 00 00 0.6.6 00 40 00 00 0.6.6 00 60 (45) (46) x 00 u + 40 u, (47) x 00 u + 60 u. (48) d. The populatios ted to 60 rabbits ad 60 foxes. e. The populatios ted to r lim f r lim f lim A x (49) lim 00A u + 40A u (50) lim 00λ u + 40λ u (5) 00 u (5) 60 60. (53) lim A x (54) lim 00A u + 60A u (55) lim 00λ u + 60λ u (56) 00 u (57) 60 60. (58) The rabbits ad the foxes disappear. f. For a arbitrary populatio of r rabbits ad f foxes r (r f) 0.6.6 f.6f 0.6r (59) Page 6 of 9 DS-GA 00, Fall 05

so r (r f) u f + (.6f 0.6r) u. (60) The populatios ted to r lim f lim A x (6) lim (r f) A u + (.6f 0.6r) A u (6) lim (r f) λ u + (.6f 0.6r) λ u (63) (r f) u (64).6 (r f). (65) 0.6 The umber is positive oly if r f 0, i.e. if there are more rabbits tha foxes. 6. Defective pixels. a. We model the umber of defective pixels i each TV by a radom variable D i ad the average umber of defective pixels by a radom variable D. By liearity of expectatio ( ) 00 E (D) E D i (66) 00 00 i 00 Assumig that the TVs are sampled idepedetly, ( ) 00 Var (D) Var D i 00 Now, applyig Chebyshev s iequality i E (D i ) (67) 0. (68) 00 4 00 00 i i (69) Var (D i ) (70) 0.6. (7) P (D 9.) P ( D E (D) 0.9) (7) VarD 0.9 (73) 0.6 0.0. 0.8 (74) b. The ull hypothesis is that the mea of D is greater or equal to 0. Sample Fial Problems Solutios Page 7 of 9

c. The result is ot statistically sigificat because uder the ull hypothesis the probability of the observed results is 0. > 0.05. d. By the CLT, D ca be approximated as a Gaussia radom variable with mea 0 ad variace 0.6. Equivaletly U : D 0 is a Gaussia with zero mea ad uit variace. As 0.4 a result, P (D 9.) P (0.4U + 0 9.) (75) ( P U 0.9 ) (76) 0.4 ( ) 0.9 Q by symmetry (77) 0.4 Q (.5). 0. (78) Note that cosiderig ay mea greater tha 0 would yield a smaller p value by the same argumet. This implies that the probability of the observed results uder the ull hypothesis is bouded by. 0 < 0.05 so the result is sigificat (uder the assumptio that the CLT approximatio is valid). e. Boferroi s method is meat for situatios i which we perform several tests at the same time. Here we oly have oe test that ivolves several TVs. 7. (0 poits) Camera measuremet. a. Note that E (A) E (A ) p ad E (X ) µ + σx. We have b. E(Y ) E(AX + Z) (79) E(A) E(X) + E(Z) by idepedece of A ad X (80) pµ. E ( Y ) ) E((AX + Z) ) (8) E(A X + Z + AXZ) (83) E(A ) E(X ) + E(Z ) + E (A) E (X) E (Z) by idepedece of A, X ad Z p ( µ + σ X) + σ Z. (84) Var(Y ) E ( Y ) ) E (Y ) (85) p ( µ + σ X) + σ Z p µ (86) pσ X + p( p)µ + σ Z. (87) (8) Cov(X, Y ) E (XY ) E (X) E (Y ) (88) E ( AX + XZ ) pµ (89) E (A) E ( X ) + E (X) E (Z) pµ (90) p ( ) µ + σx pµ (9) pσx. (9) Page 8 of 9 DS-GA 00, Fall 05

c. The best MSE liear estimate is give by X LMMSE Cov(X, Y ) (Y E(Y )) + E(X) (93) Var(Y ) pσx (Y pµ) pσx + p( + µ. (94) p)µ + σz d. Whe p 0 the X LMMSE µ. I this case Y 0 with probability. This implies that X ad Y are idepedet, so the best MSE estimate is E (X Y ) E (X) µ. Sample Fial Problems Solutios Page 9 of 9