Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2 <, zero elsewhere (a) Find the marginal pdf of X 2 (b) Find the conditional expectation E(X X 2 = 2) (c) Find the distribution of Y = X + X 2 (a) f X2 (x 2 ) = x2 0 2e (x +x 2 ) dx = 2e x 2 ( e x 2 ), 0 < x 2 < (b) The conditional density function of X X 2 = 2 is f X,X 2 (x, x 2 ) x2 =2 = f X2 (x 2 ) 2e x 2 2e 2 ( e 2 ) Thus the conditional expectation E(X X 2 = 2) is 2 0 x e x e 2 dx = = e x e 2, 0 < x < 2 e 2 ( x e x e x ) 2 0 = 3e 2 e 2 (c) Let Y 2 = X We first derive the joint distribution of (Y, Y 2 ) Clearly J = Notice that 0 < X < X 2, so we have Y > 2Y 2 > 0 Thus the joint distribution of Y and Y 2 is 2e y, 0 < 2y 2 < y < The distribution of Y can be obtained as y /2 0 2e Y dy 2 = y e y, y > 0 Problem 2 Stat 40 Suppose that number of customers visiting at a bank follows a Poisson process with an average of 3 persons per time unit Let X be the length of time from the bank opening until the first customer visit the bank; let Y be the length of time from the bank opening until the second customer visit the bank
(a) Derive the probability density functions of X (b) Derive the probability density functions of Y (c) Find the joint distribution of X and Y (a) Consider the event {X > x} It implies that there is no customer at the interval [0, x] Let Z x be the number of customers visit the bank at the time interval [0, x] Then Z x has poisson distribution with parameter 3x Thus P (X > x) = P (Z x = 0) = e 3x Consequently, the PDF of X is 3e 3x, x > 0 (b) Similarly, we can derive the distribution of Y, ie, P (Y > y)=p (Z y < 2), where Z y is the number of customers visit the bank at the time interval [0, y] Thus, P (Y > y) = e 3y ( + 3y) Thus the PDF of Y is 9ye 3y, y > 0 (c) P (X > x, Y > y) = P (x < X y, Y > y) + p(x > y, Y > y) = P (T x = 0, T y = ) + p(t y = 0) = P (T y = T x = 0)p(T x = 0) + p(t y = 0) = 3(y x)e 3(y x) e 3x + e 3y = (3y 3x + )e 3y Thus the joint PDF of (X, Y ) is 9e 3y, 0 < x < y < Problem 3 Stat 4 Consider a distribution with density f θ (x) = θx θ, for x (0, ) and θ > 0 This is a beta distribution with parameters θ and a Let X,, X n be independent and identically distributed according to f θ (x) Find the maximum likelihood estimator, ˆθ, of θ b Let X f θ (x) Find the distribution of Y = log X In particular, what is the mean E θ (Y )? (Hint: The distribution of Y is one you know) c Use part (b), the law of large numbers, and the continuous mapping theorem to show that ˆθ is a consistent estimator of θ (a) The log-likelihood function looks like l(θ) = log f θ (X i ) = n log θ + (θ ) The derivative of l(θ) with respect to θ is l (θ) = n θ + log X i, 2 log X i
and setting this equal to zero and solving for θ gives the estimator ˆθ = n n log X i That this, indeed, is a maximizer of the log-likelihood is easy to check with the second derivative test (b) Let X f θ (x) and define Y = log X Then X = e Y transformation is J = e Y Therefore, and the Jacobian of the f Y θ (y) = f X θ (e y )e y = θ(e y ) θ e y = θe θy This is clearly the density of an exponential distribution with mean /θ (c) By the law of large numbers, n n log X i converges in probability to E θ (log X) and, according to part (b), the limit equals /θ Consider the function g(z) = /z for z > 0, a continuous function By continuous mapping theorem we have ˆθ = g( log X) g( E θ (log X)) = g(/θ) = θ This convergence is in probability Therefore, ˆθ is a consistent estimator of θ Problem 4 Stat 4 Let X be a random variable with probability mass function f θ (x) = θ( θ) x, where x = 0,, and θ (0, ) a For fixed θ 0, suppose the goal is to test H 0 : θ = θ 0 versus H : θ < θ 0 Show that the uniformly most powerful test is of the form reject if X > c b Find the constant c above so that the corresponding test has size α Without loss of generality, you can assume c is an integer (a) Take θ < θ 0 For testing H 0 : θ = θ 0 versus H : θ = θ, the most powerful test obtained from the Neyman Pearson lemma is to reject H 0, in favor of H, if and only if the likelihood ratio L(θ 0 )/L(θ ) is too small Since L(θ 0 ) L(θ ) = θ 0( θ 0 ) X θ ( θ ) X = θ 0 θ ( θ0 θ ) X is a non-increasing function of X, we can conclude that the most powerful test rejects H 0, in favor of H, if and only if X > c for some constant c Since the choice of cutoff c will not depend on θ, we can extend the optimality conclusion to hold uniformly for all θ < θ 0 (This also follows since the distribution in question here has the monotone likelihood ratio property in X) Therefore, the stated test is uniformly most powerful for testing H 0 versus H 3
(b) The size of the test is P θ0 (X > c) Following the suggestion, we assume c is an integer Then P θ0 (X > c) = x=c+ θ 0 ( θ 0 ) x = ( θ 0 ) c So, to make the size of the test equal α, we take [ c = ceiling x=0 log α log( θ 0 ) θ 0 ( θ 0 ) x = ( θ 0 ) c the smallest integer greater than or equal to the ratio in the inside ], Problem 5 Stat 4 Let X and X 2 be independent continuous uniform distributed random variables on the interval (θ, θ + ), where θ is an unknown real number Let 2 2 Y = (X 2 + X 2 ) be the sample mean, and Y 2 = (X 2 X 2 ), a scaled difference a Find the conditional density of Y, given Y 2 = u (Hint: The joint distribution of Y and Y 2 is easy to get, and the conditional density is proportional to the joint density with u fixed at the given value) b Find the conditional variance of Y, given Y 2 = u, and the unconditional variance of Y For what values of u is the conditional variance smaller than the unconditional variance? (Hint: The variance of a Unif(a, b) distribution is (b a) 2 /2) (a) The joint density of (Y, Y 2 ) is obtained from the transformation formula Since X = Y + Y 2 and X 2 = Y Y 2, the Jacobian of the transformation is, so f Y,Y 2 (y, y 2 ) = f X,X 2 (y + y 2, y y 2 ) = I [θ 2,θ+ 2 ](y + y 2 )I [θ 2,θ+ 2 ](y y 2 ), where I is the indicator function Following the hint, we know that the conditional density of Y, given Y 2 = u, is proportional to f Y,Y 2 (y, u) as a function of y, ie, f Y Y 2 (y u) I [θ 2 u,θ+ 2 u](y )I [θ 2 +u,θ+ 2 +u](y ) Combining the two indicators, it is clear that the conditional distribution must also be uniform; in particular, (b) From the hint, we have that Y (Y 2 = u) Unif(θ 2 + u, θ + 2 u ) V(Y ) = variance of Unif(θ 2, θ + 2 ) 2 = 24 4
Similarly, V(Y Y 2 = u) = [(θ + 2 u ) (θ 2 + u )]2 2 = ( 2 u )2 2 The conditional variance will be smaller if ( 2 u ) 2 < 2 u > ( ) 2 2 or, equivalently, if Basically, if the distance between observations X and X 2 is sufficiently large, then the conditional variance of the sample mean is less than its unconditional variance Problem 6 Stat 46 The pretest anxiety is investigated in a study in order to see whether the scores are different for two groups of students in two different sections of an introduction course to probability theory Five students enrolled in first section, and six students enrolled in second section Their scores are Section I 9,26,22,2,27 Section II 34,24,30,28,25,23 Use Wilcoxon rank-sum test to check if there a significant difference between the median scores of the two groups at 5% level Sort observations in Section I (denoted by X) into 9, 2, 22, 26, 27 Sort observations in Section II (denoted by Y ) into 23, 24, 25, 28, 30, 34 Combine X and Y into Z and get # 9 2 22 23 24 25 26 27 28 30 34 # X X X Y Y Y X X Y Y Y #i 2 3 4 5 6 7 8 9 0 #Zi 0 0 0 0 0 0 The Wilcoxon rank-sum statistic W n = iz i = + 2 + 3 + 7 + 8 = 2 Using Table J (p 576) with m = 5, n = 6, we get P (W n 2 H 0 ) = 0063 Therefore, the p-value=2*0063=026 We do not reject the null hypothesis That is, there is no significant difference between the median scores of the two groups at 5% 5
Problem 7 Stat 43 The size [N] of a finite population is unknown to start with Ten [0] units are drawn, marked and then released into the population Next, a simple random sample of 30 is drawn only to find that three [3] of the 30 units bear the mark From the above information, suggest a reasonable estimate for N Also indicate what statistical procedure you used in your estimation methodology Sample proportion of marked units = 3/30 Population proportion of marked units = 0/N Using Method of Moments, we equate the sample proportion to the unknown population proportion and so we obtain: 3/30 = 0/N, whence ˆN = 00 Problem 8 Stat 45 The table below shows survival times (days) of patients with advanced terminal cancer of the stomach and breast The goal is to use a permutation test to examine the hypothesis that there is no difference in mean survival times between the two groups (stomach and breast) Describe your algorithm in detail Stomach 25 42 45 46 5 03 24 46 340 396 42 876 2 Breast 24 40 79 727 79 66 235 58 804 3460 3808 There are n = 3 observations in the stomach group, denoted by x,, x n There are m = observations in the breast group, denoted by y,, y m Define the statistic T = T (z,, z n, z n+,, z n+m ) = n z i m m z n+j Calculate t 0 = T (x,, x n, y,, y m ) which is the mean difference of the two groups 2 For k =,, B, permute the original data (x,, x n, y,, y m ) into a new dataset Z (k) = (z (k),, z n (k), z n+, (k), z n+m) (k) and calculate t k = T (Z (k) ) 3 Let L be the number of k s such that t k t 0 Then L/B serves as an estimated p-value We reject the hypothesis that there is no difference in group means if L/B is less than a certain significance level, say 005 j= Problem 9 Stat 46 A die is rolled repeatedly until either two successive s appear or one 6 s appear Suppose the first roll is a 3 Find the probability that the game ends with two successive s We use X n to record the number of successive s by the following way: Outcome of the n-th roll is 2, 3, 4 or 5: X n = 0; 6
Number of successive s is one after the n-th roll: X n = ; Number of successive s is two after the n-th roll: X n = 2; Outcome of the n-th roll is 6: X n = 3 Then X n is a Markov chain with transition probability matrix 4/6 /6 0 /6 P = 4/6 0 /6 /6 0 0 0 0 0 0 States 2 and 3 are absorption states The transition probability matrix corresponding to the non-absorbing states is ( ) 4/6 /6 Q = 4/6 0 ( ) 2/6 /6 W = (I Q) = = 9 ( ) /6 4/6 2 4/6 2/6 Note We have U = W R = 9 2 R = ( 0 /6 /6 /6 ( /6 4/6 2/6 ) ) ( 0 /6 /6 /6 Suppose the first roll is a 3, the probability that the game ends with two successive s is given by U 02 which is the (0, 2)-th entry of U ) Problem 0 Stat 46 We toss a coin repeatedly For each toss, we get if the outcome is a head; we get 2 if the out come is a tail Let Y n be the summation of the outcomes of the first n tosses Denote by X n the remainder of Y n divided by 3 a Is there a limiting distribution for the Markov chain X n? If yes, determine the limiting distribution b Suppose a visit to state j incurs a cost c j for j = 0, and 2 Moreover, we know that c 0 =, c = 2 and c 2 = 3 What is the long run mean cost per unit time? The transition matrix of X n is given by 0 05 05 P = 05 0 05 05 05 0 7
(a) Yes, there is a limiting distribution for the Markov chain, since P 2 has all entries strictly positive and hence regular Moreover, it is clear that P is doubly stochastic, the limiting distribution is thus π = (π 0, π, π 2 ) = (/3, /3, /3) (b) Since π j is also the long run mean fraction of time that the process X n is in state j, we have Long run mean cost per unit time = N j=0 π j c j = ( + 2 + 3) = 2 3 Problem Stat 48 Try to fit data {(x i, Y i ), i =,, n} with a simple linear regression model Y i = β 0 + β x i + ε i, where iid errors ε i N (0, σ 2 ) (a) Based on the least square criterion (loss function), calculate the least squares estimates for the intercept and slope, ˆβ 0, ˆβ What is the least square estimate for β 0 under restriction that β = 0? Is it different from the unrestricted estimator? (b) Show that SSR = ˆβ 2 n ( ) (x i x) 2, and derive its distribution under β = 0 [Given V ar ˆβ = σ [ 2 n (x i x) 2] ] (c) Show that the coefficient of determination R 2 = r 2, where r is the linear correlation coefficient of x = (x,, x n ) and Y = (Y,, Y n ) (a) Least square estimators ( Q ˆβ0, ˆβ ) = min { Q/ β0 = 0 Q/ β = 0 ( ˆβ0, ˆβ ) : { } {Q (β 0, β )} = min (Y i β 0 β x i ) 2 β 0,β β 0,β { n Y i = nβ 0 + β n x i n Y ix i = β n 0 x i + β n x2 i ˆβ 0 = ȳ ˆβ x, ˆβ = S xy S xx = n (x i x) (y i ȳ) n (x i x) 2 When β = 0, the least square estimator for β 0 is the response average { ( ) } Q ˆβ0 = min {Q (β 0 )} = min (Y i β 0 ) 2 β 0 β 0 dq/dβ 0 = 0 Y i = nβ 0 ˆβ 0 = Ȳ (b) Note that ȳ = ˆβ 0 + ˆβ x, ŷ i = ˆβ 0 + ˆβ x i, then SSR = (ŷ i ȳ) 2 = ( ˆβ0 + ˆβ x i ˆβ ) 2 0 ˆβ x = ˆβ2 (x i x) 2 8
where ˆβ = n (x i x) (y i ȳ) s xx = c i Y i, where c i = (x i x) s xx It can be shown that ˆβ N (β, σ 2 s xx ) Under hypothesis β = 0, then ˆβ ( ) N (0, ) ˆβ2 ( ) = ˆβ 2 s xx χ 2 () V ar ˆβ V ar ˆβ σ 2 ie SSR/σ 2 χ 2 () (c) The coefficient of determination R 2 = SSR SST O = ˆβ ( ) 2 2 s xx sxy = sxx s yy s xx s yy s 2 n xy = = (x i x) (y i ȳ) s xx s n yy (x i x) 2 n (y i ȳ) 2 2 = r 2 Problem 2 Stat 48 A researcher studied the sodium content in beer by selecting six brands from the large number of brands of US and Canadian beers The researcher then chose eight 2-ounce cans or bottles of each selected brand at random and measured the sodium content Y (in mg) (a) Write down appropriate statistic model and necessary assumptions for the model What is the hypotheses for this study? (b) Complete the following ANOVA table and then conclude given level α = 005F 005 (5, 42) = 244, F 005 (6, 4) = 233 Source DF Sum of Squares Mean Square F Brand 5 650 30 78 Error 42 308 073 T otal 47 6808 (c) Estimate variance components in the model given in () (d) Find Corr (Y ij, Y i j ) i, i =,, k; j, j =,, n, the correlation coefficient between any two responses (a) This is a random effect one-way ANOVA model Y ij = µ + τ i + ε ij, i =,, 6; j =,, 8 where iid errors ε ij N (0, σ 2 ) is independent of the random effect τ i N (0, σ 2 τ) Hypotheses: H 0 : σ 2 τ = 0 vs H : σ 2 τ 0 (b) See the table above Reject null hypothesis as p value < 005 since F = 78 > F 005 (5, 42) = 244 9
(c) ˆσ 2 = MSE = 073, ˆσ 2 τ = (MST R MSE) /n = (30 073) /8 = 62 (d) Calcluate the covariance first Cov (Y ij, Y i j ) = Cov (µ + τ i + ε ij, µ + τ i + ε i j ) 0, i i = σ 2 τ, i = i, j j σ 2 + στ 2 i = i, j = j 0, i i Corr (Y ij, Y i j ) = στ 2, i = i, j j σ 2 +στ 2 i = i, j = j 0