STAT 350. Assignment 6 Solutions

Size: px

Start display at page:

Download "STAT 350. Assignment 6 Solutions"

Leslie Jennings
5 years ago
Views:

1 STAT 350 Assignment 6 Solutions 1. For the Nitrogen Output in Wallabies data set from Assignment 3 do forward, backward, stepwise all subsets regression. Here is code for all the methods with all subsets done both using C p using adjusted R 2. data nit; infile nit.dat ; input nitexc weight dryin wetin nitin ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=forward; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=backward; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=stepwise; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=cp; run ; proc reg data=nit; model nitexc = weight dryin wetin nitin /selection=adjrsq; run ; The conclusion of the output Forward Selection Procedure for Dependent Variable NITEXC Step 1 Variable NITIN Entered R-square = C(p) = Regression Error

2 INTERCEP NITIN Bounds on condition number: 1, 1 Step 2 Variable WETIN Entered R-square = C(p) = Regression Error INTERCEP WETIN NITIN Bounds on condition number: , No other variable met the significance level for entry into the model. Summary of Forward Selection Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Entered In R**2 R**2 C(p) F Prob>F 1 NITIN WETIN Backward Elimination Procedure for Dependent Variable NITEXC Step 0 All Variables Entered R-square = C(p) =

3 Regression Error INTERCEP WEIGHT DRYIN WETIN NITIN Bounds on condition number: , Step 1 Variable WEIGHT Removed R-square = C(p) = Regression Error INTERCEP DRYIN WETIN NITIN Bounds on condition number: , Step 2 Variable DRYIN Removed R-square = C(p) = Regression Error

4 INTERCEP WETIN NITIN Bounds on condition number: , Step 3 Variable WETIN Removed R-square = C(p) = Regression Error INTERCEP NITIN Bounds on condition number: 1, 1 All variables left in the model are significant at the level. Summary of Backward Elimination Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Removed In R**2 R**2 C(p) F Prob>F 1 WEIGHT DRYIN WETIN Stepwise Procedure for Dependent Variable NITEXC Step 1 Variable NITIN Entered R-square = C(p) =

5 Regression Error INTERCEP NITIN Bounds on condition number: 1, 1 All variables left in the model are significant at the level. No other variable met the significance level for entry into the model. Summary of Stepwise Procedure for Dependent Variable NITEXC Variable Number Partial Model Step Entered Removed In R**2 R**2 C(p) F Prob>F 1 NITIN N = 25 Regression Models for Dependent Variable: NITEXC C(p) R-square Variables in Model In NITIN WETIN NITIN DRYIN NITIN WEIGHT NITIN DRYIN WETIN NITIN WEIGHT WETIN NITIN WEIGHT DRYIN NITIN WEIGHT DRYIN WETIN NITIN WEIGHT DRYIN WETIN DRYIN WETIN DRYIN WEIGHT DRYIN WEIGHT WETIN WETIN WEIGHT 5

6 N = 25 Regression Models for Dependent Variable: NITEXC Adjusted R-square Variables in Model R-square In WETIN NITIN NITIN DRYIN NITIN WEIGHT NITIN DRYIN WETIN NITIN WEIGHT WETIN NITIN WEIGHT DRYIN NITIN WEIGHT DRYIN WETIN NITIN WEIGHT DRYIN WETIN DRYIN WETIN DRYIN WEIGHT DRYIN WETIN WEIGHT WETIN WEIGHT is that BACKWARD STEPWISE settle for the model containing only Nitrogen Intake as a predictor. The forward selection method also includes Wet Intake because of the very high level of α (0.5) to enter. The all subsets method using C p would settle on the model using only nitrogen intake but the adjusted R 2 method also includes Wet Intake. However, overall there seems little reason to include Wet Intake since it improves the fit very little is not very significant at all. 2. Suppose X 1, X 2, X 3 are independent N(µ, σ 2 ) rom variables, so that X i = µ + σz i with Z 1, Z 2, Z 3 independent stard normals. (a) If X T = (X 1, X 2, X 3 ) Z T = (Z 1, Z 2, Z 3 ) express X in the form AZ + b for a suitable matrix A vector b. We have A = σi b T = [µ, µ, µ]. (b) Show that X is MV N 3 (µ X, Σ X ) identify µ X Σ X. The definition of MVN is that X be of the form AZ + b then µ X = b Σ X = AA T. So µ T x = [µ, µ, µ] Σ X = σ 2 I. (c) Let Y i = X i X for i = 1, 2, 3 Y 4 = X. Show that Y MV N 4 (µ Y, Σ Y ) find µ Y Σ Y. 6

7 The definition of MVN is that X be of the form AZ +b. Then if Y = BX we have Y = B(AZ + b) = (BA)Z + Bb so that Y is MVN with mean µ Y = Bb = E(Y ) Σ Y = (BA)(BA) T = BAA T B T. In this case we find So B = µ Y = E(Y ) = Σ Y = σ 2 BB T = µ 2/3 1/3 1/3 1/3 2/3 1/3 1/3 1/3 2/3 1/3 1/3 1/3 2/3 1/3 1/3 0 1/3 2/3 1/3 0 1/3 1/3 2/ /3 (d) In class I may have stated that the if the covariance between two components of a multivariate normal vector is 0 then the components are independent, but I indicated a proof only when the multivariate normal distribution in question has a density. In this case the variance matrix is singular so there is no density. However, in terms of the original Z it is possible to find two independent functions of Z such that Y 1, Y 2, Y 3 are a function of the first function while Y 4 is a function of the second. i. Let U 1 = (Z 1 Z 2 )/ 2, U 2 = (Z 1 + Z 2 2Z 3 )/ 6 U 3 = (Z 1 + Z 2 + Z 3 )/3. Show that U = (U 1, U 2, U 3 ) T has a multivariate normal distribution identify the mean variance of U. We have U = AZ where 1/ 2 1/ 2 0 A = 1/ 6 1/ 6 2/ 6 1/3 1/3 1/3 Thus U is multivariate normal with mean 0 variance covariance matrix AA T. Multiply this out to check this is Σ = AA T = /3 ii. Use the result in class, for multivariate normals which have a density to show that (U 1, U 2 ) is independent of U 3. Since the variance covariance matrix is not singular we need only check that Cov(U 1, U 3 ) = Cov(U 2, U 3 ) = 0. These two entries in Σ are, indeed, 0. 7.

8 iii. Express Y 3 as a function of U. We have Y 3 = X 3 X = σ(z 3 Z). Now 3U 3 6U 2 = 3Z 3 Z = U 3. Thus Y 3 = σ(u 3 6U 2 /3 U 3 ) = σ 6U 2 /3. iv. Use the fact that if X 1 X 2 are independent then so are G(X 1 ) H(X 2 ) for any functions G H to show that Y 1, Y 2, Y 3 is independent of Y 4. We see that Y 4 is a function of U 3. I claim that Y 1, Y 2 Y 3 are each functions of U 1, U 2. You did Y 3 in the last part. Notice that Write then These show Y 2 = X 2 X = σ(z 2 Z) Y 1 = X 1 X = σ(z 1 Z). 2U 3 + 6U 2 /3 = Z 1 + Z 2 2U 3 + 6U 2 /3 + 2U 1 = 2Z 1 2U 3 + 6U 2 /3 2U 1 = 2Z 2 Z 1 Z = 6U 2 /6 + 2U 1 /2 Z 2 Z = 6U 2 /6 2U 1 /2 so both Y 1 Y 2 are functions of U 1 U 2. Since Y 1, Y 2, Y 3 is a function of U 1, U 2 Y 4 is a function of U 3 we see the desired independence. v. Express the sample variance of the X i, i = 1, 2, 3 in terms of U use this to show that (n 1)s 2 X/σ 2 has a χ 2 distribution on 2 degrees of freedom (with n = 3). Note: in fact the sample variance of X 1, X 2 is a function of U 1. Generalizations of this idea can be used to develop an identity of the form (n 1)s 2 n = (n 2)s 2 n 1 + U 2 n for a suitable U n where s 2 n is the sample variance for X 1,..., X n. 8

9 The sample variance is s 2 = Y Y2 2 + Y3 2 2 Replace each Y i by the formulas in the previous part to get that 2s 2 σ 2 = (Z 1 Z) 2 + (Z 2 Z) 2 + (Z 3 Z) 2 Write this in terms of U 1 U 2 to get 2s 2 σ 2 = { 6U2 /6 + 2U 1 /2 } 2 + { 6U2 /6 2U 1 /2 } 2 + { 6U2 /3 } 2 = U 2 1 +U 2 2 This is a sum of squares of two independent normals each with mean 0 variance 1 so we are done. 3. In class I discussed the general formula for a multivariate normal density. Suppose that Z 1 Z 2 are independent stard normal variables. Assume that X 1 = az 1 +bz 2 +c X 2 = dz 1 +ez 2 +f. Find the joint density of X 1 X 2 by evaluating the formulas I gave in class. Express P (X 1 t) as a double integral. I want to see the integr the limits of integration but you need not try to do the integral. The density in class was [ 1 2π det(aa T ) exp 1 ] 2 (x µ)t Σ 1 (x µ) where Σ = AA T. The entries in µ are c f we have [ a b A = d e ] Then Σ = AA T = (AA T ) 1 = [ a 2 + b 2 ad + be ad + be d 2 + e 2 d 2 +e 2 (ae bd) 2 Putting together all the algebra gives [ 1 f(x 1, x 2 ) = 2π ae bd exp where the exponent is the quadratic function ad+be (ae bd) 2 ad+be (ae bd) 2 a 2 +b 2 (ae bd) ] ] q(x 1, x 2, a, b, c, d, e, f). (ae bd) 2 q(x 1, x 2, a, b, c, d, e, f) = [(x 1 c) 2 (d 2 + e 2 ) 9 2(ad + be)(x 1 c)(x 2 f) +(x 2 f) 2 (a 2 + b 2 )].

10 To compute P (X 1 t) we take the joint density of X 1 X 2 integrate it over the set of (x 1, x 2 ) such that x 1 t to get P (X 1 t) = t f(x 1, x 2 )dx 1 dx Power sample size calculations must be done before the data are gathered. However: pilot studies are often used to determine the size of the unknown parameters which are needed for these calculations. Use the S Fibre Hardness data discussed in class as follows. (a) Consider the model Y i = β 0 + β 1 S i + β 2 F i + β 3 F 2 i Fit this model to get estimates of all the βs of σ. Use these fitted values as if they were the true parameter values in the following. i. Compute the power of a two sided t test (at the 5% level) of the hypothesis that β 3 = 0 for We need the non-centrality parameter + ɛ i β 3 { 0.006, 0.003, 0, 0.003, 0.006}. δ = β 3 σ a T (X T X) 1 a where a T = [ ]. We get β 3 values from the question the denominator from the stard error of ˆβ 3. I find that stard error to be This gives δ values equal to -3,-1.5,0,1.5,3 (close enough). Turning to Table B.5 with 14 degrees of freedom for error I get powers 0.8, somewhere between , 0.05, somewhere between The somewheres are, I guess in the range of ii. Find a number m of copies of the basic design (each combination of S Fibre tried twice) to guarantee that the power of the t-test of the hypothesis β 3 = 0 is 0.9 when the true parameter values are as in the fit above. The fitted value of β 3 is the corresponding value of δ is just the t statistic in the computer output for testing β 3 = 0. This is though we use in the tables because our tests are to be two sided. We need m to give us a power of 0.9. For large numbers of degrees of freedom for error this power is about half way between the figure under δ = 4 the figure under δ = 3 so we solve m = 3.5 get m = 3.5. We would have to use m = 4 giving δ = 3.74 a power closer to However we could treat the basic design as having 9 points try 63 points in total (7 at each combination of S F) get a power quite close to

11 (b) Now consider the model Y i = β 0 + β 1 S i + β 2 F i + β 3 F 2 i + β 4 S 2 i + β 5 S i F i + ɛ i Fit this model to get estimates of all the βs of σ. Use these fitted values as if they were the true parameter values in the following. Find a number m of copies of the basic design (each combination of S Fibre tried twice) to guarantee that the power of the F -test of the hypothesis β 3 = β 4 = β 5 = 0 is 0.9 when the true parameter values are as in this fit. We can use table B.11 on page 1338 since we will have 3 numerator degrees of freedom for this F test. We will have 18m 6 degrees of freedom for error so will need a φ, in the notation of the table, quite close to 2. This makes δ 2 = (p+1)2 2 = 16. The estimates are ˆβ 3 = , ˆβ 4 = ˆβ 5 = To get our value of δ 2 for the basic design I regress ˆβ 3 F 2 + ˆβ 4 S 2 + ˆβ 5 SF on S F find the error sum of squares is Divide this by ˆσ 2 = 6.77 to find δ 2 = I need to get up to 16 so I need m = 4. 11

MLES & Multivariate Normal Theory

MLES & Multivariate Normal Theory Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate