Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example: Demand for money depends upon income, interest rate, etc. Two most frequently estimation methods: Maximum likelihood estimate Least square estimate Linear Statistical Models Definition 11.1 A linear statistical model relating a random response Y to a set of independent variables x 1, x,..., x k is of the form Y i = β 0 + β 1 x 1i + β i x + + β k x ki + ɛ i i = 1,,, T E(ɛ t ) = 0, t V (ɛ t ) =, t E(ɛ t ɛ s ) = 0, t s E(x it ɛ t ) = i = 1,,, k0 Take expectation of both sides to obtain µ y = β 0 + β 1 µ x1 + + β k µ xk and subtract it from the regression model, (y t µ y ) = β 1 (x 1t µ x1 )+β (x i µ x )+ +β k (x ki µ xk )+ɛ t, t = 1,,, T where β 0, β 1,..., β k are unknown parameters, ɛ is a random variable, and x 1, x,..., x k are fixed variables. 1
3 The Method of Least Squares Simple Linear Regression Model Y = β 0 + β 1 x + ɛ ŷ i = ˆβ 0 + ˆβ 1 x i Minimize the sum of squared deviation: SSE = (y i ŷ i ) = (y i ˆβ 0 ˆβ 1 x i ) i=1 Least Squares Estimators for Simple Linear Regression Model 1. ˆβ1 = Sxy, where S xy = n i=1 (x i x)(y i ȳ), and = n i=1 (x i x).. ˆβ0 = ȳ ˆβ 1 x. Note that (Ŷ ȳ) = ˆβ 1 (x x) ˆβ 1 = w i y i ˆβ 0 = w i = i=1 v i y i i=1 x i x n i=1 (x i x) v i = 1 n xw i w i = 0 i wi = i=1 1 n i=1 (x i x) 4 Properties of the Least Squares Estimators- Simple Linear Regression 1. The estimators ˆβ 0 and ˆβ 1 are unbiased, that is, E( ˆβ i ) = β i, for i=0,1.
. V ( ˆβ 0 ) = c 00, where c 00 = x i n. 3. 3. V ( ˆβ 1 ) = c 11, where c 00 = 1. 4. Cov( ˆβ 0, ˆβ 1 ) = c 01, where c 01 = x. 5. An unbiased estimator of is S = SSE/(n ), where SSE = S yy ˆβ 1 S xy and S yy = (y i ȳ). If, in addition, the ɛ i, for i = 1,,..., n are normally distributed: 6. Both ˆβ 0 and ˆβ 1 are normal distributed. 7. The random variable (n )S freedom. has a χ distribution with n- degrees of 8. The statistic S is independent of both ˆβ 0 and ˆβ 1. 5 Inference Concerning the Parameters β i Test of Hypothesis for β i H 0 : H a : Test statistic: T = ˆβ i β i0 S c ii Rejection region: x i β i = β i0 β i > β i0 (upper tail alternative) β i < β i0 (lower tail alternative) β i β i0 (two tailed alternative) t > t α (upper tail alternative) t < t α (lower tail alternative) t > t α/ (two tailed alternative) where c 00 = n, and c 11 = 1 Notice that t α is based on (n-) degrees of freedom. A (1 α) 100 % Confidence Interval for β i. ˆβ ± t α/ S c ii where c 00 = x i n, c 11 = 1 3
6 Inference Concerning Linear Functions of the Model Parameters: Simple Linear Regression A Test for θ = a 0 β 0 + a 1 β 1 H 0 : H a : θ = θ 0 θ > θ 0 θ < θ 0 θ θ 0 Test statistic: T = Rejection region: ˆθ θ 0 S ( a 0 x /n+a i 1 a 0 a 1 x ) Sxx t > t α t < t α t t α/ Here t α is based upon n- degrees of freedom. A (1 α) 100 % Confidence Interval for θ = a 0 β 0 + a 1 β 1 ˆθ ± t α/ S ( a 0 x i /n + a 1 a 0 a 1 x ) where the tabulated t α/ is based upon n- degrees of freedom. A (1 α) 100 % Confidence Interval for E(Y ) = β 0 + β 1 x ˆβ 0 + ˆβ 1 x ± t α/ S 1/n + (x x) / where the tabulated t α/ is based upon n- degrees of freedom. 7 Predicting a Particular Value of Y Using Simple Linear Regression A (1 α) 100 % Prediction Interval for Y when x = x ˆβ 0 + ˆβ 1 x ± t α/ S 1 + 1/n + (x x) / 4
8 Correlation In Y = β 0 + β 1 x + ɛ, x could be a fixed regressor controlled by researchers or can be observed values of a random variable. For the former, E(Y ) = β 0 +β 1 x. For the latter, E(Y X = x) = β 0 +β 1 x. Typically, we often assume that (X, Y ) has a bivariate normal distribution with E(X) = µ x, E(Y ) = µ y, V (X) = x, V (Y ) = y, Cov(X, Y ) = σ xy = ρσ x σ y. In this case, E(Y X = x) = β 0 + β 1 x, where β 1 = σ y σ x ρ The MLE of correlation ρ is given by the sample correlation coefficient: r = = n i=1 (X i X)(Y i Ȳ ) n i=1 (X i X) n i=1 (Y i Ȳ ) X xy = ˆβ 1 Sxx S yy S yy Testing β 1 = 0 is equivalent to testing ρ = 0. ˆβ 1 t = S/ = r n 1 r Z-test (1/) log[(1 + r)/(1 r)] N((1/) log[(1 + ρ)/(1 ρ)], 1/(n 3)) Z = ( 1 1+r ) log( 1 r ) ( 1 ) log( 1+ρ 0 1 n 3 1 ρ 0 ) 9 Fitting the Linear Model by Using Matrices Least Squares Equations and Solutions for a General Linear Model Equations : (X X)ˆβ = X Y Solutions : ˆβ = (X X) 1 X Y SSE = Y Y ˆβ X Y 5
10 Linear Functions of the Model Parameters: Multiple Linear Regression Multiple Linear Regression Y i = β 0 + β 1 x 1i + + β k x ki + ɛ i, i = 1,,..., n. ˆβ = (X X) 1 X Y Properties of the Least Squares Estimators- Multiple Linear Regression 1. E( ˆβ i ) = β i, i = 0, 1,..., k.. V ( ˆβ i ) = c ii, where c ij (or, in this special case, c ii ) is the element in row i and column j (here, column i) of (X X) 1. 3. Cov( ˆβ i, ˆβ j ) = c ij 4. An unbiased estimator of is S = SSE/[n (k + 1)], where SSE = Y Y ˆβ X Y. 5. Each ˆβ i is normally distributed. 6. The random variable [n (k + 1)]S has a χ distribution with n-(k+1) degrees of freedom. 7. The statistics S and ˆβ i, for i = 0, 1,,..., k are independent. 11 Inference Concerning Linear Functions of the Model Parameters: Multiple Linear Regression A Test for a β H 0 : H a : a β = (a β) 0 a β > (a β) 0 a β < (a β) 0 a β (a β) 0 6
Test statistic: Rejection region: T = a ˆβ (a β) 0 S a (X X) 1 a t > t α t < t α t t α/ Here t α is based upon [n-(k+1)] degrees of freedom. A (1 α) 100 % Confidence Interval for a β a ˆβ ± tα/ S a (X X) 1 a 1 Predicting a Particular Value of Y Using Multiple Regression A (1 α) 100 % Prediction Interval for Y when x 1 = x 1, x = x,..., x k = x k where a = [1, x 1, x,..., x k ] a ˆβ ± tα/ S 1 + a (X X) 1 a 13 A test for H 0 : β g+1 = β g+ = = β k = 0 Reduced model (R): Complete model (C): Y = β 0 + β 1 x 1 + + β g x g + ɛ Y = β 0 + β 1 x 1 + + β g x g + β g+1 x g+1 + + β k x k ɛ SSE R = SSE C + (SSE R SSE C ) χ 3 = SSE R χ = SSE C χ 1 = SSE R SSE C 7
F = F > F α χ 1/(k g) χ /(n [k + 1]) = SSE R SSE C /(k g) SSE C /(n [k + 1]) 8