The Multiple Regression Model Estimation

Size: px
Start display at page:

Download "The Multiple Regression Model Estimation"

Transcription

1 Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 1 / 64

2 Learning objectives To estimate a linear regression model by Ordinary Least Squares (OLS) using available data To interpret the estimated coefficients of a linear regression model To derive the properties of the OLS estimator To derive the properties of the sample regression function To calculate a measure of Goodness-of-fit To estimate the linear regression model using non sample information by Restricted Least Squares Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 2 / 64

3 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 3 / 64

4 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 4 / 64

5 Estimation of the Linear Regression Model Objective To estimate the unknown parameters of the linear regression model: Y i = β 1 + β 2 X 2i + + β k X ki + u i i = 1, 2,, N u i NID(0, σ 2 ) using the available sample Y : dependent variable, endogenous variable X j, j = 2,, k: regressors β: unknown coefficients u: non observable error term i: index used with cross-section data (t in case of time series data) N: sample size (T in case of time series data) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 5 / 64

6 Estimation of the Linear Regression Model General linear regression model Y i = β 1 + β 2 X 2i + + β k X ki + u i i = 1, 2,, N u i NID(0, σ 2 ) Some remarks: This is the expression that will be used for the GLRM throughout the theoretical exposition of the estimation methodology This expression includes all the regression models linear in the coefficients discussed in Lesson 4 X j, j = 2,, k,: regressors that can be quantitative variables, dummy variables or other terms, such as product of variables, quadratic or logarithmic transformations, The interpretation of the coefficients depends on what X j stands for Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 6 / 64

7 Estimation of the Linear Regression Model Consider the general linear regression model: Y i = β 1 + β 2 X 2i + + β k X ki + u i i = 1, 2,, N Written for each observation: Y 1 = β 1 + β 2 X 21 + β 3 X β k X k1 + u 1 i = 1 Y 2 = β 1 + β 2 X 22 + β 3 X β k X k2 + u 2 i = 2 Y i = β 1 + β 2 X 2i + β 3 X 3i + + β k X ki + u i i = i Y N = β 1 + β 2 X 2N + β 3 X 3N + + β k X kn + u N i = N Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 7 / 64

8 Estimation of the Linear Regression Model The linear regression model can be written as well in matrix form This expression will be used for some proofs and to derive some complex expressions The GLRM in matrix form Y = Xβ + u Y 1 Y 2 Y i Y N = 1 X 21 X 31 X k1 1 X 22 X 32 X k2 1 X 2i X 3i X ki 1 X 2N X 3N X kn β 1 β 2 β 3 β k + u 1 u 2 u i u N Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 8 / 64

9 The OLS estimator Objective: to estimate the unknown coefficients of the Population Regression Function: E(Y i X) = β 1 + β 2 X 2i + + β k X ki Sample Regression Function (SRF): obtained substituting the estimated coefficients: Ŷ i = ˆβ 1 + ˆβ 2 X 2i + + ˆβ k X ki where Ŷi are called the fitted or adjusted values The estimation error is referred as residual: û i = Y i Ŷi = û i = Y i ˆβ 1 ˆβ 2 X 2i ˆβ k X ki This residual depends on the error term and on the error due to the estimation of the coefficients: û i = u i + (β 1 ˆβ 1 ) + (β 2 ˆβ 2 ) X 2i + + (β k ˆβ k ) X ki Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 9 / 64

10 The OLS estimator The OLS criterion The ordinary least squares estimator (OLS) is obtained minimizing the objective function: min ˆβ1,, ˆβ k N i=1 N û 2 i = min ˆβ1,, ˆβ k (Y i ˆβ 1 ˆβ 2 X 2i ˆβ k X ki ) 2 i=1 First order conditions Given by the first derivatives of the objective function equal to zero: N i=1 û2 i ˆβ = 0,, 1 ˆβ1= ˆβ OLS 1 N i=1 û2 i ˆβ = 0 k ˆβk = ˆβ OLS k Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 10 / 64

11 The OLS estimator The first order conditions are: N i=1 û2 i ˆβ 1 ˆβ1= ˆβ OLS 1 N i=1 û2 i ˆβ 2 ˆβ2= ˆβ OLS 2 N = 2 (Y i ˆβ 1 ˆβ 2 X 2i ˆβ k X ki ) = 0 i=1 N = 2 (Y i ˆβ 1 ˆβ 2 X 2i ˆβ k X ki )X 2i = 0 i=1 N i=1 û2 i ˆβ k ˆβk = ˆβ OLS k N = 2 (Y i ˆβ 1 ˆβ 2 X 2i ˆβ k X ki )X ki = 0 i=1 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 11 / 64

12 The OLS estimator Normal equations Y i = N ˆβ 1 + ˆβ 2 X 2i + + ˆβ k X ki X 2iY i = ˆβ 1 X 2i + ˆβ 2 X 22i + + ˆβ k X 2iX ki X ki Y i = ˆβ 1 X ki + ˆβ 2 X ki X 2i + + ˆβ k X 2 ki The first order conditions provide a system of k linear independent equations with k unknown parameters, which is called normal equations The OLS estimator is obtained by solving this system of equations Given that the second order conditions of minimization are always satisfied, we may affirm that the solution of the normal equations system is a minimum Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 12 / 64

13 The OLS estimator The OLS estimator in matrix form Objective function: min ˆβ û û = min ˆβ (Y X ˆβ) (Y X ˆβ) First order conditions: û û ˆβ = 0 2X (Y X ˆβ OLS ) = 0 X û OLS = 0 ˆβ= ˆβOLS Normal equations: X Y = X X ˆβ OLS Second order conditions: û û ˆβ ˆβ ˆβ= ˆβOLS = 2X X It is always a semidefinite positive matrix Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 13 / 64

14 The OLS estimator OLS estimator Solving this system of k linear independent equations, we get the OLS estimators: ˆβ OLS = (X X) 1 X Y X X = N X2i X3i X2i X 2 2i X2iX 3i X3i X3iX 2i X 2 3i Xki Xki X 2i Xki X 3i X Y = Yi X2iY i X3iY i Xki Y i ˆβ = ˆβ 1 ˆβ 2 ˆβ 3 ˆβ k Xki X2iX ki X3iX ki X 2 ki See Examples 51 and 52 for applications Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 14 / 64

15 The OLS estimator Some equivalences Observation i Y i = β 1 + β 2 X 2i + + β K X ki + u i E(Y i X) = β 1 + β 2 X 2i + + β k X ki Ŷ i = ˆβ 1 + ˆβ 2 X 2i + + ˆβ K X ki u i = Y i β 1 β 2 X 2i β k X ki û i = Y i Ŷi û i = Y i ˆβ 1 ˆβ 2 X 2i ˆβ K X ki û i = u i + (β 1 ˆβ 1 ) + + (β k ˆβ k )X ki In matrix form Y = Xβ + u E(Y X) = Xβ Ŷ = X ˆβ U = Y Xβ Û = Y Ŷ Û = Y X ˆβ Û = U + X(β ˆβ) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 15 / 64

16 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 16 / 64

17 GLRM estimation The Sample Regression Function (SRF) Some algebraic properties of the sample regression function: Ŷ i = ˆβ 1 + ˆβ 2 X 2i + + ˆβ k X ki N 1 The sum of the residuals is zero: û i = 0 i=1 2 The residuals are orthogonal to the regressors: i = 1, 2,, N N û i X ji = 0 i=1 Proof From the normal equations system: X (Y X ˆβ) = X û = 0 N 1 ûi N 1 X2iûi N 1 X3iûi N 1 X kiû i = { N i=1 ûi = 0 N ûixji = 0, j i=1 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 17 / 64

18 The Sample Regression Function 3 The residuals are orthogonal to the fitted values: Ŷ û = 0 Proof: Ŷ û = (X ˆβ) û = ˆβ }{{} X û = 0 =0 4 The sample means of Y and Ŷ are equal: Ȳ = Ŷ Proof: û i = Y i Ŷi Yi = Ŷi + ûi Y i = Ŷi + û i }{{} =0 1 Y i = 1 Ŷi Ȳ = Ŷ N N Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 18 / 64

19 The Sample Regression Function 5 The SRF lays on the vector of sample means (Ȳ, X 2,, X k ) Proof: N û i = 0 i=1 (Y i ˆβ 1 ˆβ 2X 2i ˆβ k X ki ) = 0 Y i N ˆβ 1 ˆβ 2 X 2i ˆβ k X ki = 0 Y i = N ˆβ 1 + ˆβ 2 X 2i + + ˆβ k X ki 1 Y i = N ˆβ 1 + ˆβ 1 2 X 2i + + N ˆβ 1 k X ki N Ȳ = ˆβ 1 + ˆβ 2 X2 + + ˆβ k Xk Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 19 / 64

20 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 20 / 64

21 Properties of the OLS estimator Properties of the OLS estimator: ˆβ = (X X) 1 X Y 1 Linearity An estimator is linear if and only if it can be expressed as a linear function of the dependent variable, conditional on X Note that, conditional on X, ˆβ can be written as a linear function of the error term as well Proof Under assumptions A1 and A2: ˆβ = (X X) 1 X Y = (X X) 1 X (Xβ + u) = β + (X X) 1 X u 2 Unbiasness ˆβ is unbiased, that is, the expected value of the OLS estimator is equal to the population value of the coefficients of the model Proof Under assumptions A1 through A3: E( ˆβ X) = E[(β + (X X) 1 X u) X] = β + (X X) 1 X E(u X) = β Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 21 / 64

22 Properties of the OLS estimator 3 Variance: V ( ˆβ X) = σ 2 (X X) 1 Proof Under assumptions A1 through A5: V ( ˆβ X) = E[( ˆβ E( ˆβ X))( ˆβ E( ˆβ X)) X] = E[( ˆβ β)( ˆβ β) X] = = E [[ ] [ (X X) 1 X u (X X) 1 X u = (X X) 1 X σ 2 I T X(X X) 1 = ] X ] = σ 2 (X X) 1 X X(X X) 1 = σ 2 (X X) 1 = For simplicity, this covariance matrix will be denoted by V ( ˆβ) V ( ˆβ) = σ 2 (X X) 1 = V ( ˆβ 1) Cov( ˆβ 1, ˆβ 2) Cov( ˆβ 1, ˆβ k ) Cov( ˆβ 2, ˆβ 1) V ( ˆβ 2) Cov( ˆβ 2, ˆβ k ) Cov( ˆβ k, ˆβ 1) Cov( ˆβ k, ˆβ 2) V ( ˆβ k ) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 22 / 64

23 Properties of the OLS estimator This variance is minimum in the class of all linear and unbiased estimators Gauss-Markov Theorem Under assumptions A1 through A5, in the class of linear unbiased estimators, OLS has the smallest variance, conditional on X Under assumptions A1 through A5, ˆβ OLS is the best linear unbiased estimator (BLUE) of ˆβ This theorem justifies the use of the OLS method instead of other competing estimators Note All the analysis is conditional on X, even though it is not going to be explicitely written from now onwards in order to simplify the notation Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 23 / 64

24 Estimation OLS residuals The OLS residuals can be written in terms of the error term as follows: û = Mu where M is a squared matrix of order N, symmetric (M = M ), idempotent (MM = M), with rank rg(m)= tr(m)= N K and orthogonal to X (MX = 0) Proof: û = Y Ŷ = Y X ˆβ = Y X(X X) 1 X Y = = [I N X(X X) 1 X ]Y = [I N X(X X) 1 X ](Xβ + u) = = Xβ X(X X) 1 X Xβ + [I N X(X X) 1 X ]u = = [I N X(X X) 1 X ] u = Mu }{{} =M Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 24 / 64

25 Estimation Properties of the OLS residuals 1 Expected value: E(û) = 0 Proof: E(û) = ME(u) = M 0 = 0 2 Variance: Even in the case of homoskedastic and uncorrelated disturbances (u (0, σ 2 I)), the OLS residuals are NOT homoskedastic and they ARE correlated Proof: V (û) = E(ûû ) = E(Muu M ) = = Mσ 2 I N M = σ 2 M M I Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 25 / 64

26 Estimation Estimator of the variance of the error term, σ 2 ˆσ 2 = û û N k = SSR N k = û2 i N k Proof: This estimator of the variance of the error term is unbiased: given that: E(ˆσ 2 ) = E(û û) N k = σ2 (N k) = σ 2 N k E(û û) = E(u Mu) = E(tr(u Mu)) = E(tr(Muu )) = = tr(e(muu )) = tr(mσ 2 I N ) = = σ 2 tr(m) = σ 2 (N k) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 26 / 64

27 Estimation Estimator of the covariance matrix of ˆβ V ( ˆβ) = V ( ˆβ 1 ) Ĉov( ˆβ 1, ˆβ 2 ) Ĉov( ˆβ 1, ˆβ k ) Ĉov( ˆβ 2, ˆβ 1 ) V ( ˆβ2 ) Ĉov( ˆβ 2, ˆβ k ) Ĉov( ˆβ k, ˆβ 1 ) Ĉov( ˆβ k, ˆβ 2 ) V ( ˆβk ) = σ 2 (X X) 1 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 27 / 64

28 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 28 / 64

29 GLRM estimation Goodness-of-fit Sum of squares decomposition SST = SSE + SSR T Total Sum of Squares: SST = (Y t Ȳ )2 t=1 It is a measure of the total sample variation in Y T Explained Sum of Squares: SSE = (Ŷt Ȳ )2 t=1 It is a measure of the sample variation in Ŷ T Residual Sum of Squares: SSR = (Y t Ŷt) 2 = t=1 It is the sample variation in the OLS residuals T t=1 û 2 t Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 29 / 64

30 Goodness-of-fit Regression model: Y t = Ŷt + û t = Y t is decomposed into two parts: fitted value and residual Estimating by OLS, the following decomposition holds: t=1 Therefore, it can be proved that: SST = SSE + SSR N N N (Y t Ȳ )2 = (Ŷt Ȳ )2 + The total sample variation in Y can be expressed as the sum of t=1 t=1 the variation in the fitted values of Y, Ŷ (the variation in Y explained by the regression) + the variation in the OLS residuals, û (the variation in Y UNexplained by the regression) û 2 t Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 30 / 64

31 Goodness-of-fit Proof Writting the GLRM as follows: û t = Y t Ŷt Y t = Ŷt + û t The total sample variation in Y is: SST = T (Y t Ȳ )2 t=1 T (Y t Ȳ )2 = t=1 = = T (Ŷt + ût Ȳ )2 = t=1 T (Ŷt Ȳ )2 + t=1 T (Ŷt Ȳ )2 + t=1 T t=1 T [(Ŷt Ȳ ) + ût]2 = t=1 û 2 t + 2 T û 2 t = t=1 T t=1 (Ŷt Ȳ )ût = T (Ŷt Ŷ ) 2 + t=1 T t=1 û 2 t SST = SSE + SSR Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 31 / 64

32 Goodness-of-fit Goodness-of-fit measure Coefficient of determination: R 2 = SSE SST R 2 is the fraction of the sample variation in Y that is explained by the variation in the explanatory variables, X If the model contains an intercept, the coefficient of determination may be calculated as follows: R 2 = 1 SSR (0, 1) SST Gretl computes the coefficient of determination using the formula in terms of the Sum of Squares Residuals! Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 32 / 64

33 Goodness-of-fit Some performance criteria These criteria are useful to compare the performance of nested regression models Rationale: to adjust the sum of squared residuals by the degrees of freedom Given that they are based on the SSR, the smaller the value of the criterion the better the performance of the model except for the adjusted R 2 Adjusted R 2 : R2 = 1 N 1 N k R2 Log-likelihood: l(ˆθ) = N 2 (1 + ln 2π ln N) N 2 ln SSR Akaike criterion: AIC = 2l(ˆθ) + 2k Schwarz criterion: BIC = 2l(ˆθ) + k ln N Hannan-Quinn criterion: HQC = 2l(ˆθ) + 2k ln ln N Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 33 / 64

34 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 34 / 64

35 Restricted Least Squares estimator Objective To estimate a linear regression model: Y i = β 1 + β 2 X 2i + + β k X ki + u i i = 1, 2,, N u i NID(0, σu) 2 including non sample information on the coefficients The Restricted Least Squares estimator (RLS) is ˆβ RLS = (X RX R ) 1 X RY R where X R y Y R are the observation matrix and the dependent variable of the restricted model, that is, of the model that results from including the non sample information in the regression model See Example 53 for applications Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 35 / 64

36 Restricted Least Squares estimator Properties ˆβ RLS = (X RX R ) 1 X RY R Conditioning on X, the RLS estimator is: Linear in u Unbiased if the non sample information included in the estimation is true The variance of the RLS estimator is always smaller than the variance of the OLS estimator, even if the non sample information included is false Therefore, if the non sample information available is true, the RLS estimator should be used because it is unbiased and has a smaller variance than the OLS estimator Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 36 / 64

37 Restricted Least Squares estimator Example 1 Consider the regression model: pizza i = β 1 + β 2 income i + β 3 age i + u i i = 1, 2,, N subject to: β 2 = β 3 Restricted model: pizza i = β 1 + β 2 (income i age i ) + u i 1 income 1 age 1 pizza 1 1 income 2 age 2 pizza 2 X R = YR = 1 income N age N pizza N ˆβ RLS = ( ˆβ1 ˆβ 2 ) ˆβ 3 RLS RLS = ˆβ 2 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 37 / 64

38 Restricted Least Squares estimator Example 2 Consider the regression model: pizza i = β 1 + β 2 income i + β 3 (age i income i ) + u i i = 1,, N subject to: β 2 = 5 Restricted model: pizza i 5 income i = β 1 + β 3 (age i income i ) + u i X R = 1 age 1 income 1 1 age 2 income 2 1 age N income N YR = ˆβ RLS = ( ˆβ1 ˆβ 3 ) pizza 1 5 income 1 pizza 2 5 income 2 pizza N 5 income N ˆβ RLS 2 = 5 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 38 / 64

39 Restricted Least Squares estimator Example 3 Consider the regression model: Y t = β 1 + β 2 X2 t + β 3 X3 t + β 4 X4 t + u t t = 1, 2,, T subject to: β 3 + β 4 = 1 Restricted model: X R = Y t X4 t = β 1 + β 2 X2 t + β 3 (X3 t X4 t ) + u t 1 X2 1 X3 1 X4 1 1 X2 2 X3 2 X4 2 1 X2 T X3 T X4 T ˆβ RLS = ˆβ 1 ˆβ 2 ˆβ 3 YR = ˆβRLS RLS 4 = 1 ˆβ 3 Y 1 X4 1 Y 2 X4 2 Y T X4 T Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 39 / 64

40 Restricted Least Squares estimator Example 4 Consider the regression model: Y t = β 1 + β 2 X3 t + β 3 X3 2 t + β 4 time + u t t = 1, 2,, T subject to: β 2 = β 3 = 0 Restricted model: Y t = β 1 + β 4 time + u t X R = T ˆβ RLS = ( ˆβ1 ˆβ 4 ) YR = Y 1 Y 2 Y T ˆβ 2 RLS RLS = ˆβ 3 = 0 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 40 / 64

41 GLRM estimation Omitting a relevant variable Omitting a relevant variable Let s assume that this regression model satisfies all the assumptions: Y i = β 1 + β 2 X 2i + β 3 X 3i + + β k X ki + u i i = 1, 2,, N To omit X 2 in this model to include a false restriction (β 2 = 0) The restricted model (omitting X 2 ) will be: Y i = β 1 + β 3 X 3i + + β k X ki + u i i = 1, 2,, N The Restricted Least Squares Estimator, ˆβRLS = (X R X R) 1 X R Y R, will have the following properties, conditional on X: Linear in u BIASED because the restriction included is NOT true Its variance is smaller than the variance of the OLS estimator Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 41 / 64

42 GLRM estimation Including an irrelevant variable Including an irrelevant variable Consider the linear regression model: Y i = β 1 + β 2 X 2i + + β k X ki + u i i = 1, 2,, N (1) Assume that it is known that β 2 = 0, that is, that the regressor X 2 is irrelevant In this case, it can be proved that the OLS estimator in model (1), conditional on X, is: Linear in u Unbiased because, even though an irrelevant variable is included, E(u X) = 0 The variance of the OLS estimator is the smallest in the class of linear and unbiased estimators BUT it is possible to estimate the coefficients of model (1) with a smaller variance: = including in the model the true restriction β 2 = 0 and estimating by Restricted Least Squares Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 42 / 64

43 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 43 / 64

44 Estimation results Example Pizza consumption The file pizzagdt contains information from N individuals on annual pizza consumption (in dollars) and some characteristics of the individuals, such as: Annual income (in thousands of dollars) Age of the consumer (in years) Gender Highest level of studies (basic, high-school, college, postgraduated) The factors that determine pizza consumption will be analysed in detail in the Example 51 The objective of this section is to estimate a very simple model for pizza consumption in order to explain the estimation output produced by Gretl Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 44 / 64

45 Estimation results Consider the linear regression model: pizza i = β 1 + β 2 income i + β 3 age i + u i i = 1,, N where: pizza: dependent variable (pizza consumption) income: quantitative explanatory variable age: quantitative explanatory variable β 1, β 2, β 3 : unknown coefficients u: error term N: sample size i: index Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 45 / 64

46 Estimation results Gretl output Model 1: OLS, using observations 1 40 Dependent variable: pizza Coefficient std error t-ratio p-value const income age Mean dependent var SD dependent var Sum squared resid SE of regression R squared Adjusted R squared F (2, 37) p-value (F ) Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 46 / 64

47 Estimation results Gretl output Model 1: OLS, using observations 1 40 Dependent variable: pizza The information shown in the estimation output heading is: Model 1 this is the first model estimated since Gretl was started OLS the estimation method used using observations 1 40 the sample size is 40 Dependent variable: pizza the dependent variable of the estimated regression model is pizza Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 47 / 64

48 Estimation results Gretl output Model 1: OLS, using observations 1 40 Dependent variable: pizza Coefficient std error t-ratio p-value const income age The first column shows the regressors included in the estimated regression model These regressors may be quantitative variables, dummy variables, products or transformations of variables, That is, they represent the columns of the observation matrix X: the constant (column of 1s) and the regressors income and age Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 48 / 64

49 Estimation results Gretl output Model 1: OLS, using observations 1 40 Dependent variable: pizza Coefficient std error t-ratio p-value const income age The second column shows the estimates of the coefficients of the model obtained using the OLS estimator: ˆβ OLS = (X X) 1 X Y The third column shows the standard errors of the OLS estimators of the coefficients These standard errors are the squared root of the elements in the diagonal of the covariance matrix: V ( ˆβ) = ˆσ 2 (X X) 1, where the estimator of the variance of the error term is given by ˆσ 2 = SSR/(N k) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 49 / 64

50 Estimation results Gretl output Model 1: OLS, using observations 1 40 Dependent variable: pizza Coefficient std error t-ratio p-value const income age The two last columns contain the t-ratios and the p-values This information is useful to test the individual significance of the regressors (see Lesson 6) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 50 / 64

51 Estimation results Gretl output Mean dependent var SD dependent var Sum squared resid SE of regression R squared Adjusted R squared F (2, 37) p-value (F ) Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn At the bottom of the estimation output appears some information of interest: the sum of squared residuals, coefficient of determination and the performance criteria explained in this lesson (adjusted R 2, Log-likelihood, Akaike criterion, Schwarz criterion and Hannan Quinn criterion) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 51 / 64

52 Estimation results Gretl output Mean dependent var SD dependent var Sum squared resid SE of regression R squared Adjusted R squared F (2, 37) p-value (F ) Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn Mean dependent var = pizza = 1 N N i=1 pizza i SD dependent var = SD pizza = SE regression = ˆσ = SSR N k 1 N 1 N i=1 (pizza i pizza) 2 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 52 / 64

53 Estimation results Gretl output Mean dependent var SD dependent var Sum squared resid SE of regression R squared Adjusted R squared F (2, 37) p-value (F ) Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn The information given by F (2, 37) = and the p-value (F ) = are useful to test the joint significance of the regressors (see Lesson 6) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 53 / 64

54 Estimation results Gretl output The file chickengdt contains annual time series data from 1990 to 2012 on: Y : annual chicken consumption (in kilograms) X2: per capital real disposable income (in euros) X3: price of chicken (in euros/kilogram) X4: price of pork (in euros/kilogram) X5: price of beef (in euros/kilogram) Avian flue epidemic ( ) The factors that determine the evolution of chicken consumption will be analysed in detail in the Example 52 In this section, a very simple model to determine chicken consumption will be estimated in order to explain the Gretl estimation output, focusing on the specific results for time series data Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 54 / 64

55 Estimation results Consider the regression model Y t = β 1 + β 2 X2 t + β 3 X3 t + β 4 X4 t + u t t = 1, T where: Y : dependent variable (chicken consumption) X2: quantitative explanatory variable X3: quantitative explanatory variable X4: quantitative explanatory variable β 1, β 2, β 3, β 4 : unknown coefficients u: error term T : sample size t: index Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 55 / 64

56 Estimation results Gretl output Model 1: OLS, using observations (T = 23) Dependent variable: Y Coefficient Std Error t-ratio p-value const X X X Mean dependent var SD dependent var Sum squared resid SE of regression R Adjusted R F (3, 19) P-value(F ) 128e 11 Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn ˆρ Durbin Watson Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 56 / 64

57 Estimation results Gretl output Model 1: OLS, using observations (T = 23) Dependent variable: Y If the sample consists of time series data, the heading of the estimation output shows the sample frequency: (T = 23) 23 annual observations For quarterly data: 1950:1-1955:3 (T = 23) For monthly data: 1980: :11 (T = 23) For daily data: : (T = 23) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 57 / 64

58 Estimation results Gretl output Model 1: OLS, using observations (T = 23) Dependent variable: Y Coefficient Std Error t-ratio p-value const X X X Note that while the regression model is written as: Y t = β 1 + β 2 X 2t + β 3 X 3t + β 4 X 4t + u t the list of regressors in the estimation table is: const, X4, X2, X3 This is because the regressors appear in the table in the same order we introduce them in the Gretl specification window (see Example 511) This fact has to be taken into account to read the results, that is, ˆβ 2 = and ˆβ 4 = Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 58 / 64

59 Estimation results Gretl output Mean dependent var SD dependent var Sum squared resid SE of regression R Adjusted R F (3, 19) P-value(F ) 128e 11 Log-likelihood Akaike criterion Schwarz criterion Hannan Quinn ˆρ Durbin Watson ˆρ = T t=2 ûtût 1 T t=2 û2 t 1 Durbin-Watson: T t=2 (ût ût 1)2 T t=1 û2 t These results appear only when the data set structure is time series The Durbin-Watson autocorrelation test will be studied in Lesson 7 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 59 / 64

60 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 60 / 64

61 Presentation of results The estimation results are usually summarized writing down the Sample Regression Function along with the standard error of the estimators, the coefficient of determination and the Sum of Squares Residuals Summary pizza i = (72343) (046430) income i (23170) age i i = 1,, 40 R 2 = 0329 SSR = (standard error in parentheses) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 61 / 64

62 Presentation of results If we are working with time series data, the same information is presented but adding the value of the Durbin-Watson statistic (see Lesson 7) Summary Ŷ t = ( ) (15501) X4 t ( ) X2 t (391865) X3 t t = 1990,, 2012 R 2 = SCR = DW = (standard error in parentheses) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 62 / 64

63 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 63 / 64

64 Contents 1 The Ordinary Least Squares method 2 Sample Regression Function (SRF) 3 Sampling properties of the OLS estimator 4 Goodness-of-fit of the model 5 Use of non-sample information Restricted Least Squares Estimator (RLS) Omitting and including relevant variables 6 Estimation results using Gretl 7 Presentation of results 8 Tasks: T51 and T52 9 Exercises: E51, E52 and E53 Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model: Estimation 64 / 64

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Multiple Regression Analysis

Multiple Regression Analysis Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple

More information

Exercises (in progress) Applied Econometrics Part 1

Exercises (in progress) Applied Econometrics Part 1 Exercises (in progress) Applied Econometrics 2016-2017 Part 1 1. De ne the concept of unbiased estimator. 2. Explain what it is a classic linear regression model and which are its distinctive features.

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Exercise sheet 6 Models with endogenous explanatory variables

Exercise sheet 6 Models with endogenous explanatory variables Exercise sheet 6 Models with endogenous explanatory variables Note: Some of the exercises include estimations and references to the data files. Use these to compare them to the results you obtained with

More information

Solution to Exercise E6.

Solution to Exercise E6. Solution to Exercise E6. The Multiple Regression Model. Inference Exercise E6.1 Beach umbrella rental Part I. Simple Linear Regression Model. a. Regression model: U t = α + β T t + u t t = 1,..., 22 Model

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

INTRODUCTORY ECONOMETRICS

INTRODUCTORY ECONOMETRICS INTRODUCTORY ECONOMETRICS Lesson 2b Dr Javier Fernández etpfemaj@ehu.es Dpt. of Econometrics & Statistics UPV EHU c J Fernández (EA3-UPV/EHU), February 21, 2009 Introductory Econometrics - p. 1/192 GLRM:

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

The Simple Regression Model. Simple Regression Model 1

The Simple Regression Model. Simple Regression Model 1 The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Reference: Davidson and MacKinnon Ch 2. In particular page

Reference: Davidson and MacKinnon Ch 2. In particular page RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1 Estimation and Inference in Econometrics Exercises, January 24, 2003 Solutions 1. a) cov(wy ) = E [(WY E[WY ])(WY E[WY ]) ] = E [W(Y E[Y ])(Y E[Y ]) W ] = W [(Y E[Y ])(Y E[Y ]) ] W = WΣW b) Let Σ be a

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Advanced Quantitative Methods: ordinary least squares

Advanced Quantitative Methods: ordinary least squares Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) Explain/Obtain the variance-covariance matrix of Both in the bivariate case (two regressors) and

More information

Exercise E7. Heteroskedasticity and Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Exercise E7. Heteroskedasticity and Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics) Exercise E7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 E7 Heteroskedasticity and

More information

Diagnostics of Linear Regression

Diagnostics of Linear Regression Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions

More information

Applied Econometrics. Professor Bernard Fingleton

Applied Econometrics. Professor Bernard Fingleton Applied Econometrics Professor Bernard Fingleton Regression A quick summary of some key issues Some key issues Text book JH Stock & MW Watson Introduction to Econometrics 2nd Edition Software Gretl Gretl.sourceforge.net

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 2: Simple Regression Egypt Scholars Economic Society Happy Eid Eid present! enter classroom at http://b.socrative.com/login/student/ room name c28efb78 Outline

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

The regression model with one fixed regressor cont d

The regression model with one fixed regressor cont d The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8

More information

6. Assessing studies based on multiple regression

6. Assessing studies based on multiple regression 6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

Time Series. Chapter Time Series Data

Time Series. Chapter Time Series Data Chapter 10 Time Series 10.1 Time Series Data The main difference between time series data and cross-sectional data is the temporal ordering. To emphasize the proper ordering of the observations, Table

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

ECON Program Evaluation, Binary Dependent Variable, Misc.

ECON Program Evaluation, Binary Dependent Variable, Misc. ECON 351 - Program Evaluation, Binary Dependent Variable, Misc. Maggie Jones () 1 / 17 Readings Chapter 13: Section 13.2 on difference in differences Chapter 7: Section on binary dependent variables Chapter

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Brief Suggested Solutions

Brief Suggested Solutions DEPARTMENT OF ECONOMICS UNIVERSITY OF VICTORIA ECONOMICS 366: ECONOMETRICS II SPRING TERM 5: ASSIGNMENT TWO Brief Suggested Solutions Question One: Consider the classical T-observation, K-regressor linear

More information

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables Lecture 8. Using the CLR Model Relation between patent applications and R&D spending Variables PATENTS = No. of patents (in 000) filed RDEP = Expenditure on research&development (in billions of 99 $) The

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

F9 F10: Autocorrelation

F9 F10: Autocorrelation F9 F10: Autocorrelation Feng Li Department of Statistics, Stockholm University Introduction In the classic regression model we assume cov(u i, u j x i, x k ) = E(u i, u j ) = 0 What if we break the assumption?

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21 This lecture (#12): Based on the econometric model specification from Lecture 9

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

11. Simultaneous-Equation Models

11. Simultaneous-Equation Models 11. Simultaneous-Equation Models Up to now: Estimation and inference in single-equation models Now: Modeling and estimation of a system of equations 328 Example: [I] Analysis of the impact of advertisement

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests Brief Sketch of Solutions: Tutorial 3 3) unit root tests.5.4.4.3.3.2.2.1.1.. -.1 -.1 -.2 -.2 -.3 -.3 -.4 -.4 21 22 23 24 25 26 -.5 21 22 23 24 25 26.8.2.4. -.4 - -.8 - - -.12 21 22 23 24 25 26 -.2 21 22

More information

Exercise sheet 3 The Multiple Regression Model

Exercise sheet 3 The Multiple Regression Model Exercise sheet 3 The Multiple Regression Model Note: In those problems that include estimations and have a reference to a data set the students should check the outputs obtained with Gretl. 1. Let the

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

Econometrics - Slides

Econometrics - Slides 1 Econometrics - Slides 2011/2012 João Nicolau 2 1 Introduction 1.1 What is Econometrics? Econometrics is a discipline that aims to give empirical content to economic relations. It has been defined generally

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS FINAL EXAM (Type B) 2. This document is self contained. Your are not allowed to use any other material.

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS FINAL EXAM (Type B) 2. This document is self contained. Your are not allowed to use any other material. DURATION: 125 MINUTES Directions: UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS FINAL EXAM (Type B) 1. This is an example of a exam that you can use to self-evaluate about the contents of the course Econometrics

More information