OLS, MLE and related topics. Primer.

Size: px
Start display at page:

Download "OLS, MLE and related topics. Primer."

Transcription

1 OLS, MLE and related topics. Primer. Katarzyna Bech Week 1 () Week 1 1 / 88

2 Classical Linear Regression Model (CLRM) The model: y = X β + ɛ, and the assumptions: A1 The true model is y = X β + ɛ. A2 E (ɛ) = 0. A3 Var(ɛ) = E (ɛɛ 0 ) = σ 2 I n. A4 X is a non-stochastic n k matrix with rank k n. () Week 1 2 / 88

3 Least Squares Estimation The OLS estimator bβ will minimize the sum of squares bβ = arg min β RSS(β) = ɛ 0 ɛ = (y X β) 0 (y X β). Using the rules for matrix calculus the FOC is RSS(β) β = (y X β)0 (y X β) β = 2X 0 y + 2X 0 X β = 0, and the SOC is 2 RSS(β) β β 0 which is clearly positive de nite. = 2X 0 X () Week 1 3 / 88

4 Least Squares Estimation From the FOC: bβ = (X 0 X ) 1 X 0 y. Properties of the OLS estimator: E [bβ] = β + (X 0 X ) 1 X 0 E [ɛ] = β hence it is unbiased, and h Var[bβ] = E (bβ β)(bβ β) 0i = σ 2 (X 0 X ) 1. Gauss-Markov Theorem: under A1-A4: bβ OLS is BLUE. () Week 1 4 / 88

5 Gauss Markov Theorem Proof Consider again the model y = X β + ɛ Let β = C y (k1) (kn) n1 be another linear unbiased estimator. For β to be unbiased we require Thus, we need We have E ( β) = E (Cy) = E (CX β + C ɛ) = CX β = β. CX = I k! β = CX β + C ɛ = β + C ɛ. Var( β) = E (( β β)( β β) 0 ) = E (C ɛɛ 0 C 0 ) = CE (ɛɛ 0 )C 0 = σ 2 CC 0. () Week 1 5 / 88

6 Gauss Markov Theorem Proof We want to show that Var( β) Var( ˆβ) 0. Thus, Var( β) Var( ˆβ) = σ 2 (CC 0 (X 0 X ) 1 ) = σ 2 (CC 0 CX (X 0 X ) 1 X 0 C 0 ) =Ik =I k = σ 2 (CI n C 0 CX (X 0 X ) 1 X 0 C 0 ) = σ 2 C (I n X (X 0 X ) 1 X 0 )C 0 = σ 2 CM X C 0 = σ 2 CM X M 0 X C 0 = σ 2 DD 0 where D = CM X. DD 0 is positive semide nite. σ 2 DD 0 0 implying that Var( β) Var( ˆβ) 0. () Week 1 6 / 88

7 Gauss Markov Theorem Proof The latter result also applies to any linear combination of the elements of β:, i.e. Var(c 0 β) Var(c 0 ˆβ) 0. We thus have the following corollary Corollary Under A1-A4, for any vector of constants c the minimum variance linear unbiased estimator of c 0 β in the classical regression model is c 0 ˆβ, where ˆβ is the least squares estimator. Corollary Each coe cient β j is estimated at least as e ciently by ˆβ j as by any other linear unbiased estimator. () Week 1 7 / 88

8 Least Squares Estimation Clearly, we also need to estimate σ 2. The most obvious estimator to take is 0 y X bβ (y X bβ) bσ 2 = = y 0 M X y n n where M X = I P X = I X (X 0 X ) 1 X 0. BUT bσ 2 is biased, so we have to de ne an unbiased estimator 0 s 2 = n y X bβ (y X bβ) n k bσ2 =. n k () Week 1 8 / 88

9 How Econometrics is typically thought? What are the consequences of the violation of the assumptions A1-A4? () Week 1 9 / 88

10 Perfect Multicollinearity Let s see what happens when A4 is violated, i.e. rank(x ) < k (equivalently, the column of X are linearly dependent). In this case X 0 X is not invertible (it s singular), so that the ordinary least squares estimator cannot be computed. The parameters of such a regression model are unidenti ed. The use of too many dummy variables (dummy variable trap) is a typical cause for exact multicollinearity. Consider for instance the case where we would like to estimate wage including a dummy for males (MALE i ), a dummy for females (FEMALE i ) as well as a constant (note that MALE i + FEMALE i = 1 8i ). Exact multicollinearity is easily solved by excluding one of the variables from the model. () Week 1 10 / 88

11 "Near" Multicollinearity Imperfect multicollinearity arises when two or more regressors are highly correlated, in the sense that there is a linear function of regressors which is highly correlated with another regressor. When regressors are highly correlated, it becomes di cult to disentangle the separate e ects of the regressors on the dependent variable. Near multicollinearity is not a violation of the classical linear regression assumptions. So our OLS estimates are still the best linear unbiased estimators. Best, just simply it might not be all that good. () Week 1 11 / 88

12 "Near" Multicollinearity The higher the correlation between the regressors becomes, the less precise our estimates will be (note that Var( ˆβ) = σ 2 (X 0 X ) 1 ). Consider y i = α + x i1 β 1 + x i2 β 2 + ɛ i. Var( ˆβ j ) = σ 2 n. (x ij x j ) 2 (1 rx 2 1,x 2 ) i=1 Obviously, as r 2 x 1,x 2! 1, the variance increases. () Week 1 12 / 88

13 "Near" Multicollinearity Practical consequences: Although OLS estimators are still BLUE they have large variances and covariances. Individual t-tests may fail to reject that coe cients are 0, even though they are jointly sign cant. Parameter estimates may be very sensitive to one or a small number of observations. Coe cients may have the wrong sign or implausible magnitude. How to detect multicollinearity? High R 2 but few signi cant t-ratios. High pairwise correlation among explanatory variables. How to deal with multicollinearity? A priori information. Dropping a variable. Transformation of variables. () Week 1 13 / 88

14 Misspecifying the Set of Regressors- violation of A1 We consider two issues: Omission of relevant variable (e.g., due to oversight or lack of measurement). Inclusion of irrelevant variables. Preview of the results: In the rst case, the estimator is generally biased. In the second case, there is no bias but the estimator is ine cient. () Week 1 14 / 88

15 Omitted Variable Bias Consider the following two models: True model: y = X β + Z δ + ɛ, E (ɛ) = 0, E (ɛɛ 0 ) = σ 2 I (1) and misspeci ed model which we estimate: The OLS estimate of β in (2) are y = X β + v (2) β = (X 0 X ) 1 X 0 y. () Week 1 15 / 88

16 Omitted Variable Bias In order to assess the statistical properties of the latter, we have to substitute the true model for y, which is given by (1). We obtain β = (X 0 X ) 1 X 0 (X β + Z δ + ɛ) = β + (X 0 X ) 1 X 0 Z δ + (X 0 X ) 1 X 0 ɛ. Taking expectation (recall that X and Z are non stochastic), we have E ( β) = β + (X 0 X ) 1 X 0 Z δ + (X 0 X ) 1 X 0 E (ɛ) = β + (X 0 X ) 1 X 0 Z δ. Thus, generally β is biased. Interpretation of the bias term: (X 0 X ) 1 X 0 Z is the regression coe cient(s) of omitted variable(s) on all included variables. δ is the true coe cient corresponding to the omitted variable(s). () Week 1 16 / 88

17 Omitted Variable Bias We can interpret E ( β) as the sum of two terms, i.e. β is direct change in E (y) associated with changes in X (X 0 X ) 1 X 0 Z δ is the indirect change in E (y) associated with changes in Z (X partially acts as a proxi for Z, when Z is omitted). There would be no omitted variable bias if X and Z were orthogonal, i.e. X 0 Z = 0. Trivially there would be no bias also if δ = 0. () Week 1 17 / 88

18 Example Consider the standard human capital earning function lnw i = α + β 1 school i + β 2 exp i + v i (3) and we are interested in β 1, which is the returns to schooling. We suspect that the relevant independent variable ability i is omitted. What is the likely direction of the bias? Let ˆν denote the OLS coe cients corresponding to schooling in the regression of ability i on a constant, school i and exp i. ˆν is likely to be positive (note that we are not claiming a causal e ect). On the other hand, consider the true model (that cannot be used since ability i is not observed) lnw i = α + β 1 school i + β 2 exp i + δability i + v i. The true coe cient is likely to be positive. () Week 1 18 / 88

19 Example Hence, the direction of the bias of ˆβ 1, when (3) is estimated is given by ˆν (+) (+) δ = +. The returns to schooling tend to be overestimated. () Week 1 19 / 88

20 Variance Taking into account that X and Z are non stochastic and E (ɛ) = 0, Var( β) = E (( β E ( β))( β E ( β)) 0 ) = σ 2 (X 0 X ) 1. Important: s 2 is a biased estimate of the error variance of the true regression error. Indeed, s 2 = ˆv 0 ˆv, n k x where ˆv = y get Thus (show it) X β = M x y. By substituting the true model (1) for y, we ˆv = M x (X β + Z δ + ɛ) = M x (Z δ + ɛ). E (s 2 ) = σ 2 + δ0 Z 0 M X Z δ n k x 6= σ 2. This implies that the t-statistics and F-statistics are invalid. () Week 1 20 / 88

21 Summary of consequences of omitting a relevant variable If the omitted variable(s) are correlated with the included variable(s), the parameter estimates are biased. The disturbance variance σ 2 is incorrectly estimated. As a result, the usual con dence intervals and hypothesis testing procedures are likely to give misleading conclusions. () Week 1 21 / 88

22 Irrelevant Variables Suppose that the correct model is y = X β + ɛ (4) but we instead choose to estimate the bigger model y = X β + Z δ + v (5) The OLS estimate of β in (5) are β = (X 0 M Z X ) 1 X 0 M Z y. () Week 1 22 / 88

23 Irrelevant Variables In order to assess the statistical properties of the latter, we have to substitute the true model for y, which is given by (4). We obtain β = (X 0 M Z X ) 1 X 0 M Z (X β + ɛ) = β + (X 0 M Z X ) 1 X 0 M Z ɛ. Taking expectation, we have Thus, β is unbiased. However, it can be shown that E ( β) = β + (X 0 M Z X ) 1 X 0 M Z E (ɛ) = β. Var( β) σ 2 (X 0 X ) 1 where σ 2 (X 0 X ) 1 is the variance of the correct OLS estimator. () Week 1 23 / 88

24 Summary of consequences of including an irrelevant variable OLS estimators of the parameters of the "incorrect" model are unbiased, but less precise (larger variance). As a consequence of the fact that the price for omitting a relevant variable is so much higher than the price for including irrelevant ones many econometric theorists have suggested a "General to Speci c" modelling strategy. It entails, basically, initially including every regressor that may be suspected of being relevant. Then a combination of R 2 and F and t tests are used to begin to eliminate the least signi cant regressors, hopefully leading to a "correct" model. () Week 1 24 / 88

25 Ramsey s RESET test The most commonly used test to check for the speci cation of the mean function is known as Ramsey s RESET test. Let by i = xi 0b β be the tted values of the OLS regression, then the Ramsey s RESET procedure is to run the regression, as in y i = x 0 i β + γ 1 (by i ) 2 + γ 2 (by i ) γ q (by i ) q+1 + v i and perform an F test of the hypothesis H 0 : γ 1 = γ 2 =... = γ q = 0 If we cannot reject H 0 the model is correctly speci ed. () Week 1 25 / 88

26 Modelling Strategy Approaches to model building: "General to speci c". "Simple to general". The motivating factor when building a model should be ECONOMIC THEORY How do we judge a model to be "good"? Parsimony Identi ability Goodness of t Theoretical consistency Predictive power () Week 1 26 / 88

27 Selecting regressors It is good practice to select the set of potentially relevant variables on the basis of economic arguments rather than statistical ones. There is always a small (but not ignorable) probability of drawing the wrong conclusion. For example there is always a probability of rejecting the null hypothesis that a coe cient is zero, while the null is actually true. Such type I errors are rather likely to happen if we use a sequence of many tests to select the regressors to include in the model. This process is referred to as data mining. In presenting your estimation results, it is not a mistake to have insigni cant variables included in your speci cation. Of course, you should be careful with including many variables in your model that are multicollinear so that, in the end, almost none of the variables appear individually signi cant. () Week 1 27 / 88

28 Spherical Disturbances Assumption A3 of the Classical Linear Regression Model ensures that the variance-covariance matrix of the errors is This implies two things: E (ɛɛ 0 ) = σ 2 I n. E (ɛ 2 i ) = σ2 for all i, i.e. homoskedasticity E (ɛ i ɛ j ) = 0 if i 6= j, i.e. no serial correlation. Naturally, testing for heteroskedasticity and for serial correlation forms a very important part of the econometrics of the linear model. Before we do that, however, we need to look at what sources there are of what the textobooks call non-spherical errors and what e ects might be on our model.

29 Generalized Linear Regression Model The Generalized Linear Regression Model is just the Classical Linear Regression Model, but with non-spherical disturbances (the covariance matrix is no longer proportional to the identity matrix). We consider the model y = X β + ɛ, where and E (ɛ) = 0 E (ɛɛ 0 ) = Σ. Σ is any symmetric positive de nite matrix.

30 Non-spherical disturbances We are interested in two di erent forms of non-spherical disturbances: Heteroscedasticity, with E (ɛ 2 i ) = σ 2 i and E (ɛ i ɛ j ) = 0 if i 6= j, so that each observation will have a di erent variance. In this case, 0 σ 2 1 Σ = σ Ω A = diagfσ σ 2 ng σ 2 n Note that Ω is a diagonal matrix of weights ω i. Sometimes it is convenient to write σ 2 i = σ 2 ω i. Autocorrelation, with E (ɛ 2 i ) = σ 2 and E (ɛ i ɛ j ) 6= 0 if i 6= j. Autocorrelation can be found in time series data.

31 Heteroscedasticity- graphical intuition When data are homoscedastic we expect something like

32 Heteroscedasticity- graphical intuition With heteroscedasticity, we expect

33 Consequences for OLS estimation Let s examine the statistical properties of OLS estimator when E (ɛɛ 0 ) = σ 2 Ω 6= σ 2 I n. Unbiasedness: bβ is still unbiased, since bβ = β + (X 0 X ) 1 X 0 ɛ and since E (ɛ) = 0, we conclude that E (bβ) = β. E ciency: the variance of bβ changes to Var(bβ) = E ((bβ β)(bβ β) 0 ) = (X 0 X ) 1 X 0 E (ɛɛ 0 )X (X 0 X ) 1 = σ 2 (X 0 X ) 1 X 0 ΩX (X 0 X ) 1 The consequence is that the standard OLS estimated variance and standard deviation (computed by statistical packages) is biased, since it is based on the wrong formula σ 2 (X 0 X ) 1. Standard tests are not valid.

34 Consequences for OLS estimation Since one of the Gauss-Markov assumptions is violated (E (ɛɛ 0 ) 6= σ 2 I n ), the OLS of such models is no longer BLUE (although, remember, it is still unbiased). Possible remedies: Construct another estimator which is BLUE- we will call it Generalized Least Squares (GLS) We stick to OLS, but we compute the correct estimated variance.

35 GLS The idea is to transform the model y = X β + ɛ so that the transformed model satis es the Gauss- Markov assumptions. We assume for the time being that Ω is known (rather unrealistic assumption). Property Since Ω is positive de nite and symmetric, there exists a square, nonsingular matrix P such that P 0 P = Ω 1. Sketch of proof: (Spectral decomposition) Since Ω is symmetric there exists C and Λ such that Ω = C ΛC 0, C 0 C = I n. Because Ω is positive de nite (positive de nite matrix is always nonsingular): Ω 1 = C Λ 1 C 0 = P 0 P with P 0 = C Λ 1 2 C 0.

36 GLS We can transform the model y = X β + ɛ by premultiplying it by the matrix P: Py = PX β + Pɛ i.e. y = X β + ɛ. (6) This transformed model satis es the Gauss-Markov assumptions, since and E (ɛ ) = E (Pɛ) = PE (ɛ) = 0 Var(ɛ ) = E (ɛ ɛ 0 ) = E (Pɛɛ 0 P 0 ) = PE (ɛɛ 0 )P 0 = σ 2 PΩP 0 = σ 2 P(P 0 P) 1 P 0 = σ 2 PP 1 P 0 1 P 0 = σ 2 I n.

37 GLS Hence, the OLS estimator of the transformed model (6) is BLUE and bβ GLS = (X 0 X ) 1 X 0 y = (X 0 P 0 PX ) 1 X 0 P 0 Py = (X 0 Ω 1 X ) 1 X 0 Ω 1 y It is easy to verify that E b β GLS = β and Var b β GLS = σ 2 (X 0 Ω 1 X ) 1 = (X 0 Σ 1 X ) 1. (I leave it to you as the exercise)

38 GLS Note: the GLS estimator we have just discussed is useful in any general case of non-spherical disturbances. The general formula is bβ GLS = (X 0 Σ 1 X ) 1 X 0 Σ 1 y. Now, we will specify it to the heteroscedasticity form, i.e. when Σ = diagfσ σ2 ng.

39 Unfeasible GLS We consider again the model y i = x 0 i β + ɛ i, with Var(ɛ i ) = σ 2 i. Thus, Var(ɛ) = diagfσ σ2 ng. In this setting, the transformation P is given by P = diagf 1 σ 1,..., 1 σ n g It is easy to verify that (P 0 P) 1 = Ω = Σ, when we set σ 2 = 1.

40 Unfeasible GLS The transformed model is given by Thus, bβ GLS = y 0 i xi = β + ɛ i. σ i σ i σ i n i=1! 1 x i xi 0 σ 2 i which is also called "weighted least squares". This approach is only available if σ 2 i is known.! n x i y i i=1 σ 2, i

41 Feasible GLS Problem: σ 2 i is not known in general. Therefore, the latter estimator is unfeasible. We need to use a feasible version of such estimator, which means we need to replace the unknown σ 2 i with sensible estimates: bβ GLS = n i=1! 1 x i xi 0 bσ 2 i! n x i y i i=1 bσ 2. i

42 Feasible GLS The main issue is that we need to know the form of the heteroscedasticity in order to estimate σ 2 i. The simplest situation, for instance, would be if we had a conjecture that σ 2 i = zi 0a, where the components of z i are observable. In such case, a consistent estimator of σ 2 i would be the tted values of the arti cial (called skedastic) regression: bɛ 2 i = z 0 i a + v i, where bɛ i are the residuals of the original regression when estimated by standard OLS. We can accomodate into this idea more complicated functional forms for the heteroscedasticity (e.g. σ 2 i = exp(z 0 i a)).

43 White robust standard errors Rather than estimating β by GLS, we can stick to OLS and "correct" the standard errors. We have seen that Var b β OLS = (X 0 X ) 1 X 0 E (ɛɛ 0 )X (X 0 X ) 1 = (X 0 X ) 1 X 0 ΣX (X 0 X ) 1 and in the vector form Var b β OLS =! 1 n x i xi 0 i=1! n σ 2 i x i xi 0 i=1! 1 n x i xi 0. i=1 We don t know σ 2 i, but under general conditions White (1980) showed that the matrix n 1 bɛ 2 i x i xi 0, n i=1 where again bɛ 2 i are the residuals of the OLS estimation, is a consistent estimator of 1 n n i=1 σ 2 i x i xi 0.

44 Detection of heteroscedasticity Informal methods: graphical method Formal methods: Goldfeld- Quandt test Breusch-Pagan/Godfrey test White s general heteroscedasticity test

45 Goldfeld- Quandt test Suppose that we think that the heteroscedasticity is particularly related to one of the variables, i.e. one of the columns of X, say x j. The Goldfeld- Quandt test is based on the following: 1 Order the observations according to the values of x j, starting with the lowest. 2 Split the sample into three parts, of lenghts n 1, c and n 2. 3 Obtain the residuals bɛ 1 and bɛ 2 from regressions using the rst n 1, and then the last n 2 observations separately. 4 Test H 0 : σ 2 i = σ 2 for all i using the statistic F = bɛ0 1 bɛ 1/(n 1 k) bɛ 0 2 bɛ 2/(n 2 k) with critical values obtained from the F (n 1 k, n 2 k) distribution. Practical advice: use this test for relatively small sample sizes (up to n = 100); with n = 30 choose c = 8, with n = 60 choose c = 16.

46 Breusch-Pagan/Godfrey test Suppose that the source of heteroscedasticity is that σ 2 i = g(α 0 + ez 0 i α), where ez 0 i is the q 1 row of a matrix of regressors Z, which is n q. Z can contain some or all of the X s in our CLRM, if desired, although ideally the choice of Z is done on the basis of some economic theory. For computational purposes Z does not contain a constant term. This test involves an auxiliary regression: v = ez γ + u, ez = M 0 Z where M 0 = I ee 0 e 0 e with e = (1, 1,..., 1)0. u is a vector of iid errors and the dependent variable is de ned by v i = bɛ2 i (bɛ 0 bɛ)/n 1.

47 Breusch-Pagan/Godfrey test The test is related to an F test in the auxiliary regression of the joint restrictions H 0 : γ = 0 and the test- statistic is given by BP = 1 1 v 0 ez Z e 0 ez Z e v 0, 2 which follows χ 2 (q) distribution in large samples.

48 White test For simplicity, assume that the data follow y i = β 0 + β 1 x 1i + β 2 x 2i + ɛ i. 1 Run your regression and save the residuals (denoted as bɛ i, as usual). 2 Run an auxiliary (arti cial) regression with the squared residuals as the dependent variable and where explanatory variables include all the explanatory variables from step 1 plus the squares and cross products: bɛ 2 i = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 2 1i + β 4 x 2 2i β 0 + β 5 x 1i x 2i + v i. 3 Obtain R 2 from the auxiliary regression in step 2. 4 Construct the test- statistic: nr 2 asy. χ 2 (k 1), where k is the number of parameters we estimate in the auxiliary regression, so k = 6 in this example.

49 White test Note that H 0 : Var(ɛ i ) = σ 2 vs. H 1 : heteroscedasticity. Thus, if the calculated value of the test-statistic exceeds the critical chi-square value at the chosen level of signi cance, we reject the null that the error has a constant variance. White test is a large sample test, so it is expected to work well only when the sample is su ciently large.

50 Summary of heteroscedasticity A crucial CLRM assumption is that Var(ɛ i ) = σ 2. If Var(ɛ i ) = σ 2 i instead, we have heteroscedasticity. OLS estimators are no longer BLUE; still unbiased, but not e cient. Two remedial approaches to deal with heteroscedasticity: when σ 2 i is known: weighted least squares (WLS), when σ 2 i is unknown: make an educated guess about the likely pattern of the heteroscedasticity to consistently estimate σ 2 i and use it in the feasible GLS White s heteroscedasticity- consistent variances and standard errors A number of tests are available to detect heteroscedasticity.

51 Serial correlation- general case Now we want to specify the results we discussed to the model y t = x 0 t β + ɛ t, where x t is the usual k 1 vector. E (ɛ t ) = 0 and E (ɛ t ɛ s ) 6= 0 for some t 6= s. Note that autocorrelation (or serial correlation, it s the same) is very common in time series data, for instance due to unobserved factors (omitted variables) that are correlated over time. A common form of serial correlation is the autoregressive structure.

52 Autocorrelation patterns There are several forms of autocorrelation, each leading to a di erent structure for the error covariance matrix Var(ɛ) = σ 2 Ω. We will only consider here AR(1) structures, i.e. ɛ t = ρɛ t 1 + v t, where v t iid(0, σ 2 ), where jρj < 1. It s easy to show that E (ɛ t ) = 0 We can show that: E (ɛɛ 0 ) = σ2 v 1 ρ ρ ρ 2... ρ n 1 ρ 1 ρ... ρ n ρ n 1 ρ n C A

53 Autocorrelation patterns As we discussed, in case we have non-spherical disturbances: we can either stick to OLS and correct the standard errors or we can derive GLS, but this requires a feasible version of the matrix P to be implemented. A suitable matrix P can be derived fairly easily in the case of AR(1) error term as 0 p 1 1 ρ ρ P = B 0 ρ A ρ 1 Rather than on the derivation of P we focus on tests for detecting serial correlation.

54 Adjust the standard errors when errors are serially correlated When all other classical regression assumptions are satis ed (in particular uncorrelatedness between the errors and regressors) we may decide to use OLS, which is unbiased and consistent but ine cient. The corrected standard errors can be derived from Var( ˆβ) = (X 0 X ) 1 X 0 ΣX (X 0 X ) 1, using Newey-West Heteroskedasticity Autocorrelation Consistent estimator for Σ.

55 Testing for serial correlation Informal checks: Tests: plot the residuals and check whether there is any systematic pattern obtain the correlogram (graphical representation of the autocorrelations) of the residuals and check whether calculated autocorrelations are signi cantly di erent from zero. From the correlogram you can also "guess" the structure of the autocorrelation. Durbin- Watson d test Breusch- Godfrey test

56 Durbin- Watson d test The rst test we consider to test H 0 : d statistic, which is ρ = 0 is the Durbin-Watson d = T t=2 (ˆɛ t ˆɛ t 1 ) 2 T ˆɛ 2 t t=2 where ˆɛ t are the OLS residuals obtained by estimating the parameters. It is possible to show that for large T, d! 2 2ρ. The value of d is given by any statistical package. If the value of d is close to 2, we can conclude that the model is free from serial correlation. We need to establish a precise rejection rule.,

57 Durbin- Watson d test Unfortunately, establishing a rejection rule for a test of H 0 based on d statistic is more complicated than usual. This is because the critical values depend also on the value of the regressors. The best we can do is to give upper and lower bounds for the critical values of d, d U and d L, and establish the following rejection rule If the computed value of d is lower than d L, we reject H 0. If the computed value of d is greater than d U, we fail to reject H 0. If the computed value of d is between d L and d U we cannot conclude (the outcome is indeterminate). d L and d U in the statistical tables are given in terms of number of observations and number of regressors.

58 Durbin- Watson d test Limitations: d test is only valid when: the model contains the intercept there are no lagged dependent variables as regressors

59 Breusch- Godfrey test Suppose we have the general model y t = x 0 t β + ɛ t where ɛ t = ρɛ t 1 + v t, (7) where one or more of the regressors can be a lagged y (such as, y t we wish to test H 0 : ρ = 0, we can 1 ). If Estimate the parameters of (7) by OLS and save the residuals ˆɛ t for t = 1,...T. Estimate the following residual regression ˆɛ t = x 0 t γ k + δˆɛ t 1 + ν t. (8) From the regression output obtained by performing regression (8), compute nr 2 ( n is the actual number of observations, which is T 1 in this case as the rst one is lost). Under H 0, nr 2 is asymptotically χ 2 with one degree of freedom.

60 Maximum Likelihood Estimation The most direct way of estimating unknown parameters is known as maximum likelihood estimation. Although the principle can be applied more generally, assume that the data fy i g n i=1 are iid copies of the RV Y, which is known to have a density function f (y, θ). As y i are iid, the joint sample density is f (y 1,..., y n, θ) = n i=1 f (y i, θ). We then interpret this as a function of the unknown parameters given we have observed the data, as in L(θ, y) = f (y 1,..., y n, θ) called the likelihood. Then, we simply ask what values of θ are most likely given this data i.e. we maximize the likelihood with respect to θ. Actually, since log is a monotone function, we de ne MLE to be bθ = arg max l(θ) = log L(θ, y). θ

61 Maximum Likelihood Estimation in NLRM Consider the model y = X β + ε and let the standard assumptions (A1-A4) hold. Additionally assume that the errors are normally distributed, i.e. A5 : ε N(0, σ 2 I n ). Since y = X β + ε, we immediately have y N(X β, σ 2 I n ), as if w N(µ, Σ), then for any n m A and m 1 b, z = A 0 w + b N(A 0 µ + b, A 0 ΣA).

62 Maximum Likelihood Estimation in NLRM Moreover we can write down its joint density (multivariate normal) f (y) = expf 1 2σ 2 (y X β) 0 (y X β)g (2πσ 2 ) n 2 from which we obtain the log-likelihood l(β, σ 2 ) = 1 2σ 2 (y X β)0 (y X β) n 2 ln n σ2 ln 2π. 2

63 Maximum Likelihood Estimation in NLRM Thus the maximum likelihood estimators must satisfy the FOC (the score vector equals 0 at the MLE), i.e. with l(β, σ 2 ) β S(β, σ 2 ) = l(β,σ 2 ) β = 0 l(β,σ 2 ) σ 2 = X 0 y X 0 X β σ 2 = 0 (9) l(β, σ 2 ) σ 2 = 1 2σ 4 (y X β)0 (y X β) Solving (9), we nd bβ MLE = (X 0 X ) 1 X 0 y n 2σ 2 = 0. bσ 2 MLE = (y X bβ) 0 (y X bβ) n = y 0 My n.

64 Maximum Likelihood Estimation in NLRM The SOC, is that the matrix of second order derivatives (evaluated at MLE estimators), i.e l(β,σ 2 ) 2 l(β,σ 2 ) H(β, σ 2 )j b β MLE,bσ 2 β β 0 β σ 2 A MLE 2 l(β,σ 2 ) σ 2 β 2 l(β,σ 2 ) (σ 2 ) 2 is negative de nite. Note that H(β, σ 2 ) = and X 0 X X 0 y X 0 X β σ 2 σ 4 X 0 y X 0 X β σ 4 1 σ 6 (y X β) 0 (y X β) + n 2σ 4 H(bβ MLE, bσ 2 MLE ) = which is obviously negative de nite. X 0 X 0 bσ 2 MLE n 0 2bσ 2 MLE!,! (10)

65 Properties of MLE Clearly bβ MLE N(β, σ 2 (X 0 X ) 1 ). Note that the OLS estimator equals bβ MLE under normality. We proved that the OLS estimator is BLUE by the Gauss- Markov Theorem (smallest variance in the class of linear unbiased estimators). Under normality, we can strengthen our notion of e ciency and we can show that bβ MLE is BUE (Best Unbiased Estimator) by the Cramer-Rao Lower Bound (the variance of any unbiased estimator is at least as large as the inverse of the information).

66 Cramer-Rao Lower Bound The information is de ned as I n (β, σ 2 ) = E H(β, σ 2 ) and given (10), in our case we have I n (β, σ 2 ) = X 0 X σ n 2σ 4 so that the inverse of the information is 1 σ In (β, σ 2 2 (X 0 X ) 1 0 ) = 2σ 0 4 n.

67 Maximum Likelihood Estimation in NLRM h i Since we know that V b β MLE = σ 2 (X 0 X ) 1, then clearly bβ MLE is BUE. Note that bσ 2 MLE is biased, but we can de ne the unbiased estimator by s 2 = n n k bσ2 MLE = (y X bβ) 0 (y X bβ). n k Unfortunately the unbiased s 2 is not BUE, since V s 2 = 2σ4 (n k) > 2σ4 n. Note that as n!, s2 becomes e cient as long as k is xed.

68 Maximum Likelihood Estimation in NLRM h i Since we know that V b β MLE = σ 2 (X 0 X ) 1, then clearly bβ MLE is BUE. Note that bσ 2 MLE is biased, but we can de ne the unbiased estimator by s 2 = n n k bσ2 MLE = (y X bβ) 0 (y X bβ). n k Unfortunately the unbiased s 2 is not BUE, since V s 2 = 2σ4 (n k) > 2σ4 n. Note that as n!, s2 becomes e cient as long as k is xed.

69 Oaxaca-Blinder (1973) Microeconometric decomposition technique, which allows to study the di erences in outcomes between two groups, when these di erences come from the di erences in characteristics (explained variation) and di erences in parameters (discrimination/ unexplained variation). Most often used in the literature on wage inequalities (female vs. male, union vs. nonunion workers, public vs. private sector workers, migrants vs. native workers).

70 Oaxaca-Blinder preliminaries Assume that y is explained by the vector of regressors x as in the linear regression model: y = β female x i + ε female i, if i is female β male x i + ε male i, if i is male where β also contains the intercept. Assume that men are privilidged. The di erence in a mean outcome is, y male y female = β male x male β female x female

71 Oaxaca-Blinder Alternatively: y male y female = 4x β male + 4βx female or y male y female = 4x β female + 4βx male Di erences in outcome come from di erent characteristics and di erent parameters (females have worse x and worse β). Even more generally: y male y female = 4x β female + 4βx female + 4x4β = E + C + CE Problem: which group to pick as the reference?

72 General version of the decomposition General equation: y male y female = β (x male x female ) (di in characteristics) + x male (β male β )+x female (β β female )(di in parameters) (male advantage) (female disadvantage) where β = λβ male + (1 λ)β female

73 General version of the decomposition λ value Interpretation λ = 1 Male parameters as a reference λ = 0 Female parameters as a reference λ = 0.5 Average, Reimers (1983) λ = %male Parameters weighted by the sample proportion, Cotton (1988) β Parameters for the whole sample without gender dummy, = pooled Neumark (1988) β Parameters for the whole sample with gender dummy, = pooled Fortin (2008) λ = %female Parameters weighted by the opposite sex, S oczyński (2013)

74 Some asymptotic results prior to Stochastic Regressors The idea is to treat results that hold when n! as approximations for nite n. In particular, we will be interested to see whether estimators are consistent and what their asymptotic distribution is. Asymptotic results are useful in nite samples ( nite n) because, for instance, we may not be able to show unbiasedness or we may not be able to determine the sampling distribution to carry on statistical inference.

75 Asymptotic theory- some results De nition - Convergence in Probability Let X n = fx i, i = 1,..., ng be a sequence of random variables. X n converges in probability to X if lim Pr(jX n X j > ε) = 0 for any ε > 0. n! You will see convergence in probability written as either X n! p X or p lim n! X n = X.

76 Convergence in Probability The idea behind this type of convergence is that the probability of an "unusual" outcome becomes smaller and smaller as the sequence progresses. Example: Suppose you take a basketball and start shooting free throws. Let X n be your success percentage in n th shot. Initially you are likely to miss a lot, but as the time goes on your skill increases, and you are more likely to make the shots. After years of practice the probability that you miss will be getting increasingly smaller and smaller. Thus, as n! the sequence X n converges in probability to X = 100%. Note that the probability does not become 100% as there is always a small probability of missing.

77 Consistency De nition - Consistency An estimator bθ of θ is consistent if, when the sample size increases, bθ gets "closer" to θ. Formally, bθ is consistent when p lim n! bθ = θ. Useful result - Slutsky Theorem Let g() be a continuous function. Then p lim n! g(x n ) = g(p lim n! X n ). Useful result - Chebyshev s inequality For a random variable X n with mean µ and variance Var(X n ) Pr(jX n µj > ε) Var(X n) ε 2 for any ε > 0.

78 Consistency Su cient condition for consistency (not necessary): If lim E (bθ) = θ and Var(bθ)! 0 as n! n! then or as we write, lim Pr(j bθ θj > ε) = 0 n! p lim n! bθ = θ. If bθ is unbiased, in order to determine whether bθ is also consistent, we only need to check that Var(bθ)! 0 as n!. Note that biased estimators can be consistent as long as lim n! E (bθ) = θ and Var(bθ)! 0 as n!.

79 Unbiasedness versus Consistency Unbiased, but not consistent: Suppose that given an iid sample fx 1,..., X n g we want to estimate the mean of X, i.e. E (X ). We could use the rst observation as the estimator for the mean, so bθ = X 1. bθ is unbiased, since E (bθ) = E (X 1 ) = E (X ), because we have an iid sample. However, bθ does not converge to any value, therefore it cannot be consistent. Biased, but consistent: an alternative estimator for the mean in an iid sample fx 1,..., X n g might be eθ = 1 n n i=1 X i + 1 n. eθ is biased, since E (eθ) = E (X ) + 1 n, but lim E ( eθ) = E (X ) n! and Var(eθ) = Var(X ) n Therefore, eθ is consistent.! 0 as n!.

80 Law of Large Numbers Laws of large numbers provide conditions to ensure that sample moments converge to their population moments. Weak Law of Large Numbers Let X n = fx i, i = 1,..., ng be an independent and identically distributed (iid) sequence of random variables with E jx i j <. Then n 1 n X i! p E (X i ). i=1 We only consider this version of WLLN (for iid data). When data are not iid, such as for time series data, stronger conditions are needed.

81 Law of Large Numbers Example: Consider a coin (heads and tails) being ipped. Logic says there are equal chances of getting heads or tails. If the coin is ipped 10 times, there is a good chance that the proportion of heads and tails is not equal. Crudely, the law of large numbers says that as the number of times ipped increases the proportion of heads will converge in probability to 0.5.

82 Sampling distribution When we do not know the exact sampling distribution of an estimator (for instance when we do not assume normality of the error term), we may ask whether asymptotics can allow us to infer something about its distribution for large n, so that we are still able to make inference on the estimates. De nition - Convergence in Distribution Let X n = fx i, i = 1,..., ng be a sequence of random variables and let X be a random variable with distribution F X (x). X n converges in distribution to X if lim n! Pr(X n x) = F X (x) and is written X n! d X. The approximating distribution F X (x) is called a limiting or asymptotic distribution.

83 Useful results Convergence in probability implies convergence in distribution, i.e. X n! p X ) X n! d X. Convergence in distribution to a constant implies convergence in probability to a constant, i.e. X n! d c ) X n! p c. Suppose that fy n g is another sequence of random variables and let g() be a continuous function. If X n! d X and Y n! p c then For example, X n + Y n! d X + c. g(x n, Y n )! d g(x, c).

84 Central Limit Theorem The most important example of convergence in distribution is called the Central Limit Theorem, which is useful to establish the asymptotic distribution. If X 1,..., X n is an iid sample from any probability distribution with nite mean µ and nite variance σ 2, we have p n(x µ)! d N(0, σ 2 ), or equivalently p n(x µ)! d N(0, 1). σ The CLT guarantees that, even if the errors are not normally distributed, but simply iid with zero mean and variance σ 2, we can conclude p! n b 1 1 β β! d N 0, σ 2 lim n X 0 X.

85 Limiting distributions of test-statistics Thus, when the disturbances are iid with zero mean and nite variance, the tests we previously discussed are asymptotically valid and t! d N(0, 1) under H 0 W = F! d χ 2 J under H 0.

86 Law of iterated expectation We have a useful result that we will use extensively. Let X and Y be two random variables, we have E (Y ) = E [E (Y jx )]. A very useful by-product of the Law of Iterated Expectation is sometimes written as E (XY ) = E [XE (Y jx )], E (XY ) = E X [XE (Y jx )] to indicate that the outer expectation is taken with respect to X.

87 Stochastic regressors We consider again y = X β + ɛ. So far, we assumed that the regressors are xed. This is a rather unrealistic assumption. We now allow X to have stochastic components. However, a crucial condition that has to hold to perform regression analysis is that regressors and errors have to be uncorrelated.

88 Assumptions Our new revised assumptions become A1 The true model is y = X β + ɛ. A2 8i, E (ɛ i jx ) = 0 (conditional zero mean)- more about this later. A3 Var(ɛjX ) = E (ɛɛ 0 jx ) = σ 2 I (spherical disturbances). A4 X has full column rank. A5 (eventually, for testing purposes, ɛjx N(0, σ 2 I )). Under A1-A4 the OLS is Gauss - Markov e cient.

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Introductory Econometrics

Introductory Econometrics Introductory Econometrics Violation of basic assumptions Heteroskedasticity Barbara Pertold-Gebicka CERGE-EI 16 November 010 OLS assumptions 1. Disturbances are random variables drawn from a normal distribution.

More information

Econometrics Lecture 1 Introduction and Review on Statistics

Econometrics Lecture 1 Introduction and Review on Statistics Econometrics Lecture 1 Introduction and Review on Statistics Chau, Tak Wai Shanghai University of Finance and Economics Spring 2014 1 / 69 Introduction This course is about Econometrics. Metrics means

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e., 1 Motivation Auto correlation 2 Autocorrelation occurs when what happens today has an impact on what happens tomorrow, and perhaps further into the future This is a phenomena mainly found in time-series

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

1 Correlation between an independent variable and the error

1 Correlation between an independent variable and the error Chapter 7 outline, Econometrics Instrumental variables and model estimation 1 Correlation between an independent variable and the error Recall that one of the assumptions that we make when proving the

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Iris Wang.

Iris Wang. Chapter 10: Multicollinearity Iris Wang iris.wang@kau.se Econometric problems Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences?

More information

Environmental Econometrics

Environmental Econometrics Environmental Econometrics Syngjoo Choi Fall 2008 Environmental Econometrics (GR03) Fall 2008 1 / 37 Syllabus I This is an introductory econometrics course which assumes no prior knowledge on econometrics;

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 15, 2013 Christophe Hurlin (University of Orléans)

More information

Graduate Econometrics Lecture 4: Heteroskedasticity

Graduate Econometrics Lecture 4: Heteroskedasticity Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Topic 7: Heteroskedasticity

Topic 7: Heteroskedasticity Topic 7: Heteroskedasticity Advanced Econometrics (I Dong Chen School of Economics, Peking University Introduction If the disturbance variance is not constant across observations, the regression is heteroskedastic

More information

AUTOCORRELATION. Phung Thanh Binh

AUTOCORRELATION. Phung Thanh Binh AUTOCORRELATION Phung Thanh Binh OUTLINE Time series Gauss-Markov conditions The nature of autocorrelation Causes of autocorrelation Consequences of autocorrelation Detecting autocorrelation Remedial measures

More information

(c) i) In ation (INFL) is regressed on the unemployment rate (UNR):

(c) i) In ation (INFL) is regressed on the unemployment rate (UNR): BRUNEL UNIVERSITY Master of Science Degree examination Test Exam Paper 005-006 EC500: Modelling Financial Decisions and Markets EC5030: Introduction to Quantitative methods Model Answers. COMPULSORY (a)

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity 1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Economics 308: Econometrics Professor Moody

Economics 308: Econometrics Professor Moody Economics 308: Econometrics Professor Moody References on reserve: Text Moody, Basic Econometrics with Stata (BES) Pindyck and Rubinfeld, Econometric Models and Economic Forecasts (PR) Wooldridge, Jeffrey

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94 Freeing up the Classical Assumptions () Introductory Econometrics: Topic 5 1 / 94 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions needed for derivations

More information

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation 1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Michael Bar October 3, 08 San Francisco State University, department of economics. ii Contents Preliminaries. Probability Spaces................................. Random Variables.................................

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Violation of OLS assumption - Heteroscedasticity

Violation of OLS assumption - Heteroscedasticity Violation of OLS assumption - Heteroscedasticity What, why, so what and what to do? Lars Forsberg Uppsala Uppsala University, Department of Statistics October 22, 2014 Lars Forsberg (Uppsala University)

More information

Quantitative Techniques - Lecture 8: Estimation

Quantitative Techniques - Lecture 8: Estimation Quantitative Techniques - Lecture 8: Estimation Key words: Estimation, hypothesis testing, bias, e ciency, least squares Hypothesis testing when the population variance is not known roperties of estimates

More information

Violation of OLS assumption- Multicollinearity

Violation of OLS assumption- Multicollinearity Violation of OLS assumption- Multicollinearity What, why and so what? Lars Forsberg Uppsala University, Department of Statistics October 17, 2014 Lars Forsberg (Uppsala University) 1110 - Multi - co -

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Speci cation of Conditional Expectation Functions

Speci cation of Conditional Expectation Functions Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

1 Regression with Time Series Variables

1 Regression with Time Series Variables 1 Regression with Time Series Variables With time series regression, Y might not only depend on X, but also lags of Y and lags of X Autoregressive Distributed lag (or ADL(p; q)) model has these features:

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

Problem set 1 - Solutions

Problem set 1 - Solutions EMPIRICAL FINANCE AND FINANCIAL ECONOMETRICS - MODULE (8448) Problem set 1 - Solutions Exercise 1 -Solutions 1. The correct answer is (a). In fact, the process generating daily prices is usually assumed

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Economics 241B Estimation with Instruments

Economics 241B Estimation with Instruments Economics 241B Estimation with Instruments Measurement Error Measurement error is de ned as the error resulting from the measurement of a variable. At some level, every variable is measured with error.

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

Chapter 8 Heteroskedasticity

Chapter 8 Heteroskedasticity Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance. Heteroskedasticity y i = β + β x i + β x i +... + β k x ki + e i where E(e i ) σ, non-constant variance. Common problem with samples over individuals. ê i e ˆi x k x k AREC-ECON 535 Lec F Suppose y i =

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

1 Introduction to Generalized Least Squares

1 Introduction to Generalized Least Squares ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 7: the K-Varable Linear Model IV

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

Multiple Regression Analysis

Multiple Regression Analysis 1 OUTLINE Basic Concept: Multiple Regression MULTICOLLINEARITY AUTOCORRELATION HETEROSCEDASTICITY REASEARCH IN FINANCE 2 BASIC CONCEPTS: Multiple Regression Y i = β 1 + β 2 X 1i + β 3 X 2i + β 4 X 3i +

More information

Instrumental Variables and Two-Stage Least Squares

Instrumental Variables and Two-Stage Least Squares Instrumental Variables and Two-Stage Least Squares Generalised Least Squares Professor Menelaos Karanasos December 2011 Generalised Least Squares: Assume that the postulated model is y = Xb + e, (1) where

More information

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley Models, Testing, and Correction of Heteroskedasticity James L. Powell Department of Economics University of California, Berkeley Aitken s GLS and Weighted LS The Generalized Classical Regression Model

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information