ECONOMETRICS. Bruce E. Hansen. c2000, 2001, 2002, 2003, University of Wisconsin

Size: px
Start display at page:

Download "ECONOMETRICS. Bruce E. Hansen. c2000, 2001, 2002, 2003, University of Wisconsin"

Transcription

1 ECONOMETRICS Bruce E. Hansen c2000, 200, 2002, 2003, 2004 University of Wisconsin Revised: January 2004 Comments Welcome This manuscript may be printed and reproduced for individual or instructional use, but may not be printed for commercial purposes.

2 Contents Ordinary Least Squares. Framework Estimation Eciency Model in Matrix Notation Residual Regression Consistency Asymptotic Normality Covariance Matrix Estimation Functions of Parameters t tests Condence Intervals Wald Tests F Tests Regression Models Regression Bias and Variance of OLS estimator Multicollinearity Forecast Intervals NonLinearity in Regressors NonLinear Least Squares Normal Regression Model Least Absolute Deviations Model Selection Omitted Variables Irrelevant Variables Model Selection Testing for Omitted NonLinearity i

3 3.5 log(y ) versus Y as Dependent Variable Generalized Least Squares GLS and the Gauss-Markov Theorem Skedastic Regression Estimation of Skedastic Regression Testing for Heteroskedasticity Feasible GLS Estimation Covariance Matrix Estimation Commentary: FGLS versus OLS Generalized Method of Moments Overidentied Linear Model GMM Estimator Distribution of GMM Estimator Estimation of the Ecient Weight Matrix GMM: The General Case Over-Identication Test Hypothesis Testing: The Distance Statistic Conditional Moment Restrictions Empirical Likelihood Non-Parametric Likelihood Asymptotic Distribution of EL Estimator Overidentifying Restrictions Testing Numerical Computation Derivatives Inner Loop Outer Loop Endogeneity 7 7. Instrumental Variables Reduced Form Identication Estimation Special Cases: IV and 2SLS Bekker Asymptotics Identication Failure ii

4 8 The Bootstrap 8 8. Monte Carlo Simulation An Example The Empirical Distribution Function Denition of the Bootstrap Bootstrap Estimation of Bias and Variance Percentile Intervals Percentile-t Equal-Tailed Interval Symmetric Percentile-t Intervals Asymptotic Expansions One-Sided Tests Symmetric Two-Sided Tests Percentile Condence Intervals Bootstrap Methods for Regression Models Bootstrap GMM Inference Univariate Time Series Stationarity and Ergodicity Autoregressions Stationarity of AR() Process Lag Operator Stationarity of AR(k) Estimation Asymptotic Distribution Bootstrap for Autoregressions Trend Stationarity Testing for Omitted Serial Correlation Model Selection Autoregressive Unit Roots Multivariate Time Series 4 0. Vector Autoregressions (VARs) Estimation Restricted VARs Single Equation from a VAR Testing for Omitted Serial Correlation Selection of Lag Length in an VAR Granger Causality Cointegration Cointegrated VARs iii

5 Limited Dependent Variables 2. Binary Choice Count Data Censored Data Sample Selection Panel Data Individual-Eects Model Fixed Eects Dynamic Panel Regression Nonparametrics Kernel Density Estimation Asymptotic MSE for Kernel Estimates Appendix A: Mathematical Formula 38 5 Appendix B: Matrix Algebra Terminology Matrix Multiplication Trace, Inverse, Determinant Eigenvalues Idempotent and Projection Matrices Kronecker Products and the Vec Operator Matrix Calculus Appendix C: Probability Foundations Random Variables Expectation Common Distributions Multivariate Random Variables Conditional Distributions and Expectation Transformations Normal and Related Distributions Maximum Likelihood Appendix D: Asymptotic Theory Inequalities Convergence in Probability Almost Sure Convergence Convergence in Distribution iv

6 7.5 Asymptotic Transformations Appendix E: Numerical Optimization Grid Search Gradient Methods Derivative-Free Methods v

7 Chapter Ordinary Least Squares. Framework An econometrician has observational data f(y ; x ) ; (y 2 ; x 2 ) ; :::; (y i ; x i ) ; :::; (y n ; x n )g = f(y i ; x i ) : i = ; :::; ng where each pair fy i ; x i g 2 R R k is observation on an individual (e.g., household or rm). We call these observations the sample. Notice that the observations are paired (y i ; x i ): We call y i the dependent variable and x i the regressor vector. For convenience, the vector x i is typically presumed to include a constant. That is, one element (typically written as the rst) equals. We can write the k regressor x i as 0 0 x i = x i x 2i. x ki C A = If the data is cross-sectional (each observation is a dierent individual) it is often reasonable to assume they are mutually independent. If the data is randomly gathered, it is reasonable to model each observation as a random draw from the same probability distribution. Thus, the data are independent and identically distributed, or iid. We call this a random sample. Sometimes the label iid is misconstrued. It means that the pair (y i ; x i ) is independent of the pair (y j ; x j ) for i 6= j: It is not a statement about the relationship between y i and x i : The random variables (y i ; x i ) have a distribution F which we call the population. This \population" is innitely large. Sometimes this is a source of confusion, but it is merely an abstraction. This distribution is unknown, and the goal of statistical inference is to learn about features of F from the sample. x 2i. x ki C A :

8 It is unimportant whether the observations y i and x i may come from continuous or discrete distributions. For example, many regressors in econometric practice are binary, taking on only the values 0 and, and are typically called dummy variables. A linear regression model for y i given x i takes the form y i = + x 2i x ki k + e i ; i = ; :::; n (.) where through k are parameters and e i is the error. The parameter vector is written as 0 2 = B A : We can then write (.) more compactly as k y i = x 0 i + e i ; i = ; :::; n (.2) The model is incomplete without a description of the error e i. It should be mean zero, a nite variance, and be uncorrelated with the regressors. We state the needed conditions here. Assumption... E(e i ) = 0 2. E (x i e i ) = = Ee 2 i < 4. Ex 0 i x i < 5. Q = Ex i x 0 i > 0 Assumptions...3 and...4 are made to guarantee that all variables in the model have a nite variance. This is necessary to ensure that E (x i e i ) is well dened. Indeed by the Cauchy- Schwarz inequality, E jx i e i j E jx i j 2 E je i j 2 < under these assumptions. We can use Assumption.. to derive a moment representation for the parameter vector : Take equation (.2) and pre-multiply by x i x i y i = x i x 0 i + x i e i : 2

9 Now take expectations: E (x i y i ) = E x i x 0 i + E (xi e i ) = E x i x 0 i where the second equality is Assumption...2. Since E (x i x 0 i ) is invertible by Assumption...5, we can solve for : = E x i x 0 i E (xi y i ) : (.3) Thus the parameter is an explicit function of population second moments of (y i ; x i ): In fact, this derivation shows that if is dened by (.3), then Assumption...2 must hold true by construction. In this sense, Assumption...2 is very weak. However, it is important to not misinterpret this statement. In many economic models, the parameter may be dened within the model, rather than by construction as in (.3). In this case (.3) may not hold. These structural models require alternative estimation methods, and are discussed in Chapter 5. To emphasize this distinction, we may describe the model of this section as a linear projection model rather than a linear regression model. This is an accurate label as the equation (.3) shows that e i is explicitly dened as a projection error. However, conventional econometric practice labels (.2) as a linear regression model, so we will adhere to this convention..2 Estimation Equation (.3) writes the regression parameter as an explicit function of population moments E (x i y i ) and E (x i x 0 i ) : Their moment estimators are the sample moments ^E (x i y i ) = x i y i n ^E x i x 0 i = n x i x 0 i: It follows that the moment estimator of is (.3) with the population moments replaced by the sample moments: = =! x i x 0 i x i y i n! x i x 0 i x i y i : (.4) n Another way to derive ^ is as follows. Observe that Assumptions...2 can be written in the parametric form E x i y i x 0 i = 0: (.5) 3

10 The function E (x i (y i x 0 i)) can be estimated by ^E x i y i x 0 i = n x i y i x 0 i and ^ is the value which sets this equal to zero: 0 = n = n x i y i x 0 i^ x i y i n x i x 0 i^ (.6) whose solution is (.4). There is another classic motivation for the estimator (.4). Dene the sum-of-squared errors (SSE) function S n () = y i x 0 i 2 The Ordinary Least Squares (OLS) estimator is the value of which minimizes S n (): Observe that we can write the latter as S n () = yi x i y i + 0 x i x 0 i Vector calculus (see section..3) gives the rst-order conditions for minimization: S n(^) = 2 x i y i + 2 x i x 0 i^ whose solution is (.4). Following convention, we will call ^ the OLS estimator of : As a by-product of OLS estimation, we dene the predicted value and the residual ^y i = x 0 i^ ^e i = y i ^y i = y i x 0 i^: 4

11 Note that y i = ^y i + ^e i :It is important to understand the distinction between the error e i and the residual ^e i : The error is unobservable, while the residual is a by-product of estimation. These two variables are frequently mislabeled, which can cause confusion. Equation (.6) implies that x i^e i = 0: n Thus the sample correlation between the regressors and the residual is zero. Furthermore, since x i (typically) contains a constant, one implication is that n ^e i = 0: Thus the residuals have a sample mean of zero. These are algebraic results, and hold true for all linear regression estimates. The error variance 2 is also a parameter of interest. A method of moments estimator for it is the sample average ^ 2 = ^e 2 i : n The error variance 2 measures the variation in the \unexplained" part of the regression. A measure of the explained variation relative to the total variation is the coecient of determination or R-squared. R 2 ^ 2 = where ^ 2 y = n ^ 2 y (y i y) 2 is the sample variance of y i : The R 2 is frequently mislabeled as a measure of \t". It is an inappropriate label, as the value of R 2 does not aid in the interpretation of parameter estimates or test statistics..3 Eciency Is the OLS estimator ecient, in the sense of achieving the smallest possible mean-squared error among feasible estimators? The answer was armatively provided by Chamberlain (987). Suppose that the joint distribution of (y i ; x i ) is discrete. That is, for nite r; P y i = j ; x i = j = j ; j = ; :::; r 5

12 for some constant vectors j ; j ; and j : Assume that the j and j are known, but the j are unknown. (We know the values y i and x i can take, but we don't know the probabilities.) In this discrete setting, the moment condition (.5) can be rewritten as rx j j j 0 j = 0: (.7) j= By the implicit function theorem, is a function of ( ; :::; r ) : As the data are multinomial, the maximum likelihood estimator (MLE) is ^ j = n (y i = j ) x i = j for j = ; :::; r; where () is the indicator function. That is, ^ j is the percentage of the observations which fall in each category. The MLE ^ mle for is then the function of (^ ; :::; ^ r ) which satises the analog of (.7) with the i replaced by the ^ i : rx j= Substituting in the expressions for ^ j, 0 = = n = n rx j= n j= rx x i y i ^ j j j 0 j ^ mle = 0: (y i = j ) x i = j! j j (y i = j ) x i = j j j x 0 i^ mle 0 j ^ mle 0 j ^ mle But this is the same expression as (.6), which means that ^ mle = ^ ols : In other words, if the data have a discrete distribution, the maximum likelihood estimator is simply the OLS estimator. Since this is a regular parametric model the MLE is asymptotically ecient, and thus so is the OLS estimator. Chamberlain (987) extends this argument to the case of continuously-distributed data: He observes that the above argument holds for all multinomial distributions, and any continuous distribution can be arbitrarily well approximated by a multinomial distribution. He proves that generically the OLS estimator is asymptotically ecient for the class of regression models satisfying Assumption... 6

13 .4 Model in Matrix Notation For some purposes, including computation, it is convenient to write the model and statistics in matrix notation. We dene 0 0 y x 0 0 e y 2 C B x 0 2 C B e 2 C Y = y n C A ; X = x 0 n C A ; e = Observe that Y and e are n vectors, and X is an n k matrix. The linear regression model (.2) is a system of n equations, one for each observation. We can stack these n equations together as or equivalently y = x 0 + e y 2 = x e 2. y n = x 0 n + e n : Y = X + e: Sample sums can also be written in matrix notation. For example x i x 0 i = X 0 X x i y i = X 0 Y: Thus the estimator (.4), residual vector, and sample error variance can be written ^ = X 0 X X 0 Y Dene the projection matrices ^e = Y X ^ ^ 2 = n ^e 0^e P = X X 0 X X 0 M = I n P:. e n C A : Then ^Y = X ^ = X X 0 X X 0 Y = P Y 7

14 and ^e = Y X ^ = Y P Y = (I n P ) Y = MY: (.8) Another way of writing this is Y = (P + M) Y = P Y + MY = ^Y + ^e: This decomposition is orthogonal, that is ^Y 0^e = (P Y ) 0 (MY ) = Y 0 P MY = 0:.5 Residual Regression Partition and Then the regression model can be rewritten as X = [X X 2 ] = 2 : Y = X + X e: (.9) Observe that the OLS estimator of = ( 0 ; 0 2) 0 can be obtained by regression of Y on X = [X X 2 ]: OLS estimation can be written as Using the partitioned matrix inversion formula (5.), X 0 X X 0 X 2 where X 0 2 X X 0 2 X 2 = Y = X ^ + X 2^2 + ^e: (.0) (X 0 M 2X ) (X 0 M 2X ) X 0 X 2 (X2 0 X 2) (X2 0 X 2) X2 0 X (X 0 M 2X ) (X2 0 M X 2 ) M = I n X XX 0 X 0 M 2 = I n X 2 X2X 0 2 X 0 2 : (.) 8

15 Thus ^ ^ 2 = = = = X 0 X X 0 X 2 X 0 Y X2 0 X X2 0 X 2 X2 0 Y (X 0 M 2X ) (X 0 M 2X ) X 0 X 2 (X2 0 X 2) (X2 0 X 2) X2 0 X (X 0 M 2X ) (X2 0 M X 2 ) (X 0 M 2 X ) (X 0 M 2Y ) (X2 0 M X 2 ) (X2 0 M Y ) 0 ~X 0 ~ X ~X 0 ~ ~X 0 2 ~ X2 ~X 0 2 ~ Y2 X 0 Y X 0 2 Y A (.2) where ~X = M 2 X ~Y = M 2 Y ~X 2 = M X 2 ~Y 2 = M Y The variables X ~ and Y ~ are least-squares residuals from the regression of X and Y; respectively, on the matrix X 2 only. Similarly, the variables X ~ 2 and Y ~ 2 are least-squares residuals from the regression of X 2 and Y on the matrix X only. Formula (.2) shows that the subvector ^ of the OLS estimator ^ can be calculated by the OLS regression of Y ~ on X ~ ; and similarly ^ 2 can be calculated by the OLS regression of Y ~ 2 on ~X 2 : This technique is called residual regression. Furthermore, recalling the denition M = I X (X 0 X) X 0 ; observe that X2 0 M = 0 and hence M M = I X 2 X 0 2X 2 X 0 2 M = M Then using (.8), we nd M 2^e = M 2 MY = MY = ^e: Premultiplying (.0) by M 2 ; we obtain ~Y = ~ X ^ + ^e: Since ^ is precisely the OLS coecient from a regression of ~ Y on ~ X ; this shows that the residual from this regression is ^e, the numerically same residual from the joint regression (.0). We have proven the following theorem. Theorem.5. (Frisch-Waugh-Lovell). In the model (.9), the OLS estimator of and the OLS residuals ^e may be equivalently computed by either the OLS regression (.0) or via the following algorithm:. Regress Y on X 2 ; obtain residuals ~ Y ; 9

16 2. Regress X on X 2 ; obtain residuals ~ X ; 3. Regress ~ Y on ~ X ; obtain OLS estimates ^ and residuals ^e: In some contexts, the FWL theorem can be used to speed computation, but in most cases there is little computational advantage to using the two-step algorithm. Rather, the theorem's primary use is theoretical. A common application of the FWL theorem, which you may have seen in an introductory econometrics course, is the demeaning formula for regression. Partition X = [X X 2 ] where X = is a vector of ones, and X 2 is the vector of observed regressors. In this case, Observe that and M = I 0 0 : ~X 2 = M X 2 = X X 2 = X 2 X 2 ~Y = M Y = Y 0 0 Y = Y Y ; which are \demeaned". The FWL theorem says that ^ 2 is the OLS estimate from a regression of ~Y on ~ X 2 ; or y i y on x 2i x 2 : ^ 2 =! (x 2i x 2 ) (x 2i x 2 ) 0! (x 2i x 2 ) (y i y) : Thus the OLS estimator for the slope coecients is a regression with demeaned data..6 Consistency The OLS estimator ^ is a statistic, and thus has a statistical distribution. In general, this distribution is unknown. Asymptotic (large sample) methods approximate sampling distributions based on the limiting experiment that the sample size n tends to innity. A preliminary step in this approach is the demonstration that estimators are consistent { that they converge in probability to the true parameters as the sample size gets large. 0

17 The following decomposition is quite useful. ^ = = =! x i x 0 i x i y i! x i x 0 i x i x 0 i = +! x i x 0 i + e i! x i x 0 i +! x i x 0 i x i e i! x i x 0 i x i e i : (.3) This shows that after centering, the distribution of ^ is determined by the joint distribution of (x i ; e i ) only. We can now deduce the consistency of ^: First, Assumption.. and the WLLN (Section 7.2) imply that x i x 0 i! p E x i x 0 n i = Q (.4) and Using (.3), we can write n ^ = + x i e i! p E (x i e i ) = 0: (.5) = + n = + g! x i x 0 i x i e i x i x 0 i n! x i x 0 i; n n! x i e i! x i e i where g(a; b) = A b is a continuous function of A and b; at all values of the arguments such that A exist. Now by (.4) and (.5), n x i x 0 i; n! x i e i! p (Q; 0) :

18 Assumption...5 implies that Q exists and thus g(; ) is continuous at (Q; 0): Hence by the continuous mapping theorem (CMT) (Section 7.5), so g n x i x 0 i; n ^ = + g n! x i e i! p g (Q; 0) = Q 0 = 0 x i x 0 i; n! x i e i! p + 0 = 0 Theorem.6. Under Assumption.., as n! ; ^! p : In Section.2 we also dened the sample error variance ^ 2 : We now demonstrate its consistency for 2 : Using (.8), n^ 2 = e 0 MMe = e 0 Me = e 0 e e 0 P e: (.6) An application of the WLLN yields n e 2 i! p Ee 2 i = 2 as n! ; so combined with (.4) and (.5), ^ 2 = n e 2 i n e i x 0 i n x i x 0 i! n! x i e i! p Q 0 = 2 (.7) so ^ 2 is consistent for 2 :.7 Asymptotic Normality We now establish the asymptotic distribution of ^ after normalization. We need a strengthening of the moment conditions. Assumption.7. In addition to Assumption.., Ee 4 i < and E jx ij 4 < : Now dene = E x i x 0 ie 2 i : Assumption (.7.) guarantees that the elements of are nite. To see this,by the Cauchy-Schwarz inequality and Assumption.7., E x i x 0 ie 2 i E x i x 0 2 =2 i E e 4 =2 i = E jx i j 4 =2 E e 4 =2 i < : (.8) 2

19 Thus x i e i is iid with mean zero and has covariance matrix. By the central limit theorem (Section 7.4), p x i e i! d N (0; ) (.9) n Then using (.3), (.4), and (.9), p n ^ = n x i x 0 i! d Q N (0; )! = N 0; Q Q : Theorem.7. Under Assumption.7., as n! p n ^! d N (0; V ) where V = Q Q : p n As V is the variance of the asymptotic distribution of p n ^! x i e i ; V is often referred to as the asymptotic covariance matrix of ^: The form V = Q Q is called a sandwich form. It may be insightful to examine a special case where and V simplify: Homoskedastic Projection Error: Cov(x i x 0 i ; e2 i ) = 0 Condition (.7) holds, for example, when x i and e i are independent, but this is not a necessary condition. We should not expect it to generically hold, but when it does the asymptotic variance formulas simplify. If (.7) is true, then = E x i x 0 i E e 2 i = Q 2 (.20) V = Q Q = Q 2 V 0 (.2) In (.2) we dene V 0 = Q 2 as this matrix is dened even if (.7) is false, although in that case V 0 does not equal V: We call V 0 the homoskedastic covariance matrix..8 Covariance Matrix Estimation The homoskedastic covariance matrix V 0 = Q 2 can be estimated by ^V 0 = ^Q ^ 2 (.22) where ^Q = n x i x 0 i = n X0 X 3

20 is the method of moments estimator for Q: Since ^Q! p Q and ^! p 2 (see (.4) and (.7)) it is clear that ^V 0! p V 0 : To estimate V = Q Q ; we need an estimate of = E x i x 0 i e2 i : Their MME estimator is ^ = n x i x 0 i^e 2 i where ^e i are the OLS residuals. A useful computational formula is to dene ^u i = x i^e i and the n k matrix 0 ^u 0 ^u 0 2 ^u = B A : Then = n ^u0^u ^u 0 n ^V = n X 0 X ^u0^u X 0 X This estimator was introduced to the econometrics literature by White (980): The estimator ^V 0 was the dominate covariance estimator used before 980, and was still the standard choice in the 980s. From my reading of the literature, the White estimate ^V started to come in common use in the early 990s, and by the late 990s is quite commonly used, especially by younger researchers. When reading and reporting applied work, it is important to pay attention to the distinction between ^V 0 and ^V, as it is not always clear which has been used. When ^V is used rather than the traditional choice ^V 0 ; many authors will state that \their standard errors have been corrected for heteroskedasticity", or that they use a \heteroskedasticity-robust covariance matrix estimator", or that they use the \White formula", the \Eicker-White formula", the \Huber formula", the \Huber-White formula" or the \GMM covariance matrix". In most cases, these all mean the same thing. We now show ^! p ; from which it follows that ^V! p V as n! : Expanding the quadratic ^e 2 i = = y i e i 2 x 0 i^ 2 ^ x 0 i = e 2 i 2 ^ 0 xi e i + ^ 0 xi x 0 i ^ : Hence 4

21 ^ = n = n x i x 0 i^e 2 i x i x 0 ie 2 i 2 n x i x 0 i ^ 0 xi e i + n 0 x i x 0 i ^ xi x 0 i ^ : (.23) We now examine the each sum on the right-hand-side of (.23) in turn. WLLN show that x i x 0 n ie 2 i! p E x i x 0 ie 2 i = : Second, by Holder's inequality (Section 7.) E jx i j 3 je i j E jx i j 4 3=4 E e 4 =4 i < ; First, (.8) and the so by the WLLN and thus since ^! p 0; n n 0 x i x 0 i ^ xi e i ^ jx i j 3 je i j! p E jx i j 3 je i j ; n! jx i j 3 je i j! p 0: Third, by the WLLN so n Together, these establish consistency. n jx i j 4! p E jx i j 4 ; 0 x i x 0 i ^ xi x i ^ ^ 2 n jx i j 4! p 0: Theorem.8. As n! ; ^! p : The variance estimator ^V is an estimate of the variance of the asymptotic distribution of ^. A more easily interpretable measure of spread is its square root { the standard deviation. This motivates the denition of a standard error. Denition.8. A standard error s(^) for an estimator ^ is an estimate of the standard deviation of the distribution of ^: 5

22 When is scalar, and ^V is an estimator of the variance of p n ^ ; we set s(^) = n =2p ^V : When is a vector, we focus on individual elements of one-at-a-time, vis., j ; j = ; :::; k: Thus s(^ j ) = n =2 q ^V jj : Generically, standard errors are not unique, as there may be more than one estimator of the variance of the estimator. It is therefore important to understand what formula and method is used by an author when studying their work. It is also important to understand that a particular standard error may be relevant under one set of model assumptions, but not under another set of assumptions, just as any other estimator. From a computational standpoint, the standard method to calculate the standard errors is to rst calculate n ^V, then take the diagonal elements, and then the square roots..9 Functions of Parameters Sometimes we are interested in some function of the parameter vector. Let h : R k! R q, and = h(): We will assume from now on that h() is continuously dierentiable at the true value of : The estimate of is ^ = h(^): What is an appropriate standard error for ^? By a rst-order Taylor series approximation: h(^) ' h() + H ^ 0 : where Thus where p n ^ H h() k = p n h(^) p ' n ^ H 0! d H 0 N(0; V ) h() = N(0; V ): (.24) V = H 0 V H : 6

23 where If ^V is the estimated covariance matrix for ^; then the natural estimate for the variance of ^ is In many cases, the function h() is linear: ^V = ^H 0 ^V ^H h(^): h() = R 0 for some k q matrix R: In this case, H = R and ^H = R; so ^V = R 0 ^V R: For example, if R is a \selector matrix" I R = 0 so that if = ( ; 2 ); then = R 0 = and ^V = I 0 ^V I 0 = ^V ; the upper-left block of ^V : When q = (so h() is real-valued), the standard error for ^ is the square root of n ^V ; that is, s(^) = n =2 q ^H0 ^V ^H :.0 t tests Let = h() : R k! R be any parameter of interest, ^ its estimate and s(^) its asymptotic standard error. Consider the studentized statistic t n () = ^ s(^) (.25) It is easy to calculate that this statistic has the asymptotic distribution t n () = ^ s(^) p n ^ = q ^V! d N(0; V ) p V = N(0; ) 7

24 the standard normal. This distribution is known. Since this distribution does not depend on the parameters, we say that t n () is asymptotically pivotal. In special cases (such as the normal regression model, see Section 2.7), the statistic t n has an exact t distribution, and is therefore exactly free of unknowns. In this case, we say that t n is an exactly pivotal statistic. In general, however, pivotal statistics are unavailable and so we must rely on asymptotically pivotal statistics. A simple null and composite hypothesis takes the form H 0 : = 0 H : 6= 0 where 0 is some pre-specied value, and = h() is some function of the parameter vector. (For example, could be a single element of ): The standard test for H 0 against H is the t-statistic (or studentized statistic) t n = t n ( 0 ) = ^ 0 s(^) : Under H 0 ; t n! d N(0; ): Let z =2 is the upper =2 quantile of the standard normal distribution. That is, if Z N(0; ); then P (Z > z =2 ) = =2 and P (jzj > z =2 ) = : For example, z :025 = :96 and z :05 = :645: A test of asymptotic signicance rejects H 0 if jt n j > z =2 : Otherwise the test does not reject, or \accepts" H 0 : This is because P (reject H 0 j H 0 ) = P jt n j > z =2 j = 0! P jzj > z =2 = : The rejection/acceptance dichotomy is associate with the Neyman-Pearson approach to hypothesis testing. An alternative approach, associate with Fisher, is to report an asymptotic p-value. The asymptotic p-value for the above statistic is constructed as follows. Dene the tail probability, or asymptotic p-value function p(t) = P (jzj > jtj) = 2 ( (jtj)) : Then the asymptotic p-value of the statistic t n is p n = p(t n ): If the p-value p n is small (close to zero) then the evidence against H 0 is strong. In a sense, p- values and hypothesis tests are equivalent since p n < if and only if jt n j > z =2, thus an equivalent statement of a Neyman-Pearson test is to reject at the % level if and only if p n < : The p-value is more general, however, in that the reader is allowed to pick the level of signicance (), in contrast to Neyman-Pearson rejection/acceptance reporting, where the researcher picks the level. Another helpful observation is that the p-value function has simply made a unit-free transformation of the test statistic. That is, under H 0 ; p n! d U[0; ]; so the \unusualness" of the test 8

25 statistic can be compared to the easy-to-understand uniform distribution, regardless of the complication of the distribution of the original test statistic. To see this fact, note that the asymptotic distribution of jt n j is F (x) = p(x): Thus P ( p n u) = P ( p(t n ) u) = P (F (t n ) u) = P jt n j F (u)! F F (u) = u; establishing that p n! d U[0; ]; from which it follows that p n! d U[0; ]: It may be helpful to note that in the GAUSS language, the function p(t) may be computed by the expression p = 2 cdfnc(t):. Condence Intervals A condence interval C n is an interval estimate of ; and is a function of the data and hence is random. It is designed to cover with high probability. Either 2 C n or =2 C n : The coverage probability is P ( 2 C n ). We typically cannot calculate the exact coverage probability P ( 2 C n ): However we often can calculate lim n! P ( 2 C n ): We call this the asymptotic coverage probability. We say that C n has asymptotic ( )% coverage for if P ( 2 C n )! as n! : A good method for construction of a condence interval is the collection of parameter values which are not rejected by an appropriate statistical test. We recall the t-statistic (.25) and the test: Reject H 0 : 0 = if the is jt n ()j > z =2 where z =2 again is the upper =2 quantile of the standard normal distribution. Our condence interval is then C n = : jt n ()j z =2 ( = : z =2 ^ ) s(^) z =2 = h^ z=2 s(^); ^ + z=2 s(^)i (.26) While there is no hard-and-fast guideline for choosing the coverage probability ; the most common professional i choice isi95%, or = :05: This corresponds to selecting the condence interval h^ :96s(^) h^ 2s(^) : Thus values of within two standard errors of the estimated ^ are considered \reasonable" candidates for the true value ; and values of outside two standard errors of the estimated ^ are considered unlikely or unreasonable candidates for the true value. The interval has been constructed so that as n! ; and C n is an asymptotic ( P ( 2 C n ) = P jt n ()j z =2! P jzj z=2 = : )% condence interval. 9

26 .2 Wald Tests Sometimes = h() is a q vector, and it is desired to test the joint restrictions simultaneously. In this case the t-statistic approach does not work. We have the null and alternative H 0 : = 0 H : 6= 0 : The natural estimate of is ^ = h(^) and has asymptotic covariance matrix estimate where ^V = ^H 0 ^V ^H h(^): The Wald statistic for H 0 against H is 0 W n = n ^ 0 ^V ^ 0 0 = n h(^) 0 ^H0 ^V ^H h(^) 0 : When h is a linear function of ; h() = R 0 ; then the Wald statistic takes the form W n = n R 0^ 0 0 R 0 ^V R R 0^ 0 : The delta method (.24) showed that p n ^! d Z N(0; V ); and Theorem.8. showed that ^V! p V: Furthermore, H () is a continuous function of ; so by the continuous mapping theorem, H (^)! p H : Thus ^V = ^H 0 ^V ^H! p H 0 V H = V > 0 if H has full rank q: Hence W n = n 0 ^ 0 ^V ^ by Theorem 6.8.: We have established: 0! d Z 0 V Z = 2 q; Theorem.2. Under H 0 and Assumption.7., if rank(h ) = q; then W n! d 2 q; a chisquare random variable with q degrees of freedom. An asymptotic Wald test rejects H 0 in favor of H if W n exceeds 2 q(); the upper- quantile of the 2 q distribution. For example, 2 (:05) = 3:84 = z2 :025 : The Wald test fails to reject if W n is less than 2 q(): The asymptotic p-value for W n is p n = p(w n ); where p(x) = P 2 q x is the tail probability function of the 2 q distribution. As before, the test rejects at the % level i p n < ; and p n is asymptotically U[0; ] under H 0 : In addition, it may be helpful to note that in the GAUSS language, the function p(t) may be computed by the expression p = cdfchic(t): 20

27 .3 F Tests Take the linear model Y = X + X e where X is n k and X 2 is n k 2 and k = k + k 2 : The null hypothesis is H 0 : 2 = 0: In this case, = 2 ; and there are q = k 2 restrictions. Also h() = R 0 is linear with R = a selector matrix. We know that the Wald statistic takes the form W n = n^ 0 ^V ^ = n^ 0 2 R 0 ^V R ^2 : What we will show in this section is that if ^V is replaced with ^V 0 = ^ 2 n X 0 X ; the covariance matrix estimator valid under homoskedasticity, then the Wald statistic can be written in the form ~ 2 ^ 2 W n = n ^ 2 (.27) where are from OLS of Y on X ; and ~ 2 = n ~e0 ~e; ~e = Y X ~ ; ~ = X 0 X X 0 Y ^ 2 = n ^e0^e; ^e = Y X ^; ^ = X 0 X X 0 Y are from OLS of Y on X = (X ; X 2 ): The elegant feature about (.27) is that it is directly computable from the standard output from two simple OLS regressions, as the sum of square errors is a typical output from statistical packages. This statistic is typically reported as an \F-statistic" which is dened as F = n k W n = ~2 ^ 2 =k2 n k 2 ^ 2 =(n k) : 0 I While it should be emphasized that equality (.27) only holds if ^V 0 = ^ 2 formula often nds good use in reading applied papers. We now derive expression (.27). First, note that using (.), n X 0 X ; still this R 0 ^V 0 R = n ^ 2 R 0 X 0 X X 0 X 2 X 0 2 X X 0 2 X 2 R! = ^ 2 n X 0 2M X 2 ; 2

28 where M = I X (X 0 X ) X 0 : Thus W n = n^ 0 2 R 0 ^V 0 R ^2 = ^ 0 2 (X 0 2 M X 2 ) ^ 2 ^ 2 : To simplify this expression further, note that if we regress Y on X alone, the residual is ~e = M Y: Now consider the residual regression of ~e on X ~ 2 = M X 2 : By the FWL theorem, ~e = X ~ 2^2 + ^e and X ~ 0 2^e = 0: Thus or alternatively, Also, since we conclude that ~e 0 ~e = 0 ~X2^2 + ^e ~X2^2 + ^e = ^ 0 2 ~ X 0 2 ~ X 2^2 + ^e 0^e = ^ 0 2X 0 2M X 2^2 + ^e 0^e; ^ 0 2X 0 2M X 2^2 = ~e 0 ~e ^ 2 = n ^e 0^e ^e 0^e: ~e 0 ~e ^e 0^e ~ 2 ^ 2 W n = n ^e 0^e = n ^ 2 ; as claimed. In many statistical packages, when an OLS regression is reported, an \F statistic" is reported. This is where F = ~2 y ^ 2 =(k ) ^ 2 : =(n k) ~ 2 y = n (y y)0 (y y) is the sample variance of y i ; equivalently the residual variance from an intercept-only model. This special F statistic is testing the hypothesis that all slope coecients (other than the intercept) are zero. This was a popular statistic in the early days of econometric reporting, when sample sizes were very small and researchers wanted to know if there was \any explanatory power" to their regression. This is rarely an issue today, as sample sizes are typically suciently large that this F statistic is highly \signicant". Certainly, there are special cases where this F statistic is useful, but these cases are atypical. 22

29 Chapter 2 Regression Models 2. Regression In regression, we want to nd the central tendency of the conditional distribution of y i given x i : A standard measure of central tendency is the mean. The conditional analog is the conditional mean m(x) = E (y i j x i = x). In general, m(x) can take any form. The regression error e i is dened to be the dierence between y i and its conditional mean: By construction, this yields the formula e i = y i m(x i ): y i = m(x i ) + e i : (2.) It is worth emphasizing that no assumptions have been used to develop (2.), other than that (y i ; x i ) have a joint distribution and E jy i j < : Proposition 2.. Properties of the regression error e i. E (e i j x i ) = 0: 2. E(e i ) = 0: 3. E (h(x i )e i ) = 0 for any function h () : 4. E(x i e i ) = 0: Proof: 23

30 . By the denition of e i and the linearity of conditional expectations, E (e i j x i ) = E ((y i m(x i )) j x i ) = E (y i j x i ) E (m(x i ) j x i ) = m(x i ) m(x i ) = 0: 2. By the law of iterated expectations (Theorem 6.7) and the rst result, E(e i ) = E (E (e i j x i )) = E(0) = 0: 3. By a similar argument, and using the conditioning theorem (Theorem 6.9), E(h(x i )e i ) = E (E (h(x i )e i j x i )) = E (h(x i )E (e i j x i )) = E(h(x i ) 0) = 0: 4. Follows from the third result setting h(x i ) = x i : Equation (2.) plus Proposition 2... are often stated jointly as the regression framework: y i = m(x i ) + e i (2.2) E (e i j x i ) = 0: It is important to understand that this is a framework, not a model, because no restrictions have been placed on the joint distribution of the data. These equations hold true by denition. A regression model imposes further restrictions on the joint distribution; most typically, restrictions on the permissible class of regression functions m(x): The most common choice is the linear regression model. It species that m(x) is a linear function of x : y i = x 0 i + e i E (e i j x i ) = 0 Since this is a linear equation is a special case of the general conditional conditional mean equation, this is a substantive restriction which may or may not be true in a specic application. The fact 24

31 that Proposition is the same as Assumption...2 means that the linear regression model is a special case of the least-squares projection model of Chapter. Another way of saying this is that the conditional mean assumption that E (e i j x i ) = 0 is stronger than the uncorrelated assumption E (x i e i ) = 0: It is also useful to dene the conditional variance of y i given x i = x: V ar (y i j x i = x) = E e 2 i j x i = x = 2 (x): Generally, this is a function of x: Just as the conditional mean function may take any form, so may the conditional variance function (other than the restriction that it is non-negative). Given the random variable x i, the conditional variance is 2 i = 2 (x i ): In the general case where 2 (x) is not necessarily a constant function, so 2 i may dierent across i; we say that the error e i is heteroskedastic. When 2 (x) is a constant, so that E e 2 i j x i = 2 (2.3) we say that the error e i is homoskedastic. The model y i = x 0 i + e i E (e i j x i ) = 0 E e 2 i j x i = 2 is called the homoskedastic linear regression model. In this case, by the law of iterated expectations E x i x 0 ie 2 i = E xi x 0 ie e 2 i j x i = Q 2 which is (.20). Thus the homoskedastic linear regression model is a special case of the homoskedastic projection error model. 2.2 Bias and Variance of OLS estimator The conditional mean assumption allows us to calculate the small sample conditional mean and variance of the OLS estimator. To examine the bias of ^, using (.3), the conditioning theorem, and the independence of the 25

32 observations E ^ j X 2! 3 = E 4 x i x 0 i x i e i j X5 = = = 0! x i x 0 i x i E (e i j X)! x i x 0 i x i E (e i j x i ) Thus the OLS estimator ^ is unbiased for : To examine its covariance matrix, for a random vector Y we dene V ar(y ) = E (Y EY ) (Y EY ) 0 = EY Y 0 (EY ) (EY ) 0 : Then by independence of the observations! V ar x i e i j X = V ar (x i e i j X) = x i x 0 i 2 i and we nd V ar ^ j X = x i x 0 i!! x i x 0 i 2 i x i xi! 0 : In the special case of the linear homoskedastic regression model, 2 i = 2 and the covariance matrix simplies to V ar ^ j X = x i xi! 0 2 : Recall the method of moments estimator ^ 2 for 2 : We now calculate its nite sample bias in the context of the homoskedastic linear regression model. Using (.6) and linear algebra manipulations 26

33 E n^ 2 j X = E e 0 Me j X = E tr e 0 Me j X = E tr Mee 0 j X = tr E Mee 0 j X = tr ME ee 0 j X = tr M 2 = 2 (n k); the nal equality by (5.4). We have found that under these assumptions so ^ 2 is biased towards zero. bias-corrected estimator E^ 2 = (n k) 2 n Since the bias is proportional to 2 ; it is common to dene the s 2 = n so that Es 2 = 2 is unbiased. It is important to remember, however, that this estimator is only unbiased in the special case of the homoskedastic linear regression model. It is not unbiased in the absence of homoskedasticity, or in the projection model. 2.3 Multicollinearity If rank(x 0 X) < k; then ^ is not dened. This can be called strict multicollinearity. This happens when the columns of X are linearly dependent, i.e., there is some such that X = 0: Most commonly, this arises when sets of regressors are included which are identically related. For example, if X includes both the logs of two prices and the log of the relative prices log(p ); log(p 2 ) and log(p =p 2 ): When this happens, the applied researcher quickly discovers the error as the statistical software will be unable to construct (X 0 X) : Since the error is discovered quickly, this is rarely a problem for applied econometric practice. The more relevant issue is near multicollinearity, which is often called \multicollinearity" for brevity. This is the situation when the X 0 X matrix is near singular, when the columns of X are close to linearly dependent. This denition is not precise, because we have not said what it means for a matrix to be \near singular". This is one diculty with the denition and interpretation of multicollinearity. One implication of near singularity of matrices is that the numerical reliability of the calculations is reduced. It is possible that the reported calculations will be in error due to oating-point calculation diculties. k ^e 2 i 27

34 More relevantly in practice, an implication of near multicollinearity is that estimation precision of individual coecients will be poor. We can see this most simply in a model with two regressors and no intercept: y i = x i + x 2i 2 + e i ; where e i is independent of x i and x 2i ; Ee 2 i = and x 2 E i x i x 2i x i x 2i x 2 2i In this case the asymptotic covariance matrix V is = = 2 The correlation indexes collinearity, since as approaches the matrix becomes singular. We can see the eect of collinearity on precision by examining the asymptotic variance of either coecient estimate, which is 2 : As approaches, the variance rises quickly to innity. Thus the more \collinear" are the regressors, the worse the precision of the individual coecient estimates. Basically, what is happening is that when the regressors are highly dependent, it is statistically dicult to disentangle the impact of from that of 2 : The precision of individual estimates are reduced. Is there a simple solution? Basically, No. Fortunately, multicollinearity does not lead to errors in inference. The asymptotic distribution is still valid. Regression estimates are asymptotically normal, and estimated standard errors are consistent for the asymptotic variance. So reported condence intervals are not inherently misleading. They will be large, correctly indicating the inherent uncertainty about the true parameter value 2.4 Forecast Intervals In the linear regression model, m(x) = E (y i j x i = x) = x 0 : In some cases, we want to estimate m(x) at a particular point x: Notice that this is a (linear) function p of : Letting h() = x 0 and = h(); we see that ^m(x) = ^ = x 0^ and H = x; so s(^) = n x 0 ^V x: Thus an asymptotic 95% condence interval for m(x) is hx 0^ p 2 n x 0 ^V i x : It is interesting to observe that if this is viewed as a function of x; the width of the condence set is dependent on x: : 28

35 For a given value of x i = x; we may want to forecast (guess) y i out-of-sample. A reasonable guess is the conditional mean m(x); and indeed this is the mean-square-minimizing decision rule. Thus a point forecast is ^m(x) = x 0^; the estimated conditional mean, as discussed above. We would also like a measure of uncertainty for the forecast. The forecast error is ^e i = y i ^m(x) = e i x 0 ^ : As the out-of-sample error e i is independent of the in-sample estimate ^; this has variance E^e 2 i = E e 2 i j x i = x + x 0 E ^ ^ 0 x = 2 (x) + n x 0 V x: Assuming E e 2 i j x i = 2 ; the natural estimate of this variance is ^ 2 + n x 0 ^V x; so a standard p error for the forecast is ^ 2 + n x 0 ^V x: Notice that this is dierent from the standard error for the conditional mean. It would appear natural to conclude that an asymptotic 95% forecast interval for y i is x 0^ q 2 ^ 2 + n x 0 ^V x ; but this turns out to be incorrect. In general, the validity of an asymptotic condence interval is based on the asymptotic normality of the studentized ratio. In the present case, this would require the asymptotic normality of the ratio e i x 0 ^ p ^ 2 + n x 0 ^V : x But no such asymptotic approximation can be made. The only special exception is the case where e i has the exact distribution N(0; 2 ); which is generally invalid. To get an accurate forecast interval, we need to estimate the conditional distribution of e i given x i = x; which is a much more dicult task. Given the diculty, most applied forecasters focus on the simple and unjustied interval hx 0^ p 2 ^ 2 + n x 0 ^V i x : 2.5 NonLinearity in Regressors In the regression setting we are interested in E (y i j x i = x) = m(x); which need not be a linear function of x; and its precise form may be unknown. A common approach is to employ a polynomial approximation. Consider the case of x i 2 R: Then a k 0 th order polynomial model is y i = 0 + x i + 2 x 2 i + + k x k i + e i : Letting = ( 0 ; ; :::; k ) and z i = (; x i ; x 2 i ; :::; xk i ); this is the linear regression y i = z 0 i + e i. 29

36 Now suppose that x 2 R 2 : A simple quadratic approximation is y i = 0 + x i + 2 x 2i + 3 x 2 i + 4 x 2 2i + 5 x i x 2i + e i : As the dimensionality of x increases, such approximations can become quite non-parsimonious! In practice, therefore, most applications do appear to use more than quadratic terms. Some applications add cubics without interactions: y i = 0 + x i + 2 x 2i + 3 x 2 i + 4 x 2 2i + 5 x 3 i + 6 x 3 2i + 7 x i x 2i + e i : Non-linear approximations can also be made using alternative basis functions, such as Fourier series (sins and cosines), splines, neural nets, or wavelets. Since these non-linear models are linear in the parameters, they can be estimated by OLS, and inference is convention. However, the model is non-linear so interpretation must take this into account. For example, in the cubic model given above, the slope with respect to x i E (y i j x i ) = x i x 2 i + 7 x 2i ; which is a function of x i and x 2i ; making reporting of the \slope" dicult. In many applications, it will be important to report the slopes for dierent values of the regressors, carefully chosen to illustrate the point of interest. In other applications, an average slope may be sucient. There are two obvious candidates: the derivative evaluated at the sample averages and the average i E (y i j x i ) j xi =x= x x x E (y i j x i ) x i n 2.6 NonLinear Least Squares x 2 i + 7 x 2 : We say that the regression function m(x; ) = E (y i j x i = x) is nonlinear in the parameters if it cannot be written as m(x; ) = z(x) 0 for some function z(x): Examples of nonlinear regression 30

37 functions include m(x; ) = + 2 x + 3 x m(x; ) = + 2 x 3 m(x; ) = + 2 exp( 3 x) m(x; ) = G(x 0 ); G known x2 5 m(x; ) = + 2 x + ( x ) m(x; ) = + 2 x + 4 (x 3 ) (x > 3 ) m(x; ) = ( + 2 x ) (x 2 < 3 ) + ( x ) (x 2 > 3 ) In the rst ve examples, m(x; ) is (generically) dierentiable in the parameters : In the nal two examples, m is not dierentiable with respect to 3 ; which alters some of the analysis. When it exists, let m (x; ) m(x; Nonlinear regression is frequently adopted because the functional form m(x; ) is suggested by an economic model. In other cases, it is adopted as a exible approximation to an unknown regression function. The least squares estimator ^ minimizes the sum-of-squared-errors S n () = 6 (y i m(x i ; )) 2 : When the regression function is nonlinear, we call this the nonlinear least squares (NLLS) estimator. The NLLS residuals are ^e i = y i m(x i ; ^): One motivation for the choice of NLLS as the estimation method is that the parameter is the solution to the population problem min E (y i m(x i ; )) 2 Since sum-of-squared-errors function S n () is not quadratic, ^ must be found by numerical methods. See Appendix E. When m(x; ) is dierentiable, then the FOC for minimization are 0 = m (x i ; ^)^e i : (2.4) Theorem 2.6. If the model is identied and m(x; ) is dierentiable with respect to, where m i = m (x i ; 0 ): p n ^ 0! d N(0; V ) V = E m i m 0 i E mi m 0 i e2 i E mi m 0 i 3

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance V (u i x i ) = σ 2 is common to all observations i = 1,..., In many applications, we may suspect

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables,

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables, University of Illinois Fall 998 Department of Economics Roger Koenker Economics 472 Lecture Introduction to Dynamic Simultaneous Equation Models In this lecture we will introduce some simple dynamic simultaneous

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

A Guide to Modern Econometric:

A Guide to Modern Econometric: A Guide to Modern Econometric: 4th edition Marno Verbeek Rotterdam School of Management, Erasmus University, Rotterdam B 379887 )WILEY A John Wiley & Sons, Ltd., Publication Contents Preface xiii 1 Introduction

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Averaging Estimators for Regressions with a Possible Structural Break

Averaging Estimators for Regressions with a Possible Structural Break Averaging Estimators for Regressions with a Possible Structural Break Bruce E. Hansen University of Wisconsin y www.ssc.wisc.edu/~bhansen September 2007 Preliminary Abstract This paper investigates selection

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7

More information

Econometria. Estimation and hypotheses testing in the uni-equational linear regression model: cross-section data. Luca Fanelli. University of Bologna

Econometria. Estimation and hypotheses testing in the uni-equational linear regression model: cross-section data. Luca Fanelli. University of Bologna Econometria Estimation and hypotheses testing in the uni-equational linear regression model: cross-section data Luca Fanelli University of Bologna luca.fanelli@unibo.it Estimation and hypotheses testing

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Simple Linear Regression Model & Introduction to. OLS Estimation

Simple Linear Regression Model & Introduction to. OLS Estimation Inside ECOOMICS Introduction to Econometrics Simple Linear Regression Model & Introduction to Introduction OLS Estimation We are interested in a model that explains a variable y in terms of other variables

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley Models, Testing, and Correction of Heteroskedasticity James L. Powell Department of Economics University of California, Berkeley Aitken s GLS and Weighted LS The Generalized Classical Regression Model

More information

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] 1 Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] Insights: Price movements in one market can spread easily and instantly to another market [economic globalization and internet

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Introduction to Eco n o m et rics

Introduction to Eco n o m et rics 2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Introduction to Eco n o m et rics Third Edition G.S. Maddala Formerly

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Environmental Econometrics

Environmental Econometrics Environmental Econometrics Syngjoo Choi Fall 2008 Environmental Econometrics (GR03) Fall 2008 1 / 37 Syllabus I This is an introductory econometrics course which assumes no prior knowledge on econometrics;

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Reduced rank regression in cointegrated models

Reduced rank regression in cointegrated models Journal of Econometrics 06 (2002) 203 26 www.elsevier.com/locate/econbase Reduced rank regression in cointegrated models.w. Anderson Department of Statistics, Stanford University, Stanford, CA 94305-4065,

More information

388 Index Differencing test ,232 Distributed lags , 147 arithmetic lag.

388 Index Differencing test ,232 Distributed lags , 147 arithmetic lag. INDEX Aggregation... 104 Almon lag... 135-140,149 AR(1) process... 114-130,240,246,324-325,366,370,374 ARCH... 376-379 ARlMA... 365 Asymptotically unbiased... 13,50 Autocorrelation... 113-130, 142-150,324-325,365-369

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner Econometrics II Nonstandard Standard Error Issues: A Guide for the Practitioner Måns Söderbom 10 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

ECONOMETRICS HONOR S EXAM REVIEW SESSION

ECONOMETRICS HONOR S EXAM REVIEW SESSION ECONOMETRICS HONOR S EXAM REVIEW SESSION Eunice Han ehan@fas.harvard.edu March 26 th, 2013 Harvard University Information 2 Exam: April 3 rd 3-6pm @ Emerson 105 Bring a calculator and extra pens. Notes

More information

Linear Model Under General Variance

Linear Model Under General Variance Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

Multivariate Regression: Part I

Multivariate Regression: Part I Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction For pedagogical reasons, OLS is presented initially under strong simplifying assumptions. One of these is homoskedastic errors,

More information