Introductory Econometrics

Size: px
Start display at page:

Download "Introductory Econometrics"

Transcription

1 Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013

2 Outline Introduction Simple linear regression Multiple linear regression OLS in the multiple linear regression Statistical properties of OLS Inference in the multiple model OLS asymptotics Selection of regressors in the multiple model Heteroskedasticity Regressions with time-series observations Asymptotics of OLS in time-series regression Serial correlation in time-series regression Instrumental variables estimation

3

4

5 OLS in the multiple linear regression A multiple linear regression model with two regressors The simplest multiple linear model is one in which a dependent variable y depends on two explanatory variables x 1 and x 2, for example wages on education and work experience: y = β 0 +β 1 x 1 +β 2 x 2 +u, where the slope β 1 measures the reaction of y to a marginal change in x 1 keeping x 2 fixed (ceteris paribus) or y/ x 1. Often, regressors are closely related, and the ceteris paribus idea becomes problematic.

6 OLS in the multiple linear regression The multiple linear regression model In the general multiple linear regression model, y is regressed on k regressors y = β 0 +β 1 x 1 +β 2 x β k x k +u, with an intercept β 0 and k slope parameters (coefficients) β j,1 j k. Again, for the error term u, it will be assumed that E(u x 1,...,x k ) = 0. The multiple linear regression is the most important statistical model in econometrics. Note that multiple should not be replaced by multivariate : multivariate regression lets a vector of variables y 1,...,y g depend on a vector of regressors. Here, y is just a scalar dependent variable.

7 OLS in the multiple linear regression OLS in the multiple model In order to generalize the idea of OLS estimation to the multiple model, one would minimize n (y i β 0 β 1 x i1... β k x ik ) 2 i=1 in β 0,β 1,...,β k, and call the minimizing values ˆβ 0, ˆβ 1,..., ˆβ k.

8 OLS in the multiple linear regression Formally, the solution can be obtained by taking derivatives and solving a system of first-order conditions n (y i ˆβ 0 ˆβ 1 x i1... ˆβ k x ik ) = 0, i=1 n x i1 (y i ˆβ 0 ˆβ 1 x i1... ˆβ k x ik ) = 0, i=1 n x i2 (y i ˆβ 0 ˆβ 1 x i1... ˆβ k x ik ) = 0, i=1... n x ik (y i ˆβ 0 ˆβ 1 x i1... ˆβ k x ik ) = 0. i=1 This system does not yield a nice closed form for the OLS coefficients, unless matrix algebra is used.

9 OLS in the multiple linear regression Interpreting the OLS first-order conditions Just like in the simple regression model, the first-order conditions have a method-of-moments interpretation: The condition for the intercept ˆβ 0 says that the sample mean of the OLS residuals is 0. This corresponds to the population moments condition that Eu = 0; Each condition for a slope coefficient ˆβ j says that the sample correlation (or covariance) between the residuals and the regressor x j is 0. This corresponds to the population condition that the regressors and errors are uncorrelated.

10 OLS in the multiple linear regression Multiple linear regression in matrix form Presume all values for the dependent variable y and all regressors are written in a vector y and in an {n (k +1)} matrix X: y = y 1 y 2... y n, X = 1 x x 1k 1 x x 2k... 1 x n1... x nk In this notation, the OLS estimates ˆβ = (ˆβ 0, ˆβ 1,..., ˆβ k ) can be written compactly as ˆβ = (X X) 1 X y.

11 OLS in the multiple linear regression Fitted values and residuals Just as in simple regression, OLS estimation decomposes observed y into an explained part ŷ (the fitted value) and an unexplained part or residual û: y i = ˆβ 0 + ˆβ 1 x i ˆβ k x ik +û i = ŷ i +û i. Because the sample mean of the residuals is 0, the sample mean of the fitted values is ȳ and the averages are on the regression hyperplane : ȳ = ˆβ 0 + ˆβ 1 x ˆβ k x k.

12 OLS in the multiple linear regression Simple and multiple linear regression coefficients In most cases, the estimate ˆβ 1 in a simple linear regression y = ˆβ 0 + ˆβ 1 x 1 +û differs from the estimate ˆβ 1 in a comparable multiple regression y = ˆβ 0 + ˆβ 1 x ˆβ k x k +û. The coefficient estimates only coincide in special cases, such as cov(x 1,x j ) = 0 for all j 1 or ˆβ j = 0 for all j 1. Note that cov(x 1,x j ) = 0 is not the typical case: regressors are usually correlated with other regressor variables.

13 OLS in the multiple linear regression Simple and two-regressor regression: a property Consider the simple regression y = β 0 + β 1 x 1 +ũ and the regression of y on x 1 and on an additional x 2 : It is easily shown that y = ˆβ 0 + ˆβ 1 x 1 + ˆβ 2 x 2 +û. β 1 = ˆβ 1 + ˆβ 2ˆδ, where ˆδ is the slope coefficient in a regression of x 2 on x 1. Clearly, β 1 = ˆβ 1 iff one of the two factors ˆβ 2 and ˆδ is 0. Note: ˆβ1 is not necessarily better or more correct than β 1.

14 OLS in the multiple linear regression Goodness of fit in the multiple model The variance decomposition equation or n (y i ȳ) 2 = i=1 n (ŷ i ȳ) 2 + i=1 SST = SSE +SSR continues to hold in the multiple regression model. Likewise, R 2 = SSE SST = 1 SSR SST defines a descriptive statistic in the interval [0, 1] that measures the goodness of fit. Note, however, that R 2 is not the squared correlation of y and any x j but the maximum squared correlation coefficient of y and linear combinations of x 1,...,x k. n i=1 û 2 i

15 Statistical properties of OLS Assumptions for multiple linear regression In order to establish OLS properties such as unbiasedness etc., model assumptions have to be formulated. The first assumption is the natural counterpart to (SLR.1), the linearity in parameters: MLR.1 The population model can be written y = β 0 +β 1 x 1 +β 2 x β k x k +u, with unknown coefficient parameters β 1,...,β k, an intercept parameter β 0, and unobserved random error u.

16 Statistical properties of OLS Assumption of random sampling MLR.2 The data constitute a random sample of n observations {(x i1,...,x ik,y i ) : i = 1,...,n} of random variables corresponding to the population model (MLR.1). Due to the random-sampling assumption (MLR.2), observations and also errors are independent for different i.

17 Statistical properties of OLS No multicollinearity In simple regression, OLS requires some variation in the regressor (should not be entirely constant). In multiple regression, more is needed. The matrix X X must be invertible: MLR.3 There are no exact linear relationships connecting the regressor variables, and no regressor is constant in sample or in population. Assumption (MLR.3) implies n > k. When it holds in population, violation in sample happens with 0 probability for continuous random variables. (MLR.3) is violated if a regressor is the sum or difference of other regressors. It is not violated for nonlinear identities, such as x 2 = x 2 1.

18 Statistical properties of OLS Zero conditional expectation The assumption E(u x) = 0 is just a natural generalization of the simple regression assumption: MLR.4 The error u has an expected value of zero given any values of the regressors, in symbols E(u x 1,x 2,...,x k ) = 0. Again, (MLR.4) implies E(u) = 0 but it is stronger than that property. (MLR.4) also implies cov(x j,u) = 0 for all regressors x j.

19 Statistical properties of OLS Violations of assumption MLR.4 There are several reasons why (MLR.4) may not hold, in particular the ensuing condition cov(x j,u) = 0, which is often called an exogeneity condition. When it is violated, x j is called an endogenous regressor. If the true relationship is nonlinear, E(u x j ) 0 even though E(u) = 0; If an important influence factor has been omitted from the list of regressors ( omitted variable bias ), (MLR.4) is formally violated. The researcher must decide whether she wishes to estimate the regression without or with the doubtful control; If there is logical feedback from y to some x j, u and x j are correlated, x j is endogenous, and regression yields biased estimates of the true relationship. This case must be handled by special techniques (instrumental variables).

20 Statistical properties of OLS Unbiasedness of OLS (MLR.1) to (MLR.4) suffice for unbiasedness: Theorem Under assumptions (MLR.1) (MLR.4), E(ˆβ j ) = β j, j = 0,1,...,k, for any values of the parameters β j. In words, OLS is an unbiased estimator for the intercept and all coefficients. In short, one may write E(ˆβ) = β, using the notation β for a (k +1) vector (β 0,β 1,...,β k ) and a corresponding notation for the expectation operator.

21 Statistical properties of OLS Scylla and Charybdis How many regressors should be included in a multiple regression? Omitting influential regressors (too low k) tends to overstate the effects. Effects due to the omitted variables are attributed to the included regressors ( omitted variable bias ). Conversely, for example the effect of a difference between two regressors may not be found if only one of them is included; Profligate regressions with many regressors lack degrees of freedom. Results will be imprecise, variances will be large. Statistical tools for model selection are important (R 2 and R 2 do not work). Generally, economists tend to include too many regressors.

22 Statistical properties of OLS Homoskedasticity For the efficiency and variance properties, constant variance must be assumed: MLR.5 The error u has the same variance given any values of the explanatory variables, in symbols var(u x 1,...,x k ) = σ 2. If (MLR.5) is violated, the errors variance and also the variance of the dependent variable will change with some x j. Heteroskedasticity is often observed in cross-section data.

23 Statistical properties of OLS The variance of OLS The most informative way to represent the OLS variance is by using matrices: Theorem Under assumptions (MLR.1) (MLR.5), the variance of the OLS estimator ˆβ = (ˆβ 0, ˆβ 1,..., ˆβ k ) is given by var(ˆβ X) = σ 2 (X X) 1, where the operator var applied to a vector denotes a matrix of variances and covariances. The matrix expression must be evaluated in OLS estimation anyway. As n, the matrix X X divided by n may converge to a moment matrix of the regressors.

24 Statistical properties of OLS A property of the OLS variances From the general formula in the theorem, the interesting formula var(ˆβ j X) = σ 2 SST j (1 R 2 j ) is obtained, where SST j denotes n i=1 (x ij x j ) 2 and R 2 j is the R 2 from a regression of x j on the other regressors x l,l j. Note that this formula does not use any matrices. Strong variation in the regressor x j and weak correlation with other regressors benefits the precision of the coefficient estimate ˆβ j.

25 Statistical properties of OLS Estimating the OLS variance In the formulae for the OLS variance, the item σ 2 is unobserved and must be estimated. In analogy to simple regression, the following theorem holds: Theorem Under the assumptions (MLR.1) (MLR.5), it holds that E n i=1û2 i n k 1 = E SSR n k 1 = Eˆσ2 = σ 2, i.e. the estimator of the error variance is unbiased. The scale factor n k 1 corresponds to the degrees of freedom concept: n observations yield k + 1 coefficient estimates, such that n k 1 degrees of freedom remain. The proof is omitted.

26 Statistical properties of OLS Gauss-Markov and multiple regression In direct analogy to the case of simple regression, there is the celebrated Gauss-Markov Theorem for linear efficiency: Theorem Under assumptions (MLR.1) (MLR.5), the OLS estimators ˆβ 0, ˆβ 1,..., ˆβ k are the best linear unbiased estimators of β 0,β 1,...,β k, respectively, i.e. BLUE. For this reason, (MLR.1) (MLR.5) are called the Gauss-Markov conditions. It can be shown that a genuine multivariate generalization holds and that linear combinations of OLS estimators are BLUE for linear combinations of coefficient parameters.

27 Inference in the multiple model Normal regression For some results, such as unbiasedness and linear efficiency, no exact distributional assumptions are needed. For others, it is convenient to assume a Gaussian (normal) distribution: MLR.6 The error u is independent of the explanatory variables and it is normally distributed with mean 0 and variance σ 2, in symbols u N(0,σ 2 ). Assumption (MLR.6) implies (MLR.4) and (MLR.5). Normality is often a reasonable working assumption, unless there is strong evidence to the contrary. In large samples, it can be tested.

28 Inference in the multiple model OLS coefficient estimates as normal random variables Theorem Under the assumptions (MLR.1) (MLR.6), the conditional distribution on the regressors of the OLS coefficient estimates ˆβ j is normal, i.e. ˆβ j X N{β j,var(ˆβ j )}, with var(ˆβ j ) given by the direct expression using the idea of a regression of x j on other covariates or as the (j,j) element of the variance matrix σ 2 (X X) 1. Note that the variance is formally a random variable as it depends on X. Proof is quite obvious, as for given X, ˆβ is a linear function of y, which in turn is normal due to normal u.

29 Inference in the multiple model Implications of the normality of OLS estimates Normality does not only hold for the individual ˆβ j, it holds that ˆβ N(β,σ 2 (X X) 1 ) for a multivariate ((k + 1) variate) normal distribution. Thus, all sums, differences, linear combinations of coefficient estimates are also normally distributed; From the properties of the normal distribution, it follows that the theoretical standardized estimate ˆβ j β j s.e.(ˆβ j ) is standard normal N(0, 1) distributed. The denominator standard error, however, is the square root of the true and unknown variance. This distributional property does not hold for the estimated standard error.

30 Inference in the multiple model The empirically standardized estimate If the OLS coefficient estimates are standardized by estimated standard errors, the distribution follows the well known t law (in the older literature Student distribution): Theorem Under assumptions (MLR.1) (MLR.6), ˆβ j β j s.e.(ˆβ j ) t n k 1, in words, the empirically standardized estimate follows a t distribution with n k 1 degrees of freedom.

31 Inference in the multiple model Remarks on the standardized estimate The t distribution with m degrees of freedom is defined from m + 1 independent standard normal random variables a,b 1,...,b m as the distribution of the ratio a/ (b b2 m )/m; For more than around 30 degrees of freedom, the t distribution becomes so close to the normal N(0,1) that the standard normal can be used instead; The standardized estimator will not be t distributed if the normality assumption (MLR.6) is violated; Degrees of freedom can be remembered as follows: out of n original degrees of freedom, k +1 are used up by estimating coefficients and the intercept, and n k 1 remain.

32 Inference in the multiple model Densities of t distributions f(x) x Densities for the t distribution with 5 (black), 10 (blue), and 20 (green) degrees of freedom.

33 Inference in the multiple model Testing the null hypothesis β j = 0 Researchers are interested in testing the null hypothesis H 0 : β j = 0 usually with the alternative β j 0, less often with the alternative β j > 0 or β j < 0. An appropriate test statistic to test this H 0 is the empirically standardized estimate for β j = 0, i.e. t βj = ˆβ j ŝ.e.(ˆβ j ), which is called the t ratio or t statistic.

34 Inference in the multiple model What is a hypothesis test? A hypothesis test is a statistical decision procedure. Based on the value of a test statistic, which is a function of the sample and hence a random variable, it either rejects the null hypothesis or is unable to reject it (fails to reject). For example, the t test rejects the null of β j = 0 if t βj > c, t βj > c, in the one-sided and two-sided versions. c is called the critical value, and the region of R where the test rejects, is called the critical region.

35 Inference in the multiple model How are the critical values determined? Hypothesis tests are tuned to significance levels. A significance level is the probability of a type I error, i.e. of rejecting the null even though it is correct. The construction of a test requires knowledge of the distribution of the test statistic under the null. Suppose the significance level (specified by the researcher) is 5%. Then, any interval that has the probability of 5% under the null is a valid critical region for a valid test. In order to minimize the probability of a type II error, critical regions are defined to be situated in the tails of the null distribution. For example, the 95% quantile of the t distribution is a good critical value for a 5% test against a one-sided alternative if the test statistic is t distributed.

36 Inference in the multiple model Practical implementation of hypothesis tests Presume the researcher has the value of the test statistic and searches for critical values. Several options are available: If the null distribution is a known standard law, critical values are found on the web or in books: inconvenient; Critical values may also be provided by a statistical software in this case: slightly more convenient and flexible; If the software is smart, it will provide p values instead of or in addition to the critical values: very precise and convenient; If the null distribution is non-standard and rare, the researcher may have to simulate the distribution via Monte Carlo or by bootstrap procedures: computer skills needed.

37 Inference in the multiple model Definition of the p value Correct definitions: The p value is the significance level, at which the test becomes indifferent between rejection and acceptance for the sample at hand (the calculated value of the test statistic); The p value is the probability of generating values for the test statistic that are, under the null hypothesis, even more unusual (less typical, often larger ) than the one calculated from the sample. Incorrect definition: The p value is the probability of the null hypothesis for this sample.

38 Inference in the multiple model Test based on quantiles % to 90% quantiles of the normal distribution. The observed value of 2.2 for the test statistic that is, under H 0, normally distributed is significant at 10% for the one-sided test.

39 Inference in the multiple model Test based on p values The area under the density curve to the right of the observed value of 2.2 is 0.014, which yields the p value. The one-sided test rejects on the levels of 10%, 5%, but not 1%.

40 Inference in the multiple model Return to the t test Assume (MLR.1) (MLR.6). Under the null hypothesis H 0 : β j = 0, the t ratio for β j, ˆβ j t βj = ŝ.e.(ˆβ j ), will be t distributed with n k 1 degrees of freedom or t n k 1 distributed. Thus, reject H 0 at 5% significance in favor of the alternative H A : β j > 0 if the test statistic is larger than the 95% quantile of the t n k 1 distribution. Reject in favor of H A : β j 0 if the test statistic is larger than the 97.5% quantile or less than the 2.5% quantile. When the t test rejects, it is often said that β j is significantly different from 0 or simply that β j is significant or also that x j is significant.

41 Inference in the multiple model More general t tests Presume one wishes to test H 0 : β j = β j0 for a given value β j0, such as H 0 : β j = Then, evaluate the statistic ˆβ j β j0 ŝ.e.(ˆβ j ). Under H 0, it is clearly t n k 1 distributed, and the usual quantiles can be used.

42 Inference in the multiple model Testing several exclusion restrictions jointly Assume the null hypothesis of concern is now H 0 : β l+1 = β l+2 =... = β k = 0, i.e. the exclusion of k l regressors. Then, a suitable test statistic is F = (SSR r SSR u )/(k l), SSR u /(n k 1) where SSR r is the SSR for the restricted model without the k l regressors and SSR u for the unrestricted model with all k regressors. Assuming (MLR.1) (MLR.6), the statistic F is, under the null, distributed F with k l numerator and n k 1 denominator degrees of freedom.

43 Inference in the multiple model Densities of F distributions f(x) x Densities of F distributions with 2 (black), 4 (blue), and 6 (green) numerator and 20 denominator degrees of freedom.

44 Inference in the multiple model Some remarks on the F test The F test is easily generalized to test for general restrictions, such as H 0 : β 2 +β 3 = 1,β 5 = 3.61,β 6 = 2β 7, as again there exists a SSR u and a SSR r. The main difficulty may be the estimation of the restricted model. Numerator degrees of freedom correspond to the number of linear independent restrictions; For large n, the F statistic will be distributed like 1/(k l) times a χ 2 (k l) distribution; The F statistic for the exclusion of one regressor x j is the square of t βj.

45 Inference in the multiple model The overall F test A special F test has the null hypothesis H 0 : β 1 =... = β k = 0 and the alternative that at least one coefficient is non-zero. The statistic is F = (SST SSR)/k SSR/(n k 1) = R 2 /k (1 R 2 )/(n k 1), a transformation of the R 2. When it fails to reject, the regression model fails to provide a useful description of y. This is the only F statistic that shows in a standard regression printout.

46 Inference in the multiple model The importance of F and t tests F and t tests are restriction tests that are tools in searching for the best specification of a regression equation the best selection of regressors that determines the targeted dependent variable y. Only nested models can be compared. For example, y = β 0 +β 1 x 1 +u can be tested against y = β 0 +β 1 x 1 +β 2 x 2 +u but not against y = β 0 +β 1 x 2 +u; In the specification search, it is often recommended to start with a profligate model and to eliminate insignificant regressors (backward elimination, general-to-specific) rather than to add regressors from a small model; The decisions of t tests, say, for two coefficients β l,β j and of the F test for β l = β j = 0 are often in conflict. Some researchers prefer the decision of the F test in doubtful cases.

47 Inference in the multiple model Ouch! The following statements are regarded as incorrect: The tested null hypothesis is H 0 : ˆβ j = 0; The test is rejected; The alternative hypothesis can be rejected; The test is 2.55; The coefficient β 4 is significant at 95% (unless someone really uses an unusual 95% significance level); The hypothesis that β 4 is insignificant can be rejected.

48 OLS asymptotics The probability limit When talking about asymptotics, i.e. large-sample behavior, statistical convergence concepts are needed. For convergence of a sequence of random variables X 1,...,X n,... to a fixed limit, we use Definition A sequence of random variables (X n ) is said to converge in probability to θ R, in symbols plimx n = θ iff for every ε > 0 n P( X n θ > ε) 0 as n. This concept is relatively weak, as it does not imply that single realizations of the random variable sequence converge. It allows simple rules, such as plim(x n Y n ) = (plimx n )(plimy n ).

49 OLS asymptotics Consistency of OLS An estimator ˆθ for the parameter θ is called consistent iff plim ˆθ(n) = θ, n with ˆθ(n) denoting an estimate from a sample of size n. Consistency holds under relatively weak conditions: Theorem Under assumptions (MLR.1) (MLR.4) and some technical conditions, the OLS estimator ˆβ is consistent for β, which implies that plim ˆβ j = β j for j = 0,1,...,k. n

50 OLS asymptotics A sketch of the consistency issue Consider ˆβ = β +(X X) 1 X u = β +( 1 n X X) 11 n X u. Typically, the term n 1 X X will converge to some kind of variance matrix. The term n 1 X u should converge to its expectation EX u, which is 0 if X and u are uncorrelated and Eu = 0. Thus, the condition MLR.4 E(u) = 0, cov(x j,u) = 0, for j = 1,...,k, will suffice for consistency and can be substituted for the stronger assumption (MLR.4).

51 OLS asymptotics Correlation of regressor and error is pretty bad It was shown before that correlation between a regressor and the errors (for example, with omitted variables and with endogeneity) usually causes a bias in the sense of Eˆβ β. If (MLR.4 ) is violated, this bias will not even disappear as n and becomes an inconsistency. As Clive Granger said, If you can t get it right as n goes to infinity, you shouldn t be in the business. This means that inconsistent estimators should not be used at all. Inconsistency is more serious than a finite-sample bias.

52 OLS asymptotics Asymptotic normality of the OLS estimator The (a) celebrated Central Limit Theorem can be used to prove Theorem Under the Gauss-Markov assumptions (MLR.1) (MLR.5) and some technical conditions, it holds that ˆβ j β j ŝ.e.(ˆβ j ) d N(0,1), and generally that n(ˆβ j β j ) d N(0,σ 2 β j ) with σ 2 β j determined either from the matrix formula σ 2 (X X) 1 or by the aforementioned construction from regressions among regressors.

53 OLS asymptotics Remarks for the asymptotic normality of OLS Note that normality of the errors is not required: even for most non-normal error distributions, ˆβ will approach a normal limit distribution; Under the assumptions of the theorem, ˆσ 2 will converge to σ 2 ; This latter convergence is of type plim, while the main result of the theorem uses convergence in distribution ( d ), a weaker type of convergence. Convergence in distribution means that the distribution of a random variable converges to a limit distribution, nothing else is stated on the random variables proper.

54 OLS asymptotics Lagrange multiplier tests: the idea Restriction tests (t and F) follow the Wald test principle, one of the three test construction principles used in parametric statistics. The other two principles are the likelihood-ratio test (LR) and the Lagrange multiplier test principle (LM). LR and LM tests are typically asymptotic tests, their small-sample null distributions are uncertain, their large-sample distributions will be regular (chi-square) even in the absence of (MLR.6). The LM test estimates the model under the null and checks the increase in the likelihood when moving toward the alternative. It is also called score test, as the derivative of the likelihood is called the score. Often, the LM test can be made operational in a sequence of regressions, with the test statistic simply calculated as nr 2 for a specific regression ( auxiliary regression ).

55 OLS asymptotics The LM test for exclusion of variables Consider the multiple regression model y i = β 0 +β 1 x 1,i +...+β k x k,i +u i, and the null hypothesis H 0 : β k q+1 =... = β k = 0. Estimate the restricted regression model y i = β 0 +β 1 x 1,i +...+β k q x k q,i +u i by OLS and keep the residuals ũ. Then, regress these ũ on all k regressors ũ i = γ 0 +γ 1 x 1,i +...+γ k x k,i +v i. The nr 2 from this second, auxiliary regression is the LM test statistic. Under H 0, it is asymptotically distributed as χ 2 (q).

56 Selection of regressors in the multiple model Model selection: the main issue The typical situation in multiple regression is that y has been specified a priori, and that the researcher looks for the optimal set of regressors that offer the best explanation for y. Tools for this specification search or regressor selection are: R 2 and R 2 can be used for comparing any two or more models, but tend to increase with adding any regressors; F and t tests can only be used for comparing nested models, and lengthy search sequences tend to invalidate the significance level; Information criteria such as AIC and BIC can compare any two or more models and penalize complexity; Specification tests can be used to eliminate ill-specified models but cannot find the optimal model.

57 Selection of regressors in the multiple model Adjusted R 2 The corrected or adjusted R 2, often denoted R 2 or R 2 c, is defined as R 2 = 1 (1 R 2 n 1 ) n k 1. It holds that R 2 R 2. If R 2 is seen as an estimator for corr 2 (y,β x), then the bias of R 2 is smaller than the bias of R 2. R 2 always increases if a new regressor is included in the regression. R 2 increases if the t ratio for this variable is larger than 1 and corresponds to testing at an enormous significance level. It cannot be used for serious model selection.

58 Selection of regressors in the multiple model Penalizing complexity Consider the estimated error variance ˆσ 2 = 1 n k 1 n i=1 û 2 i = SSR n k 1, which, just like R 2, improves (here, however, decreases) if a new regressor with a t ratio greater than one is added. Thus, it cannot be used for serious model selection. It takes a step in the right direction, however, a trade-off: the numerator improves (decreases) with increasing complexity; the denominator deteriorates (decreases) with higher complexity. This idea is pursued by information criteria, which impose a stronger penalty for complexity, strong enough for useful model selection.

59 Selection of regressors in the multiple model The AIC according to Akaike Akaike introduced the AIC (A Information Criterion), in one possible version AIC = logˆσ 2 2(k +1) +, n which is to be minimized: complexity decreases the first term and increases the second. (In information criteria, ˆσ 2 should be formed using scales n not n k 1.) In nested comparisons, minimizing AIC corresponds to t or F tests at an approximate 15% significance level. For n, minimizing AIC selects the best forecasting model that tends to keep slightly more regressors than those with non-zero coefficients.

60 Selection of regressors in the multiple model The BIC according to Schwarz Schwarz simplified the BIC that had been introduced by Akaike, in one version BIC = logˆσ 2 + (k +1)logn, n which is to be minimized. The BIC complexity penalty is stronger than the AIC penalty, so selected models tend to be more parsimonious (smaller). In nested comparisons, minimizing BIC corresponds to a significance level falling to 0 as n. For n, BIC will select the true model, exactly keeping all regressors with non-zero coefficients. In smaller samples, BIC tends to select too parsimonious models.

61

62

63

64

65

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 17, 2012 Outline Heteroskedasticity

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

3. Linear Regression With a Single Regressor

3. Linear Regression With a Single Regressor 3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1 Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Econometrics Master in Business and Quantitative Methods

Econometrics Master in Business and Quantitative Methods Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

The Simple Regression Model. Simple Regression Model 1

The Simple Regression Model. Simple Regression Model 1 The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Econometric Methods for Panel Data

Econometric Methods for Panel Data Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies

More information

Asymptotic Theory. L. Magee revised January 21, 2013

Asymptotic Theory. L. Magee revised January 21, 2013 Asymptotic Theory L. Magee revised January 21, 2013 1 Convergence 1.1 Definitions Let a n to refer to a random variable that is a function of n random variables. Convergence in Probability The scalar a

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

The Multiple Regression Model Estimation

The Multiple Regression Model Estimation Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I. Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) 5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1) Assumption #A1: Our regression model does not lack of any further relevant exogenous variables beyond x 1i, x 2i,..., x Ki and

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Chapter 8 Heteroskedasticity

Chapter 8 Heteroskedasticity Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2

More information

Additional Topics on Linear Regression

Additional Topics on Linear Regression Additional Topics on Linear Regression Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Additional Topics 1 / 49 1 Tests for Functional Form Misspecification 2 Nonlinear

More information