Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Similar documents
Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Statistical Inference with Regression Analysis

Introductory Econometrics

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Homoskedasticity. Var (u X) = σ 2. (23)

The Simple Regression Model. Part II. The Simple Regression Model

ECON3150/4150 Spring 2015

THE MULTIVARIATE LINEAR REGRESSION MODEL

ECON The Simple Regression Model

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Applied Statistics and Econometrics

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Lecture 4: Multivariate Regression, Part 2

Multiple Linear Regression

Multiple Linear Regression CIVL 7012/8012

Review of Econometrics

Heteroskedasticity. Part VII. Heteroskedasticity

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECNS 561 Multiple Regression Analysis

Econometrics Summary Algebraic and Statistical Preliminaries

ECON3150/4150 Spring 2016

α version (only brief introduction so far)

Lecture 8: Instrumental Variables Estimation

Multiple Regression Analysis

Multivariate Regression Analysis

Simple Linear Regression: The Model

Applied Statistics and Econometrics

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Introductory Econometrics

Intermediate Econometrics

Lecture 4: Multivariate Regression, Part 2

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Econometrics - 30C00200

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

Linear Regression. Junhui Qian. October 27, 2014

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Ch 3: Multiple Linear Regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Lecture 3: Multivariate Regression

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Lab 07 Introduction to Econometrics

ECO220Y Simple Regression: Testing the Slope

Econometrics of Panel Data

Instrumental Variables, Simultaneous and Systems of Equations

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Multiple Regression Analysis: Heteroskedasticity

Advanced Quantitative Methods: ordinary least squares

Multiple Regression: Inference

Econometrics Multiple Regression Analysis: Heteroskedasticity

Introductory Econometrics

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Least Squares Estimation-Finite-Sample Properties

Econ 510 B. Brown Spring 2014 Final Exam Answers

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Simple Linear Regression Model & Introduction to. OLS Estimation

Course Econometrics I

1 Independent Practice: Hypothesis tests for one parameter:

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Econometrics of Panel Data

Empirical Application of Simple Regression (Chapter 2)

The Simple Regression Model. Simple Regression Model 1

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Master in Business and Quantitative Methods

Econometrics Midterm Examination Answers

Heteroskedasticity and Autocorrelation

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Ch 2: Simple Linear Regression

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Regression #8: Loose Ends

Empirical Economic Research, Part II

Advanced Econometrics

Motivation for multiple regression

Econometrics Homework 1

Introductory Econometrics

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section

Lab 6 - Simple Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Introduction to Estimation Methods for Time Series models. Lecture 1

7 Introduction to Time Series

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Regression #3: Properties of OLS Estimator

Lecture 2 Multiple Regression and Tests

Intermediate Econometrics

Statistical Inference. Part IV. Statistical Inference

Essential of Simple regression

The Statistical Property of Ordinary Least Squares

Applied Regression Analysis

Advanced Econometrics I

Transcription:

Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 1 / 33

The multiple linear regression model The major drawback of the bivariate regression model is that the key assumption SLR.4 (zero conditional mean) is often unrealistic. The multiple linear regression model (MLR) allows to control for many other factors which might otherwise be captured in the error term. Thus it is more amenable to ceteris paribus analysis. The model with k independent variables given a sample (y i, x i1,..., x ik ), i = 1,..., n, reads y i = β 0 + β 1 x i1 + β 2 x i2 + + β k x ik + u i (1) y i = x i β + u i with x i = (1, x i1,..., x ik ) and β = (β 0, β 1,..., β k ). The key assumption is E(u i x i ) = 0 i = 1,..., n (3) (2) Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 2 / 33

MLR Matrix notation Let y be a n 1 vector of observations on y, let X be the data matrix with dimension n (k + 1) and associated parameter vector β R (k+1) 1, and let u be a n 1 vector of disturbances. With these ingredients we can write the model in (1) as y = Xβ + u (4) or more explicitly, y 1 1 x 11 x 12... x 1k β 0 u 1 y 2. = 1 x 21 x 22... x 2k β 1.......... + u 2. y n 1 x n1 x n2... x nk β k u n (5) where Xβ is the systematic and u the stochastic component. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 3 / 33

MLR An example We are interested in the effect of education on hourly wage: wage i = β 0 + β 1 educ i + β 2 exper i + u i, i = 1,..., n (6) We control for years of labor market experience (exper). We are still primarily interested in the effect of education. The MLR takes experience out of the error term u. With the SLR we would have to assume exper educ. 1 β 1 measures the effect of educ on wage when exper is held constant. β 2 measures the effect of exper on wage when educ is held constant. 1 denotes statistical independence. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 4 / 33

MLR Estimation As in Module 1, we are looking for a coefficient vector ˆβ R (k+1) 1 that minimizes the sum of squared residuals. Formally the problem reads arg min ˆβ n û 2 i = û û = (y X ˆβ) (y X ˆβ) (7) i=1 = y y ˆβ X y y X ˆβ + ˆβ X X ˆβ (8) = y y 2 ˆβ X y + ˆβ X X ˆβ (9) Note that the last step is possible because ˆβ X y = (y X ˆβ) = y X ˆβ. The first-order condition for minimization is û û ˆβ = 2X y + 2X X ˆβ = 0 (10) or written differently X X ˆβ = X y (11) which is called the system of least squares normal equations. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 5 / 33

MLR Estimation If X X is non-singular (i.e., there exists an inverse), pre-multiplying both sides of equation (11) with (X X) 1 yields the OLS estimators ˆβ: ˆβ = (X X) 1 X y (12) Important: The matrix X X is non-singular (= invertible) and ˆβ a unique solution to the minimization problem if and only if we have at least n k observations, and the data matrix X has rank (k + 1). The second point is violated if there are linear dependencies among the explanatory variables (i.e., perfect collinearity). Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 6 / 33

MLR Properties of the OLS estimator The OLS estimator has various important properties that do not depend on any assumptions, but rather arise by how it is constructed. First, substitute y = X ˆβ + û into the system of normal equations (11) to obtain X X ˆβ = X y X X ˆβ = X (X ˆβ + û) X X ˆβ = X X ˆβ + X û 0 = X û (13) A number of important properties can be derived from this condition. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 7 / 33

MLR Properties of the OLS estimator 1 The observed values of X are uncorrelated with the residuals û. This follows immediately from (13): X û = 0 iff. X û. Note that this does not mean that X is uncorrelated with u! We have to assume this. 2 The sum of residuals is zero. If there is a constant, the first column of X will be a column of ones. For the first element in the X û to be zero it must hold that ûi = 0. i 3 The sample mean of the residuals is zero. This follows from the previous property: û = n 1 n û = i=1 0. 4 The regression hyperplane passes through the means of the observed values X and ȳ. Recall that û = y X ˆβ. Dividing by n gives û = ȳ X ˆβ. From the previous property: û = ȳ X ˆβ = 0, so ȳ = X ˆβ. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 8 / 33

MLR Properties of the OLS estimator 5 The predicted values ŷ are uncorrelated with the residuals û. The predicted values are ŷ = X ˆβ. From this we have ŷ û = (X ˆβ) û = ˆβ X û = 0 because X û = 0. 6 The mean of the predicted Y s for the sample will equal the mean of the observed Y s, i.e. ŷ = ȳ. Proof is left as an exercise (use the result in item 4). Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 9 / 33

MLR Properties of the OLS estimator These properties always hold true, be careful not to infer anything from the residuals about the actual disturbances! So far we know nothing about ˆβ except that it satisfies all of the properties discussed above. We need to make some assumptions about the true model in order to make any inferences regarding β (the true population parameters) from ˆβ (our estimator of the true parameters). Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 10 / 33

MLR Expected value and variance The assumptions from the bivariate model translate to the multivariate case as follows: Assumption MLR.1 Linear in parameters The population model is linear: y = β X + u. Assumption MLR.2 Random sampling We have a random sample of n observations, {(y i, x i) i = 1,..., n}, that follows the population model in assumption MLR.1. Assumption MLR.3 No perfect collinearity The data matrix X has rank (k + 1). Assumption MLR.4 Zero conditional mean Conditional on the entire matrix X, each error u i has mean zero: E(u i X) = 0 i = 1,..., n. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 11 / 33

Finite sample properties Note that we only need assumption MLR.3 (no perfect collinearity) to obtain an OLS estimate ˆβ. Whether this estimate actually makes sense, i.e. is unbiased and representative for the full population, depends on the other assumptions. Especially the zero conditional mean assumption (MLR.4) often poses problems in practice. Requires that, conditional on the observed covariates x i, unobservables are on average orthogonal to the error term u i. Fails in the case of Simultaneity Selection Omitted variables Functional form misspecification Measurement error We will discuss sources and consequences of these cases in Module 6. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 12 / 33

Finite sample properties Expected values Theorem: Unbiasedness of OLS Under assumptions MLR.1 through MLR.4, E( ˆβ) = β. (14) In other words, ˆβ is an unbiased estimate for β. Proof. Rewrite the OLS estimator as ˆβ = (X X) 1 X y = (X X) 1 X (Xβ + u) = (X X) 1 (X X)β + (X X) 1 X u (15) Because X X is a square matrix, (X X) 1 (X X) = I, thus ˆβ = β + (X X) 1 X u (16) Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 13 / 33

Finite sample properties Expected values Taking conditional expectations on both sides of equation (16) gives E( ˆβ X) = β + (X X) 1 X E(u X) (17) By MLR.4, E(u X) = 0, so This completes the proof. E( ˆβ X) = β (18) Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 14 / 33

Finite sample properties Expected variances Assumption MLR.1 Linear in parameters The population model is linear: y = β X + u. Assumption MLR.2 Random sampling We have a random sample {(y i, x i) i = 1,..., n} that follows the population model: y i = β 0 + β 1x i + u i. Assumption MLR.3 No perfect collinearity The data matrix X has rank (k + 1). Assumption MLR.4 Zero conditional mean Conditional on the entire matrix X, each error u i has zero mean: E(u i X) = 0 i = 1,..., n. Assumption MLR.5 Homoskedasticity and no serial correlation The error u i has the same variance given any values of the covariates, i.e. Var(u i X) = σ 2, i = 1,..., n, and there is no serial correlation between the errors: Cov(u i, u j X) = 0 for all j i. We can write these two assumptions as Var(u X) = σ 2 I. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 15 / 33

Finite sample properties Expected variances Assumption MLR.5 requires that errors are homoskedastic and that there is no serial correlation (meaning that errors are not correlated across observations this is especially important if you deal with panel data, but sometimes also in cross-sectional settings). Combining these assumptions, we can write the variance-covariance matrix of the disturbances as σ 2 0... 0 Var(u X) = E(uu 0 σ 2... 0 X) =........ = σ2 I (19) 0 0... σ 2 Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 16 / 33

Finite sample properties Expected variances Theorem: Variance-covariance matrix of the OLS estimator Under assumptions MLR.1 through MLR.5, Var( ˆβ) = σ 2 (X X) 1 (20) Proof. See Wooldridge (2013), p. 805. For one particular ˆβ j ˆβ, the variance is obtained by multiplying σ 2 by the jth diagonal element of (X X) 1. It can also be written as Var( ˆβ j ) = σ 2 SST j (1 R 2 j ) (21) where SST j = n i=1 (x ij x j ) 2 is the total sample variation in x j and R 2 j is the R-squared from regressing x j on all other independent variables (and including an intercept). Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 17 / 33

Finite sample properties Expected variances The unbiased estimator of the error variance in the multivariate case is given by ˆσ 2 = u u n k 1 (22) where u u is again the sum of squared residuals. Theorem: Unbiasedness of σ 2 Under assumptions MLR.1 through MLR.5, ˆσ 2 is an unbiased estimate for σ 2. That is, E(ˆσ 2 X) = σ 2 σ 2 > 0 (23) Proof. See Wooldridge (2013), p. 807. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 18 / 33

Finite sample properties Gauss-Markov theorem Gauss-Markov Theorem Under assumptions MLR.1 through MLR.5, ˆβ is the best linear unbiased estimator. Proof. See Wooldridge (2013), p. 808. The Gauss-Markov theorem translated: OLS is the estimator with the smallest variance amongst all linear unbiased estimators. OLS is BLUE. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 19 / 33

Inference Sampling distributions of the OLS estimators Although it is not necessary for the Gauss-Markov theorem to hold, 2 we assume normally distributed disturbances to derive sampling distributions. Assumption MLR.6 Normality of errors Conditional on X, u is distributed as multivariate normal with mean zero and variance-covariance matrix σ 2 I. That is, u Normal(0, σ 2 I). (24) 2 We will show later that, as soon as asymptotics kick in, we don t need the normality assumption anymore for our test statistics to be valid. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 20 / 33

Inference Sampling distributions of the OLS estimators Theorem: Normality of ˆβ Under the classical linear model assumptions MLR.1 through MLR.6, ˆβ conditional on X is distributed as multivariate normal with mean β and variance-covariance matrix σ 2 X X 1. That is, ˆβ Normal(β, σ 2 (X X) 1 ) (25) Therefore, ˆβ j β j sd( ˆβ j ) Normal(0, 1) (26) Proof. Wooldridge (2013, p. 113) provides a sketch of the proof for (25). The result in (26) is straightforward; if we substract the mean from a normally distributed random variable and divide by its standard deviation, we get a standard normal variable with mean zero and a standard deviation of 1. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 21 / 33

Inference Sampling distributions of the OLS estimators Theorem: Distribution of t-statistics Under assumptions MLR.1 through MLR.6, Proof. Wooldridge (2013), p. 808. ˆβ j β j se( ˆβ t n k 1 (27) j ) }{{} t-statistic This is an important result for inference. It says that, when we estimate σ in sd( ˆβ j ) by ˆσ which yields se( ˆβ j ), ( ˆβ j β j )/se( ˆβ j ) is t-distributed with n k 1 degrees of freedom. Note that β j is some hypothesized value. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 22 / 33

Inference Testing a single population parameter Pick a significance level and formulate a null-hypothesis (H 0 ): One-sided alternatives: H 0 : β j 0; H 1 : β j > 0. Two-sided alternatives: H 0 : β j = 0; H 1 : β j 0. One-sided alternatives: H 0 : β j > α j; H 1 : β j α j, with α R. t-statistic: t (estimate hypothesized value), according to theorem (27). standard error p-value for t-test: what is the smallest significance level at which H 0 would be rejected? Confidence intervals range of likely values for β: CI ˆβ j ± c se( ˆβ j ) (28) where c is the 97.5 th percentile of the t n k 1 distribution. Economic vs. statistical significance. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 23 / 33

Inference Testing multiple linear restrictions The F -test allows to test for multiple hypotheses. Consider a model with k = 4 independent variables: y i = β 0 + β 1 x i1 + β 2 x i2 + β 3 x i3 + β 4 x i4 + u i. Suppose you want to test whether x 1, x 2, and x 3 are jointly insignificant Formulate a null-hypothesis: H 0 : β 1 = β 2 = β 3 = 0 H 1 : H 0 is not true. F -statistic: F (SSR r SSR ur )/q SSR ur /(n k 1) F q,n k 1 (29) where q is the number of restrictions, SSR r is the sum of squared residuals from the restricted model and SSR ur is the SSR from the unrestricted model. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 24 / 33

Inference Example We estimate the following model: final i = β 0 + β 1 attend i + β 2 hwrte i + u i (30) where final [10, 39] are final exam points, attend [2, 32] is the number of classes attended out of 32, and hwrte is the percentage of homeworks turned in times 100.. reg final attend hwrte Source SS df MS Number of obs = 674 F( 2, 671) = 9.20 Model 401.109761 2 200.554881 Prob > F = 0.0001 Residual 14623.5445 671 21.793658 R-squared = 0.0267 Adj R-squared = 0.0238 Total 15024.6543 673 22.324895 Root MSE = 4.6684 final Coef. Std. Err. t P> t [95% Conf. Interval] attend.0828712.043704 1.90 0.058 -.0029418.1686842 hwrte.0217245.0119752 1.81 0.070 -.0017889.0452378 _cons 21.8012.9725956 22.42 0.000 19.89151 23.7109 Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 25 / 33

Asymptotics So far, we have looked at finite sample properties of OLS, i.e., properties that hold independent of how large n is. However, it is also important to know large sample or asymptotic properties of OLS. These are defined as sample size grows without bound. An important result is that even without assuming normality (MLR.6), t and F statistics are approximately t and F distributed. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 26 / 33

Asymptotics Consistency In general, an estimator ˆθ is consistent, if it converges in probability 3 to the population parameter θ, that is, ˆθ p θ, or plim ˆθ = θ. Note that unbiasedness does not necessarily imply consistency, and consistency does not automatically imply unbiasedness. Consistency of OLS: OLS is unbiased under assumptions MLR.1 through MLR.4, so ˆβ j is always distributed around β j. The distribution of ˆβ j becomes more and more tightly distributed around β j as the sample size grows. As n, the distribution of ˆβ j collapses to a single point β j. 3 A random variable X n converges to X in probability if for some ε > 0, lim P( X n X ε) = 0 (31) n Note that here the probability converges, not the random variable itself. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 27 / 33

Asymptotics Consistency Figure: Sampling distributions of ˆβ 1 [Source: Wooldridge (2013), Figure 5.1]. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 28 / 33

Asymptotics Consistency Theorem: Consistency of OLS Under assumptions MLR.1 through MLR.4, the OLS estimator ˆβ is consistent for β. Proof. We show consistency for the bivariate case with one regressor β 1, the general proof for k regressors is given in Wooldridge (2013), p. 810. First note that we can also write the OLS estimator ˆβ 1 simply as ˆβ 1 = n i=1 x iy i n i=1 x2 i (32) and after plugging in y i = βx i + u i and some algebra, we obtain ˆβ 1 = β + n 1 n i=1 x iu i n 1 n i=1 x2 i (33) Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 29 / 33

Asymptotics Consistency As n, we have ˆβ 1 p β + plim n 1 n i=1 x iu i plim n 1 n i=1 x2 i By the law of large numbers, 4 we have plim n 1 n i=1 x i = E(x i ) and plim n 1 n i=1 u i = E(u i ). Since we assume zero conditional mean (MLR.4), which obviously implies E(u i ) = 0, we get ˆβ 1 p β + 0 plim n 1 n i=1 x2 i (34) = β (36) This proofs that ˆβ 1 converges to β in probability. 4 Let X 1,..., X n be some sequence of i.i.d. random variables with arbitrary distribution. The law of large number states that X n = n 1 (X 1,..., X n) E(X n) (35) That is, the sample average converges to the expected value. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 30 / 33

Asymptotics Consistency Consistency is related to bias as follows: An estimator ˆθ is consistent iff. it converges to some value θ and the bias, i.e., Bias(θ) = E(ˆθ) θ, converges to zero. Individual estimators in the sequence ˆθ j ˆθ may be biased, but the overall sequence is still consistent if the bias converges to zero. Estimators can be Unbiased but not consistent Biased but consistent Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 31 / 33

Asymptotics Normality Another important large sample property of OLS is that ˆβ j is asymptotically normally distributed under assumptions MLR.1 through MLR.5. Formally, ˆβ j β se( ˆβ j ) where Var( ˆβ j ) is the usual OLS standard error. Result stems from the central limit theorem. 5 a Normal(0, 1) (37) We do not need the normality assumption in MLR.6 for our test statistics to be valid, as long as the sample size is reasonably large. All we have to assume is finite variance; Var(u) <. 5 The central limit theorem states that the standardized sums of i.i.d. random variables converges to a normal distribution, irrespective of their own distributions. Let X 1,..., X n be i.i.d. random variables with finite expected value µ and variance σ 2. Then ) ) 1 nσ ( n i=1 X i nµ = 1 n ( n i=1 X µ σ = X µ σ d n N(0, 1) (38) Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 32 / 33

Literature Main reference: Wooldridge, J. M. (2015). Introductory Econometrics: A Modern Approach, 5th ed., South Western College Publishing. Additional reference: Greene, W. H. (2012). Econometric Analysis, 7th edition, Pearson. Alexander Ahammer (JKU) Module 2: Multivariate Linear Regression 33 / 33