ECONOMETRICS Introduction & First Principles

Size: px

Start display at page:

Download "ECONOMETRICS Introduction & First Principles"

Elwin Abraham Flowers
5 years ago
Views:

1 ECONOMETRICS Introduction & First Principles DA V I D C. BR O A D S T O C K Research Institute of Economics & Management, China. OSEC Pre-Master Course, 2015.

2 COURSE OUTLINE Part 1. Introduction to econometrics: including review of types of data used and basic notation for cross sectional, time-series and panel data. OLS and the linear regression. Understanding test statistics with a review of Monte-Carlo simulation. Mis-specification testing Heteroskedasticity, heterogeneity and structural breaks. Panel data. Time series methods. Advanced topics.

3 CLASS SCHEDULE Dates/locations of econometrics classes TABLE : Class schedule Date Venue Lecture Lecturer 21 Mar Saturday-morning C205 Jingshi Lecture 1 David Broadstock 28 Mar Saturday-morning C205 Jingshi Lecture 2 David Broadstock 11 Apr Saturday-morning C205 Jingshi Lecture 3 David Broadstock 18 Apr Saturday-morning C205 Jingshi Lecture 4 David Broadstock 25 Apr Saturday-morning C205 Jingshi Lecture 5 «Guest» 09 May Saturday-morning C205 Jingshi Lecture 6 David Broadstock 16 May Saturday-morning C205 Jingshi Lecture 7 David Broadstock 23 May Saturday-morning C205 Jingshi Lecture 8 David Broadstock TBA TBA Final Exam

4 GRADING/CLASS REQUIREMENTS This course is aimed at reviewing the overall merits and approaches of (statistical) econometric methods. Attendance = 5% Homeworks = 25% In class presentation = 10% Final Exam = 60% Each week we will have a presentation by 1 group discussing a paper of my choosing - aiming at 15 minutes. These will be used to tackle issue relating to econometric practices from a wider viewpoint. We shall begin with the birth of Econometrica and the first winner of the Nobel Prize in Economics!

5 SOME ONLINE RESOURCES Use the web - it is invaluable! This course is aimed at reviewing the overall merits and approaches of (statistical) econometric methods. The website of the Wooldridge book (datasets available) Econometrics Journal s online resources: Under the above, link to a list of different econometrics software packages (many are free). You should also be able to implement basic operations in a spreadsheet, or find software within the university. You may also wish to check the course webpage (click on me), where lecture slides and homework materials will be posted.

6 INTRODUCTION Today s learning outcomes. To get familiar with each other To spark an interest to econometric thinking To develop an apprieciation of the different types of data in the real-world To understand the components of a simple linear model To understand the components of an estimated simple linear model To understand what the least squares criterion is To be aware of the different types of data we may work with To begin to consider the notion of causality (probably next week)

7 ECONOMETRICS - WHAT IS IT? Analysis of statistical relationships in economic data. A highly useful tool, with lots of applications in Business in Government in Research Centres in your 3rd year dissertation in postgraduate studies Takes an effort to learn, but once learned, it is very rewarding, both intellectually and in terms of your employability. If you don t learn Econometrics while at University, you ll probably never learn it!

8 HOW DO WE THINK OF THE PROBLEM Part 1/2. Electricity consumption (Y ) depends on income (X). However consumption also depends on several other economic variables that we ignore for now In economics we express our ideas about relationships between economic variables using the mathematical concept of a function For example, to express the relationship between food electricity consumption for individual i, Y i and his/her income X i we may write Y i = f (X i ) (1) However, when studying this relationship one recognizes that the actual consumption by an individual is the result of a systematic part and/or a random and unpredictable component u i that we call the random term Y i = f (X i ) + u i (2) The random term accounts for the many factors that affect sales that we have omitted from this simple model, and it also reflects the intrinsic uncertainty in the behaviour of individuals.

9 HOW DO WE THINK OF THE PROBLEM Part 2/2. To complete the specification of the econometric model, we must also say something about the form of the algebraic relationship among our economic variables In the example about the relationship of annual electricity expenditure and annual income, we assume that the systematic part of the demand relation is linear f (X i ) = β 1 + β 2 X i (3) The corresponding econometric model is Y i = β 1 + β 2 X i + u i (4) This together with some assumptions about u i and X i is called the simple linear regression model

10 THE PROBLEM OF INFERENCE We will spend some time on this. Given that the data has been generated by a model that has the form Y i = β 1 + β 2 X i + u i, what are the values of β 1 and β 2 in such a model? Using a sample of observations on Y and X we would like to estimate the parameters b 1 and b 2 We would like to test hypotheses on these parameters For example we want to know if b 1 and b 2 are somehow informative/stable Also, we would like to understand if we are using the correct model in the first place

11 SOME JARGON There is a lot... Y i f (X i ), f (X i ) = β 1 + β 2 X i (plus assumptions) is the linear regression model f (X i ) = β 1 + β 2 X i is called linear regression β 1 and β 2 are the parameters β 1 is the intercept parameter β 2 is the slope parameter. df (X) Since = d(β 1+β 2 X i ) = β d(x) d(x) 2, then β 2 tells us the increase of E(Y X) for each unit increase in X Y is the dependent variable X is the independent variable, or the regressor u is the error term

12 SOME MORE JARGON... get used to the terms Given two estimates b 1 and b 2, the quantity Ŷi = β 1 + β 2 X i is the predicted value for individual i The difference Y i Ŷi = û i is the residual Notice that the residuals and the errors are different things The error of a sample is the deviation of the sample from the (unobservable) true function value, while the residual of a sample is the difference between the sample and the estimated function value. (Wikipedia!)

13 BEFORE DISCUSSING ESTIMATION... lets think about data and variables a little There are a number of different types of data structures - we will not handle them all in this course, but here are some of the main ones to be aware of: Cross-sectional Panel Time series Discrete/categorical outcomes Count-data/truncated data Big-data

14 CROSS-SECTIONAL DATA Cross-sectional data is the simplest type of data available to us. We rarely work with this type of data in empirical research, but the teaching of econometrics and development of econometric theory still takes advantage of it: many innovative estimators start life without considering aspects of time or individual-type heterogeneity often considering these (necessary) refinements once the cross-sectional world is understood. Imagine we have a sample of households for a single year. Denote each household i, and the total number of households in the sample as I. Note, we could easily have a cross section of individuals, or countries or firms etc. We typically will denote the data Y i for the dependent variable and X i for independent variables. An example cross-sectional demand function might look something like: Q i = α + β P P i + β Y Y i + ε i

15 PLOTTING CROSS-SECTION DATA Below are some hypothetical data on price, income and quantity consumed - we will revisit next week why it might be interesting to work with hypothetical data sometimes P Q Y Q

16 PANEL DATA Panel data is an intuitively simple extension to cross-sectional data. In short, panel data includes information on individuals i = 1,..., I, similar in this sense to cross-sectional data, but in addition records this information over many time periods t = 1,..., T e.g. years, months quarters etc. Time periods are ideally (though not always) equidistant, meaning that the amount of time passed between periods 1 and 2 is the same as for 2 and 3 and all other sequential periods. We typically will denote the data Y it for the dependent variable and X it for independent variables. An example panel-data demand function might look something like: Q it = α i + β P P it + β Y Y it + ε it Note: There is an important difference between panel data (sometimes the name given to repeated cross-sections ) and longitudinal data (sometimes referred to as pure panel )

17 PANEL-DATA ILLUSTRATED Below is an example of energy demand for 17 countries OECD 17 log energy consumption per capita log(epc) Time

18 TIME SERIES DATA Time series data is data which concentrates on a single individual (again noting the general interpretation we have for the term individual ) over multiple time periods. It allows for more comprehensive understanding of trends in the world and the exploration of phenomena such as periods of economic boom, or sudden price collapses. We typically will denote the data Y t for the dependent variable and X t for independent variables. An example time-series demand function might look something like: Q t = α i + β P P t + β Y Y t + ε t A particular interest in time series is the treatment of high-frequency data which are increasingly commonplace.

19 TIME-SERIES DATA ILLUSTRATED Below is an example of UK gasoline demand UK real (weighted) gasoline price log(price) Time

20 DISCRETE/CATEGORICAL OUTCOMES Discrete or categorical variables can be found in cross-sections, panel-data and time series. These variables represent things that have a finite number of outcomes, for example the outcome of tossing a coin, or perhaps the decision by OPEC to change supply. Discrete variables can be simple consider for example a variable OPEC t intended to reflect the decision by OPEC to change (either increase or decrease) oil supply in period t. We could have: OPEC t = { 0 if supply remains unchanged 1 if supply changes We must be careful to consider if the discrete variable has an order. For instance choosing which color to have your car requires comparing many colors, which have no natural order. Alternatively a variable describing satisfaction (1=unhappy, 5=very happy) has an intuitive ordering to it. This can have important consequences for estimation.

21 COUNT-DATA/TRUNCATED DATA Count-data and truncated data are two other types of special variables, which can again have important consequences for estimation when present: Count data will generally be variables measured in integers (e.g. cannot be obtained in parts) and which may take any integer value from 0 to. Truncated data is often continuous over a range, for example the share of gasoline in total energy will be a maximum of 1 and a minimum of zero - but is perfectly continuous on this range and not restricted to integer values.

22 BIG DATA Big Data represents the Information assets characterized by such a High Volume, Velocity and Variety to require specific Technology and Analytical Methods for its transformation into Value [De Mauro et al. (2015)] Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes (2560 terabytes) of data the equivalent of 167 times the information contained in all the books in the US Library of Congress. [The Economist (2010)] I am presently trying to work with a 400GB dataset on energy consumption across 16,000 households - simply opening the data is beyond the capacity of most software and PC s - not to mention the programming ability of the average economist...

23 VARIABLE TYPES Most data types can contain one or more of the following We will review for general types of variables: Continuous variables These are variables that in their purest sense can take any value from minus infinity to plus infinity. We will in most cases assume this for the dependent variable. Bounded/truncated variables Variables of this form may not go above/below certain values. They can be bound from one or two sides. Can have important effects on density estimation for example (solved via reflection). Discrete As already discussed these variables depict selection among a finite set of outcomes and can lead to using quite specialized statistical approaches. Latent variables see 2 slides later

24 VARIABLE TYPES ILLUSTRATED Below are some examples of how different variables look A continuous normal variable A 'truncated from the left' normal variable A 'truncated from above and below' normal variable Frequency Frequency Frequency mean=0, sd= mean=0, sd=1, lower=0, upper=infinity mean=0, sd=1, lower=0,upper=1

25 LATENT VARIABLES Latent variables, and more generally latent information are things economists have used in various ways for some time. Beyond simply unobserved, latent variables often fall into the territory of being unobservable, in economics this might for example take the form of preferences This will at first seem counter-intuitive, how can the behavior of something immeasurable be quantified? The answers to this are interesting and highlight the more elegant aspects of economics and statistics - incorporating wisdom into rigorous mathematical structures. We will consider technological progress and underlying trends as some intuitive pedagogical examples. A relatively mainstream example would also be factor analysis and it s variants.

26 GUESSING THE PARAMETERS Art or science? Let us turn attention back to the the econometrics, and more specifically the nature of estimation. So, how to determine b 1 and b 2? Fit the line in a way, that our prediction mistakes are minimised in the best possible way. Does it mean we can minimize the sum of the residuals?

27 GUESSING THE PARAMETERS The least squares criterion Instead of choosing b 1 and b 2 to minimize the sum of the residuals, we choose b 1 and b 2 to minimize the sum of the squared residuals û û û û 2 n (5) = (Y i b 1 b 2 X i ) 2 + (Y 2 b 1 b 2 X 2 ) 2 + (Y 3 b 1 b 2 X 3 ) (Y n b 1 b 2 X n) 2 (6) Notice that the sum of the squared residuals can be zero if and only if all the residuals are zero

28 THE LEAST SQUARES CRITERION Our first estimator The study of the linear regression model is the focus of this course A good starting point for understanding the linear regression model are Chapter s 1 and 2 of Wooldridge This course requires that you are familiar with the material in the Review section at the end of the book. It is highly recommended that you read through it during the beginning of the term.

29 A RESEARCH FORMAT Before we go any further with the mechanics. It all starts with a problem (or question) Economic theory gives us a way of thinking about the problem: What economic variables are involved and what is the possible direction of the relationship(s)? The working economic model leads to an econometric model. We must choose a functional form and make some assumptions about the nature of the error term Sample data are obtained, and a desirable method of statistical analysis chosen, based on our initial assumptions, and our understanding of how the data were collected Estimates of the unknown parameters are obtained with the help of a statistical software package, predictions are made and hypothesis tests are performed Model diagnostics are performed to check the validity of the assumptions that were made. For example, were all of the right-hand side explanatory variables relevant? Was the correct functional form used? The economic consequences and the implications of the empirical results are analyzed and evaluated. What economic resource allocation and distribution results are implied, and what are their policy-choice implications? What remaining questions might be answered with further study or new and better data?

30 ORDINARY LEAST SQUARES (OLS) The workhorse of econometrics. Getting back to the idea of guessing the parameters, we have already introduced the least squares criterion In the next slides we review the mechanics of least squares estimation It is little more than standard optimization, but over a large number of observations We concentrate on the simple linear regression for illustration, i.e. the case where there is only one X variable, since the mechanics get much more involved with more than one X, after which matrix algebra becomes favorable

31 MINIMIZATION (1/6) Simple linear regression The minimization problem min (b 1,b 2 ) (Y i b 1 b 2 X i ) 2 (7) Leads to the following first order conditions: b 1 b 2 (Y i b 1 b 2 X i ) 2 = 0 (8a) (Y i b 1 b 2 X i ) 2 = 0 (8b)

32 MINIMIZATION (2/6) Simple linear regression By taking the derivatives of each term of the sums, we can re-write the first order conditions as b 1 (Y i b 1 b 2 X i ) 2 = 0 b 2 (Y i b 1 b 2 X i ) 2 = 0 Then, evaluating these two derivatives we obtain 2 2 (9a) (9b) (Y i b 1 b 2 X i ) = 0 (10a) X(Y i b 1 b 2 X i ) = 0 (10b)

33 MINIMIZATION (3/6) Simple linear regression The 2 term cancels from each equation giving: (Y i b 1 b 2 X i ) = 0 (11a) X(Y i b 1 b 2 X i ) = 0 Expanding out the brackets and re-arranging these we obtain the normal equations: (11b) Y i = nb 1 + b 2 X i Y i = b 1 X i + b 2 X i X 2 i (12a) (12b)

34 MINIMIZATION (4/6) Simple linear regression Now we can solve for b 1 and b 2 simultaneously by multiplying 12a by n X i and also by multiplying 12b by n to give: n X i Y i = nb 1 X i Y i = nb 1 X i + b 2 ( X i ) 2 X i + nb 2 X 2 i (13a) (13b)

35 MINIMIZATION (5/6) Simple linear regression Subtracting 13a from 13b n X i Y i From which it follows that: X i Y i = b 2 [n X 2 i ( ] X i ) 2 (14) b 2 = n n X iy i n X n i Y i [ n n X i 2 ( ] (15) n X i) 2

36 MINIMIZATION (6/6) Simple linear regression Now, given b 2 we can recover b 1 by recalling normal equation 12a and re-arranging to give: nb 1 = Y i b 2 X i (16a) n b 1 = Y i n b 2 n X i n = Ȳ b 2 X (16b)

37 MINIMIZATION SUMMARY The ordinary least squares estimator Note that the slope of the estimated relationship (b 2 ) is equivalent to cov(x, Y ) divided by var(x) : b 2 = cov(x,y ) var(x) Also note that the 2 nd equation implies that the estimated line always passes through the means of X i and Y i : Ȳ = b 1 + b 2 X [This will be important to remember for hypothesis testing and model validation] Since these formulas work for any values of the sample data, they are the least squares estimators.

38 THE MAIN ASSUMPTIONS To be reviewed in multiple regression SLR.1 Linear in parameters y = β 0 + β 1 x + u (17) SLR.2 Random sampling: We can take a random sample of size n, {x i, y i : i = 1, 2,..., n} from the population model SLR.3 Zero conditional mean: E(u x) = 0 SLR.4 Sample variation in the independent variable: In the sample, the independent variable s x i, i = 1,..., n, are not all equal to the same constant. This requires some variation in x in the population.

39 MEANING OF LINEAR REGRESSION Linearity in parameters This can be a little confusing, as it is possible to specify some non-linear relationships using the simple linear regression model Clarity over this confusion comes in terms of the role of the parameters. The parameters in the equation must enter in a linear fashion, however it is still possible for variables to enter into the equation non-linearly e.g. interaction terms, which we come back to in a future class Can anybody think of examples when we might wish to try and control for non-linear behaviour? Econometrics can deal with non-linear models (i.e. non-linear in parameters) but this is beyond the extent of this course.

40 MULTIPLE LINEAR REGRESSION When there is more than one X. Y i = b 1 b 2 X 1i b 3 X 2i b k X ki + u i (18) The X variables can be transformations e.g. X 1i = X 1i, X 2i = X 2 1i The only limitations on the number of included variables is the sample size (and desired degrees of freedom for inference - to be discussed next week) Ceteris paribus - specification of the standard Multiple linear regression allows for a holding other things fixed interpretation/control environment. However it does not require anything to be fixed during data collection in order to work.

41 MINIMIZATION (1/8) Multiple regression The minimization problem min (b 1,b 2,b 3 ) (Y i b 1 b 2 X 1 b 3 X 2 ) 2 (19) Leads to the following first order conditions: b 1 b 2 b 3 (Y i b 1 b 2 X 1 b 3 X 2 ) 2 = 0 (20a) (Y i b 1 b 2 X 1 b 3 X 2 ) 2 = 0 (20b) (Y i b 1 b 2 X 1 b 3 X 2 ) 2 = 0 (20c)

42 MINIMIZATION (2/8) Multiple regression By taking the derivatives of each term of the sums evaluating the three derivatives we obtain: 2 (Y i b 1 b 2 X 1 b 3 X 2 ) = 0 (21a) 2 X 1 (Y i b 1 b 2 X 1 b 3 X 2 ) = 0 2 X 2 (Y i b 1 b 2 X 1 b 3 X 2 ) = 0 (21b) (21c)

43 MINIMIZATION Multiple regression This is where Wooldridge stops the derivation for the multiple linear regression

44 MINIMIZATION (3/8) Multiple regression Canceling the 2 terms from each equation and re-arranging we obtain the normal equations: Y i = nb 1 + b 2 X 1 + b 3 X 1 Y i = b 1 X 2 Y i = b 1 X 1 + b 2 X1 2 + b 3 X 1 + b 2 X 2 X 1 X 2 X 1 X 2 + b 3 X 2 2 (22a) (22b) (22c)

45 MINIMIZATION Multiple regression When it comes to solving this set of equations it is convenient to apply matrix algebra or Cramer s rule It is not required for this course to use these methods It is not really required to derive the least squares estimators for the multiple regression model, though it is instructive to do so The following slides outline an approach to deriving the parameter estimates using the same notation as applied for the simple linear regression

46 MINIMIZATION (4/8) Multiple regression In order to solve this set of equations it is useful to recall the structure of the two (X) variable regression model: Ŷ = ˆb 1 + ˆb 2 X 1 + ˆb 3 X 2 + u (23) Averaging over the sample observations (noting also that ū = 0) gives: Ȳ = ˆb 1 + ˆb 2 X1 + ˆb 3 X2 (24) Now subtracting (24) from (23) gives the deviation form : y = ˆb 2 x 1 + ˆb 3 x 2 + u (25)

47 MINIMIZATION (5/8) Multiple regression The intercept b 1 disappears from the deviations form of the regression but is easily recovered by re-arranging the averages form of the equation to give: ˆb 1 = Ȳ ˆb 2 X1 ˆb 3 X2 (26) in order determine the values for b 2 and b 3, as in the case of the simple linear regression, we wish to minimise the sum of the squared residuals for the deviations form of the regression: min (b 2,b 3 ) (y b 2 x 1 b 3 x 2 ) 2 (27)

48 MINIMIZATION (6/8) Multiple regression Evaluating the first order derivatives gives the following first order conditions: b 2 b 3 which give respectively: (y i b 2 x 1 b 3 x 2 ) 2 = 0 (28a) (y i b 2 x 1 b 3 x 2 ) 2 = 0 (28b) x 1 y = b 2 x1 2 b 3 x 1 x 2 x 2 y = b 2 x 1 x 2 b 3 x 2 2 (29a) (29b)

49 MINIMIZATION (7/8) Multiple regression In order to eliminate one of the unknown parameters we multiply equation (24a) by n x 2 2 and (24b) by n x 1x 2 to give: x 1 y x2 2 = b 2 x 2 y x 1 x 2 = b 2 x 2 1 x2 2 b 3 x 1 x 2 (x 1 x 2 ) 2 b 3 x 1 x 2 x 2 2 x 2 2 (30a) (30b) Then subtract (25b) from (25a) to give: x 1 y x2 2 x 2 y [ x 1 x 2 = b 2 (x 1 x 2 ) 2 x 2 1 ] x2 2 (31)

50 MINIMIZATION (8/8) Multiple regression This can be re-arranged to give: n b 2 = x 1y n x 2 2 n x 2y n x 1x 2 [ n (x 1x 2 ) 2 ] (32) n x 1 2 n x 2 2 In a similar fashion we can find that: n b 3 = x 2y n [ n x 2 1 x 2 1 n x 1y n x 1x 2 n x 2 2 n (x 1x 2 ) 2 ] (33)

51 R 2 (GOODNESS-OF-FIT) The coefficient of determination - multiple regression Given SST (the total sum of squares), SSE (the explained sum of squares) and SSR (the sum of squared residuals) we can define R 2 the ratio of the explained variation compared to the total variation; thus, it is interpreted as the fraction of the sample variation in y that is explained by x. Where R 2 = SSE SST = 1 SSR SST SST = SSE = (34) (y i ȳ) 2 (35a) (ŷ i ȳ) 2 (35b) SSR = (ū i ) 2 (35c)

52 R 2 (GOODNESS-OF-FIT) The coefficient of determination - some notes R 2 provides an extremely useful measure of the ability of the specified regression equation to explain the variation in the independent variable R 2 never decreases and it usually increases when another independent variable is added to a regression This makes R 2 a poor tool for deciding whether one additional variable or many additional variables should be added to a model However, as we will see in week 4 and in later weeks, the R 2 does provide useful information for considering whether groups of independent variables are useful in explaining the dependent variable

53 THE MAIN ASSUMPTIONS In the context of the multiple regression In the context of the more general multiple regression we have a slightly different set of assumptions: MLR.1 Linear in parameters y = β 0 + β 1 x 1 + β 2 x β k x k + u (36) MLR.2 Random sampling: We have a random sample of n observatoins, {x 1, x 2,..., x k, y i : i = 1, 2,..., n} from the population model described by assumptions MLR.1. MLR.3 Zero conditional mean: E(u x 1, x 2,..., x k ) = 0 MLR.4 No perfect collinearity: In the sample (and therefore the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables. MLR.5 Homoskedasticity: Var(u x 1, x 2,..., x k ) = σ 2

54 MLR.1 LINEAR IN PARAMETERS The meaning of linear regression This can be a little confusing, as it is possible to specify some non-linear relationships using the simple linear regression model The answer comes in terms of the role of the parameters. The parameters in the equation must enter in a linear fashion, however it is still possible for variables to enter into the equation non-linearly Can anybody think of examples when we might wish to try and control for non-linear behaviour? Econometrics can deal with non-linear models (i.e. non-linear in parameters) but this is beyond the extent of this course.

55 MLR.2 RANDOM SAMPLING The meaning of linear regression We assume that the data are randomly drawn from the population For OLS to be unbiased, this assumption needs to hold for the population

56 MLR.3 ZERO CONDITIONAL MEAN Exogenous explanatory variables One way in which this assumption can fail is if the functional relationship between the explained and the explanatory variables, for example omitting a quadratic term when in fact it should be present Functional form mis-specification can be a problem and its detection is considered in Chapter 9 of Wooldridge Omitting an important variable that is correlated with any of x 1, x 2,..., x k of the included variables can also cause MLR.3 to fail as it will mean that there is still important information contained in the residual term Next week we will show how this can generate bias in the results, and in the following weeks we will consider what can be done to remedy it

57 MLR.4 NO PERFECT COLLINEARITY Sample properties In the sample (and therefore the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables. This says nothing about the relationship defined by MLR.3, (i.e. the relationship with u, rather it relates to the relationships among any of x 1, x 2,..., x k of the included variables. If one of the independent variables is an exact linear relationship of another, then we say that it is perfectly co-linear and it cannot be estimated by OLS. This assumption does still allow for correlation between x 1, x 2,..., x k - it is the co-movement in the variables which determines the value of the coefficients

58 RECALL REGRESSION ESTIMATOR Why X must take at least two values Regarding constant terms, from simple OLS recall that: n (Y Ȳ )(X X) b 2 = n (X X) (37) 2 If X had no variation at all, so that each X i would be equal to the mean of X, the OLS estimator would not be defined, as one can t divide by zero But, even if there is some variation, more of it would be better for the precision of the OLS estimator as will be seen later Regarding the slope parameters, consider the equation for b 2, given earlier, if x 2 = 1(x 1 ): n b 2 = x 1y n x 2 2 n x 2y n x 1x 2 [ n (x 1x 2 ) 2 ] (38) n x 1 2 n x 2 2

59 MLR.5 HOMOSKEDASTICITY Var(u x 1, x 2,..., x k ) = σ 2 This assumption means that the variance in the error term, u, conditional on the explanatory variables, is the same for all combinations of outcomes of the explanatory variables. If this variance changes with any of the explanatory variables, then the residual process is said to exhibit heteroskedasticity heteroskedasticity generated problems that can have important implications for model inference and will be reviewed in week 7 Assumptions MLR.1-MLR.5 are known as the Gauss-Markov assumptions for cross sectional regression. The set of assumptions developed so far are only appropriate for cross-sectional regression. Stating the assumptions for time series or for panel data is much more difficult, though there are similarities.

60 OLS VARIANCE AND COVARIANCE Parameter uncertainty It will prove important for us to understand the variance and co-variance of the OLS estimates for individual parameters. Below are definitions that we will again return to later in the course. σ 2 Var(b j ) = SST j (1 Rj 2 ) se(b 2 ) = Var(ˆb 2 ) (39a) (39b) where σ 2 is the error variance, defined in the following slide. Variance for any given parameter is therefore a combination of the (average) model uncertainty, inversely weighted by the ability of the model to describe the data.

61 ERROR VARIANCE Model uncertainty We wish to define σ 2 = E(u 2 ) in which the expectation is equivalent to n 1 n u2 i. However ui 2 is unobserved and so we replace it with the residuals from the estimated regression Instead of taking a direct average, the denominator is equal to n k 1 as opposed to n, and is referred to as the degrees of freedom ) ( n σ 2 = u2 i (n k 1) = SSR (n k 1) (40) You may wish to read Wooldridge regarding degrees of freedom

62 BLUE Best Linear Unbiased Estimator Best: refers to the estimator with the smallest variance. This is met insofar as it is the objective function for OLS Linear: means that b 1 and b 2 are linear estimators; that is, they are linear functions of the random variable Y Unbiased: An unbiased estimator is an estimator in which E( β j ) = β j Estimator: simply refers to the fact that we are looking at an estimator. Under the main MLR assumptions β j are the BLUE s of b j for all j ( j) hence the OLS estimators are BLUE (by the Gauss Markov Theorem) hence they are often referred to as the Gauss Markov assumptions. The importance of these assumptions is that when the standard set of assumptions holds we need not look for alternative unbiased estimators.

63 OMITTED VARIABLE BIAS 1/3 The consequences of leaving something important out Imagine the true PRF is: But however we estimate: Y = β 1 + β 1 X 1 + β 1 X 2 + u Ỹ = β 1 + β 1 X 1 Note that we use rather thanˆto emphasize that β j comes from an underspecified model We know that the OLS estimator of β 2 is; n β 2 = (Y i Ȳ )(X i X) n (X i X = 2 ) n (X i X)Y i n (X i X 2 )

64 OMITTED VARIABLE BIAS 2/3 The consequences of leaving something important out Since we know that Y = β 1 + β 2 X 1 + β 3 X 2 + u we can re-write the numerator of β 2 as follows: (X i X)(β 1 + β 2 X 1 + β 3 X 2 + u) = β 2 (X i X) 2 + β 3 (X i X)X 2 + = β 2 SST 1 + β 3 (X i X)X 2 + (X i X)u (X i X)u

65 OMITTED VARIABLE BIAS 3/3 The consequences of leaving something important out Dividing by SST 1 and taking the expectation conditional on the independent variables (noting that E(u) = 0) we have E( β 2 ) = β 2 + β 3 n (X i X)X 2 n (X i X) 2 n (X i X)X 2 n (X i X) 2 is equivalent to the slope coefficient from a regression of X 2 on X 1, which we could define as X 2 = δ 1 + δ 2 X 1. We can therefore see that E( β 2 ) = β 2 + β 3 δ2 E( β 2 ) β 2 = β 3 δ2 Where E( β 2 ) β 2 is defined as the bias

66 Thanks for listening! Any questions/comments are warmly welcomed.

67 INTERPRETATION OF MODEL OUTPUT The impact of different functional forms (given that y = β 0 + β 1 x) Model Dependent variable Independent variable Interpretation of β 1 level-level y x y = β 1 x level-log y ln (x) y = β % x log-level ln (y) x % y = (100 β 1 ) x log-log ln (y) ln (x) % y = β 1 % x

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In