THE MULTIVARIATE LINEAR REGRESSION MODEL

Size: px
Start display at page:

Download "THE MULTIVARIATE LINEAR REGRESSION MODEL"

Transcription

1 THE MULTIVARIATE LINEAR REGRESSION MODEL

2 Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus effect. Ex: y: wage, x 1 : education, x 2 : IQ => IQ is no more part of u. =>better job at inferring causality. -better predictions: more of the variation in y can be explained.

3 Why multiple regression analysis?(2) It also allows: - estimating non-linear relationship. Ex: quadratic relationship between wage and experience. wage 0 1 exp er 2 exp er Careful: no ceteris paribus interpretation here! -Testing joint hypotheses on parameters. Key assumption: E 2 u ( u 2 x1, x ) 0

4 Example: Determinants of wage Source: Wooldridge, WAGE1.dta (data from 1976 Current Population Survey) Population model: wage educ exp er u use sum wage educ exper Variable Obs Mean Std. Dev. Min Max wage educ exper corr educ exper (obs=526) educ exper educ exper

5 Example: Determinants of wage(2). reg wage educ Source SS df MS Number of obs = 526 F( 1, 524) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ _cons reg wage educ exper Source SS df MS Number of obs = 526 F( 2, 523) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ exper _cons

6 Example: Determinants of wage(3) Interpretation: A one-year increase in education is predicted to increase hourly wage by 64 cent, ceteris paribus. An additional year of experience is predicted to increase wage by 7 cents, ceteris paribus. Comparing with the results of the bivariate model, we now obtain a higher estimate of returns to education. We suspect the results of the bivariate case to be biased, since experience is correlated with education, and experience affects wage too. I.e. the zero conditional mean assumptions was likely to be violated in the bivariate case. In other words: in the bivariate case, the impact of education accounted for the impact of experience as well. As the correlation between the two variables is negative, the estimate of the impact of education on wage was downward biased.

7 Example: introducing quadratics What if the impact of a variable is not constant? wage 0 1 exp er 2 exp er 2 u Introducing quadratics allows us to: model an increasing or decreasing effect of experience when experience increases. wage exp er exp er Determine the turning point of the effect: exp er 1 2 2

8 . list exper* in 1/10 exper expersq reg wage exper* Source SS df MS Number of obs = 526 F( 2, 523) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] exper expersq _cons Interpretation: For low levels of experience, wage is predicted to increase with experience, ceteris paribus. The negative sign on the square term indicates however that as the number of years of experience increases, the returns to an additional year decreases. In fact we can calculate the turning point, i.e. the point where the marginal returns to education are 0. This happens at.298/(2*006), i.e. approximately at 25 years of experience.

9 Stata commands:. scatter wage exper qfit wage exper,name(multiple). scatter wage exper lfit wage exper,name(simple). graph combine multiple simple,saving(simple_multiple) (file simple_multiple.gph saved) exper exper wage Fitted values wage Fitted values

10 The model with k independent variables The general multiple linear regression model (also called the multiple regression model) can be written in the population as: y x... x u. Notation: x 1, x2,..., x k are the independent variables, with k the number of independent variables, and x ik the value of variable x k for observation i. Key assumption:. E ( 2 0 u x1, x,..., x k ) k k

11 Deriving the OLS estimates The estimated model is:. We want to estimate =>k+1 OLS estimates. Minimize sum of squares of residuals: =>First order conditions (using calculus, see Appendix 3A) give k+1 linear equations with k+1 unknown: ˆ,..., ˆ, ˆ ) ˆ... ˆ ˆ ( 1 0 ik k i n i i x x y Min k k x k x x y ˆ... ˆ ˆ ˆ ˆ k ˆ,..., ˆ, ˆ, ˆ k ˆ,..., ˆ, ˆ, ˆ 2 1 0

12 Interpretation of OLS estimates ˆ Estimated model: yˆ (3.11) 0 1x1 2x2... kx k How do we interpret ˆ ˆ ˆ? 1, 2,..., k We can obtain from (3.11) the predicted change in y given changes in the x i : yˆ ˆ x ˆ x ˆ kx k. The coefficient on x 1 measures the change in y due to a one-unit increase in x 1, holding all other independent variables fixed. That is, if we hold x 2, x 3, x k constant: yˆ ˆ 1x1=>allows ceteris paribus estimation, even if data were not collected this way!! ˆ ˆ ˆ

13 OLS Fitted Values and Residuals For obs i, the fitted value is simply: y ˆ ˆ x... ˆ x The actual value y i will not equal the predicted value. Residual: ˆi 0 1 i1 uˆ i y i yˆ Same properties of fitted values and residuals: -The sample average of the residuals is zero. i -The sample covariance between the x i and residuals is zero=> between fitted values and residuals also. -The average point is always on the regression line. k ik ŷ i

14 Simple Vs. Multiple regression estimates ~ ~ ~ x Simple regression model: y Multiple regression model: yˆ ˆ 0 ˆ 1x1 ˆ 2x2 ~ 1 ˆ 1 if: - the partial effect of x 2 is zero in the sample -x 1 and x 2 are uncorrelated in the sample ~ 1 ˆ 1 if: : - the partial effect of x 2 is small in the sample -x 1 and x 2 are weakly correlated in the sample.

15 How good is the estimation at explaining the dependent variable? Measure of sample variation: Total Sum of Squares: SST n i1 ( y i y) 2 Part that is explained by x: Explained Sum of Squares: SSE n i1 ( yˆ i y) 2 Part that is unexplained by x: Residual Sum of Squares: SSR n i1 2 uˆ i Just as in the simple regression case, SST=SSE+SSR.

16 Goodness of fit: the R-squared 2 SSE SSR R 1 SST SST R 2 is the proportion of the sample variation in y i that is explained by the OLS regression line. R 2 lies between 0 and 1. Higher value indicates a better fit, but: R 2 never decreases, and it usually increases when another independent variable is added to a regression=> poor tool for deciding which model to choose. We will need another criterion to choose whether to include a variable.

17 Example: explaining arrest records Population model: First, we estimate the model without the variable avgsen. We obtain:. use reg narr86 pcnv ptime86 qemp86 Source SS df MS Number of obs = 2725 F( 3, 2721) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.8416 narr86 Coef. Std. Err. t P> t [95% Conf. Interval] pcnv ptime qemp _cons

18 So we obtain the estimated equation: Nârr86= pcnv ptime qemp86 n = 2,725, R 2 =.0413 The three variables pcnv, ptime86, and qemp86 explain about 4.1 percent of the variation in narr86. What happens if pcnv increases by 50%? nârr86 = -.150(.5) = What happens if ptime86 increases from 0 to 12? predicted arrests for a particular man falls by 0.034(12)=0.408 What if we include avgsen in the model?

19 . reg narr86 avgsen pcnv ptime86 qemp86 Source SS df MS Number of obs = 2725 F( 4, 2720) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = narr86 Coef. Std. Err. t P> t [95% Conf. Interval] avgsen pcnv ptime qemp _cons R 2 increases from.0413 to.0422, a practically small effect. The sign of the coefficient on avgsen is also unexpected: a longer average sentence length increases criminal activity. =>What should we conclude about the two models?

20 Unbiasedness of OLS Remember assumptions: Linearity (in parameters!): y 0 1x... k xk u Random sampling: yi 0 1xi 1... k xik ui i=1,2,,n. Zero conditional mean: E( u x1, x2,..., x k ) 0 No perfect collinearity: In the sample (and therefore in the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables.. Using all these assumptions we can prove the first important statistical property of OLS: unbiasedness. E( ˆ j ) j, j 1,2,..., k

21 Violations of zero conditional mean ZCM assumption will not be true if the functional relationship between the explained and explanatory variables is misspecified in equation : 2 Ex1: True model: cons 0 1inc 2inc u Estimated model: cons 0 1inc u Ex2: True model: log( wage) 0 1educ u Estimated model: wage 0 educ u 1 We omit a variable that is correlated with the x i =>endogeneity.

22 Violations of no perfect collinearity Assumption is violated if there exists (a, b) such that x 1 = a+bx 2. one variable can t be multiple from another (Ex: inc and inc 2 are ok, but log(inc) and log(inc 2 ) are not ok.) One variable can t be the sum of some of the others When variables are shares: can t include all the shares. Practical Note: Stata will not estimate models with perfect collinearity. Solution: drop any of the perfectly correlated variables!

23 Perfect collinearity could also fail if n<k+1: to estimate k+1 parameters, we need at least k+1observations. Bad luck in collecting the sample.

24 Examples of perfect collinearity : Voting outcomes and campaign expenditures Source: Wooldridge, VOTE1.dta (From M. Barone and G. Ujifusa, The Almanac of American Politics, Washington, DC: National Journal.) two-party races for the US House of Representatives in bcuse vote1. ge shareb=100-sharea Data description: votea: percent vote for A expenda: campaign expends. by A, $1000s expendb: campaign expends. by B, $1000s sharea: 100*(expendA/(expendA+expendB)). su votea expenda expendb sharea shareb Variable Obs Mean Std. Dev. Min Max votea expenda expendb sharea shareb

25 . reg votea sharea shareb note: sharea omitted because of collinearity Source SS df MS Number of obs = 173 F( 1, 171) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = votea Coef. Std. Err. t P> t [95% Conf. Interval] sharea (omitted) shareb _cons reg votea shareb Source SS df MS Number of obs = 173 F( 1, 171) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = votea Coef. Std. Err. t P> t [95% Conf. Interval] shareb _cons reg votea sharea Source SS df MS Number of obs = 173 F( 1, 171) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = votea Coef. Std. Err. t P> t [95% Conf. Interval] sharea _cons

26 Interpretation: The variables sharea and shareb are perfectly collinear (sharea = 100-shareB). Therefore they cannot both be used as independent variables in the regression. STATA will automatically drop one=> First two estimates are the same. ShareA as only explanatory variable and ShareB as only explanatory variable yield the same results: Increasing the share of expenditures of B with one percentage point (=a one percentage point decrease of the share of A), is predicted to decrease the share of votes for A by.46 percentage points, ceteris paribus.

27 Omitted variable bias Let y be the true model. 0 1x1 2x2 u All 4 assumptions are verified. When estimated, it gives: yˆ ˆ 0 ˆ 1x1 ˆ 2x2 We want the effect of x 1 on y. What happens if we regress y on x 1 only? The estimated (underspecified) model then is: ~ ~ ~ y x ~ ~ ~ is biased for β 1 : ( ). 1 E omittedbias

28 About the omitted bias: Two cases when ~ 1 is not biased: When β 2 =0 so that x 2 does not appear in the true model. ~ If 0, i.e. if and only if x 1 and x 2 are uncorrelated in 1 the sample. Direction of the omitted variable bias: 2-Variable case. Corr(x 1,x 2 )>0 Corr(x 1,x 2 )<0 β 2 >0 Positive bias Negative bias β 2 <0 Negative bias Positive bias

29 Example 3: Impact of IQ on relationship between wage and education Source: WAGE2.dta, Wooldridge, (data used in M. Blackburn and D. Neumark (1992), Unobserved Ability, Efficiency Wages, and Interindustry Wage Differentials, Quarterly Journal of Economics 107, ). use su wage IQ educ Variable Obs Mean Std. Dev. Min Max wage IQ educ

30 . reg wage educ Source SS df MS Number of obs = 935 F( 1, 933) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ _cons reg wage educ IQ Source SS df MS Number of obs = 935 F( 2, 932) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ IQ _cons corr educ IQ (obs=935) educ IQ educ IQ

31 Interpretation: Intellectual ability is likely to affect both people s wage and their education. Therefore a simple regression of wage on education is likely to be biased, as intellectual ability will be included in the error term, resulting in a violation of the zero conditional mean assumption. If we use IQ (as a proxy for intellectual ability) we correct for this bias. Given that IQ and education are positively correlated, and IQ and wage are also positively correlated, we suspect the coefficient in the bivariate model to be positively biased (i.e overestimated). This is confirmed when we run the regression including IQ. The coefficient of education drops from 60 to 42. An increase in the IQ score of 1, is predicted to increase wage by 5$ per month, ceteris paribus. Given that the correlation between education and IQ, and between IQ and wage is strong, the bias in the bivariate model was large.

32 Omitted variable bias: multiple case What happens if multiple regressors? Correlation between a single explanatory variable and the error generally results in all OLS estimators being biased. If focus is on the relationship between a particular explanatory variable, say x1, and the key omitted factor, ignoring all other explanatory variables is a valid practice only when each one is uncorrelated with x1, but it is still a useful guide.

33 Including irrelevant variables Overspecifying the model: one (or more) of the independent variables is included in the model even though it has no partial effect on y in the population. (That is, its population coefficient is zero.) No bias (when 4 assumptions hold), but not harmless though: undesirable effects on the variances of the OLS estimators.

34 Back to the Broad Picture We are interested in understanding the effect of a variable x on variable y. Need a coefficient estimate, need to know its sign and magnitude. Need to know how precise this estimate is => need to find about its variance. 4 assumptions give us unbiasedness of the coefficient estimates. Need one more assumption to obtain an unbiased estimate of the variance of the coefficient estimate, and to know OLS are efficient.

35 5 Gauss-Markov assumptions: 4+1 Linearity Random Sampling Zero Conditional Mean No Perfect Collinearity Homoskedasticity: variance in the error term, conditional on the explanatory variables, is constant 2 Var ( u x 1, x2,..., x k ) Under these conditions, 2 OLS estimate of error variance is unbiased: E( ˆ 2 ) We derive formula for sampling variance of the OLS coefficients. OLS is efficient (i.e. variance is the smallest variance possible).

36 Sampling variance of the OLS coefficients Under Assumptions 1 through 5, conditional on the sample values of the independent variables, For j=1,2,,k where is the total variation in x j, and is the R-squared from regressing x j on all other independent variables. Why should we care about its size?

37 Unbiased estimator of σ 2 Need an unbiased estimator of σ 2 to get an unbiased estimator of Var ( ˆ j ). σ 2 =E(u 2 ) => logical estimator would be u 2 i. Problem: the errors are not observable! But the residuals are. => ˆ? u 2 i n An unbiased estimator of σ 2 is: n SSR ˆ uˆ i n k 1 i1 n k 1 Why n-k-1? Degree of freedom=number of obs-number of parameters n

38 More precise estimate when: 2 2 lower: more noise in the equation (a larger ) makes it more difficult to estimate the partial effect of any of the x j on y. To reduce it, increase nb of x j. Larger total variation in x j : to increase it, increase sample size. SST j =0 is ruled out by assumption 4. Less correlation between the x j. Two extreme cases: =0 smallest variance. =1 perfect collinearity. (no way as it is ruled out by assumption 4)

39 What if R j 2 is close to 1? This is called multicollinearity. It does not violate assumption 4, but still is a problem as variance of the estimator increases. How to reduce multicollinearity? Dropping a variable? How big is the multicollinearity issue depends on which variable is your focus.

40 Example of Multicollinearity Relationship between education and family background Source: WAGE2.dta, Wooldridge, (data used in M. Blackburn and D. Neumark (1992), Unobserved Ability, Efficiency Wages, and Interindustry Wage Differentials, Quarterly Journal of Economics 107, ). use su educ sibs meduc feduc Variable Obs Mean Std. Dev. Min Max educ sibs meduc feduc In order to predict educational attainment, should we include all of these variables?

41 Omitted variable bias vs multicollinearity What happens if we omit father s education?. reg educ sibs meduc Source SS df MS Number of obs = 857 F( 2, 854) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc _cons These estimates are biased if feduc affects educ and is correlated with meduc and/or sibs.. corr feduc meduc sibs (obs=722). corr educ feduc (obs=741) feduc meduc sibs feduc meduc sibs Corr(feduc,meduc) is high => there is Also a problem of multicollinearity if Both are in the model. educ feduc educ feduc

42 Should we still include the omitted variable?. reg educ sibs meduc feduc Source SS df MS Number of obs = 722 F( 3, 718) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc feduc _cons Because of the high correlation between meduc and feduc, the standard error of the coefficient of meduc increased substantially. Given that multicollinearity is not a violation of any assumption, we prefer the second over the first estimation.

43 Try to redefine the research question: Create a third variable to sum up the information contained in the two variables meduc and feduc.. gen avpareduc=(feduc+meduc)/2 (213 missing values generated). reg educ sibs avpareduc Source SS df MS Number of obs = 722 F( 2, 719) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs avpareduc _cons Note: if x 1 is uncorrelated with x 2 and x 3, and x 1 is the variable of interest, then we do not really care whether x 2 and x 3 are correlated. So include x 3, it will make a better case for causality, and only the variances of the estimators of the coefficients of x 2 and x 3 will increase.

44 Misspecification Let y 0 1x1 2x2 u be the true model. All Gauss-Markov assumptions are ok. We consider two estimators of β 1 : ˆ from ˆ ˆ ˆ 1 yˆ, and 0 1x1 2x2 ~ 1 from the estimated (underspecified) model ~ ~ ~ y x Which one is the best? If bias is the criterion: ˆ 1 If variance is the criterion? will be better.

45 Trade-off variance vs bias We have equality if x1 and x2 are uncorrelated. If not: ~ When β 2 0: is biased, ˆ is not, and 1 1 ~ When β 2 =0: and ˆ are unbiased, and 1 1 Why should we prefer 1? -variances will decrease when n increases. ˆ -when we omit x 2 and β 2 0, variance of 1 bigger than it seems because x 2 is in error term, so bigger σ. ~

46 Example of misspecification. regress educ sibs meduc feduc brthord Source SS df MS Number of obs = 663 F( 4, 658) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc feduc brthord _cons regress educ sibs meduc feduc if brthord!=. Source SS df MS Number of obs = 663 F( 3, 659) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc feduc _cons The first regression suggests that birthorder has no significant effect on education. There has to be however a correlation between the number of siblings and the birthorder, which causes some multicollinearity. As a result the standard error for the coefficient of sibs in the first model is larger than in the second model.

47 Efficiency of OLS: Gauss-Markov theorem Under first four assumptions, OLS estimators are unbiased. But maybe there are other estimators with smaller variances? GM theorem: If assumptions 1 to 5 are satisfied, the OLS gives us the Best Linear Unbiased Estimators (BLUE) Unbiased: Assumptions 1 to 4: E( ˆ j ) j, j 1..., k Best: Smallest variances=>most precise. Linear: ˆ j can be written as a linear combination of y i.

48 findit bcuse bcuse vote1 d su expend* ge sharea2=( expenda/( expenda+ expendb))*100 ge shareb=( expendb/( expenda+ expendb))*100 list share* su share* ge a=sharea+shareb list sharea shareb a rename a sumshare reg votea sharea shareb reg votea sharea2 shareb su votea reg votea shareb clear bcuse wage2 su hours reg wage educ su wage reg wage educ IQ reg wage educ exper corr educ exper corr educ IQ corr educ exper wage corr IQ wage

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

In Chapter 2, we learned how to use simple regression analysis to explain a dependent

In Chapter 2, we learned how to use simple regression analysis to explain a dependent 3 Multiple Regression Analysis: Estimation In Chapter 2, we learned how to use simple regression analysis to explain a dependent variable, y, as a function of a single independent variable, x. The primary

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Lab 6 - Simple Regression

Lab 6 - Simple Regression Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11 Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Lecture 8: Instrumental Variables Estimation

Lecture 8: Instrumental Variables Estimation Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Lecture 3: Multivariate Regression

Lecture 3: Multivariate Regression Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

In Chapter 2, we learned how to use simple regression analysis to explain a dependent

In Chapter 2, we learned how to use simple regression analysis to explain a dependent C h a p t e r Three Multiple Regression Analysis: Estimation In Chapter 2, we learned how to use simple regression analysis to explain a dependent variable, y, as a function of a single independent variable,

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u. BOSTON COLLEGE Department of Economics EC 228 Econometrics, Prof. Baum, Ms. Yu, Fall 2003 Problem Set 3 Solutions Problem sets should be your own work. You may work together with classmates, but if you

More information

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor) 1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval] Problem Set #3-Key Sonoma State University Economics 317- Introduction to Econometrics Dr. Cuellar 1. Use the data set Wage1.dta to answer the following questions. a. For the regression model Wage i =

More information

5.2. a. Unobserved factors that tend to make an individual healthier also tend

5.2. a. Unobserved factors that tend to make an individual healthier also tend SOLUTIONS TO CHAPTER 5 PROBLEMS ^ ^ ^ ^ 5.1. Define x _ (z,y ) and x _ v, and let B _ (B,r ) be OLS estimator 1 1 1 1 ^ ^ ^ ^ from (5.5), where B = (D,a ). Using the hint, B can also be obtained by 1 1

More information

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1 Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!

Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much! Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much! OLS: Comparison of SLR and MLR Analysis Interpreting Coefficients I (SRF): Marginal effects ceteris paribus

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Problem Set 10: Panel Data

Problem Set 10: Panel Data Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Specification Error: Omitted and Extraneous Variables

Specification Error: Omitted and Extraneous Variables Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser Simultaneous Equations with Error Components Mike Bronner Marko Ledic Anja Breitwieser PRESENTATION OUTLINE Part I: - Simultaneous equation models: overview - Empirical example Part II: - Hausman and Taylor

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

sociology 362 regression

sociology 362 regression sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

More information

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =

More information

sociology 362 regression

sociology 362 regression sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

More information

Week 3: Simple Linear Regression

Week 3: Simple Linear Regression Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Multivariate Regression: Part I

Multivariate Regression: Part I Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Problem Set 1 ANSWERS

Problem Set 1 ANSWERS Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Introduction to Econometrics. Multiple Regression (2016/2017)

Introduction to Econometrics. Multiple Regression (2016/2017) Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Problem Set 4 ANSWERS

Problem Set 4 ANSWERS Economics 20 Problem Set 4 ANSWERS Prof. Patricia M. Anderson 1. Suppose that our variable for consumption is measured with error, so cons = consumption + e 0, where e 0 is uncorrelated with inc, educ

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

1 The basics of panel data

1 The basics of panel data Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random

More information

Econometrics Homework 1

Econometrics Homework 1 Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z

More information

Course Econometrics I

Course Econometrics I Course Econometrics I 4. Heteroskedasticity Martin Halla Johannes Kepler University of Linz Department of Economics Last update: May 6, 2014 Martin Halla CS Econometrics I 4 1/31 Our agenda for today Consequences

More information

Econometrics. 8) Instrumental variables

Econometrics. 8) Instrumental variables 30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates

More information

Problem 4.1. Problem 4.3

Problem 4.1. Problem 4.3 BOSTON COLLEGE Department of Economics EC 228 01 Econometric Methods Fall 2008, Prof. Baum, Ms. Phillips (tutor), Mr. Dmitriev (grader) Problem Set 3 Due at classtime, Thursday 14 Oct 2008 Problem 4.1

More information

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON Introductory Econometrics. Lecture 13: Internal and external validity ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external

More information

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections Answer Key Fixed Effect and First Difference Models 1. See discussion in class.. David Neumark and William Wascher published a study in 199 of the effect of minimum wages on teenage employment using a

More information

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Problem 1. The files

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Introduction to Econometrics. Multiple Regression

Introduction to Econometrics. Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Lecture 14. More on using dummy variables (deal with seasonality)

Lecture 14. More on using dummy variables (deal with seasonality) Lecture 14. More on using dummy variables (deal with seasonality) More things to worry about: measurement error in variables (can lead to bias in OLS (endogeneity) ) Have seen that dummy variables are

More information

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant) Heteroskedasticity Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set so that E(u 2 i /X i ) σ 2 i (In practice this means the spread

More information

Quantitative Methods Final Exam (2017/1)

Quantitative Methods Final Exam (2017/1) Quantitative Methods Final Exam (2017/1) 1. Please write down your name and student ID number. 2. Calculator is allowed during the exam, but DO NOT use a smartphone. 3. List your answers (together with

More information

Lecture 7: OLS with qualitative information

Lecture 7: OLS with qualitative information Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Econometrics II Censoring & Truncation. May 5, 2011

Econometrics II Censoring & Truncation. May 5, 2011 Econometrics II Censoring & Truncation Måns Söderbom May 5, 2011 1 Censored and Truncated Models Recall that a corner solution is an actual economic outcome, e.g. zero expenditure on health by a household

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5. Economics 3 Introduction to Econometrics Winter 2004 Professor Dobkin Name Final Exam (Sample) You must answer all the questions. The exam is closed book and closed notes you may use calculators. You must

More information