Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!
|
|
- Amelia Rogers
- 5 years ago
- Views:
Transcription
1 Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much! OLS: Comparison of SLR and MLR Analysis Interpreting Coefficients I (SRF): Marginal effects ceteris paribus The Collinearity Regression Multicollinearity and R and VIF Omitted Variable Bias/Impact (Endogeneity) Interpreting Coefficients II (SLR): Regressing y on What s New about x OLS: Comparison of SLR and MLR Analytics Analysis SLR MLR Data Generation Model SLR.1: Linear Model y = β + β x + u i 1 i i MLR.1: Linear Model y = β + β x + β z + u i x i z i i y = β + β x + β z + β w + u i x i z i w i i y = β + β x + u i i i Residuals/ Unexplained = ( β + β ) u = y ( β + β x + β z ) u y x i i 1 i i i x i z i etc OLS Min SSRs Min SSRs Estimates: Intercept ˆ β ˆ = y β1x β = y β x ˆ ˆ Slopes ˆ β ( x x)( y y) i i 1 = ( xi x) Complicated SRF (Sample Regression Function) ŷ = ˆ β ˆ + β1x yˆ ˆ ˆ ˆ i = β + βxxi + βzzi, etc... yˆ ˆ β ˆ β x = + Controlling for impact of? No other RHS variables All other RHS variables
2 Analysis SLR MLR Estimated Impact from changing one RHS var from changing several RHS vars dyˆ = ˆ β1 dx ˆ yˆ yˆ = β ˆ 1 x = β1 x yˆ = ˆ β x (ceteris paribus) ˆ yˆ yˆ = β ˆ x = β x ˆ β x yˆ = the means y = ˆ β ˆ + β1x y = β + β x ˆ ˆ Elasticities (at the x d means) ˆ ˆ x y = β1 yˆ dx y x= x x ˆ ˆ x y = β yˆ x means So What s New?... not much, really! 1. Interpreting Coefficients I (SRF): Marginal effects ceteris paribus Consider the following MLR model with three RHS variables: DGM 1: yi = β + βxxi + βzzi + βwwi + vi You estimate the unknown parameter values using ordinary least squares, which gives you the following Sample Regression Function: SRF 1: yˆ = ˆ β ˆ ˆ ˆ + βxx+ βzz+ βww We use SRFs to predict y values as a function of the values of the RHS variables. Since yˆ = ˆ β x, the estimated coefficients in the SRF tell us the relationship between changes in RHS x variables (holding everything else fixed) and changes in the predicted values, the ys. ˆ ' The coefficients also tell us something about the impacts of discrete changes in the RHS variables: Consider a discrete change in x, from, say, x 1 to x. If the values of the other RHS variables are held fixed, we have two predicted values: yˆ ˆ ˆ ˆ ˆ 1 = β + βxx1+ βzz+ βww and yˆ ˆ ˆ ˆ ˆ = β + βxx + βzz+ βww. And the change in the predicted values will be: ( ) y = y y = ˆ x ˆ x = ˆ x x = ˆ x ˆ ˆ ˆ 1 βx βx 1 βx 1 βx
3 And if we have x changing by x, z changing by z, and w fixed, the change in the y = ˆ β x x + ˆ β z z = ˆ β x+ ˆ β z. ˆ x 1 z 1 x z predicted value of y will be ( ) ( ) So one interpretation of the MLR coefficients: they capture average marginal impacts on the predicted y values generated by the SRF. Example: See the handout on European football and corner kicks.. The Collinearity Regression The Collinearity regression features prominently in 1) the R measure of multi-collinearity, ) Omitted Variable Bias/Impact, and 3) the second interpretation of coefficients in MLR models (see below). Consider the MLR model above, with three RHS variables, x, z, and w, and dependent variable y: DGM 1: yi = β + βxxi + βzzi + βwwi + vi The collinearity regression looks solely at the RHS (explanatory) variables in the regression. In the collinearity regression, one of the RHS variables is regressed on the remaining RHS variables. So focusing on the RHS variable w, we have: DGMw: wi = α + αxxi + αzzi + ui, and SRFw: wˆ = ˆ α ˆ ˆ + αxx+ αzz (residual: uˆ ˆ i = wi wi) By construction, the SRF s predicted values, wˆ = ˆ α ˆ ˆ + αxx+ αzz, are the part of the w s explained by the other two explanatory variables (the x s and z s). because the predicted w s, the ws, ˆ ' are a linear function of the xs ' and zs. ' The residuals, the uˆ i ' s, are What s New about the w s the part of the w s not explained by the other explanatory variables in the model. If the residuals are all zero, then the x s and z s perfectly predict the w s, and so the w variable provides no new/additional explanatory power. In this case we say that the w s are perfectly collinear with the x s and z s. And if the residuals are sizable, then the w s are not so collinear with the x s and z s, and accordingly may provide some new and useful explanatory power in predicting the y values. Here are some collinearity regressions from a model that evaluates the relationship between the Brozek measure of body fat and hgt, wgt and abd (waist size):. bcuse bodyfat. reg Brozek hgt wgt abd. eststo. reg hgt wgt abd. eststo 3
4 . reg wgt hgt abd. eststo. reg abd hgt wgt. eststo. esttab, r scalar(rmse) (1) () (3) (4) Brozek hgt wgt abd hgt *** -.65*** (-1.43) (9.18) (-7.41) wgt -.1***.136***.349*** (-5.41) (9.18) (34.31) abd.88*** -.99***.365*** (15.19) (-7.41) (34.31) _cons -3.66*** 73.51*** -17.6*** 7.53*** (-5.1) (39.7) (-11.3) (13.3) N R-sq rmse Model (1) is the original regression; Models () - (4) are the collinearity regressions. 3. Multicollinearity - R-squared ( R ) Sxy Recall that sample correlation, ˆ ρ xy =, captures the extent to which there is a linear SS x y relationship between two variables, x and y. By definition, the correlation concept can be applied only to pairs of explanatory variables but what happens when you want to evaluate the extent to which larger groups of variables are moving together (in a linear fashion)? ˆ xy R In the SLR analysis, we found that R is also correlation squared: ρ =. And so in that analysis, we could ust as easily have used R to measure the extent to which the two variables moved together in a linear fashion. 1 While the concept of correlation does not extend to sets of more than two variables, R does. And so we will use the R of the collinearity regression to measure multicollinearity, the extent to which sets of variables are moving together in a linear fashion. In the example above, wgt is the most collinear explanatory variable since the R in Model (3) is.84 (which tells us that 84.% of the variation in the wgt variable can be explained with a linear 1 Note that 1 ˆ ρ 1 and R 1. Note that the squared correlation of the predicted y s, the yˆ ' s, with the y s does extend to MLR models: ρ = R so that s another reason to use R as a measure of collinearity in MLR models. ˆ yy ˆ 4
5 function of the other two explanatory variables, hgt and abd). abd in model (4) is almost as collinear, with R =.87. And hgt is the least collinear, with R =.59. The R-squared in the collinearity regression is often called to be associated with some explanatory variable x. R, or R-squared since it is said If it s near 1, then most of the variation in the w s can, say, be explained by the other explanatory variables (the x s and z s). And so in that sense the w s don t bring much new to the model (or offer much new independent explanatory power). In this case we say that w is highly collinear with the other RHS variables. But if R is small, then the w s are not so collinear with the other explanatory variables, and in that sense they (the w s) bring a lot of new explanatory power to the RHS of the model. When we get to inference and precision of estimation, we will see that the presence of multicollinearity will lead to higher standard errors, whatever those are. But the real problem with it is that it can lead to wacky estimated coefficients because of the ceteris paribus condition (in some cases it doesn t make sense to hold everything else fixed while changing ust one explanatory variable). (See the European football handout.) So if you have strange estimated coefficients, see if multicollinearity is driving those estimates. And as for fixes? You can always get more data. But you might also ust try re-estimating the model individually dropping highly collinear RHS variables to see what happens. We ll work through some examples in class. Another way to generate the R ' s: VIFs (Variance Inflation Factors) We will cover Variance Inflation Factors (VIFs) in detail when we get to inference, but for now, I ust note that they provide an easy way to generate the R ' s, and get a sense of the degree of multicollinearity in a MLR model. The relationship between VIFs and VIF 1 =, or R 1 R R ' s: 1 1 VIF =. So if you know one, you know the other. Note that the VIFs and direction so larger VIFs are associated with larger R ' s. R ' s move in the same If you ust run the vif command immediately after estimating your MLR model, you ll get the VIFs, and by association, the R s. Here s an example using the European Football data:. reg ptsdiff sdiff stdiff cdiff fdiff rdiff ' 5
6 Source SS df MS Number of obs = 59, F(5, 594) = Model Prob > F =. Residual , R-squared = Ad R-squared =.817 Total , Root MSE = ptsdiff Coef. Std. Err. t P> t [95% Conf. Interval] sdiff stdiff cdiff fdiff rdiff _cons vif Variable VIF 1/VIF sdiff stdiff cdiff rdiff fdiff Mean VIF di Check: Run the collinearity regression. reg sdiff stdiff cdiff fdiff rdiff Source SS df MS Number of obs = 59, F(4, 5943) = Model Prob > F =. Residual , R-squared = Ad R-squared =.665 Total , Root MSE = sdiff Coef. Std. Err. t P> t [95% Conf. Interval] stdiff cdiff fdiff rdiff _cons In this example shots and shots on target are the most collinear of the RHS variables no surprise given that their collinearity is largely driven by their high degree of correlation with one another. 3 The R s for the other RHS variables are all fairly small, and less than.5, if not.5. ' 3 Test your knowledge: Show that R for shots must be greater than or equal to the square of the correlation between shots and shots on target. 6
7 4. Omitted Variable Bias/Impact (Endogeneity) Estimated coefficients will be biased (or less peoratively, impacted) to the extent that those variables are correlated with omitted variables, which are themselves correlated with the dependent variable. This is not so much a bias as a misinterpretation. The estimated coefficients reflect the incremental average relationship between changes in the particular variable and changes in the LHS variable, controlling for all the other variables in the model. But of course, the omitted variable is not in the model. Fixes? If you can t insert the omitted variable into the model, maybe you can include a proxy variable (which might be highly correlated with the omitted variable). And if you can t do that, you might be able to at least sign the bias, and determine whether the estimated model over- or under-estimates the true parameter value(s) (relative to a model in which the omitted variable is included in the analysis). (More about fixes below.) Example: Dropping RHS variable z from the Model Consider a simple model with two explanatory variables, x and z. Using OLS to estimate the parameter values with the full model you get: SRF: yˆ = ˆ β ˆ ˆ + βxx+ βzz. Now suppose that z is dropped/omitted and the estimated model is instead: SRF: yˆ = ˆ γ ˆ + γ xx. The omitted variable bias/impact will be the change in the estimated x coefficient when z is dropped from the model: ˆ ˆx γ β x. We can derive this using the collinearity regression in which the omitted variable z is regressed on the included variable x. If the SRF from that regression is SRFz: zˆ = ˆ α ˆ + αxx, then the omitted variable bias is ust ˆ βα ˆ z x the product of: ˆz β, the estimated coefficient for the omitted variable (when it s in the full model), and ˆ α x, the respective estimated coefficients when the omitted variable is regressed on the other RHS vars in the model. Here s an example using the bodyfat dataset. (1) () (3) Brozek Brozek hgt wgt.187***.16***.384*** (14.48) (1.7) (5.1) hgt -.65*** (-6.9) _cons 31.16*** *** 63.7*** (4.51) (-4.18) (46.54) N R-sq rmse t statistics in parentheses * p<.5, ** p<.1, *** p<.1 7
8 In Model (1), Brozek has been regressed on wgt and hgt. In Model (), hgt has been dropped from the model, and the wgt coefficient drops by.5, from.187 to.16. Model (3) gives the results of the collinearity regression in which hgt, the omitted variable in Model (), is regressed on wgt, the RHS variable in model () And so the Omitted Variable Bias associated with excluding hgt from the original model is the product of the wgt coefficient in Model (3),.384, and the hgt coefficient in Model (1), -.65: (.384) (-.65) = which is exactly the amount by which the wgt coefficient dropped when hgt was dropped from the model. Qualitative Assessment Return the case of dropping z form the model so we have: Full Model SRF: y = ˆ β ˆ ˆ + β x+ β z ˆ x z Estimated Model SRF: yˆ = ˆ γ ˆ + γ xx Collinearity Regression SRFz: zˆ = ˆ α ˆ + α x, The following table summarizes the qualitative effects of omitted variable bias when you omit, say, z, from the model and ust regress y on x. Note that since ˆ αx Sz = ˆ ρxz Sx, the sign of ˆ α x will be the same as the correlation between x and z, ˆ ρ xz. Omitted Variable Bias: ˆ βα ˆ z x x z coeff. in full model correlation between x and z ˆ β z > ˆ β z < ˆ α x > Positive Negative ˆ α x < Negative Positive This table is useful because often, and especially with favorite coefficient models, you want to be able to sign the bias and if you re lucky, you can say something like: I estimated a positive effect and I know that I have an issue with omitted variable bias but since I m confident that that bias is negative, I know that the true effect is even more positive than I ve estimated and so I m confident that there really is a positive relationship. But of course, if the omitted variable bias is positive, then you know that you ve overestimated the effect, and now maybe you aren t so sure that the actual effect is positive it could ust be the omitted variable bias driving the result. 8
9 What to do if you fear omitted variable bias: 1. Don t be lazy. Get the data and include it in your model.. But maybe you can t get the data. Then maybe use an available proxy variable which is highly correlated with the omitted variable. Or try several proxy variables and see if it matters. (Example: If you don t have data on disposable personal income by MSA, use median per capita income as a proxy, or maybe median housing sales prices, or median monthly rent data.) 3. And if you are really lazy and don t want to find proxies, try the oh so sophisticated Instrumental Variables approach which we ll discuss later in the semester. But only if you are really really lazy! (Yes, you see my bias!) So far we ve looked at going from two to one explanatory variables. How do things change if we have more RHS variables in the full model? Not much! Suppose that the full model includes a third explanatory variable w, so that the full model SRF is: SRF 1: yˆ = ˆ β ˆ ˆ ˆ + βxx+ βzz+ βww. But you drop w for the analysis and ust regress y on x and z, with the resulting SRF: yˆ = b ˆ ˆ ˆ + bx x + bz z. In this case both of the estimated slope coefficients for x and z can be impacted by the omission of w from the model. To determine the bias, run the collinearity regression, regressing the omitted variable w on x and z, and generating SRFw: wˆ = ˆ α ˆ ˆ + αxx+ αzz. Then the omitted variable biases/impacts from excluding the w s from the model are: bˆ ˆ β = ˆ αβˆ (the product of the SRFw x coeff and the SRF 1 w coeff) x x x w bˆ ˆ β = ˆ αβˆ (the product of the SRFw z coeff and the SRF 1 w coeff) z z z w Here s an example, returning to the bodyfat dataset. In Model (1), Brozek has been regressed on wgt, hgt and abd. In Model (), abd has been dropped from the model. Model (3) gives the results of the collinearity regression in which abd, the omitted variable in Model () is regressed on hgt and wgt, the RHS variables in Model () 9
10 (1) () (3) Brozek Brozek abd wgt -.1***.187***.349*** (-5.41) (14.48) (34.31) hgt *** -.65*** (-1.43) (-6.9) (-7.41) abd.88*** (15.19) _cons -3.66*** 31.16*** 7.53*** (-5.1) (4.51) (13.3) N R-sq rmse t statistics in parentheses * p<.5, ** p<.1, *** p<.1 Notice that when abd was dropped from Model (1): the estimated wgt coefficient increased by.37 = (.187) - (-.1)), and the estimated hgt coefficient dropped by.53, going from to Applying the formulas above, we estimate the bias using the product of the abd coefficient in Model (1) and the respective RHS variable coefficients in the collinearity regression (Model (3)): wgt bias: (.88) * (.349) =.37, as advertised hgt bias: (.88) * (-.65) = -.53, also as advertised 5. Interpreting Coefficients II (SLR): Regressing y on what s new about x Earlier, we saw that the MLR coefficients told you something about the incremental impacts on predicted values of the dependent variable, holding everything else fixed (ceteris paribus). We now turn to a second interpretation: The MLR coefficients capture the SLR relationship between the dependent variable, and in each case, What s New about the specific RHS variable. To see this, first run the collinearity regression to determine What s New about a particular explanatory variable (What s New = residuals from the first regression), and then run a SLR model regressing y on What s New. You ll see that the SLR coefficient in the second model is exactly the same as the respective coefficient in the MLR model. So let s consider the full model, with SRF 1: yˆ = ˆ β ˆ ˆ ˆ + βxx+ βzz+ βww. We are interested in better understanding the x coefficient, ˆx β and proceed in two steps: 1
11 Step 1: What s New about x Run the collinearity regression, regressing x on the other RHS variables: SRF : xˆ = ˆ α + ˆ α z + ˆ α w (residuals: uˆ = x xˆ ) x z W i i i The residuals in the collinearity regression, û, are the part of the variable x not explained by the other RHS variables in the model. We call that WhatsNewx the part of the x s not explained by the z s and w s. So: WhatsNewx = û. Step : Regress y on WhatsNewx If you run the SLR model regressing y on WhatsNewx, you ll discover that the estimated coefficient for WhatsNewx will be exactly the same as the x coefficient in the full MLR model. So ˆx β captures the relationship between the dependent variable and What s New about the x s, the residuals in the collinearity regression (and the part of the x s not explained by the other RHS variables in the model). Here s an example using the bodyfat dataset and focusing on the abd variable:. reg Brozek wgt hgt abd Source SS df MS Number of obs = F( 3, 48) = Model Prob > F =. Residual R-squared = Ad R-squared =.7177 Total Root MSE = Brozek Coef. Std. Err. t P> t [95% Conf. Interval] wgt hgt abd _cons reg abd wgt hgt Source SS df MS Number of obs = F(, 49) = Model Prob > F =. Residual R-squared = Ad R-squared =.853 Total Root MSE = abd Coef. Std. Err. t P> t [95% Conf. Interval] wgt hgt _cons Use the predict., res command to capture the residuals: 11
12 . predict whatsnew, res And regress Brozek on What s New using a SLR model:. reg Brozek whatsnew Source SS df MS Number of obs = F(1, 5) = Model Prob > F =. Residual R-squared = Ad R-squared =.566 Total Root MSE = Brozek Coef. Std. Err. t P> t [95% Conf. Interval] whatsnew _cons So effectively, the estimated coefficients in MLR models capture the correlation between the dependent variable and What s New with each of the RHS variables. 1
σ σ MLR Models: Estimation and Inference v.3 SLR.1: Linear Model MLR.1: Linear Model Those (S/M)LR Assumptions MLR3: No perfect collinearity
Comparison of SLR and MLR analysis: What s New? Roadmap Multicollinearity and standard errors F Tests of linear restrictions F stats, adjusted R-squared, RMSE and t stats Playing with Bodyfat: F tests
More informationF Tests and F statistics
F Tests and F statistics Testing Linear estrictions F Stats and F Tests F Distributions F stats (w/ ) F Stats and tstat s eported F Stat's in OLS Output Example I: Bodyfat Babies and Bathwater F Stats,
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationOrdinary Least Squares (OLS): Multiple Linear Regression (MLR) Assessment I What s New? & Goodness-of-Fit
Ordinary Least Squares (OLS): Multiple Linear egression (ML) Assessment I What s New? & Goodness-of-Fit Introduction OLS: A Quick Comparison of SL and ML Assessment Not much that's new! ML Goodness-of-Fit:
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationTHE MULTIVARIATE LINEAR REGRESSION MODEL
THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus
More informationProblem Set 1 ANSWERS
Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationMultiple Linear Regression CIVL 7012/8012
Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47
ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationSection Least Squares Regression
Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationLecture#12. Instrumental variables regression Causal parameters III
Lecture#12 Instrumental variables regression Causal parameters III 1 Demand experiment, market data analysis & simultaneous causality 2 Simultaneous causality Your task is to estimate the demand function
More informationstatistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:
Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility
More informationLab 6 - Simple Regression
Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello
More informationsociology 362 regression
sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationMultiple Regression: Inference
Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled
More informationHandout 12. Endogeneity & Simultaneous Equation Models
Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to
More informationsociology 362 regression
sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationMeasurement Error. Often a data set will contain imperfect measures of the data we would ideally like.
Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationNonlinear Regression Functions
Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.
More informationChapter 6: Linear Regression With Multiple Regressors
Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationInterpreting coefficients for transformed variables
Interpreting coefficients for transformed variables! Recall that when both independent and dependent variables are untransformed, an estimated coefficient represents the change in the dependent variable
More informationHandout 11: Measurement Error
Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)
More informationEconometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018
Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationNonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015
Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationExercices for Applied Econometrics A
QEM F. Gardes-C. Starzec-M.A. Diaye Exercices for Applied Econometrics A I. Exercice: The panel of households expenditures in Poland, for years 1997 to 2000, gives the following statistics for the whole
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationData Analysis 1 LINEAR REGRESSION. Chapter 03
Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More information1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11
Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationLab 10 - Binary Variables
Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2
More information1 A Non-technical Introduction to Regression
1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationMULTICOLLINEARITY AND VARIANCE INFLATION FACTORS. F. Chiaromonte 1
MULTICOLLINEARITY AND VARIANCE INFLATION FACTORS F. Chiaromonte 1 Pool of available predictors/terms from them in the data set. Related to model selection, are the questions: What is the relative importance
More informationEmpirical Application of Simple Regression (Chapter 2)
Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget
More information1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Sylvan Herskowitz Section #10 4-1-15 1 Warm-Up: Remember that exam you took before break? We had a question that said this A researcher wants to
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationMultiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)
1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41
ECON2228 Notes 7 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 6 2014 2015 1 / 41 Chapter 8: Heteroskedasticity In laying out the standard regression model, we made
More informationsociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income
Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationAt this point, if you ve done everything correctly, you should have data that looks something like:
This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows
More informationLecture 5. In the last lecture, we covered. This lecture introduces you to
Lecture 5 In the last lecture, we covered. homework 2. The linear regression model (4.) 3. Estimating the coefficients (4.2) This lecture introduces you to. Measures of Fit (4.3) 2. The Least Square Assumptions
More informationECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors
ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption
More informationMulticollinearity Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationSpecification Error: Omitted and Extraneous Variables
Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct
More informationLI EAR REGRESSIO A D CORRELATIO
CHAPTER 6 LI EAR REGRESSIO A D CORRELATIO Page Contents 6.1 Introduction 10 6. Curve Fitting 10 6.3 Fitting a Simple Linear Regression Line 103 6.4 Linear Correlation Analysis 107 6.5 Spearman s Rank Correlation
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationIntermediate Econometrics
Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationLinear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error?
Simple linear regression Linear Regression Nicole Beckage y " = β % + β ' x " + ε so y* " = β+ % + β+ ' x " Method to assess and evaluate the correlation between two (continuous) variables. The slope of
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationPractice exam questions
Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationStart with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model
Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model A1: There is a linear relationship between X and Y. A2: The error terms (and
More informationImmigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs
Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationMotivation for multiple regression
Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope
More informationLab 11 - Heteroskedasticity
Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationWeek 3: Simple Linear Regression
Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline
More informationLecture notes on Regression & SAS example demonstration
Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also
More informationThe Classical Linear Regression Model
The Classical Linear Regression Model ME104: Linear Regression Analysis Kenneth Benoit August 14, 2012 CLRM: Basic Assumptions 1. Specification: Relationship between X and Y in the population is linear:
More informationMultiple Regression Analysis. Basic Estimation Techniques. Multiple Regression Analysis. Multiple Regression Analysis
Multiple Regression Analysis Basic Estimation Techniques Herbert Stocker herbert.stocker@uibk.ac.at University of Innsbruck & IIS, University of Ramkhamhaeng Regression Analysis: Statistical procedure
More informationAutocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time
Autocorrelation Given the model Y t = b 0 + b 1 X t + u t Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time This could be caused
More informationThe Simple Regression Model. Simple Regression Model 1
The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising
More informationLecture 8: Instrumental Variables Estimation
Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationNonrecursive models (Extended Version) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015
Nonrecursive models (Extended Version) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 NOTE: This lecture borrows heavily from Duncan s Introduction
More informationECON 497 Final Exam Page 1 of 12
ECON 497 Final Exam Page of 2 ECON 497: Economic Research and Forecasting Name: Spring 2008 Bellas Final Exam Return this exam to me by 4:00 on Wednesday, April 23. It may be e-mailed to me. It may be
More informationEconometrics. 8) Instrumental variables
30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates
More informationRegression Analysis IV... More MLR and Model Building
Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression
More informationGreene, Econometric Analysis (7th ed, 2012)
EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.
More information