Practical Regression: Noise, Heteroskedasticity, and Grouped Data

Size: px
Start display at page:

Download "Practical Regression: Noise, Heteroskedasticity, and Grouped Data"

Transcription

1 DAVID DRANOVE Practical Regression: Noise, Heteroskedasticity, and Grouped Data This is one in a series of notes entitled Practical Regression. These notes supplement the theoretical content of most statistics texts with practical advice on solving real world empirical problems through regression analysis. Introduction to Noisy Variables A variable is noisy if does not exactly equal the variable of interest (the one that best fits what the theory demands) or if it is mismeasured. Here are some examples: You want to measure the impact of product-level advertising on product sales. You have data on firms total advertising budgets. To estimate product-level budgets, you divide the total budget by the number of products. Your measure of product-level advertising is noisy. You want to determine if inventory turnaround is faster in firms that use just-in-time (JIT) inventory techniques. You survey logistics managers to get information about inventory turnaround. The busy managers provide rough, and therefore noisy, estimates. Continuing this investigation of inventory turnaround, you next study whether turnaround times differ by nation. Using the survey responses, you compute the average turnaround in each nation. The number of survey respondents ranges from seventy-five in the United States to two in Chile. Based on the law of large numbers, you know that the seventy-five U.S. firms in the sample are fairly representative of the United States as a whole. But you feel that the two Chilean responses may not accurately reflect all Chilean firms. Your measure of nation-level turnaround times is noisy, especially for nations with few sample respondents. The first part of this note describes the implications of noisy variables and suggests possible ways to deal with them. The second part of this note discusses problems that arise when the error term does not satisfy the ordinary least squares (OLS) assumptions of homoskedasticity and independence by the Kellogg School of Management, Northwestern University. This technical note was prepared by Professor David Dranove. Technical notes are developed solely as the basis for class discussion. Technical notes are not intended to serve as endorsements, sources of primary data, or illustrations of effective or ineffective management. To order copies or request permission to reproduce materials, call or cases@kellogg.northwestern.edu. No part of this publication may be reproduced, stored in a retrieval system, used in a spreadsheet, or transmitted in any form or by any means electronic, mechanical, photocopying, recording, or otherwise without the permission of the Kellogg School of Management.

2 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Implications of Noisy Variables It is not always easy to determine which variables are noisy. After all, the best way to know if a variable is noisy is to compare it with an accurate measure. But if you had an accurate measure, you would not need the noisy one. It is sometimes possible to apply statistical common sense to determine whether variables are noisy. For now, we will suppose that we know when a variable is noisy and discuss what that means for the analysis. Noisy Dependent Variables Here are two key facts: Coefficients obtained from OLS regressions with noisy dependent variables are unbiased. This implies that your predictions are also unbiased. Coefficients obtained from OLS regressions with noisy dependent variables are estimated less precisely (i.e., the standard errors increase). Thus, your predictions are less accurate. These statements are readily confirmed. Suppose that the true model relating X to Y is: (1) Y = B 0 + B 1 X + y where y is normally distributed. Suppose further that you do not have an accurate measure of Y. Instead, you have: (2) Z = Y + z where z is a normally distributed noise term that is independent of y. 1 Substituting from (2) into (1) yields: (3) Z = B 0 + B 1 X + ( y + z ) This is a regression equation. In fact, the only difference between equations (1) and (3) is that the error term is larger in equation (3) (( y + z ) versus y ). 2 This implies that the standard errors on B 0 and B 1 are larger when you use Z as the dependent variable. This causes the standard errors of any predictions to increase as well. Noisy Predictor Variables Things are a bit different when the predictor variables are noisy. Let s see what happens when X is noisy. Suppose that the true model is: 1 In general, you do not know the precise nature of the noise. If you assume that it is normally distributed, it is usually a good approximation and makes the math much easier. 2 Recall that the sum of two normally distributed variables is also normal. Thus, ε x + ε y is normal, so that equation (2) is a standard OLS regression model. 2 KELLOGG SCHOOL OF MANAGEMENT

3 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA (4) Y = B 0 + B 1 X + y Suppose that you cannot measure X with precision. Instead, you measure: (5) Q = X + x where x is normally distributed and independent of y. We won t derive it here, but the estimated B 1 will tend toward the following value: (6) Estimated B 1 = (True B 1 )/(1 + 2 x/ 2 y) where 2 x and 2 y are the variances of ε x and ε y, respectively. Noting that the denominator is larger than 1, we conclude that the estimate of B 1 is biased toward zero. 3 This is known as attenuation bias. The degree of attenuation bias depends on the relative values of 2 x and 2 y. If 2 x is large relative to 2 y (i.e., X is measured with a lot of noise relative to the regression error), then the bias can be quite large. Most of the time, you should not be overly concerned about attenuation bias. It is inevitable that you will measure some predictor variables with error. If the measurement errors are relatively small, the bias is small as well. Moreover, if you are mainly interested in hypothesis testing as opposed to examining magnitudes, then the bias is of the right type. That is, if the estimated B 1 is statistically significant when you have measurement error, then the true B 1 would be larger and likely more significant if you could eliminate that error. Heteroskedasticity A key assumption of OLS regression is that the errors for all observations are distributed identically. In other words, you expect the model to give equally precise predictions for all observations. Recalling that the OLS regression residuals are unbiased estimates of the error in the underlying regression equation, we expect that any variation in the residuals from one observation to another must be completely random. This requirement is violated if the magnitude of the residuals is correlated with some factor Z. 4 It does not matter whether Z is in your model. For example, your residual may be large in magnitude whenever Z is large, and your residual may be small in magnitude whenever Z is small. If the magnitude of the residuals is correlated with any factor Z, then your model suffers from heteroskedasticity. When you have heteroskedasticity, the OLS standard errors are incorrect. 3 If the true value of B 1 is positive, the computer will report an estimate of B 1 that is a smaller positive number. Similarly, if the true value is negative, the computer will report a smaller (in magnitude) negative number. 4 Remember, the error is the ε in the underlying model. The residual is the difference between the actual and predicted values. The two are not the same due to the randomness of the process that generates your data. Even so, the residual is your best estimate of the actual error. KELLOGG SCHOOL OF MANAGEMENT 3

4 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Testing for Heteroskedasticity Heteroskedasticity can arise in any number of ways, and there is no universal test for it. We can illustrate the problem by examining the relationship between sales and price of yogurt. Our data contains eighty-eight weeks of sales and pricing information on yogurt. The key dependent variable, labeled sales1 in the data set, gives the number of yogurt containers sold in a week. The key predictor, price1, is the price of yogurt in dollars per ounce. The variable promo1 indicates whether the yogurt is promoted in a special display case. Here is the result when we run regress sales1 price1 promo1: One way to test the assumption of heteroskedasticity is by performing an interocular (eyeball) test. Plot the residuals against the predicted values, or plot the absolute values of the residuals against the predicted values: 4 KELLOGG SCHOOL OF MANAGEMENT

5 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA The residuals seem to show greater variance when the predicted values are larger (look at the wide range of residuals when the fitted values are around 9,000). This is evidence of heteroskedasticity. Eyeballing the data makes us suspicious of heteroskedasticity, but we can also perform statistical tests. Specifically, we can test specific hypotheses about the residuals. Remember, heteroskedasticity can arise in countless ways, so there are countless tests we can perform. In practice, econometricians limit their toolkits to just a few tests. The most common test for heteroskedasticity is the Breusch-Pagan (BP) test. To perform the BP test, you should regress the squared values of the residuals on the predictor variables in the original regression and then perform a joint (F) test of all the predictors in the second regression Step 1: Perform regression: Y = B 0 + B x X + e Step 2: Regress squares residuals on predictors cover residuals: e 2 = γ 0 + γ x X Step 3: Perform joint (F) test on γ x Fortunately, Stata performs this test in one command following the regression: The test statistic reveals that we can reject the null hypothesis of constant variance of the residuals, which is tantamount to rejecting the null hypothesis of homoskedasticity. Thus, the regression suffers from heteroskedasticity. KELLOGG SCHOOL OF MANAGEMENT 5

6 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Fixing the Problem So you have heteroskedasticity and you don t know what to do about it. You certainly can t ignore it. Stata (and all regression software) computes standard errors and performs t-tests under the assumption of homoskedastic residuals. If you have heteroskedasticity, your standard errors and significance tests are incorrect. Most often, the cause of heteroskedasticity is a misspecified model. Misspecification can occur if we have omitted predictors or if we should have transformed the dependent and/or independent variables, for example by taking logs. We have discussed omitted variables in a previous technical note. 5 Sometimes regressions remain heteroskedastic despite the best specifications, and your standard errors are still incorrect. (If your model is well-specified, however, your coefficients are unbiased.) The standard solution to heteroskedasticity in well-specified regressions is the White correction (attributed to Halbert White). The White correction adjusts the variance-covariance matrix that is used to compute the standard errors. The technical details are not important, but there are a few things you should note: The White method corrects for heteroskedasticity without altering the regression coefficients. If the data are homoskedastic, the White-corrected standard errors are equivalent to the standard errors from OLS. (In practice, there are always at least small differences.) There are numerous modifications to the White correction, so different software packages may yield slightly different results. The White correction can be applied to models estimated via maximum likelihood techniques. Getting White-corrected standard errors (sometimes known as whitewashing ) is very simple in Stata. Just repeat your regression and add,robust: 5 David Dranove, Practical Regression: Introduction to Endogeneity: Omitted Variable Bias, Case # (Kellogg School of Management, 2012). 6 KELLOGG SCHOOL OF MANAGEMENT

7 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Note the following: The regression coefficients are unchanged; thus, the regression R 2 is unchanged. The corrected standard errors (now called robust standard errors ) are different from the original regression. The corrected standard errors in this case are larger than before. This is typical, but does not always occur. KELLOGG SCHOOL OF MANAGEMENT 7

8 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA One final point: The simple regression I have been working with is badly misspecified. Before I did any test for heteroskedasticity, I should have worked harder to correct the specification by considering additional predictors, using fixed effects (for the grocery store), and using a logged dependent variable. Using Weighted Least Squares to Correct Heteroskedasticity The Breusch-Pagan test regresses the squared residuals on all of the predictors in the model. Sometimes the squared residuals are a function of a single predictor in the model; the BP test might capture this. Because the BP regression includes all the regressors in your model, however, it is a weaker test than if you had just regressed the squared residuals on a single predictor. 6 Because you hypothesize that only a single predictor matters, your test should be limited to that predictor. A potentially more severe problem with the BP test is that the Z factor that causes heteroskedasticity might not be in your regression. If this is the case, the BP test will fail. This problem occurs in a wide class of regressions and can be fixed by using weighted least squares (WLS). One important class of applications for which the weighting factor is easy to identify and essential to use occurs whenever the left hand side (LHS) variable is drawn from individual survey data that is aggregated up to the market level. That is a rich sentence with lots of content, so let s break it down. 1. You need to have survey data. 2. The survey data is used to construct the LHS variable. 3. The LHS variable is computed by aggregating individual responses to create a market-level mean. If all three conditions hold (and they often do), then WLS is indicated. The following example should make things clearer. Suppose that you are studying determinants of television viewing in different cities. You survey lots of viewers in lots of cities to find out about their viewing habits. Your unit of analysis is the city, so you compute citywide average viewing levels. In some cities, you may have just one or two responses. In others, you have fifty or one hundred responses. Simple statistics tells you that in those cities with a higher number of responses, the citywide averages you compute are pretty close to the actual averages for those cities (assuming you have a representative sample). In those cities with only one or two responses, however, the averages you compute may be very different from the citywide averages. Because the sample sizes are rather small in many cities, your LHS variable estimated citywide viewership is noisy. But there is something predictable about the magnitude of the noise. A bit of statistics will show that if n i is the number of respondents in city i, and e i is the regression residual for city i, then the magnitude of e i is proportional to 1/n i. This implies that you have heteroskedasticity, as e i is systematically related to some factor (in this case, n i ). 6 By adding insignificant predictors to the BP test regression, you decrease your chances of getting a signficant result. 8 KELLOGG SCHOOL OF MANAGEMENT

9 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA The BP test will not pick this up because sample size (n i ) is not a regressor and is therefore excluded from the test. A variant of the test can be used instead: simply regress the squared residuals on n i. To illustrate weighting, let s look at data on managed care organization (MCO) penetration. The dependent variable is the percentage of physician revenues derived from managed care insurers in each of 294 metropolitan areas. Predictors include the income, education (percentage with college degrees), and hospital concentration in each market. We will regress MCO penetration on income, percentage of population with college education, and a measure of hospital concentration in the market. Note that the dependent variable is derived from survey data and there are a different number of survey respondents in each market. I perform the standard tests for heteroskedasticity: KELLOGG SCHOOL OF MANAGEMENT 9

10 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Neither the plot nor the BP test indicate any problem with heteroskedasticity. But because I have survey data, I have my doubts. First, I plot the residual against the number of survey respondents in each market: Note the classic funnel shape; residuals get smaller as the number of respondents gets larger. Now, perform the modified BP test: I have heteroskedasticity. To eliminate this problem, I need to get the computer to pay less attention to those cities with fewer respondents. Specifically, I will weight each observation by n and run a simple OLS regression. Weighting by n means that you multiply each and every value in your data set by n before running the regression. 10 KELLOGG SCHOOL OF MANAGEMENT

11 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Here is why this works. If you multiply everything by n, then the error term for each observation is also multiplied by n. This in turn implies that the squared errors are multiplied by n. Recall that OLS works hardest to fit the observations that contribute the most to the sum of squared errors. By multiplying the squared errors by n, you force the computer to do a good job of fitting the observations with the largest n s, which is exactly what you want. But be careful. If the residuals are not exactly proportional to the square root of the sample size, then WLS is not exactly the correct solution. Still, it is probably better to use a simple solution like WLS when it is theoretically sound than to perform ex post data picking in search of the best-fitting solution (i.e., picking out a significant result after the fact, then coming up with a theory to explain that result). Be warned, however, that widespread use of WLS can cause more problems than it solves. Needless to say, there is a simple way to perform WLS in Stata. Note the addition of [w=number_surveyed] at the end of the regression command. Some observations: The results can be interpreted like any OLS results. The R 2 is lower than before. Ignore this. The original regression was heteroskedastic, so the standard errors used to compute the R 2 in the original regression were incorrect. The WLS model can still have general heteroskedasticity that can be detected using the BP test and corrected using the White correction. REVIEW: A GUIDE TO WEIGHTING You may want to put more weight on some observations than others. This is certainly the case if the errors are systematically smaller for some observations; these observations deserve more weight (e.g., when you aggregate survey data). KELLOGG SCHOOL OF MANAGEMENT 11

12 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA You can check whether you need to do WLS by correlating the absolute value of the residuals with the potential weighting factor n, or better still, doing hettest n. WLS multiplies the LHS and right hand side (RHS) variables by n, where n is the weighting factor. This weights the squared errors by n, which is what you want. It is easy to perform WLS in Stata; just add [w=n] at the end of your regression statement. Avoid using WLS unless absolutely justified by the nature of the data. Summarizing Heteroskedasticity You have heteroskedasticity if the magnitude of the standard errors is correlated with some unmeasured factor. Heteroskedasticity biases the standard errors, but the coefficients are unbiased. You can test for heteroskedasticity using the hettest command in Stata. The,robust option corrects the standard errors in heteroskedastic OLS regressions. A common source of heteroskedasticity is the use of aggregated survey data. This can be corrected by using WLS. WLS and,robust are not mutually exclusive. Grouped Data Another critical assumption of OLS is that all the observations are independent. This assumption is frequently violated in practice. A prime example is regression with grouped data. For example, you may run a regression of profits for firms in a variety of industries. It seems plausible that profits will be correlated for firms within any given industry. Here is a more extreme example. Suppose you want to know if redheads are more popular than brunets. You have two friends named John and Paul. John has brown hair and Paul has red hair. At 1:00 p.m., you poll the class to see how many classmates like John more than Paul. You find that forty-five prefer John and fifteen prefer Paul. You repeat this poll at 1:10, 1:20, and so on. Your data looks as follows: 12 KELLOGG SCHOOL OF MANAGEMENT

13 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA Name Hair Color Popularity John B 45 Paul R 15 John B 46* Paul R 15 John B 46 Paul R 15 John B 46 Paul R 15 John B 46 Paul R 14* *One student arrived late and another student left class early to go to a job interview. Given these ten observations, you regress popularity on an indicator variable for hair color, where Hair=1 if the hair is brown and Hair=0 if red. The resulting coefficient is B hair 30.5, which is statistically significant, thanks to the ten observations and the apparent nine degrees of freedom. Do you conclude that people with brown hair are more popular? Of course not. The reason you get a significant coefficient on hair color is that you do not have ten independent observations; you have two observations that are each repeated five times. The computer has no reason to know this; it thinks you have lots of experiments and computes the standard errors accordingly. This is an extreme example of groupiness in the data. If you do not account for the groupiness of your data, you will overstate the true degrees of freedom in your model, and the reported standard errors will be artificially small. You run a great risk of tricking yourself into thinking that you have significant findings when in reality you do not. One way to deal with grouped data is to estimate fixed effects. In fixed effects models, the computer ignores within-group variation when estimating the coefficients. Thus, only acrossgroup variation matters. (To determine the effect of hair color in the prior example, either John or Paul would need to change theirs from brown to red, or vice versa.) There are times when you do not want to estimate fixed effects models. This is especially true if you do not have much within-group variation. For example, suppose you want to study the effect of market demographics on yogurt sales. The demographics of the communities surrounding the stores will change little over time. If you include store fixed effects, you will not have sufficient within-store variation. You will have to omit the store dummies and rely on across-store variation. (You now run a heightened risk of omitted variable bias, of course, but if you have a rich set of demographics, this risk is minimized.) Suppose you go ahead and omit the store fixed effects. It is now likely that the standard errors across observations within each store are no longer independent. You have grouped data and if you don t account for it, your standard errors will be biased. The technique for adjusting the standard errors to account for groupiness is preprogrammed into Stata. Continuing the example, if you have data on the income of each store s local community, you could estimate the following regression: regress sales1 price1 promo1 income KELLOGG SCHOOL OF MANAGEMENT 13

14 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA To correct the standard errors for possible groupiness, just use the cluster subcommand in Stata. In this case, the groupiness comes from the variable store, so you type: regress sales1 price1 promo1 income, cluster(store) Coping with Grouped Data Use common sense as a guide to determine if your data fall naturally into groups. As an alternative, examine the error terms for observations within specific groups. Are they systematically positive or negative? If so, then you may not have independent observations. As a result, your standard errors are too small. You can estimate a fixed effects model to avoid the resulting bias in the standard errors, but you will be unable to examine the effects of variables that vary only between groups. If you want to preserve between-group action but avoid biased standard errors, use the,cluster(groupname) option in Stata. 14 KELLOGG SCHOOL OF MANAGEMENT

15 TECHNICAL NOTE: NOISE, HETEROSKEDASTICITY, AND GROUPED DATA An Unexpected Problem (Math Optional) Suppose your initial model is: Y = B 0 + B 1 X + y You decide that you want to divide both Y and X by some other variable V. An example might be when you express both variables in per capita amounts, where V is the size of the population. If you divide Y by V, then you must divide the RHS by V to keep the equation correct. This means you are effectively regressing: Y/V = B 0 /V + (B 1 X)/V + y /V Note that the error term is now clearly larger when V is smaller that is, when the dependent and independent variables are smaller. This is blatant heteroskedasticity. One excuse for keeping the model this way is that the underlying model is in fact: Y/V = B 0 + (B 1 X)/V + y OLS appears to be safe here. If you think this is the correct model, you are almost safe. A word of caution is still necessary. Suppose the true model is: Y/V = B 0 + (B 1 X)/V + y but you do not have a precise measure of V. Instead, you have U = V + u. Thus, you are actually regressing: Y/(V + u ) = B 0 + (B 1 X)/(V+ u ) + y Note that whenever u is positive, both the dependent variable (Y/(V + u )) and the predictor variable (X/(V+ u )) in the regression are smaller in magnitude than the corresponding variables in the true model. Similarly, if u is negative, both variables are larger than they are supposed to be. This implies that the two variables move together in the data, not because the variables are causally related but because of noisy measurement of V. This will bias the estimate of B 1 upward it is more positive than the true B 1. This bias emerges whenever you divide the dependent and predictor variables by the same variable and the divisor is a noisy variable. Many empirical researchers feel that such bias is inevitable and suggest that you restate the regression in such a way as to avoid dividing both the LHS and RHS by the same variable. I generally side with this skeptical group, although I think it important to determine whether the divisor accurately measures the theoretical construct. For example, I am less worried about dividing the LHS and RHS by population (to obtain per capita values) than I am about dividing by other variables that might be measured with considerable noise (or be noisy measures of the underlying theoretical construct). KELLOGG SCHOOL OF MANAGEMENT 15

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of Wooldridge, Introductory Econometrics, d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of homoskedasticity of the regression error term: that its

More information

Mgmt 469. Causality and Identification

Mgmt 469. Causality and Identification Mgmt 469 Causality and Identification As you have learned by now, a key issue in empirical research is identifying the direction of causality in the relationship between two variables. This problem often

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

1 Correlation between an independent variable and the error

1 Correlation between an independent variable and the error Chapter 7 outline, Econometrics Instrumental variables and model estimation 1 Correlation between an independent variable and the error Recall that one of the assumptions that we make when proving the

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Topic 7: Heteroskedasticity

Topic 7: Heteroskedasticity Topic 7: Heteroskedasticity Advanced Econometrics (I Dong Chen School of Economics, Peking University Introduction If the disturbance variance is not constant across observations, the regression is heteroskedastic

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 17, 2010 Instructor: John Parman Final Exam - Solutions You have until 12:30pm to complete this exam. Please remember to put your

More information

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction For pedagogical reasons, OLS is presented initially under strong simplifying assumptions. One of these is homoskedastic errors,

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias Contact Information Elena Llaudet Sections are voluntary. My office hours are Thursdays 5pm-7pm in Littauer Mezzanine 34-36 (Note room change) You can email me administrative questions to ellaudet@gmail.com.

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Topic 1. Definitions

Topic 1. Definitions S Topic. Definitions. Scalar A scalar is a number. 2. Vector A vector is a column of numbers. 3. Linear combination A scalar times a vector plus a scalar times a vector, plus a scalar times a vector...

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

download instant at

download instant at Answers to Odd-Numbered Exercises Chapter One: An Overview of Regression Analysis 1-3. (a) Positive, (b) negative, (c) positive, (d) negative, (e) ambiguous, (f) negative. 1-5. (a) The coefficients in

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes ECON 4551 Econometrics II Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova s notes 15.1 Grunfeld s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated

More information

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41 ECON2228 Notes 7 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 6 2014 2015 1 / 41 Chapter 8: Heteroskedasticity In laying out the standard regression model, we made

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.

A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps. ECON 497: Lecture 6 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 6 Specification: Choosing the Independent Variables Studenmund Chapter 6 Before we start,

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points] Econometrics (60 points) Question 7: Short Answers (30 points) Answer parts 1-6 with a brief explanation. 1. Suppose the model of interest is Y i = 0 + 1 X 1i + 2 X 2i + u i, where E(u X)=0 and E(u 2 X)=

More information

Econ 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5.

Econ 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5. Outline 1 Elena Llaudet 2 3 4 October 6, 2010 5 based on Common Mistakes on P. Set 4 lnftmpop = -.72-2.84 higdppc -.25 lackpf +.65 higdppc * lackpf 2 lnftmpop = β 0 + β 1 higdppc + β 2 lackpf + β 3 lackpf

More information

Sampling and Sample Size. Shawn Cole Harvard Business School

Sampling and Sample Size. Shawn Cole Harvard Business School Sampling and Sample Size Shawn Cole Harvard Business School Calculating Sample Size Effect Size Power Significance Level Variance ICC EffectSize 2 ( ) 1 σ = t( 1 κ ) + tα * * 1+ ρ( m 1) P N ( 1 P) Proportion

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Chapter 8 Heteroskedasticity

Chapter 8 Heteroskedasticity Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94 Freeing up the Classical Assumptions () Introductory Econometrics: Topic 5 1 / 94 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions needed for derivations

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues

Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues Wooldridge, Introductory Econometrics, 4th ed. Chapter 6: Multiple regression analysis: Further issues What effects will the scale of the X and y variables have upon multiple regression? The coefficients

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity 1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

The returns to schooling, ability bias, and regression

The returns to schooling, ability bias, and regression The returns to schooling, ability bias, and regression Jörn-Steffen Pischke LSE October 4, 2016 Pischke (LSE) Griliches 1977 October 4, 2016 1 / 44 Counterfactual outcomes Scholing for individual i is

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Basic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model

Basic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model Basic Linear Model Chapters 4 and 4: Part II Statistical Properties of Least Square Estimates Y i = α+βx i + ε I Want to chooses estimates for α and β that best fit the data Objective minimize the sum

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

More information

Econometrics Part Three

Econometrics Part Three !1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

ECON 497: Lecture Notes 10 Page 1 of 1

ECON 497: Lecture Notes 10 Page 1 of 1 ECON 497: Lecture Notes 10 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 10 Heteroskedasticity Studenmund Chapter 10 We'll start with a quote from Studenmund:

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Instrumental variables estimation using heteroskedasticity-based instruments

Instrumental variables estimation using heteroskedasticity-based instruments Instrumental variables estimation using heteroskedasticity-based instruments Christopher F Baum, Arthur Lewbel, Mark E Schaffer, Oleksandr Talavera Boston College/DIW Berlin, Boston College, Heriot Watt

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

CRP 272 Introduction To Regression Analysis

CRP 272 Introduction To Regression Analysis CRP 272 Introduction To Regression Analysis 30 Relationships Among Two Variables: Interpretations One variable is used to explain another variable X Variable Independent Variable Explaining Variable Exogenous

More information

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance. Heteroskedasticity y i = β + β x i + β x i +... + β k x ki + e i where E(e i ) σ, non-constant variance. Common problem with samples over individuals. ê i e ˆi x k x k AREC-ECON 535 Lec F Suppose y i =

More information

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

More information

Answer Key: Problem Set 6

Answer Key: Problem Set 6 : Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3 Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over

More information

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Inference in Regression Model

Inference in Regression Model Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3

More information

Econometrics Problem Set 11

Econometrics Problem Set 11 Econometrics Problem Set WISE, Xiamen University Spring 207 Conceptual Questions. (SW 2.) This question refers to the panel data regressions summarized in the following table: Dependent variable: ln(q

More information

Mid-term exam Practice problems

Mid-term exam Practice problems Mid-term exam Practice problems Most problems are short answer problems. You receive points for the answer and the explanation. Full points require both, unless otherwise specified. Explaining your answer

More information

8. TRANSFORMING TOOL #1 (the Addition Property of Equality)

8. TRANSFORMING TOOL #1 (the Addition Property of Equality) 8 TRANSFORMING TOOL #1 (the Addition Property of Equality) sentences that look different, but always have the same truth values What can you DO to a sentence that will make it LOOK different, but not change

More information

STAT 212 Business Statistics II 1

STAT 212 Business Statistics II 1 STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information