Inference in Regression Model

Size: px
Start display at page:

Download "Inference in Regression Model"

Transcription

1 Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009

2 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3 Hypothesis Testing 4 Testing Multiple Hypotheses at the same time 5 Examples

3 Normality Assumption We need to add one more assumption to get the final part of the Classical Linear Regression Model Assumption MLR 6 (Normality) The population error u is independent of the explanatory variables x 1, x 2,..., x k and is normally distributed with zero mean and variance σ 2 : u N(0, σ 2 )

4 What do you think of this assumption? It is awfully strong-why should we believe the error term is normally distributed? Many distributions look like bell curves but that does not necessarily mean normality is a good assumption. When we worry about OLS asymptotics we will see a better way of handling this problem

5 Why is this assumption useful? Remember the simple regression model ˆβ 1 = β 1 + N i=1 (x i x)u i N i=1 (x i x) 2 This is a linear combination of normal random variables so it is normally distributed.

6 Putting the assumption into action gives the following result Theorem 4.1 (Normal Sampling Distributions) Under the CLM assumptions MLR.1 through MLR.6, conditional on the sample values of the independent variables ˆβ j N(β j, Var( ˆβ j )), where Var( ˆβ j ) was given in Chapter 3 [equation(3.51)]. Therefore, ( ˆβ j β j ) sd( ˆβ j ) N(0, 1).

7 We never actually looked at equation 3.51 in the lecture notes, so let me give it now Theorem 3.2 (Sampling Variance of the OLS Slope Estimators) ( ) σ 2 Var ˆβ j = ) SST j (1 Rj 2 for 1, 2,..., K where SST j = N ( ) 2 i=1 xij x j is the total sample variation in x j, and Rj 2 is the R-squared from regression x j on all other independent variables (and including an intercept).

8 A fly in the Ointment This is all fine, except for one thing We don t know σ 2! We can estimate it in the way you might expect (well close to it) ˆσ 2 = 1 N K 1 N ûi 2. i=1 This turns out to be an unbiased estimate of σ 2. We define the standard error of ˆβ j as ( ) se ˆβ j = ˆσ 2 ) SST j (1 Rj 2

9 Then we have the result that Theorem 4.2 (t Distribution for the Standardized Estimators) Under the CLM assumptions MLR.1 through MLR.6, ˆβ j β j se( ˆβ j ) t N K 1 where K + 1 is the number of unknown parameters in the population model y = β 0 + β 1 x β k x k + u (K slope parameters and the intercept β 0 ). Now we don t know β j, but this won t be particularly important for what we are doing. Note as well that if N-K-1 is large then the t-distribution is almost identical to the normal distribution

10 How is this all useful? Lets forget about regression now and think more generally about inference. We have talked about how to construct regression estimators and their distribution, but not about statistical inference. It will turn out that you already learned how to do this in your Statistics class. Lets first think about why we care

11 Example 1 You want to decide whether to be a doctor or a lawyer. You have a sample of earnings of people in the population For Doctors Y = $123, 452 For Lawyers Y = $112, 935 Should you be a doctor or a lawyer? How confident are you?

12 Example 2 Batting Averages: You go to one basketball game Charlie Villenueva scores 32 points Lebron James scores 17 points Who is a better?

13 Example 3 You are a firm thinking about switching to a new mode of production. It saves about $1100 on average, but costs $1000. Should you buy it?

14 Example 4 You are trying to predict the stock market. You think unemployment might be important. You run a regression of stock gains on unemployment and find b 2 = Should you necessarily invest all your money in the stock exchange? You would like to test formally whether b 2 > 0.

15 Example 5 You want to know whether crime in Chicago is related to the temperature. You run a regression where A t : Arrests in Chicago at date t T t : Temperature in Chicago at date t You find that A t = T t + ε t It looks like there are more arrests when the temperature is higher, but are you sure?

16 Example 6 You are interested in how much cars depreciate. You have information on their age and their price. P i : Price of car i in $1000 A i : Age of car i P i = A i + ε i How do you interpret this? How confident are you?

17 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3 Hypothesis Testing 4 Testing Multiple Hypotheses at the same time 5 Examples

18 Lets forget about regression or sample means or anything else and just assume that we have some estimator θ that is a function of the data. This estimator is normally distributed and is an unbiased estimate of θ. θ N ( ) θ, σθ 2

19 Lets also not worry about T distributions for now, but assume that N K 1 is large enough that the distribution is approximately normal. There is a little trick with normal random variables that makes them easy to deal with. What is the distribution of θ θ σ θ?

20 First note that θ and σ θ are nonstochastic while θ is normally distributed. Therefore this must be normally distributed. If we know its expectation and we know its variance we know everything there is to know about it.

21 E ( ) θ θ σ θ = 1 ) ) (E ( θ θ σ θ = 0 Var ( ) θ θ σ θ ( 1 = var ( θ )) θ σ θ = 1 ) σθ 2 var ( θ = σ2 θ σ 2 θ = 1

22 This means that θ θ σ θ N (0, 1) This is true for any normally distributed random variable.

23 The distribution N(0, 1) is called a standard normal You can find tables in the back of the book that gives its distribution. What the tables will show is that for any variable, say Z, that is a standard normal Pr ( 1.96 Z 1.96) = 0.95 That is 95% of the time a standard normal random variable will lie between and 1.96.

24 This is true for any standard normal so it has to be true for b θ θ σ θ, thus Pr ( 1.96 θ ) θ 1.96 σ θ = 0.95 Now after doing some algebra, we will get a nice expression.95 = Pr( 1.96 θ θ σ θ 1.96) = Pr( 1.96σ θ θ θ 1.96σ θ ) = Pr( 1.96σ θ θ θ 1.96σ θ θ) = Pr( 1.96σ θ + θ θ 1.96σ θ + θ) We call this a 95% confidence interval.

25 If you construct it in this way, it will cover the true θ 95% of the time. It is important to remember that what is random here is θ and not θ. Thus it is the interval that is random, not the variable inside it.

26 Constructing a Confidence Interval for the expected value of a Normal Distribution Let s start with the case of Y i N(µ Y, σ 2 Y ) and Y i and Y j are independent for i j, so cov(y i, Y j ) = 0. The sample mean Y is Y = 1 N N i=1 Y i Y is a random variable; what is its distribution?

27 Since it is a linear combination of normal random variables, it s normal. Its mean is E(Y ) = µ Y Its variance is Var(Y ) = σ2 Y N

28 The standard error of an estimator is its standard deviation (i.e. the square root of its variance), so we also have: se(y ) = σ Y N Thus Y N ( µ Y, σ2 Y N ) This is just a special case of what we learned above with Y taking the place of θ and µ Y taking the place of θ. Thus Pr( 1.96 σ Y N + Y µ Y 1.96 σ Y N + Y )

29 Confidence Intervals for Regression Coefficients Now we want to construct confidence intervals for regression coefficients. Once again these are just special cases of θ. Take β 1 : ˆβ 1 N(β 1, Var( ˆβ 1 )), Now β 1 takes the place of θ and ˆβ 1 takes the place of θ so that.95 = Pr( 1.96se( ˆβ 1 ) + ˆβ 1 β se( ˆβ 1 ) + ˆβ 1 ) Similarly for any of the other regression coefficients.95 = Pr( 1.96se( ˆβ j ) + ˆβ j β se( ˆβ 1 ) + ˆβ 1 ) Lets look at some examples

30 Our first example is from the RDCHEM data set. Lets just try the simple regression of profits on Research and Development expenditure We find that ˆβ 1 = ( ) se ˆβ1 = N = 32 K = 1 We want to construct a 95% confidence interval. N-K-1=30

31 The critical value for a 2-tailed T test with 30 degrees of freedom is The confidence interval is ( 2.042se ( ˆβ 1 ) + ˆβ 1, 2.042se ( ˆβ 1 ) + ˆβ 1 ) = ( , ) = (2.146, 2.728) Which is what stata shows.

32 Now consider a regression of the hours worked in a year Lets think about a 95% confidence interval for the hourly wage ˆβ 1 = ( ) se ˆβ1 = N = 428 K = 5 This gives us 422 degrees of freedom which is large enough that this is approximately normal

33 ( ) ( ) ( 1.96se ˆβ 1 + ˆβ 1, 1.96se ˆβ 1 + ˆβ 1 ) = ( , ) = ( 45.12, 0.84)

34 Next lets get a 90% confidence interval for the number of kids less than 6. Again we will be approximately normal, but for a 10% interval the critical value is ( ) ( ) ( 1.645se ˆβ 4 + ˆβ 4, 1.645se ˆβ 4 + ˆβ 4 ) = ( , ) = ( , ) Note that this is smaller than the 95% interval That makes sense

35 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3 Hypothesis Testing 4 Testing Multiple Hypotheses at the same time 5 Examples

36 Hypothesis Testing First lets think more generally what Hypothesis testing is and then worry about statistical inference Let H 0 be some hypothesis that you want to test. Suppose that it is true. We then ask whether the world seems consistent with it. Specifically: we perform some experiment and see if the results of the experiment are consistent with the hypothesis

37 Examples 1 There is no gravity 2 I only have one set of keys and I left them in the car 3 I always wear socks We can easily design experiments and test these hypotheses

38 In statistics life isn t so easy We can almost never reject for sure That is we very rarely know for sure that the null is definitely wrong For example, consider the hypothesis H 0 : µ x = 0 Even if x = it is still possible (although unlikely) that the true value is 0

39 In doing hypothesis testing we formalize these ideas We must deal with the following: Type 1 error=α = Probability of rejecting H 0 even when it is true We would like this to be zero, but it statistics it generally can not be

40 Now we need to construct a test statistic that has to have two features: 1 Depends on the data in a known way 2 We know its distribution when the null hypothesis is true We then choose an acceptance region A so that when t / A we reject the null hypothesis We choose A in such a way that Pr(reject H 0 H 0 true) = α

41 Lets look at some examples

42 Example 1: Probability of Getting a Job Suppose you are out trying to get a job and you think that the probability of getting a job offer is one half. That is H 0 : Pr (J) = 0.5 Now you go out and interview with a bunch of different firms until you get a job. You want to test the null hypothesis that the probability of getting a job was really 0.5 depending on how long it takes you to find one Lets define a test statistic t to be the number of jobs for which you interview until you get one

43 Is this a legitimate test statistic? It depends on the data. I know its distribution when the null hypothesis is true

44 To see the last part notice that Prob. Job after 1 interview 0.5 Prob. Job after 2 interviews 0.25 Prob. Job after 3 interviews Prob. Job after 4 interviews Prob. Job after 5 interviews Prob. Job after more than 5 interviews

45 How do we do this? Set α = and choose our acceptance region A = {1, 2, 3, 4} Thus we reject when we reach our fifth interview Probability of getting to fifth interview = Probability of being rejected after 4 interviews = = α

46 Example 2: Sample Mean Suppose x i N ( ) µ x, σx 2 Take the null hypothesis to be that µ x is some particular value, H 0 : µ x = x Now set t = x x ŝe( x)

47 Does this satisfy our criteria? It depends on the data. We know x and we can estimate x and ŝe( x). We know its distribution under the null hypothesis. Under the null x i N(x, σ 2 x) So we know that x N ( ) x, σ2 x N and from statistics class you learned that t T N 1 (that is a T distribution with N-1 degrees of freedom)

48 Example 3: Regression Coefficient Suppose we want to test something about β 1. In particular suppose we want to test that H 0 : β 1 = β 1

49 What should we use as our test statistic? It needs to depend on the data. That is given a set of X s and Y s, I need to be able to get the test statistic I need to know its distribution under the null hypothesis. We know that under the assumption of the classical linear regression model and under the null hypothesis ˆβ 1 N(β 1, se( ˆβ 1 ) 2 ) so and ˆβ 1 β 1 se( ˆβ 1 ) t = ˆβ 1 β 1 se( ˆβ 1 ) N(0, 1) T N K 1

50 There is still an issue we have not dealt with at all We have talked about how hypothesis testing works and what we need to do it. However, we have not talked at all about how we choose the acceptance region At this point anything will work as long as the probability of rejection is α. In particular if we know that our distribution is T N K 1 there are many ways to choose the region that will work. How do we choose the right one?

51

52

53

54

55 Power Our main criterion for picking the right region is β = Pr(Reject H 0 H 0 false) We want to choose the region so that fixing the size= α, the power is as large as possible Lets focus on the null hypothesis that β 1 = 0 Suppose in reality that the null is false and that instead β 1 = 3

56 What is the distribution of the test statistic under this alternative? t = = ˆβ 1 se( ˆβ 1 ) 3 se( ˆβ 1 ) + ˆβ 1 3 se( ˆβ 1 ) 3 + T N K 1 se( ˆβ 1 ) This is going to be a T distribution shifted to the right.

57 Lets think about the following four 95% acceptance regions A = { 1.96 t 1.96} B = {t 1.64} C = { 1.64 t} D = {t 0.06, t 0.06} Assuming that N-K-1 is large so this is approximately normal, all 4 have the same size α = 0.05.

58 That is under the null hypothesis Pr(A H 0 ) = Pr ( 1.96 t 1.96 H 0 ) = 0.95 Pr(B H 0 ) = Pr (t 1.64 H 0 ) = 0.95 Pr(C H 0 ) = Pr ( 1.64 t H 0 ) = 0.95 Pr(D H 0 ) = Pr (t 0.06, t 0.06 H 0 ) = 0.95

59

60

61

62

63 For this example which region gives us the most power? Answer: region B What about if under the alternative β 1 = 3? Region C What if you are not really sure whether β 1 is positive or negative under the null? Then you would probably choose region A. Would you ever choose D? Not for any reason I can think of Lets look at a couple of examples.

64 Cigarette Smoking Do a higher price of cigarettes lead people to smoke less? We can write the null hypothesis as H o : β 1 = 0 The interesting alternative is β 1 < 0 so it makes sense to use a one sided test. N-K-1 is large enough that we will use the normal distribution (degrees of freedom= )

65 Thus the critical value is We reject if We fail to reject the null >T = = 0.32 Alternatively we could use a two-sided test. Then we reject if t < 1.96 or if t > 1.96 We don t reject either of these

66 We can also try a one-sided test at the 10% level The critical value here is We are still nowhere close You can see from stata that the p-value is You would never use one that large

67 Wine and Death Lets look at the relationship between wine consumed and death. Here you would not really know what to think, so a two sided test makes more sense. We have 21 observations so the degrees of freedom is 19. This gives the critical value of a 95% test at Calculate We fail to reject the null. t = = 1.98

68 What about a 90% test. In that case the critical value is Again we can look at the p-value which is This means that we would reject a test of size we would fail to reject with a size of 0.061

69 Even if you reject the null does that mean that you should drink a lot of wine? Suppose (and I do not know the exact numbers but am making them up), you can either drink wine or drink vitamin water Say that the doctor tells you drinking vitamin water will save 30 lives per 100,000 Is drinking wine as effective?

70 The null hypothesis now is β 1 = 30 The test statistic is now t = ˆβ se( ˆβ 1 ) = 8.20 = 1.68 The critical values are the same as before so we fail to reject the null both at the 5% and 10% level

71 Testing linear combination of parameters Sometimes we want to test more complicated tests about more than parameters at once One really nice example is work by Kane and Rouse They want to ask, what is worth more, a year in community college or a year at a four year college. They look at the regression log(wage) i = β 0 + β 1 jc i + β 2 univ i + β 3 exper i + u i where jc i is the number of years of junior college while univ i is the number of years of university training and exper i is experience.

72 It is interesting to test the null hypothesis H o : β 1 = β 2 How do we do this? Note that we can write this as H o : β 1 β 2 = 0 Now it looks a bit like what we did before

73 Remember that ˆβ 1 ˆβ 2 is a sum of normal random variables, so it is normal ( ) E ˆβ 1 ˆβ 2 = β 1 β 2 ( var ˆβ1 ˆβ ) ( ) ( ) ( 2 = var ˆβ1 + var ˆβ2 2cov ˆβ1, ˆβ ) 2 Thus under the null hypothesis ˆβ 1 ˆβ 2 ( ) N(0, 1) var ˆβ 1 ˆβ 2

74 and when we deal with the fact that we estimate the standard error ˆβ 1 ˆβ 2 var ( ˆβ 1 ˆβ 2 ) T N K 1

75 There is no easy way to do this in stata, but there are two other ways to do it. The first way is to redefine the model slightly. What we care about testing is β 1 β 2 so lets define Also define θ 1 = β 1 β 2. totcol = jc + univ which is just the total number of years of college that a person has.

76 Then think about the population regression function log(wage) i = β 0 + β 1 jc i + β 2 univ i + β 3 exper + u i = β 0 + (θ 1 + β 2 ) jc i + β 2 univ i + β 3 exper + u i = β 0 + θ 1 jc i + β 2 (jc + univ i ) + β 3 exper + u i = β 0 + θ 1 jc i + β 2 totcol + β 3 exper + u i This gives us a nice way of testing the restriction It also makes sense intuitively.

77 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3 Hypothesis Testing 4 Testing Multiple Hypotheses at the same time 5 Examples

78 The other test one might want to do is to test more than one null hypothesis at a time. For example in the previous example you might be interested in testing the joint null hypothesis H 0 : β 1 = 0 We don t know how to do this yet. β 2 = 0.

79 To get some intuition lets think about comparing a restricted regression model with the unrestricted model. Lets write the unrestricted model as log(wage) i = β u 0 + βu 1 jc i + β u 2 univ i + β u 3 exper i + u u i The restricted model is log(wage) i = β r 0 + βr 3 exper i + u r i. Under the null hypothesis these models are equivalent.

80 Intuitively If the null hypothesis is true the two models are the same. That means when we include jc i and univ i into the model, the sum of squared residuals should not change much. However, if the null hypothesis is false that means that at least one of β 1 or β 2 is nonzero and the sum of squared residuals should fall when we include these new variables. This seems like it could be the basis of a test.

81 If the sum of squared residuals changes a lot we reject the null hypothesis. It only makes sense to to this as a one sided test. It turns out it works the following statistic works. F = (SSR r SSR u ) /q SSR u /(N K 1)

82 Here q is the difference in degrees of freedom in the restricted model versus the unrestricted model Equivalently it is the number of restrictions that you are testing in the null hypothesis. In the case we are looking at, q = 2 Note that it increases with SSR r so we reject when it is a big number.

83

84 We call this an F distribution with q degrees of freedom in the numerator and N K 1degrees of freedom in the denominator. We can also write it in terms of R 2, F = (SSR r SSR u ) /q SSR u /(N K 1) ) /q = = ( SSRr SST SSR u SST SSRu SST /(N K 1) ( R 2 u Rr 2 ) /q ( ) 1 R 2 u /(N K 1)

85 This gives us a really easy way to test the joint hypothesis that all of the coefficients in the regression are zero other than the intercept. That is the null hypothesis H 0 : β 1 = β 2 =... = β K = 0. What is the R 2 in the restricted model? It has to be zero. Thus the test statistic for this regression is just F = R 2 /q ( 1 R 2 ) /(N K 1)

86 Now back to the two year college example. I said there was another way to do it. Well there is nothing that says you can t use an F-test when you only have one restriction (q = 1). For the community college model the restricted model would have β 1 = β 2 so we can write this as log(wage) i = β 0 + β 1 jc i + β 2 univ i + β 3 exper + u i = β 0 + β 1 jc i + β 1 univ i + β 3 exper + u i = β 0 + β 1 (jc i + univ i ) + β 3 exper + u i = β 0 + β 1 totcol + β 3 exper + u i

87 We have SSR u = SSR r = q = 1 N K 1 = 6760 Thus we get F = (SSR r SSR u ) /q SSR u /(N K 1) ( ) /1 = /6760 = 2.15 The critical value for an F-test with 1 degree of freedom in the numerator and in the denominator is 3.84 so we fail to reject the null hypothesis.

88 Relationship between F and T tests It turns out that this F-test is closely related to the t-test In fact the the t-statistic squared has an F distribution with 1 degree of freedom in the numerator. Note that = 3.84 the critical value of the F-test.

89 Testing in stata Stata does things slightly different than the way I presented it. Lets look at a number of examples

90 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3 Hypothesis Testing 4 Testing Multiple Hypotheses at the same time 5 Examples

91 Example 1: Baseball Salaries Use the data mlb1 from Wooldridge First test the overall significance of the regression F = = R 2 /q ( 1 R 2 ) /(N K 1) /5 (0.3722) /(347) = You can see that stata did it for you already

92 Test whether the performance variables are jointly zero That is H 0 : β 3 = β 4 = β 5 = 0 Run the restricted model which gives an R 2 = Then you get F = ( R 2 u Rr 2 ) /q ( ) 1 R 2 u /(N K 1) = ( ) /3 (0.3722) /(347) = 9.54 The critical value is 2.6 so we reject

93 Example 2: Firm size and participation rate Lets look at the relationship between various things and participation rates in 401K plans Does type of plan matter? Does size of firm matter? Even though firm size is statistically significant it is not economically significant.

94 Example 3: The Death Penalty The death penalty is an extremely controversial policy Proponents claim it deters crime Opponents claim there is no evidence of this Lets look at the evidence using death.do and murder.raw Key variables: mrdrte exec murders per 100,000 population total executions, past 3 years

95 One conclusion from this is that there is no evidence that the death penalty deters crime In fact point estimates go other way, murder rises with executions

96 Therefore there is no argument for it This is exactly the type of argument about which one has to be very careful Just because a coefficient is insignificant doesn t mean it is zero Confidence interval is (-.218,.0.548) Lower bound is about -0.2 Interpreting this number as causal would mean that one execution every three years would save 0.2 lives for every 100,000 people each year

97 Lets try to figure out whether this is a big number or not The state of Illinois is about 12,000,000 That -0.5 would mean that every execution saves about lives 12, 000, , 000 This seems like a big number to me = 72

98 Can t reject that deterrent effect is zero Also can t reject that it is really powerful Also can t reject that is is really powerful, but goes in other direction In other words this analysis is completely uninformative I suspect gun control works in exactly the same way (but opposite politically)

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

LECTURE 2: SIMPLE REGRESSION I

LECTURE 2: SIMPLE REGRESSION I LECTURE 2: SIMPLE REGRESSION I 2 Introducing Simple Regression Introducing Simple Regression 3 simple regression = regression with 2 variables y dependent variable explained variable response variable

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Second Midterm Exam Economics 410 Thurs., April 2, 2009

Second Midterm Exam Economics 410 Thurs., April 2, 2009 Second Midterm Exam Economics 410 Thurs., April 2, 2009 Show All Work. Only partial credit will be given for correct answers if we can not figure out how they were derived. Note that we have not put equal

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Introduction When you perform statistical inference, you are primarily doing one of two things: Estimating the boundaries

More information

CHAPTER 4. > 0, where β

CHAPTER 4. > 0, where β CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Section 4: Multiple Linear Regression

Section 4: Multiple Linear Regression Section 4: Multiple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 The Multiple Regression

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

ECO375 Tutorial 4 Introduction to Statistical Inference

ECO375 Tutorial 4 Introduction to Statistical Inference ECO375 Tutorial 4 Introduction to Statistical Inference Matt Tudball University of Toronto Mississauga October 19, 2017 Matt Tudball (University of Toronto) ECO375H5 October 19, 2017 1 / 26 Statistical

More information

Lecture 4: Testing Stuff

Lecture 4: Testing Stuff Lecture 4: esting Stuff. esting Hypotheses usually has three steps a. First specify a Null Hypothesis, usually denoted, which describes a model of H 0 interest. Usually, we express H 0 as a restricted

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests Lecture Module 6 Professor Spearot Agenda 1 P-values 2 Comparing Parameters 3 Predictions 4 F-Tests Confidence intervals Housing prices and cancer risk Housing prices and cancer risk (Davis, 2004) Housing

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Homework 1 Solutions

Homework 1 Solutions Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin Note on Bivariate Regression: Connecting Practice and Theory Konstantin Kashin Fall 2012 1 This note will explain - in less theoretical terms - the basics of a bivariate linear regression, including testing

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Education Production Functions. April 7, 2009

Education Production Functions. April 7, 2009 Education Production Functions April 7, 2009 Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule What do I mean by Production Function?

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2: Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2: 03 17 08 3 All about lines 3.1 The Rectangular Coordinate System Know how to plot points in the rectangular coordinate system. Know the

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Descriptive Statistics (And a little bit on rounding and significant digits)

Descriptive Statistics (And a little bit on rounding and significant digits) Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent

More information

Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison

Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison Returns to Tenure Christopher Taber Department of Economics University of Wisconsin-Madison March 31, 2008 Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel Basic Framework

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3 Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1 Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Homework Set 2, ECO 311, Spring 2014

Homework Set 2, ECO 311, Spring 2014 Homework Set 2, ECO 311, Spring 2014 Due Date: At the beginning of class on March 31, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Statistical Inference. Part IV. Statistical Inference

Statistical Inference. Part IV. Statistical Inference Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Homework Set 2, ECO 311, Fall 2014

Homework Set 2, ECO 311, Fall 2014 Homework Set 2, ECO 311, Fall 2014 Due Date: At the beginning of class on October 21, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00)

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00) BOSTON COLLEGE Department of Economics EC 228 02 Econometric Methods Fall 2009, Prof. Baum, Ms. Phillips (TA), Ms. Pumphrey (grader) Problem Set 5 Due Tuesday 10 November 2009 Total Points Possible: 160

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Solutions to Problem Set 5 (Due December 4) Maximum number of points for Problem set 5 is: 62. Problem 9.C3

Solutions to Problem Set 5 (Due December 4) Maximum number of points for Problem set 5 is: 62. Problem 9.C3 Solutions to Problem Set 5 (Due December 4) EC 228 01, Fall 2013 Prof. Baum, Mr. Lim Maximum number of points for Problem set 5 is: 62 Problem 9.C3 (i) (1 pt) If the grants were awarded to firms based

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 600.463 Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Intro to Learning Theory Date: 12/8/16 25.1 Introduction Today we re going to talk about machine learning, but from an

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,

More information

EC402 - Problem Set 3

EC402 - Problem Set 3 EC402 - Problem Set 3 Konrad Burchardi 11th of February 2009 Introduction Today we will - briefly talk about the Conditional Expectation Function and - lengthily talk about Fixed Effects: How do we calculate

More information

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Last name (family name): First name (given name):

More information

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University Instructions: Answer all four (4) questions. Be sure to show your work or provide su cient justi cation for

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Chapter 7. Testing Linear Restrictions on Regression Coefficients

Chapter 7. Testing Linear Restrictions on Regression Coefficients Chapter 7 Testing Linear Restrictions on Regression Coefficients 1.F-tests versus t-tests In the previous chapter we discussed several applications of the t-distribution to testing hypotheses in the linear

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information