The Simple Linear Regression Model

Size: px
Start display at page:

Download "The Simple Linear Regression Model"

Transcription

1 The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON Econometrics Fall 2017 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

2 Bivariate Data and Relationships Many uses of statistics, especially in economics and business, investigate relationships between variables # of police & # of crimes; healthcare spending & avg. lifespan government spending & GDP sales & profits We will look at bivariate data relationships between two variables (e.g. X and Y ) Our immediate aim is to explore associations between variables We can quantify associations with measures such as correlation and linear regression An eventual goal will be to examine causation a very difficult thing to prove (that will require a course in econometrics) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

3 Bivariate Data and Relationships We examine many individuals with multiple variables in spreadsheets A row contains data about all variables for a single individual. A column contains data about a single variable across all individuals. Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

4 Bivariate Data and Relationships The most basic and helpful way to visualize the relationship between two quantitative variables is a scatterplot Each data point coordinate (X i, Y i ) is an individual observation (e.g. country) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

5 Bivariate Data and Relationships On the horizontal axis we usually put the independent or explanatory variable (e.g. Economic Freedom Index) On the vertical axis we usually put the dependent or response variable (e.g. GDP per capita) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

6 Bivariate Data and Relationships We want to look for an association between the independent and dependent variables based on the following factors: 1 Direction: is the trend positive or negative? 2 Form: is the trend linear, quadratic, something else, or no pattern? 3 Strength: Is the association strong or weak? 4 Outliers: Are there unusual data points that break the trends above? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

7 Correlation We want a way to quantify the strength of the association between two variables We can measure the sample correlation: r = 1 X i X ( )( Y i Ȳ ) n 1 s X s Y Notice each parenthetical is a standardized (i.e. Z) score for each variable, so equivalently: r X,Y = n Z X Z Y n 1 Take each coordinate pair, standardize the X value and the Y values, multiply them, and average them over n 1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

8 Correlation n ZX Z Y r X,Y = n 1 Correlation is standardized to be between -1 and 1 Negative values imply a negative association Positive values imply a positive association A correlation of zero implies no association The closer r is to 1, the stronger the association r = 1 implies a perfect straight line Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

9 Correlation Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

10 Correlation Guess The Correlation Game Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

11 Correlation and Endogeneity Correlation does not imply causation! There is no way to conclude from correlation alone that X causes Y There may be lurking or confounding variables (e.g. Z) that simultaneously affect both X and Y There may be simultaneous or reverse causation (e.g. maybe Y causes X!) Most of econometrics deals with trying to properly identify causal effects by controlling for lurking variables Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

12 Correlation Example The correlation between Life Expectancy and Doctors Per Person is So should we send more doctors to developing countries to increase their life expectancy? Income? Living Standards? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

13 Linear Regression If an association is roughly linear, we can estimate a line that would fit the data Recall a linear equation describing a line can be written as: Y = a + bx a: vertical intercept b: slope of the line Note we will use different symbols for a and b, in line with standard econometric notation How do we find the line that best fits the data? By linear regression Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

14 The Population Linear Regression Model Linear regression lets us estimate the slope of the population regression line between two variables, X and Y We can then make inferences about the population slope coefficient Ultimately, we want to estimate causal effect on Y of a unit change in X Y X i.e. for a one unit change in X, how many units will this cause Y to change? First, we will focus on fitting a straight line to data on X and Y Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

15 The Population Linear Regression Model The statistical analyses for linear regression are the same as the ones we looked at for estimating population means, proportions, or differences in means: Estimation: fit a line through data to estimate population relationships (slope) Hypothesis testing: test if the true slope is a certain value Confidence intervals: construct a confidence interval for the true slope Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

16 The Population Linear Regression Model Example What is the relationship between class size and educational performance? Policy question: What is the effect of reducing class sizes by 1 student per class on test scores? 10 students? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

17 The Population Linear Regression Model Example What is the relationship between class size and educational performance? 690 Test Score Student to Teacher Ratio Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

18 The Population Linear Regression Model If we change the class size by an amount, what would we expect the change in test scores to be? β ClassSize = change in test score change in class size = test score class size If we knew β ClassSize, we could say that increasing (decreasing) class size by 1 student will change test scores by (negative) β ClassSize Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

19 The Population Linear Regression Model Rearranging: test score = β ClassSize class size Suppose β ClassSize = 0.6. If we shrank class size by 2 students, model predicts: test score = = 1.2 The line relating class size and test scores has the equation: test score = β 0 + β ClassSize class size β 0 is the vertical-intercept, test score where class size is 0 β ClassSize is the slope of the regression line This relationship only holds on average for all districts in the population, individual districts are also affected by other factors Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

20 The Population Linear Regression Model To get an equation that holds as true for each district, we need to include other factors test score = β 0 + β ClassSize class size + other factors For now, we will ignore these until the next lesson Thus, β 0 + β ClassSize class size gives the average effect of class sizes on scores Later, we will want to estimate the marginal effects of each factor on a district s test score, holding all other factors constant Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

21 The Population Linear Regression Model Y = β 0 + β 1 X 1 + β 2 X 2 + ɛ Y is the dependent variable of interest AKA response variable, regressand, Left-hand side (LHS) variable X 1 and X 2 are independent variables AKA explanatory variables, regressors, Right-hand side (RHS) variables, covariates, control variables We have observed values of y, x 1, and x 2 & regress y on x 1 and x 2 β 0, β 1, and β 2 are unknown parameters to estimate ɛ is the error term, incorporating all other factors that affect Y It is stochastic (random) We can never measure the error term, only make assumptions about it Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

22 The Population Linear Regression Model How do we draw a line through the scatterplot? We do not know the true β ClassSize We do have data from a sample of class sizes and test scores So the real question is, how can we estimate β 0 and β 1? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

23 The Ordinary Least Squares Estimators Y Suppose we have a scatter plot of points (X i, Y i ) We can draw a line of best fit through our scatterplot The residual (ɛ i ) of each data point is the difference between actual and predicted value of Y given X Ŷ i Y i (X i, Y i ) ɛ i = Y i Ŷi ɛ i = Y i Ŷ i If we were to square each residual and add them all up, this is Sum of Squared Errors (SSE) X i X n n SSE = ɛ 2 i = (Y i Ŷi ) 2 i=1 i=1 The line of best fit minimizes SSE Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

24 The Ordinary Least Squares Estimators The ordinary least squares (OLS) estimators of the unknown population parameters β 0 and β 1, solve the calculus problem: min β 0,β 1 n i=1 min(sse) [Y i (β 0 + β 1 X }{{} i )] Ŷ i OLS estimators minimize average squared distance between the actual values (Y i ) and the predicted values (Ŷ i ) along the estimated regression line 2 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

25 The OLS Regression Line The OLS regression line or sample regression line is the linear function constructed using the OLS estimators: Ŷ i = ˆβ 0 + ˆβ 1 X i ˆβ 0 and ˆβ 1 ( beta 0 hat and beta 1 hat ) are the OLS estimators of population parameters β 0 and β 1 using sample data The predicted value of Y given X, based on the regression, is E(Y i X i ) = Ŷ i The residual or prediction error for the i th observation is the difference between observed Y i and its predicted value, ˆɛ i = Y i Ŷ i Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

26 The OLS Regression Estimators The solution to the calculus problem yields: For ˆβ 0 : ˆβ 0 = Ȳ ˆβ 1 X For ˆβ 1 : ˆβ 1 = n (X i X )(Y i Ȳ ) i=1 = n (X i X ) 2 s XY s 2 X = cov(x, Y ) var(x ) i=1 Equivalently (r X,Y is the correlation coefficient between X & Y ): ˆβ 1 = r X,Y s Y s X Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

27 The Ordinary Least Squares Estimators Y Ŷ = ˆβ 0 + ˆβ 1 X ˆβ 1 = Y X X Y ˆβ 0 X Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

28 The Population Regression Model 690 Test Score Student to Teacher Ratio Population regression line: Test score = β 0 + β 1 STR test score β 1 = STR =?? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

29 The Sample OLS Regression Model 690 Test Score Using OLS, we find Student to Teacher Ratio Test Score = STR Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

30 The Sample OLS Regression Model Test Score = STR test score Estimated slope ˆβ 1 = STR = 2.28 Estimated intercept ˆβ 0 = Not economically meaningful, just extrapolates the line from the data Literally, districts with STR of 0 have a predicted test score of Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

31 The Sample OLS Regression Model Test Score = STR We can now make predictions with our model For a district with 20 students per teacher, the predicted test score is (20) = Is this estimate big or small? How economically meaningful is it? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

32 The Sample OLS Regression Model: In Stata If we plug this into Stata and run OLS, this is the output Highlighted in red: top row: coefficient for STR ( ˆβ 1 =-2.280); bottom row: intercept (constant) ( ˆβ 0 =698.93) Test score = STR Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

33 The Sample OLS Regression Model 690 Richmond Test Score Student to Teacher Ratio Stock & Watson (2015: p. 113) One district in the sample is Richmond with STR = and Test Score = Predicted value: Y Richmond = (20.00) = Residual: ɛ Richmond = = 20.0 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

34 The Sample OLS Regression Model: In Stata If we plug this into Stata and run OLS, this is the output Highlighted in red: top row: coefficient for STR ( ˆβ 1 =-2.280); bottom row: intercept (constant) ( ˆβ 0 =698.93) Test score = STR Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

35 Measures of Fit: R 2 How well does a line fit data? How much variation in Y i is explained by the model? How tightly clustered around the regression line are the observations? Primary measure of fit is regression R 2, the fraction of the sample variance of Y i explained (predicted) by Ŷ i. Y i = Ŷ i + ˆɛ i Observed values of dependent variable are the sum of the predicted values and the residuals (errors) Recall OLS has chosen a model specifically to minimize SSE Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

36 Measures of Fit: R 2 R 2 = ESS TSS R 2 is the ratio of the sample variance of the model (Ŷ i ) to the sample variance of observations (Y i ); ranging from 0 to 1 Explained Sum of Squares (ESS): sum of squared deviations from the predicted value from their mean: n ESS = (Ŷ i Ȳ ) 2 Total Sum of Squares (TSS): sum of squared deviations from observed values from their mean: n TSS = (Y i Ȳ ) 2 i=1 i=1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

37 Measures of Fit: R 2 Alternatively, R 2 can be written in terms of the fraction of the variance of the observations Y i not explained by the model: R 2 = 1 SSE TSS Sum of Squared Errors (SSE): recall: SSE = Note, you may see this called sum of squared residuals (SSR) Lastly, R 2 of the regression is also equal to the square of the correlation coefficient between X and Y : n i=1 ˆɛ i 2 R 2 = (r XY ) 2 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

38 Measures of Fit: SER The standard error of the regression (SER, ˆσ, or ˆσ ɛ ) is an estimator of the standard deviation of ɛ i ˆσ = 1 n SSE ˆɛ 2 i = n 2 n 2 i=1 Measures spread of the observations around the regression line, the average size of the residual error df correction of n 2: use of 2 degrees of freedom to find β 0 and β 1 Stata gives us the Root Mean Squared Errors (Root MSE) (divides by n, rather than n 2) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

39 Measures of Fit: Example The R 2 of the regression is 0.051, so 5.1% of the variation in Test Scores is explained by the variation in Student-Teacher Ratios SER ( Root MSE ) is 18.6, standard deviation of the residuals Large spread from the line = predictions will be off by a lot! Indicates there are other important factors that also influence test scores Note: it is very rare in econo(metr)ics that we get very high R 2 values, due to tons of unobserved variables affecting economic outcomes. Don t get discouraged! Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

40 Measures of Fit: Looking at Residuals 690 Test Score Student to Teacher Ratio Recall for every data point, the equation of the line constructs a predicted value of Ŷ X This is different from the actual value of Y i. The difference (positive or negative) between the two is the residual error Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

41 Measures of Fit: Looking at Residuals Residual Student to Teacher Ratio We can construct a residual plot to examine the residuals (ˆɛ = Y i Ŷi ) Stronger relationships should have small residuals, data points more tightly concentrated around the regression line We now turn to the question of quantifying just how well the line fits the data Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

42 Distribution of OLS Estimators OLS estimators ( ˆβ 0, ˆβ 1 ) are computed from a specific sample of data Two sources of randomness in our estimate: 1 sampling randomness: different samples will generate different OLS estimators 2 modeled randomness: ɛ includes all factors affecting Y other than X, different samples have different values of those other factors Thus, ˆβ0, ˆβ 1 are also random variables, with their own sampling distribution Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

43 Distribution of OLS Estimators The central limit theorem allows us to say that the distribution of ˆβ 0 and ˆβ 1 are normal Generally agreed that n > 100 is sufficient ˆβ 1 N(β 1, σ ˆβ1 ) β 1 ˆβ 1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

44 Distribution of OLS Estimators Similar to sampling distributions of sample means ( X ) We care about the sampling distribution of ˆβ 1 ( ˆβ 0 is less useful) What is E[ ˆβ 1 ] (Where is the center?) What is var[ ˆβ 1 ] (How precise is our estimate?) β 1 ˆβ 1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

45 Exogeneity and Unbiasedness We want to see if ˆβ 1 is unbiased: there is no systematic difference, on average, between sample values of ˆβ 1 and the true population β 1, i.e. E[ ˆβ 1 ] = β 1 Doesn t mean every sample gives us ˆβ 1 = β 1, only the estimation procedure will, on average, yield the correct value On average, random errors above and below the true value cancel A long story short: ˆβ 1 is an unbiased estimator of β 1, i.e. E[ ˆβ 1 ] = β 1 when X is exogenous (See handouts for proof) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

46 Exogeneity and Unbiasedness Recall, exogenous independent variables (X ) if it is unrelated to other factors affecting Y, i.e.: corr(x, ɛ) = 0 Technically, this is called the Zero Conditional Mean Assumption E(ɛ X ) = 0 For any known value of X, the expected value of ɛ is 0. Knowing the value of X must tell us nothing about the value of ɛ (anything else relevant to Y other than X ) We can then confidently assert causation: X Y Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

47 Endogeneity and Bias Nearly all independent variables are endogenous, they are related to the error term ɛ corr(x, ɛ) 0 Suppose we estimate the following relationship: Violent crimes t = β 0 + β 1 Ice cream sales t + ɛ t We find ˆβ 1 > 0 Does this mean Ice cream sales Violent crimes? Tell me a story Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

48 Endogeneity and Bias The true expected value of ˆβ 1 is actually (see handouts for proof): E[ ˆβ 1 ] = β 1 + corr(x, ɛ) σ ɛ σ X Takeaways: If X is exogenous: corr(x, ɛ) = 0, we re just left with β 1 The larger corr(x, ɛ) is, larger bias: ( E[ ˆβ 1 ] β 1 ) We can also sign the direction of the bias based on corr(x, ɛ) Positive corr(x, ɛ) overestimates the true β 1 ( ˆβ 1 is too high) Negative corr(x, ɛ) underestimates the true β 1 ( ˆβ 1 is too low) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

49 Endogeneity and Bias Example wages = β 0 + β 1 educ + ɛ Is this an accurate reflection of educ wages? Does E[ɛ educ] = 0? What would E[ɛ educ] > 0 mean? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

50 Endogeneity and Bias Example per capita cigarette consumption = β 0 + β 1 State cig tax rate + ɛ Is this an accurate reflection of tax cons? Does E[ɛ tax] = 0? What would E[ɛ tax] > 0 mean? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

51 Exogeneity and RCTs Think about an idealized randomized controlled experiment Subjects randomly assigned to treatment or control group Implies knowing whether someone is treated (X ) tells us nothing about their personal characteristics (ɛ) Random assignment makes ɛ independent of X, so corr(x, ɛ) = 0 and E[ɛ X ] = 0 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

52 Precision and Variance of ˆβ 1 So we know the center and expected value of the sampling distribution of ˆβ 1 is β 1 What about the spread or variance? Small variance Large variance β 1 ˆβ 1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

53 Precision and Variance of ˆβ 1 The variance of ˆβ 1 measures how precise our estimate of the slope is var( ˆβ 1 ) = ˆσ 2 n var(x ) Where ˆσ 2 is the variance of the regression ˆσ 2 = SSE n 2 = 1 n 2 n ˆɛ 2 Recall we ve see the standard error of the regression (SER) ˆσ as the square root of this! The standard error of ˆβ 1 is the square root of the variance: ˆσ se( ˆβ 1 ) = 2 n var(x ) i=1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

54 Precision and Variance of ˆβ 1 The variance of ˆβ 1 measures how precise our estimate of the slope is var( ˆβ 1 ) = ˆσ 2 n var(x ) Variance of ˆβ 1 affected by three things: Model fit, measured by variance (or S.E.) of regression ˆσ 2 Larger ˆσ 2, larger var( ˆβ 1) Sample size n Larger n, lower var( ˆβ 1) Variation in X Larger var(x ), lower var( ˆβ 1) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

55 Precision and Variance of ˆβ Test Score Student to Teacher Ratio Smaller var(x i ) (light dots), larger var( ˆβ 1 ) harder to determine precise slope! Larger var(x i ) (all dots), smaller var( ˆβ 1 ) easier to determine precise slope! Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

56 Solvable Problems: Heteroskedasticity and Autocorrelation Now we want to look at how the errors (ˆɛ i ) are distributed Homoskedastic if errors have same variance over all levels of X var(ˆɛ X ) = ˆσ 2 ɛ Homoskedastic if errors have different variance over all levels of X var(ˆɛ X ) ˆσ 2 ɛ Heteroskedasticity will not cause ˆβ 1 to be biased! But it does mess with the variance of β 1, causing it to overstate statistical significance in hypothesis tests! Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

57 Heteroskedasticity We ve already seen the standard error of ˆβ 1 : ˆσ se( ˆβ 1 ) = 2 n var(x ) But this assumes errors are homoskedastic When errors are heteroskedastic, standard error becomes: n se( ˆβ (X i X ) 2ˆɛ 2 i=1 1 ) = [ n ] 2 (X i X ) 2 i=1 These are known as the heteroskedasticity-robust (or just robust ) standard error of ˆβ 1 No need to memorize the formula, there is an easy fix in Stata, but know what robust standard errors are and when we need them! Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

58 Homoskedasticity Stock & Watson (2011: 156) E[ɛ X ] = 0 (exogenous) Variance of ɛ does not depend on X Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

59 Heteroskedasticity Stock & Watson (2011: 156) E[ɛ X ] = 0 (exogenous) Variance of ɛ does depend on X Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

60 Variance of the Errors Stock & Watson (2011: 162) Homoskedastic or heteroskedastic? Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

61 Heteroskedasticity-Robust Standard Errors Using the robust command, Stata computes heteroskedasticity-robust standard errors, otherwise Stata defaults to homoskedasticity-only standard errors! Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

62 Outliers An outlier is an observation that is strongly different from the rest of the sample Outliers may bias our OLS estimates by having a strong influence on the shape of the line Test Score Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

63 Outliers Often outliers are simply the result of human error in recording data e.g. suppose you are recording height in inches (e.g. 60 ) and accidentally record one person s in centimeters (e.g. 130) Outliers may be important and valid parts of the observed effect Always check your data! Scatterplot and look for weird data points Run different models and see how adding/dropping outliers affects OLS estimates In Stata: dfbeta command Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

64 Hypothesis Testing: Overview Objective: test a hypothesis (β 1 = β 1,0 ) using data H 0 and two-sided alternative: H 0 : β 1 = β 1,0 H 0 and one-sided alternative: H 2 : β 1 β 1,0 H 0 : β 1 = β 1,0 H 2 : β 1 > β 1,0 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

65 Hypothesis Testing: Overview General approach: construct t-statistic and compute p-value (or compare to the N(0, 1) critical value Z-score) t = estimator hypothesized value standard error of the estimator Again, SE(estimator) = var(estimator) Recall: for testing the mean of Y : t = Ȳ µ Y,0 SE(Ȳ ) For testing the β 1 : t = ˆβ 1 β 1,0 SE( ˆβ 1 ) Where SE( ˆβ 1 ) is the square root of an estimator of the variance of the sampling distribution of ˆβ 1 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

66 Hypothesis Testing: Overview Construct the t-statistic: t = ˆβ 1 β 1,0 SE( β ˆ 1 ) Reject H 0 at the 5% significance level if t > 1.96 p-value = P[ t* > t ], the probability in tails of normal distribution beyond computed t ; reject H 0 if p < 0.05 For large samples, t-statistic distributed as standard normal, p-value = P( Z* > t ) = 2Φ( t ) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

67 Hypothesis Testing: Example Estimated regression line: Stata estimates SE s: SE( ˆβ 0 ) = 10.4 SE( ˆβ 1 ) = 0.52 H 0 : β 1 = 0, H 1 : β 1 0 TestScore = STR t-statistic: t = ˆβ 1 β 1,0 SE( β ˆ = ) 0.52 = > 1.96 for α = 0.05, we can reject H 0 Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

68 Confidence Intervals Recall that a 95% confidence interval is The set of points that cannot be rejected at the 5% significance level An interval that is a function of the data that contains the true parameter value 95% of the time over repeated samples For large samples, since t is standard normally distributed N(0, 1), 95% confidence interval for ˆβ 1 similar to the case for the sample mean: CI (0.95) ˆβ 1 = ˆβ 1 ± 1.96SE( ˆβ 1 ) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

69 Confidence Intervals: Example Estimated regression line: Stata estimates SE s: SE( ˆβ 0 ) = 10.4 SE( ˆβ 1 ) = % confidence interval for ˆβ 1 : TestScore = STR ˆβ 1 ± 1.96SE( ˆβ 1 ) = 2.28 ± 1.96(0.52) = ( 3.30, 1.26) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

70 Conventional Way to Report Regressions Test Score = STR, R 2 = 0.05, SER = 18.6 (10.4) (0.52) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

71 Stata Output of Regression Example TestScore = STR, R 2 = 0.05, SER = 18.6 (10.4)(0.52) t(β 1 = 0) = 4.38, p-value=0.000 (2-sided) 95% Confidence Interval for β 1 is (-3.30,-1.26) Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

72 Summary of Statistical Inference About β 0 and β 1 Estimation: OLS estimators of β 0 and β 1 : ˆβ 0 and ˆβ 1 ˆβ 0 and ˆβ 1 have approximately normal sampling distributions for large n Testing: H 0 : β 1 = β 1,0 vs. H 1 : β 1 β 1,0 (β 1,0 is value of β 1 under H 0 ) t = ˆβ 1 β 1,0 SE( ˆβ 1 p-value: area under standard normal distribution (Z) outside t (for large n) Confidence Intervals 95% confidence interval for β 1 is { ˆβ 1 ± 1.96(SE( ˆβ 1 )} This is the set of β 1 not rejected at 5% level, contains true β 1 in 95% of all samples Ryan Safner (Hood College) ECON Lesson 3 Fall / 77

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Linear Regression with one Regressor

Linear Regression with one Regressor 1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM P1.T2. Stock & Watson Chapters 4 & 5 Bionic Turtle FRM Video Tutorials By: David Harper CFA, FRM, CIPM Note: This tutorial is for paid members only. You know who you are. Anybody else is using an illegal

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data 1/2/3-1 1/2/3-2 Brief Overview of the Course Economics suggests important

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Introduction to Econometrics. Multiple Regression (2016/2017)

Introduction to Econometrics. Multiple Regression (2016/2017) Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

Introduction to Econometrics. Multiple Regression

Introduction to Econometrics. Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Correlation & Simple Regression

Correlation & Simple Regression Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM 1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y Regression and correlation Correlation & Regression, I 9.07 4/1/004 Involve bivariate, paired data, X & Y Height & weight measured for the same individual IQ & exam scores for each individual Height of

More information

ECON 450 Development Economics

ECON 450 Development Economics ECON 450 Development Economics Statistics Background University of Illinois at Urbana-Champaign Summer 2017 Outline 1 Introduction 2 3 4 5 Introduction Regression analysis is one of the most important

More information

Week 3: Simple Linear Regression

Week 3: Simple Linear Regression Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor ECON4150 - Introductory Econometrics Lecture 4: Linear Regression with One Regressor Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 4 Lecture outline 2 The OLS estimators The effect of

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots SFBS Course Notes Part 7: Correlation Bivariate relationships (p. 1) Linear transformations (p. 3) Pearson r : Measuring a relationship (p. 5) Interpretation of correlations (p. 10) Relationships between

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

Chapter 3: Examining Relationships

Chapter 3: Examining Relationships Chapter 3: Examining Relationships Most statistical studies involve more than one variable. Often in the AP Statistics exam, you will be asked to compare two data sets by using side by side boxplots or

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information