LECTURE 2: SIMPLE REGRESSION I

Size: px
Start display at page:

Download "LECTURE 2: SIMPLE REGRESSION I"

Transcription

1 LECTURE 2: SIMPLE REGRESSION I

2 2 Introducing Simple Regression

3 Introducing Simple Regression 3 simple regression = regression with 2 variables y dependent variable explained variable response variable predicted variable regressand x independent variable explanatory variable control variable predictor variable regressor we are actually going to derive the linear regression model in three very different ways these three ways reflect three types of econometric questions we discussed in the first lecture (descriptive, causal and forecasting) while the math for doing it is identical, conceptually they are very different ideas

4 4 Descriptive Approach Why do we need a regression model? Estimation & interpretation. Correlation vs. causation.

5 Descriptive Analysis 5 goal: estimate E[y x] (called the population regression function) x is discrete (i.e., categories) wages vs. education example one can collect data for the individual categories (the more categories, the more difficult) 1,500 wage 1, education (years)

6 Descriptive Analysis (cont d) 6 x is continuous problem: no categories example: CEO s salary vs. company s ROE ROE = return on equity = net income as a percentage of common equity (ROE = 10 means if I invest $100 in equity in a firm I earn $10 a year) does a CEO s salary (typically huge) reflect her performance? 4000 CEO s salary ($1,000s) company s ROE

7 7 Descriptive Analysis (cont d) therefore, we need a model for E[y x] in other words, we need to find a good mathematical expression for f in E[y x] = f(x) ) the simplest model I can think of is E[y x] = β 0 + β 1 x Why use simple models: Simple models are: easier to estimate. easier to interpret (e.g., β 1 = Δwage/Δeduc etc.). easier to analyze from the statistical standpoint. safe: they serve as a good approximation to the real relationship, the functional nature of which might be unknown and/or complicated. Things can t go too wrong when using a simple model. Further reading: Angrist and Pischke (2008): Mostly Harmless Econometrics: An Empiricist s Companion.

8 8 Linear Model for E[y x] interepretation: intercept: β 0 = E[y x = 0] slope: 1 E[ y x] E[ y x] x x E[y x] E[y x] = β 0 + β 1 x β 1 1 β 0 x

9 9 Estimation once we ve decided about the functional form of the model (in our case, it s E[y x] = β 0 + β 1 x), we need to develop techniques to obtain estimates of the parameters β 0 and β 1 we base our estimates on a sample of the population we need to make inferences about the whole population sample: n observations (people, countries, etc.) indexed with i x and y values for ith person denoted as y i, x i (for i = 1,,n) Preliminaries it will be useful to define u = y β 0 β 1 x what do we know about u? first, because E[y x] = β 0 + β 1 x, we have E[ u x] E[ y x x] 0 1 E[ y x] x 0 1 x x

10 Estimation (cont d) 10 this means two things: 1. E u = 0. (This should be intuitive: E[u x] = 0 for all x E u = 0; alternatively, you can plug in CE.4 from Wooldridge, page 687.) 2. the expected value of u does not change when we change x the second property has numerous implications, the most important being: cov(x,u) = 0 (this is property CE.5 from Wooldridge, page 687) E[xu] = 0 (because cov(x,u) = E[xu] Ex Eu ) how is this useful in estimation? we arrived at two important facts about expectations of u: E u = 0 E[xu] = 0 typically, expectations are estimated using a sample mean (e.g., how would you estimate mean wage with a sample of 10 people?) we ll use the idea of sample analogue, forcing the sample means of u and xu to equal zero (see below)

11 11 Estimation (cont d) before we move on, we ll revise some more statistical concepts connected with random sample remember we need to make inference about the population based on our sample and its characteristics Population vs. sample characteristics: Random variable Population (size N) Sample (size n) x Ex x 1 N N i1 x i x 1 n n i 1 x i var x E( x x ) 2 1 N N i 1 ( xi x ) 2 1 n1 n i1 ( x x) i 2 cov( x, y) E( x )( y ) 1 x y N N i 1 ( yi y )( xi x ) 1 n1 n i1 ( y y)( x x) i i the expressions on the right are called sample mean, sample variance and sample covariance

12 12 Estimation (cont d) for ith person (i.e., observation) in our sample, we have the population regression model y i = β 0 + β 1 x i + u i, where β 0 and β 1 are the unknown parameters to be estimated to think about estimation let s define the sample regression model y ˆ ˆ i 0 1xi uˆ i, where 0 ˆ and 1 ˆ are our estimates of β 0 and β 1 from the sample uˆi is defined as u ˆ ˆ ˆi y i 0 1x i note: if 0 ˆ and 1 ˆ are like β 0 and β 1, then uˆi should be like u i sample analogue: in order to make inferences about the population, we have to believe that our sample looks like the population if it is so, then let s force things to be true in the sample which we know would be true in the population

13 13 Estimation (cont d) from the discussion about (the population s) u, we know that E u = 0 E[xu] = 0 the sample analogue to this is: 0 ( ˆ ˆ ) 1 n 1 n 1 ˆ ni ui ni1 yi 0 1xi 0 ( ˆ ˆ ) 1 n 1 n 1 ˆ ni xiui ni1xi yi 0 1xi as y i and x i are the data, the above equations are in fact two linear equations in variables 0 ˆ and 1 ˆ ; they can be solved (fairly) easily (see Wooldridge, pages 28 and 29) note that the first equation tells us that ˆ 0 y ˆ 1x, where x and y are the sample means of x i s and y i s solving the equations to get 1 ˆ yields: ˆ n i1 i i 1 n 2 i1( xi x) ( y y)( x x )

14 14 Estimation (cont d) we can rewrite the formula for the slope as 1 n n1 i1 i i n 1 i 1( xi x) n ( y y)( x x ) ˆ, which can be viewed as the sample analogue to both var x and its sample counterpart are always positive, therefore: cov( xy, ) var x the regression coefficient (β 1 ), the covariance, and the correlation coefficient must all have the same sign one of them is zero only if all of them are zero

15 15 Exercise: CEO Salaries and Performance 1. Download the ceosal1.gdt file from my website and open it in Gretl (or, if you have the Wooldridge dataset installed in Gretl, open the ceosal1 data file). 2. Find the sample mean and standard deviation for roe and salary, and sample correlation coefficient between the two variables. Store the results in the ceo.xls file from my website. (Use View Summary statistics and View Correlation matrix procedures.) 3. Estimate the regression model E[salary roe] = β 0 + β 1 roe using the formulas we derived for intercept and slope. Note that cov( x, y) corr( x, y), where σ x and σ y are standard deviations of x and y. The same holds for the sample counterparts. 4. Interpret the results. What does β 0 tell you about average CEOs salaries? Is its level sensible? How about β 1? Would you say that a CEO s salary reflects her work performance? x y

16 Correlation Vs. Causation 16 the difference between causation and correlation is a key concept in econometrics you saw that in the model with conditional expectations, the estimates were based on the correlation between x and y (remember the formulas) there are many ways a causal interpretation can be given that is consistent with the (correlation-based) results (see next slide) as we ll see, no econometric tool can ever prove or find a causal relationship on itself having an economic model is essential in establishing the causal interpretation (we ll talk about this in the causal-analytic part) conclusions: correlation causation statistical significance the effect of x on y is significant (only that they are tightly associated ) these issues are confused all the time by politicians and the popular press

17 17 Three Causation Schemes looking at the relationship between x and y from the causal standpoint, the association (or correlation) between x and y can represent three basic situations: y x x causes y: if a CEO s performance is good (high ROE), he gets paid a lot. y x y causes x: high salaries motivate CEOs, thus making them perform well (resulting in high ROE) y z x there is another factor z that causes both x and y, which makes x and y be associated: if a CEO is good (clever, motivated, etc.), he both performs well and gets paid a lot the descriptive approach makes no claims about which one is the case

18 Exercise: Voting Load the data voting.gdt (election outcomes and campaign expenditures for 173 two-party races (A,B) for the U.S. House of Representatives in 1988) 2. Consider the following regression model: E[voteA expena] = β 0 + β 1 expena. Does it make sense to use a model like this to describe the relationship between campaign expenditures and the eventual vote share? How would you interpret β 0 and β 1? In a two-party race, do you think it makes sense to look at the campaign expenditures of party A alone? 3. Consider the following regression model: E[voteA sharea] = β 0 + β 1 sharea, where sharea is A s percentage share in the total campaign expenditures ( total meaning the sum across all parties). Generate sharea in Gretl (use Add Define new variable), estimate the model (Model Ordinary least squares) and interpret the estimates. 4. Find a story for the association between votea and sharea supporting each of the three causation schemes

19 19 Causal Approach The need for causal analysis. Structural model and its assumptions. Estimation. Examples.

20 Causal Analysis 20 in the descriptive framework, we couldn t really say anything about causality however, in decision-making situations, causal questions are typically those we need to answer: if I face a decision, it means I can influence something (I have control over an economic variable) in order to decide effectively, I need to know the impact of my decision on things I m interested in examples: central bank sets interest rates in order to keep inflation within specified bounds (inflation targeting) companies charge prices so as to maximize profit / revenues / market share ministry of education introduces new school fees scheme; the aim is to collect money without discouraging students from education. Therefore, one has to find the effect of education on future income

21 Causal Analysis (cont d) 21 to say something about causality we need to make some more assumptions in practice, these assumptions will have to be checked using an economic theory / common sense + knowledge of the real problem mathematically, the formulation of these assumptions consists in writing down a structural data-generating model this will look similar to what we have been doing, but conceptually it is very different for the description-type analysis, we started with the data and then asked which model could help summarize the conditional expectation for the structural (causal) case we start with the model and then use it to say what the data will look like (even before we actually collect them)

22 Structural model 22 a simple structural model may look like this y = β 0 + β 1 x + u what has changed from the descriptive analysis? descriptive model: conditional expectation of y = a function of x E[y x] = f(x) structural model: y = a function of x and u y = f(x,u) it should be clear that modeling y is much more ambitious than modeling the expectation of y we have already encountered u, but this time it has a real content (see later) in choosing a particular structural model, we re actually saying that we believe that, in reality, the value of y is created from x and u using function f this is a daring claim, so we need to choose the model very carefully

23 Structural model (cont d) 23 let s go back to the simple model and think about it this way: y = β 0 + β 1 x + u β 0 and β 1 are real numbers which are out there in nature (however, we do not know there values that s what we want to estimate) we choose x (i.e., x is the thing we can control) Nature / God choose u in a way that is unrelated to our choice of x β 1 now measures the real causal effect of x on y: if we increase x by 1, y goes up by β 1 in effect example: suppose that y is your annual income (in ), x is your education (in years) β 0 = 10,000, β 1 = 5,000, u = 5,000 Is it worth studying at a university? And how about the tuition fees? And what if we don t know the value of u?

24 How About u? 24 u is probably the most important part of the structural model we ll spend a lot of time talking about u and its relation to x (note that the relation of u and y is obvious from the equation) what does u contain? Everything that the rest of the right-hand side of the equation failed to capture in the previous example, this means everything that affects your wages besides education: intelligence work effort other suggestions? in some textbooks, you can read that u contains a couple more things: measurement errors (or poor proxy variables) the intrinsic randomness (in human behaviour etc.) model specification errors these will not be all that relevant for our causal discussion

25 Crucial Assumption: E[u x] = 0 25 once we have specified the structural model, we need to estimate its parameters, i.e. β 0 and β 1 in order to be able to do so, we need to assume something about the relation between u and x mathematically, the crucial assumption takes on the form E[u x] = 0 this is the same as what we did before (in the descriptive analysis), but conceptually very different before u was defined simply u = y β 0 β 1 x. It didn t actually mean anything now we think of u as this real thing that is actually out there and means something it is just that we don t / can t observe it even though we don t observe it, we can still argue whether the assumption is fulfilled or not in order to do so, we need to know the assumption tells us in the first place

26 Crucial Assumption: E[u x] = 0 (cont d) 26 as before, the assumption that E[u x] = 0 really means two separate things, one of which is a big deal, the other is not: 1. E[u x] doesn t vary with x (i.e., it s a constant). this holds true if u is assigned at random u and x are independent perhaps something else and is not true if u and x are correlated example: intelligence (a part of u) is correlated with education (x) we re in trouble (we ll discuss the possible solutions as we go) 2. Eu = 0 (i.e., the unconditioned expectation is zero). the important part is 1; we assume 2 just for convenience

27 Crucial Assumption: E[u x] = 0 (cont d) 27 imagine 1 holds and 2 does not; for instance, let E[u x] = 5 then the model y = β 0 + β 1 x + u is equivalent to y = γ 0 + γ 1 x + v where: γ 0 = β γ 1 = β 1 u = v 5 notice that now E[v x] = E[u 5 x] = E[u x] 5 = 5 5 = 0. γ 1 and β 1 are the same and that is really what we care about (the effect of x on y) thus, the fact that we pick zero is really a normalization. It makes the model well defined (identified) without any real imposition on the data however, the fact that E[u x] does not vary with x is much more than a normalization. It has real content

28 E[u x] = 0 Vs. Causation 28 what does the assumption tell us about causation? remember the three causation schemes between x an y: y x, y x, y z x in the causal analysis, we want to rule out all but the first one this is exactly what the assumption about E[u x] effectively does the arrow scheme now contains three letters: y, x and u we know that u affects y (by definition), y u note that E[u x] = constant implies cov(x,u) = 0 therefore, we d like the arrow scheme to look like this: y x u this suggests there should be no association (correlation) between x and u

29 E[u x] = 0 Vs. Causation (cont d) 29 imagine we have the reversed causality y x : y x u here u affects y which in turn affects x, so u affects x, and u and x are correlated therefore cov(x,u) 0 and the assumption is necessarily violated now consider the case y z x : if there s any z that affects y, it is a part of u (by definition), and therefore z (and u) affect x y x u u contains z, so u affects x, meaning the two are correlated

30 30 Estimation now, let s take the assumption E[u x] = 0 for granted then, nothing is really any different than in the descriptive case we can write down the sample regression function as we know that y ˆ ˆ x uˆ, i 0 1 i i Eu i 0 E[ xu] 0 i i the sample analogue is 0 ( ˆ ˆ ) 1 n 1 n 1 ˆ ni ui ni1 yi 0 1xi 0 ( ˆ ˆ ) 1 n 1 n 1 ˆ ni xiui ni1xi yi 0 1xi which gives us n i1( yi y)( xi x ) 1, 2 0 y 1 n i1( xi x) ˆ ˆ ˆ x it is only the interpretation that has changed exactly as before

31 Examples 31 lets look back at our previous models and interpret things in this way let me be very clear about something: the question of whether the additional assumption is reasonable or not is definitely questionable in all of these cases. Let s worry about that later for now, just take the assumption as given and focus on the interpretation Voting example for the voting example we estimated the sample regression function votea = sharea as usual, the first parameter is not very interesting. It tells me the fraction of the votes I would get if I spent no money: I would get about 27% the key thing is that for every 1% that I outspend my opponent, my vote share increases by 0.463%

32 Examples (cont d) 32 suppose that currently I am spending the same amount as my opponent: sharea = 50 if I double my spending (and my opponent does not react) then sharea = 67 and I would get additional % = 7.8% is it worth it? Generally, this is not an econometric question. Moreover, it depends what doubling my current expenditure means in $ CEO example here we got salary = roe we can interpret this as the CEO s pay schedule if the CEO can get the return on equity to increase by 10 percentage points more, she will earn an extra $185,010 next year this really is not that much money given how large a 10 percentage point increase is (is it a high enough motivation?)

33 33 Forecasting Approach Forecasting framework. Ordinary least squares estimation (OLS).

34 Forecasting Analysis 34 let s shift gears completely imagine the following you have a bunch of data on x and y now (i.e., the x i s and y i s) you know the value of x tomorrow (we ll denote the value x*) and want to predict what the value of y will be tomorrow (denoted y*) actually, when talking about dynamic models, there are often lags in economic responses, so that today s cause (x) is the yesterday s value of an economic variable examples: inflation rate this year, unemployment rate next year corporate profits today, stock price tomorrow SAT scores, college GPA this can make x* available in advance for prediction

35 35 Forecasting Analysis (cont d) in order to do prediction we need a model let s once again use the linear sample regression model y ˆ ˆ x uˆ once we estimate the parameters, we can predict the future value of y as ˆ ˆ x* note that unlike in the causal analysis, we do not choose x* here! from the forecasting point of view, the fitted values yˆi can be regarded as the would-be predictions we d use if we didn t have the appropriate y i s therefore, it seems sensible to find βy ˆi 0 and βy ˆi 1 such that the overall distance between y s and y i s is minimized how to measure this distance? ˆi i 0 1 i i 0 1 obvious idea: absolute difference between the two: however, absolute value is a really ugly function y i yˆ i

36 36 Forecasting Analysis (cont d) 1 x 1 x a much smoother function is ( y yˆ ) we want to aggregate this distance across all data points: this says how close our model is to the data i n 2 n 2 i1( yi yˆ i) i1( yi 0 1xi ) i 2 ˆ ˆ

37 37 Ordinary Least Squares (OLS) Estimation note that the overall distance distance between yˆi s and y i s is a function of variables βy 0 and βy 1 (because y i s and x i s are known the data) ˆi ˆi we can call this function sum of squares (SS) and write SS ( ˆ, ˆ ) ( y ˆ ˆ x ) n 0 1 i1 i 0 1 i 2 we want to keep the model as close to the data as possible, which means we minimize SS, so we take the derivative of this function with respect to βy ˆi 0 and βy ˆi 1 and set to zero: SS ˆ 0 SS ˆ 1 2 ( y ˆ ˆ x ) 0 n i1 n i1 i x ( y ˆ ˆ x ) 0 i i 0 1 i i note that we re back to the same system of 2 linear equations as before (after dividing both equations by 2) the estimates are the same as before, but with a different kind of reasoning again

38 38 Three Approaches: A Comparison descriptive approach goal: find the association between x and y expressed in terms of conditional expectation E[y x] assumptions: approximate functional form of E[y x] causal approach goal: find the causal effect of x on y assumptions: structural model for y ( how y is created ) assumptions about u: E[u x] = 0 (and thus, E[xu] = 0) forecasting approach goal: predict future values of y based on the knowledge of future x assumptions: approximate functional form of the relation y vs. x different goals, different assumptions, same formulas for estimates the econometric software has only one procedure for all three cases, you have to know what you re doing, check the assumptions etc.

39 LECTURE 2: SIMPLE REGRESSION I

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

Introductory Econometrics Exercises for tutorials (Fall 2014)

Introductory Econometrics Exercises for tutorials (Fall 2014) Introductory Econometrics Exercises for tutorials (Fall 2014) Dept. of Econometrics, Uni. of Economics, Prague, zouharj@vse.cz September 23, 2014 Tutorial 1: Review of basic statistical concepts Exercise

More information

Inference in Regression Model

Inference in Regression Model Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3

More information

The Simple Regression Model. Simple Regression Model 1

The Simple Regression Model. Simple Regression Model 1 The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Regression Analysis with Cross-Sectional Data

Regression Analysis with Cross-Sectional Data 89782_02_c02_p023-072.qxd 5/25/05 11:46 AM Page 23 PART 1 Regression Analysis with Cross-Sectional Data P art 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

1 Impact Evaluation: Randomized Controlled Trial (RCT)

1 Impact Evaluation: Randomized Controlled Trial (RCT) Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:

More information

Lab 6 - Simple Regression

Lab 6 - Simple Regression Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Regression - Modeling a response

Regression - Modeling a response Regression - Modeling a response We often wish to construct a model to Explain the association between two or more variables Predict the outcome of a variable given values of other variables. Regression

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Simple Regression Model (Assumptions)

Simple Regression Model (Assumptions) Simple Regression Model (Assumptions) Lecture 18 Reading: Sections 18.1, 18., Logarithms in Regression Analysis with Asiaphoria, 19.6 19.8 (Optional: Normal probability plot pp. 607-8) 1 Height son, inches

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

FNCE 926 Empirical Methods in CF

FNCE 926 Empirical Methods in CF FNCE 926 Empirical Methods in CF Lecture 2 Linear Regression II Professor Todd Gormley Today's Agenda n Quick review n Finish discussion of linear regression q Hypothesis testing n n Standard errors Robustness,

More information

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models Copyright 2010, 2007, 2004 Pearson Education, Inc. Normal Model When we talk about one data value and the Normal model we used the notation: N(μ, σ) Copyright 2010,

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Regression Analysis. Ordinary Least Squares. The Linear Model

Regression Analysis. Ordinary Least Squares. The Linear Model Regression Analysis Linear regression is one of the most widely used tools in statistics. Suppose we were jobless college students interested in finding out how big (or small) our salaries would be 20

More information

Descriptive Statistics (And a little bit on rounding and significant digits)

Descriptive Statistics (And a little bit on rounding and significant digits) Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent

More information

Education Production Functions. April 7, 2009

Education Production Functions. April 7, 2009 Education Production Functions April 7, 2009 Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule What do I mean by Production Function?

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

MA Macroeconomics 3. Introducing the IS-MP-PC Model

MA Macroeconomics 3. Introducing the IS-MP-PC Model MA Macroeconomics 3. Introducing the IS-MP-PC Model Karl Whelan School of Economics, UCD Autumn 2014 Karl Whelan (UCD) Introducing the IS-MP-PC Model Autumn 2014 1 / 38 Beyond IS-LM We have reviewed the

More information

1 Regression with Time Series Variables

1 Regression with Time Series Variables 1 Regression with Time Series Variables With time series regression, Y might not only depend on X, but also lags of Y and lags of X Autoregressive Distributed lag (or ADL(p; q)) model has these features:

More information

Sampling Distribution Models. Chapter 17

Sampling Distribution Models. Chapter 17 Sampling Distribution Models Chapter 17 Objectives: 1. Sampling Distribution Model 2. Sampling Variability (sampling error) 3. Sampling Distribution Model for a Proportion 4. Central Limit Theorem 5. Sampling

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton. 1/17 Research Methods Carlos Noton Term 2-2012 Outline 2/17 1 Econometrics in a nutshell: Variation and Identification 2 Main Assumptions 3/17 Dependent variable or outcome Y is the result of two forces:

More information

MATH CRASH COURSE GRA6020 SPRING 2012

MATH CRASH COURSE GRA6020 SPRING 2012 MATH CRASH COURSE GRA6020 SPRING 2012 STEFFEN GRØNNEBERG Contents 1. Basic stuff concerning equations and functions 2 2. Sums, with the Greek letter Sigma (Σ) 3 2.1. Why sums are so important to us 3 2.2.

More information

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Research Methods in Political Science I

Research Methods in Political Science I Research Methods in Political Science I 6. Linear Regression (1) Yuki Yanai School of Law and Graduate School of Law November 11, 2015 1 / 25 Today s Menu 1 Introduction What Is Linear Regression? Some

More information

Formulation Problems: Possible Answers

Formulation Problems: Possible Answers Formulation Problems: Possible Answers 1. Let x 1 = number of cans of cheap mix; x 2 = number of cans of party mix; and x 3 = number of cans of deluxe mix. These are what you want to find. Finding the

More information

POL 681 Lecture Notes: Statistical Interactions

POL 681 Lecture Notes: Statistical Interactions POL 681 Lecture Notes: Statistical Interactions 1 Preliminaries To this point, the linear models we have considered have all been interpreted in terms of additive relationships. That is, the relationship

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Regression: Ordinary Least Squares

Regression: Ordinary Least Squares Regression: Ordinary Least Squares Mark Hendricks Autumn 2017 FINM Intro: Regression Outline Regression OLS Mathematics Linear Projection Hendricks, Autumn 2017 FINM Intro: Regression: Lecture 2/32 Regression

More information

Please bring the task to your first physics lesson and hand it to the teacher.

Please bring the task to your first physics lesson and hand it to the teacher. Pre-enrolment task for 2014 entry Physics Why do I need to complete a pre-enrolment task? This bridging pack serves a number of purposes. It gives you practice in some of the important skills you will

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano 1 Notation and language Recall the notation that we discussed in the previous classes. We call the outcome

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

11 Correlation and Regression

11 Correlation and Regression Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value

More information

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

8. TRANSFORMING TOOL #1 (the Addition Property of Equality)

8. TRANSFORMING TOOL #1 (the Addition Property of Equality) 8 TRANSFORMING TOOL #1 (the Addition Property of Equality) sentences that look different, but always have the same truth values What can you DO to a sentence that will make it LOOK different, but not change

More information

Environmental Econometrics

Environmental Econometrics Environmental Econometrics Syngjoo Choi Fall 2008 Environmental Econometrics (GR03) Fall 2008 1 / 37 Syllabus I This is an introductory econometrics course which assumes no prior knowledge on econometrics;

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Experiments and Quasi-Experiments (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Why study experiments? Ideal randomized controlled experiments

More information

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics

More information

Introduction to Algebra: The First Week

Introduction to Algebra: The First Week Introduction to Algebra: The First Week Background: According to the thermostat on the wall, the temperature in the classroom right now is 72 degrees Fahrenheit. I want to write to my friend in Europe,

More information

Midterm 1 ECO Undergraduate Econometrics

Midterm 1 ECO Undergraduate Econometrics Midterm ECO 23 - Undergraduate Econometrics Prof. Carolina Caetano INSTRUCTIONS Reading and understanding the instructions is your responsibility. Failure to comply may result in loss of points, and there

More information

Expectations and Variance

Expectations and Variance 4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance

More information

DIFFERENTIAL EQUATIONS

DIFFERENTIAL EQUATIONS DIFFERENTIAL EQUATIONS Basic Concepts Paul Dawkins Table of Contents Preface... Basic Concepts... 1 Introduction... 1 Definitions... Direction Fields... 8 Final Thoughts...19 007 Paul Dawkins i http://tutorial.math.lamar.edu/terms.aspx

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression Objectives: 1. Learn the concepts of independent and dependent variables 2. Learn the concept of a scatterplot

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

Intro to Linear Regression

Intro to Linear Regression Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor

More information

Quadratic Equations Part I

Quadratic Equations Part I Quadratic Equations Part I Before proceeding with this section we should note that the topic of solving quadratic equations will be covered in two sections. This is done for the benefit of those viewing

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Department of Economics University of Wisconsin-Madison September 27, 2016 Treatment Effects Throughout the course we will focus on the Treatment Effect Model For now take that to

More information

30. TRANSFORMING TOOL #1 (the Addition Property of Equality)

30. TRANSFORMING TOOL #1 (the Addition Property of Equality) 30 TRANSFORMING TOOL #1 (the Addition Property of Equality) sentences that look different, but always have the same truth values What can you DO to a sentence that will make it LOOK different, but not

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

FNCE 926 Empirical Methods in CF

FNCE 926 Empirical Methods in CF FNCE 926 Empirical Methods in CF Lecture 5 Instrumental Variables Professor Todd Gormley Announcements n Exercise #2 due; should have uploaded it to Canvas already 2 Background readings n Roberts and Whited

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

Fitting a regression model

Fitting a regression model Fitting a regression model We wish to fit a simple linear regression model: y = β 0 + β 1 x + ɛ. Fitting a model means obtaining estimators for the unknown population parameters β 0 and β 1 (and also for

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information