Final Exam Details. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, / 24

Similar documents
Final Exam - Solutions

Final Exam - Solutions

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

ECON Introductory Econometrics. Lecture 17: Experiments

Introduction. ECN 102: Analysis of Economic Data Winter, J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, / 51

ECO220Y Simple Regression: Testing the Slope

ECON Introductory Econometrics. Lecture 16: Instrumental variables

CHAPTER 6: SPECIFICATION VARIABLES

Econometrics Review questions for exam

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

An example to start off with

Absolute Value Inequalities 2.5, Functions 3.6

WISE International Masters

Midterm 2 - Solutions

CPSC 340: Machine Learning and Data Mining. Regularization Fall 2017

Lectures 5 & 6: Hypothesis Testing

Midterm 2 - Solutions

Methods of Mathematics

Chapter 2: simple regression model

Announcements Monday, September 18

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

LECTURE 15: SIMPLE LINEAR REGRESSION I

IB Physics, Bell Work, Jan 16 19, 2017

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Introduction to Econometrics

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Click to edit Master title style

Physics 321 Theoretical Mechanics I. University of Arizona Fall 2004 Prof. Erich W. Varnes

Topic 4: Model Specifications

The flu example from last class is actually one of our most common transformations called the log-linear model:

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

The response variable depends on the explanatory variable.

MULTIPLE REGRESSION ANALYSIS AND OTHER ISSUES. Business Statistics

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Regression of Inflation on Percent M3 Change

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Instrumental Variables

Physics Fall Semester. Sections 1 5. Please find a seat. Keep all walkways free for safety reasons and to comply with the fire code.

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Sampling and Sample Size. Shawn Cole Harvard Business School

Rockefeller College University at Albany

1 Correlation between an independent variable and the error

ECNS 561 Multiple Regression Analysis

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

ECON3150/4150 Spring 2016

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Announcements Monday, November 20

MS&E 226: Small Data

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

11 Correlation and Regression

Math 2311 TEST 2 REVIEW SHEET KEY

Methods of Mathematics

Econometrics Summary Algebraic and Statistical Preliminaries

Applied Microeconometrics I

CPSC 340: Machine Learning and Data Mining. Stochastic Gradient Fall 2017

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Outline. Recall... Limits. Problem Solving Sessions. MA211 Lecture 4: Limits and Derivatives Wednesday 17 September Definition (Limit)

Making sense of Econometrics: Basics

ASTRONOMY 10 De Anza College

Econometrics Homework 1

STAT 111 Recitation 7

ECN102: Analysis of Economics Data TA Section

USP/PLSI 493: Methods of Planning Data Analysis (4 credits) San Francisco State University Spring 2011

Econometrics -- Final Exam (Sample)

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Applied Statistics and Econometrics

Making sense of Econometrics: Basics

Regression Analysis Tutorial 34 LECTURE / DISCUSSION. Statistical Properties of OLS

Course Review. Kin 304W Week 14: April 9, 2013

ECON The Simple Regression Model

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

The remains of the course

PHYSICS 15a, Fall 2006 SPEED OF SOUND LAB Due: Tuesday, November 14

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics

Announcements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory

1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #

We ll start today by learning how to change a decimal to a fraction on our calculator! Then we will pick up our Unit 1-5 Review where we left off!

Algebra 1B notes and problems March 12, 2009 Factoring page 1

Motivation for multiple regression


Methods of Mathematics

EMERGING MARKETS - Lecture 2: Methodology refresher

Regression Models - Introduction

Lab 07 Introduction to Econometrics

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics

Prerequisite: Math 40 or Math 41B with a minimum grade of C, or the equivalent.

Review. December 4 th, Review

Announcements Wednesday, November 7

MATH 1150 Chapter 2 Notation and Terminology

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Violation of OLS assumption - Heteroscedasticity

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

De-mystifying random effects models

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Social Choice and Networks

Introduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric

Transcription:

Final Exam Details The final is Thursday, March 17 from 10:30am to 12:30pm in the regular lecture room The final is cumulative (multiple choice will be a roughly 50/50 split between material since the second midterm and old material, short answer will be focused on the new material) The old finals are a good guide to the format and length of the exam as well as the division of the exam between old and new material Office hours during exam week: Monday 2pm-4pm, Tuesday 10am-12pm, Wednesday 10am-12pm J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 1 / 24

Review: Omitted Variable Bias True model: Estimated model: y = β 1 + β 2 x 2 + β 3 x 3 + ε y = b 1 + b 2 x 2 Relationship between x 2 and x 3 : x 3 = γ 1 + γ 2 x 2 + ν Estimated slope coefficient: E( b 2 ) = β 2 + β 3 γ 2 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 2 / 24

Review: Omitted Variable Bias E( b 2 ) = β 2 + β 3 γ 2 So the estimated slope coefficient for x 2 will be equal to the true value (β 2 ) plus an additional term (β 3 γ 2 ) As long as this additional term is nonzero, we have a biased estimate of the slope coefficient The sign of the bias depends on the sign of β 3 and the sign of γ 2 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 3 / 24

Including Too Many Variables We ve seen that omitting important variables leads to big problems What if we include too many variables? It s not nearly as bad Our coefficients stay unbiased for the regressors that should be there but we lose some precision These problems are small compared to the problems of omitted variables, so it is best to error on the side of including questionable regressors J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 4 / 24

Non-linear Relationships We ve covered the problems of including the wrong set of variables in our model The other way we can misspecify the model is by using the wrong functional form This is a problem we ve already encountered and we solve it with data transformations One way we ll notice we have a problem is if we get distinct patterns in the residuals plotted against a regressor J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 5 / 24

Non-linear Relationships income age J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 6 / 24

Non-linear Relationships residuals age J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 7 / 24

Badly Behaved Errors We ve just seen that one way we know that the model is misspecified is if a pattern shows up on a graph of the residuals and the regressor This leads us into a new set of problems: badly behaved error terms Several problems can pop up with the error terms: Errors are correlated with the regressors Errors have nonconstant variance Errors are correlated with each other J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 8 / 24

Errors Correlated with the Regressors Recall the problem of omitted variable bias, we used the following (misspecified) regression model: y = b 1 + b 2 x 2 Because we omitted x 3 from the equation and x 3 was correlated with both y and x 2, our estimated coefficient b 2 was a biased estimator of β 2 The error term included x 3 which was correlated with x 2 This means that omitted variable bias is actually one case of when the error term is correlated with the regressors J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 9 / 24

Errors Correlated with the Regressors So omitted variable bias is one situation in which the errors are correlated with the regressor There are a couple of other important situations when we get errors correlated with the regressors and biased coefficients Two of the most common situations: Measurement error in one of the regressors Sample selection bias J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 10 / 24

Measurement Error 182 R. Pedace and N. Bates / Using administrative records to assess earnings reporting error Table 3 SIPP misreporting by earnings category Percent within: Difference (SIPP SER): Earnings Category 5% 10% 25% Mean Median Num. of Obs. < $5,000 9.1 17.0 35.9 270.86 2,924.47 3,637 $5 $10,000 17.1 31.3 58.8 30.45 1,465.61 2,644 $10 $15,000 23.4 42.6 72.4 283.43 815.17 2,392 $15 $20,000 26.6 48.3 78.2 663.76 10.69 2,225 $20 $25,000 29.0 51.9 81.2 1,064.16 697.25 1,986 $25 $30,000 29.2 49.2 82.2 1,715.47 1,779.05 1,491 $30 $35,000 27.2 47.2 80.5 2,313.97 2,768.10 1,130 $35 $40,000 30.8 50.5 80.8 2,741.55 3,763.70 956 $40 $45,000 28.0 47.0 78.7 3,560.01 4,761.25 771 $45 $50,000 29.6 45.4 78.3 4,078.39 5,918.33 527 > $50,000 61.7 70.6 85.7 0 5,274.04 1,758 Pedace, Roberto. Using administrative records to assess earnings reporting error in the survey of income and program participation. Journal of Economic and Social Measurement, 26(3/4). -SER), $ Mean and Median Misreporting by Earning Category 4000.0 3000.0 2000.0 J. Parman (UC-Davis) 1000.0 Analysis of Economic Data, Winter 2011 March 8, 2011 11 / 24

Measurement Error 0.3 lative frequency Rel 0.25 0.2 0.15 0.1 0.05 0 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Death certificate birthyear childhood census birthyear J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 12 / 24

Measurement Error Sometimes we can t measure things as precisely as we d like So while we might be interested in: y = β 1 + β 2 x + ε we re actually regressing y on x = x + ν where ν is some random error Why is this a problem? It will lead to error terms negatively correlated with the regressor x: y = β 1 + β 2 ( x ν) + ε y = β 1 + β 2 x + ( β 2 ν + ε) Regressing y on x will give a biased estimate of β 2 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 13 / 24

Measurement Error With this kind of random measurement error, our estimate b 2 will actually be biased toward zero This can be shown with some math, but there is also simple intuition behind it The more random measurement error we have in x, the weaker the relationship between x and y will appear to be In the extreme case, if the measurement error is very large compared to the actual variation in x, differences in the observed values of x won t appear to have any relationship with y J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 14 / 24

Measurement Error What can we do about measurement error? Find a better way of measuring the variable we re interested in Take multiple measurements and average them If we can t improve the measurement, we need to consider the bias when we interpret the results J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 15 / 24

Measurement Error What if we have measurement error in y so we actually observe ỹ = y + ν? This is less of a problem: y = β 1 + β 2 x + ε ỹ ν = β 1 + β 2 x + ε ỹ = β 1 + β 2 x + (ε + ν) If we regress ỹ on x, we can still get an unbiased estimate of β 2 since x is still uncorrelated with the error term We do lose precision (the variance of the error term is bigger) J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 16 / 24

Sample Selection Bias J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 17 / 24

Sample Selection Bias If our chosen sample is not representative of the population, we can get a sample selection bias Often our observations look more similar to each other than a random selection from the population should This means that the observations in the sample may share unobserved characteristics that are correlated with our outcome variable and the regressors The coefficients we estimate will be capturing the relationship between y and x for the chosen sample and may not be unbiased estimates of the true population values J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 18 / 24

Sources of Sample Selection Bias We can get sample selection bias for a variety of reasons Our way of choosing the sample can favor one group over another (eg. phone survey vs internet survey) We often only get observations for a single geographical area or a single period in time Non-response on surveys or attrition is often not random J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 19 / 24

Sample Selection Bias: Non-random Samples Landline & cell, interviewed on: Landline only Landline Cell Cell only 18 29 11 12 17 47 30 49 21 37 41 36 50 64 27 31 29 12 65+ 38 18 12 4 College grad 23 42 40 25 Some college 22 25 24 28 HS grad 40 28 29 35 Some HS 14 5 6 12 Renter 28 15 20 60 From Pew Research Center, "Costs and benefits of full dual frame telephone survey designs", 2008 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 20 / 24

Sample Selection Bias: Dewey vs Truman, Bush vs Gore J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 21 / 24

Sample Selection Bias: Attrition Morrison et. al. Tracking and follow-up of 16,915 adolescents: minimizing attrition bais. Controlled Clinical Trials, 18 (1997) J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 22 / 24

Sample Selection Bias: Attrition Morrison et. al. Tracking and follow-up of 16,915 adolescents: minimizing attrition bais. Controlled Clinical Trials, 18 (1997) J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 23 / 24

Dealing with Sample Selection Bias First is simply figuring out whether you have a problem: think about why your sample may not be representative of the population, check some of the observable characteristics of your sample to see if the sample looks random Try to determine the direction of the bias resulting from any sample selection issues (use some economic intuition) You can try to gather more data using a better sampling strategy J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, 2011 24 / 24