Regression with a Binary Dependent Variable (SW Ch. 9)

Similar documents
Chapter 11. Regression with a Binary Dependent Variable

Binary Dependent Variable. Regression with a

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

2. We care about proportion for categorical variable, but average for numerical one.

Econometrics Problem Set 10

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Introduction to Econometrics

Econ 371 Problem Set #6 Answer Sheet. deaths per 10,000. The 90% confidence interval for the change in death rate is 1.81 ±

Logit regression Logit regression

Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Extensions to the Basic Framework II

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Linear Regression With Special Variables

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Nonlinear Regression Functions

Truncation and Censoring

Assessing Studies Based on Multiple Regression

Econometrics I Lecture 7: Dummy Variables

THE AUSTRALIAN NATIONAL UNIVERSITY. Second Semester Final Examination November, Econometrics II: Econometric Modelling (EMET 2008/6008)

4. Nonlinear regression functions

Linear Regression with one Regressor

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Unit 11: Multiple Linear Regression

Introduction to Econometrics. Multiple Regression

Non-linear panel data modeling

Linear Regression with Multiple Regressors

WISE International Masters

Experiments and Quasi-Experiments

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

y response variable x 1, x 2,, x k -- a set of explanatory variables

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

L6. NONLINEAR AND BINARY REGRESSION, PREDICTIVE EFFECTS, AND M-ESTIMATION


WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Final Exam. Name: Solution:

ECO321: Economic Statistics II

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Models of Qualitative Binary Response

Metrics Honors Review

Introduction to Econometrics. Multiple Regression (2016/2017)

Administrative Data Research Facility Linked HMDA and ACS Database

Linear Probability Model

Review of Econometrics

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Statistical Analysis of List Experiments

Applied Statistics and Econometrics

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Psychology 282 Lecture #4 Outline Inferences in SLR

Rockefeller College University at Albany

Linear Regression with Multiple Regressors

Introduction to Econometrics. Assessing Studies Based on Multiple Regression

6. Assessing studies based on multiple regression

Limited Dependent Variable Models II

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 12: Effect modification, and confounding in logistic regression

Final Review ECON Sebastian James. (Slides prepared by Bo Jackson and Hanley Chiang)

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Lecture 5. In the last lecture, we covered. This lecture introduces you to

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Lecture 3.1 Basic Logistic LDA

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Stratified Sample Design for Fair Lending Binary Logit Models + July 2000 ABSTRACT

9 Generalized Linear Models

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Lecture 3: Inference in SLR

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

STAT Chapter 11: Regression

Tests for the Odds Ratio in Logistic Regression with One Binary X (Wald Test)

Testing Linear Restrictions: cont.

9. Linear Regression and Correlation

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Introduction to Econometrics

Lecture 1: intro. to regresions

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

Econometrics -- Final Exam (Sample)

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Introduction to GSEM in Stata

STAT5044: Regression and Anova

Answers to End-of-Chapter Review the Concepts Questions

Table B1. Full Sample Results OLS/Probit

Causality and Experiments

Introduction to Econometrics. Regression with Panel Data

A simple alternative to the linear probability model for binary choice models with endogenous regressors

Lecture 10 Multiple Linear Regression

Applied Health Economics (for B.Sc.)

Working Paper No Maximum score type estimators

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Transcription:

Regression with a Binary Dependent Variable (SW Ch. 9) So far the dependent variable (Y) has been continuous: district-wide average test score traffic fatality rate But we might want to understand the effect of X on a binary variable: Y = get into college, or not Y = person smokes, or not Y = mortgage application is accepted, or not

Example: Mortgage denial and race, The Boston Fed HMDA data set Individual applications for singlefamily mortgages made in 1990 in the greater Boston area 2380 observations, collected under Home Mortgage Disclosure Act (HMDA) Variables Dependent variable: Is the mortgage denied or accepted? Independent variables: income, wealth, employment status other loan, property characteristics race of applicant

The Linear Probability Model (SW Section 9.1) A natural starting point is the linear regression model with a single regressor: Y i = β 0 + β 1 X i + u i But: What does β 1 mean when Y is Y binary? Is β 1 = X? What does the line β 0 + β 1 X mean when Y is binary? What does the predicted value Yˆ mean when Y is binary? For example, what does Yˆ = 0.26 mean?

The linear probability model, ctd. Y i = β 0 + β 1 X i + u i Recall assumption #1: E(u i X i ) = 0, so E(Y i X i ) = E(β 0 + β 1 X i + u i X i ) = β 0 + β 1 X i When Y is binary, E(Y) = 1xPr(Y=1) + 0xPr(Y=0) = Pr(Y=1) so E(Y X) = Pr(Y=1 X)

The linear probability model, ctd. When Y is binary, the linear regression model Y i = β 0 + β 1 X i + u i is called the linear probability model. The predicted value is a probability: E(Y X=x) = Pr(Y=1 X=x) = prob. that Y = 1 given x Yˆ = the predicted probability that Yi = 1, given X β 1 = change in probability that Y = 1 for a given x: β 1 = Pr( Y = 1 X = x+ x) Pr( Y = 1 X = x) x

Example: linear probability model, HMDA data Mortgage denial v. ratio of debt payments to income (P/I ratio) in the HMDA data set

Linear probability model: HMDA data deny = -.080 +.604P/I ratio (.032) (.098) (n = 2380) What is the predicted value for P/I ratio =.3? Pr(deny=1 P/I ratio = 0.3) = -.080 +.604x.3 =.151 Calculating effects: increase P/I ratio from.3 to.4: Pr(deny=1 P/I ratio = 0.4) = -.080 +.604x.4 =.212

The effect on the probability of denial of an increase in P/I ratio from.3 to.4 is to increase the probability by.061, that is, by 6.1 percentage points.

Next include race as a regressor deny = -.091 +.559P/I ratio +.177black (.032) (.098) (.025) Predicted probability of denial: for black applicant with P/I ratio =.3: Pr (deny = 1) = -.091 +.559x.3 +.177x1 =.254 for white applicant, P/I ratio =.3: Pr (deny = 1) = -.091 +.559x.3 +.177x0 =.077

difference =.177 = 17.7 percentage points Coefficient on black is significant at the 5% level Still plenty of room for omitted variable bias

The linear probability model: Summary Models probability as a linear function of X Advantages: o simple to estimate and to interpret o inference is the same as for multiple regression (need heteroskedasticity-robust standard errors) Disadvantages: o Does it make sense that the probability should be linear in X?

o Predicted probabilities can be <0 or >1! These disadvantages can be solved by using a nonlinear probability model: probit and logit regression

Probit and Logit Regression (SW Section 9.2) The problem with the linear probability model is that it models the probability of Y=1 as being linear: Pr(Y = 1 X) = β 0 + β 1 X Instead, we want: 0 Pr(Y = 1 X) 1 for all X Pr(Y = 1 X) to be increasing in X (for β 1 >0) This requires a nonlinear functional form for the probability. How about an S-curve

The probit model satisfies these conditions: 0 Pr( Y = 1 X) 1 for all X Pr(Y = 1 X) to be increasing in X (for β 1 >0)

Probit regression models the probability that Y=1 using the cumulative standard normal distribution function, evaluated at z = β 0 + β 1 X: Pr(Y = 1 X) = Φ(β 0 + β 1 X) Φ is the cumulative normal distribution function. z = β 0 + β 1 X is the z-value or z-index of the probit model.

Example: Suppose β 0 = -2, β 1 = 3, X =.4, so Pr(Y = 1 X=.4) = Φ(-2 + 3x.4) = Φ(-0.8) Pr(Y = 1 X=.4) = area under the standard normal density to left of z = -.8, which is

Pr(Z -0.8) =.2119

Probit regression, ctd. Why use the cumulative normal probability distribution? The S-shape gives us what we want: o 0 Pr(Y = 1 X) 1 for all X o Pr(Y = 1 X) to be increasing in X (for β 1 >0) Easy to use the probabilities are tabulated in the cumulative normal tables Relatively straightforward interpretation:

o z-value = β 0 + β 1 X ˆ β + ˆ 1 o 0 β X is the predicted z- value, given X o β 1 is the change in the z-value for a unit change in X

Example: HMDA data, ctd. Pr(deny=1 P/I ratio)= Φ(-2.19 + 2.97xP/I ratio) (.16) (.47) Positive coefficient: does this make sense? Standard errors have usual interpretation Predicted probabilities: Pr(deny=1 P/I ratio = 0.3) = Φ(-2.19+2.97x.3) = Φ(-1.30) =.097

Effect of change in P/I ratio from.3 to.4: Pr(deny=1 P/I ratio = 0.4) = Φ(-2.19+2.97x.4) =.159 The predicted probability of denial rises from.097 to.159

Probit regression with multiple regressors Pr(Y = 1 X 1, X 2 ) = Φ(β 0 + β 1 X 1 + β 2 X 2 ) Φ is the cumulative normal distribution function. z = β 0 + β 1 X 1 + β 2 X 2 is the z-value or z-index of the probit model. β 1 is the effect on the z-score of a unit change in X 1, holding constant X 2 We ll go through the estimation details later

Example: HMDA data, ctd. Pr(deny=1 P/I, black) = Φ(-2.26 + 2.74xP/I ratio +.71xblack) (.16) (.44) (.08) Is the coefficient on black statistically significant? Estimated effect of race for P/I ratio =.3: Pr(deny=1.3,1)= Φ(-2.26+2.74x.3+.71x1) =.233 Pr(deny=1.3,0)= Φ(-2.26+2.74x.3+.71x0) =.075

Difference in rejection probabilities =.158 (15.8 percentage points) Still plenty of room still for omitted variable bias

Logit regression Logit regression models the probability of Y=1 as the cumulative standard logistic distribution function, evaluated at z = β 0 + β 1 X: Pr(Y = 1 X) = F(β 0 + β 1 X) F is the cumulative logistic distribution function: F(β 0 + β 1 X) = 1+ 1 ( 0 1X ) e β + β

Logistic regression, ctd. Pr(Y = 1 X) = F(β 0 + β 1 X) where F(β 0 + β 1 X) = 1+ 1 ( 0 1X ) e β + β. Example: β 0 = -3, β 1 = 2, X =.4, so β 0 + β 1 X = -3 + 2x.4 = -2.2 so Pr(Y = 1 X=.4) = 1/(1+e ( 2.2) ) =.0998 Why bother with logit if we have probit? Historically, numerically convenient In practice, very similar to probit

Predicted probabilities from estimated probit and logit models usually are very close.