Modeling Binary Outcomes: Logit and Probit Models

Size: px

Start display at page:

Download "Modeling Binary Outcomes: Logit and Probit Models"

Phoebe Robbins
5 years ago
Views:

1 Modeling Binary Outcomes: Logit and Probit Models Eric Zivot December 5, 2009

2 Motivating Example: Women s labor force participation y i = 1 if married woman is in labor force = 0 otherwise x i k 1 = observed covariates Linear probability model formulation y i = x 0 i β + ε i,,...,n Note y i = 1 ε i =1 x 0 i β y i = 0 ε i = x 0 i β

3 Interpretation of regression model E[y i x i ] = 1 Pr(y i =1 x i )+0 Pr(y i =0 x i ) = Pr(y i =1 x i )=x 0 i β and Note: y i x i is heteroskedastic E[y i x i ] x ki = Pr(y i =1 x i ) x ki = β k var(y i x i )=Pr(y i =1 x i )Pr(y i =0 x i )=x 0 i β(1 x0 i β)

4 Problems with linear probability model 1. ε i x i cannot be normally distributed 2. Predicted probabilities can be less than zero or greater than one 3. Constant marginal effects Pr(y x i ) x i = β often an unrealistic assumption

5 Latent Variable Formulation y i = 0, 1: observed discrete response yi R : unobserved (latent) continuous index Idea: Large values of yi generate y i =1,andsmallvaluesgeneratey i =0 Example: Labor force participation y i = 1 if married woman is in labor force = 0 otherwise yi = unobserved propensity to work based on utility of choice x i = observed variables that influence utility y i = 1 if utility of work is greater than utility of leisure

6 Assume y i has a linear model representation yi = x0 i β + ε i,,...,n The relationship between y i and yi is y i = 1 if yi > 0 (normalized thershold) = 0 if yi 0 Then Pr(y i =1 x i )=Pr(yi > 0 x i)=pr(x 0 i β + ε i > 0 x i ) =Pr(ε i > x 0 i β x i)=pr(ε i x 0 i β x i)=f ε (x 0 i β) provided ε i has a symmetric distribution F ε. Similarly, Pr(y i =0 x i )=Pr(yi 0 x i)=pr(ε i x 0 i β x i)=1 F ε (x 0 i β)

7 Remarks 1. The latent variable formulation provides a non-linear probability model for Pr(y i x i ). 2. By construction, Pr(y i x i ) (0, 1) because it is based on F ε 3. lim x 0 i β Pr(y i =1 x i ) = lim x 0 i β F ε(x 0 i β)=1and lim x 0 i β Pr(y i =1 x i )=0 4. To make the model operational requires specifying F ε

8 Probit Model Assume ε i N(0, 1). Then Pr(y i = 1 x i )=F ε (x 0 i β)=φ(x0 i β) Pr(y i = 0 x i )=1 F ε (x 0 i β)=1 Φ(x0 i β) Z z Φ(z) = ϕ(x)dx ϕ(x) = 1 exp µ 1 2π 2 x2 Note: If ε i N(0,σ 2 ) then β and σ are not separately identified. Only the ratio β/σ is identified. Hence, σ 2 =1is an identifying assumption for β.

9 Logit Model Assume ε i Logistic. Then Pr(y i = 1 x i )=F ε (x 0 i β)=λ(x0 i β) Pr(y i = 0 x i )=1 F ε (x 0 i β)=1 Λ(x0 i β) Λ(z) = Z z λ(x)dx = exp(z) 1+exp(z) = 1 exp( z)+1 λ(z) = d dz Λ(z) =Λ(z)(1 Λ(z)) = exp(z) (1 + exp(z)) 2

10 Remarks 1. If ε i Logistic then E[ε i ]=0and var(ε i )= π2 3 =3.29 which is similar to a Student s t distribution with 7 degrees of freedom. 2. Logit and probit probabilities are essentially the same in the middle of the distribution but differ slightly in the tails of the distribution.

11 Marginal Effects in Latent Variable Formulation Inthelatentvariableformulation Pr(y i =1 x i )=F ε (x 0 i β) Then Pr(y i =1 x i ) = F ε(x 0 i β) = f ε (x 0 i x ki x β)β k i where f ε is the pdf for ε. For the probit and logit models Probit : Logit : Pr(y i =1 x i ) x ki Pr(y i =1 x i ) x ki = ϕ(x 0 i β)β k = λ(x 0 i β)β k

12 Remarks 1. Marginal effects are non-linear functions of x i and β 2. Marginal effect of x ki depends on the value of x i =(x 1i,...,x ki ) 0, the value of β =(β 1,...,β k ) 0 and the value of β k. 3. Because f ε ( ) > 0, the sign of β k determines the sign of the marginal effect 4. Estimated standard errors for marginal effects require the delta-method.

13 Maximum Likelihood Estimation Observe a random sample {(y 1, x 1 ),...,(y n, x n )} and assume that it is generated from the latent variable formulation of the binary response model. Then, y i x i is a Bernoulli random variable with conditional probablities π i = Pr(y i =1 x i )=F ε (x 0 i β) 1 π i = Pr(y i =0 x i )=1 F ε (x 0 i β) The likelihood and log-likelihood functions are L(β y, X) = ln L(β y, X) = ny nx π y i i (1 π i) 1 y i = ny F ε (x 0 i β)y i(1 F ε (x 0 i β))1 y i n yi ln ³ F ε (x 0 i β) +(1 y i )ln ³ 1 F ε (x 0 i β) o

14 The FOCs that define the MLE are 0 = ln L(ˆβ mle y, X) = S(ˆβ β mle y, X) = S i (ˆβ mle y i, x i ) nx = y ln ³ F ε (x 0 iˆβ mle ) i +(1 y i ) ln ³ 1 F ε (x 0 iˆβ mle ) β β = = nx y i nx y i f ε (x 0 iˆβ mle ) F ε (x 0 iˆβ mle ) x i +(1 y i ) f ε(x 0 iˆβ mle ) 1 F ε (x 0 iˆβ mle ) x i nx f ε (x 0 iˆβ mle ) F ε (x 0 iˆβ mle ) (1 y f ε (x 0 iˆβ mle ) i) 1 F ε (x 0 iˆβ mle ) x i These are k non-linear equations in k unknowns. No analytic solution exists.

15 The Newton-Raphson iteration is Remarks ˆβ n+1 = ˆβ n H(ˆβ n y, X) 1 S(ˆβ n y, X) H(β y, X) = 2 ln L(β y, X) β β 0 1. For the logit model, analytic derivatives are easy to determine: S(β y, X) = H(β y, X) = nx nx (y i Λ(x 0 i β))x i Λ(x 0 i β) ³ 1 Λ(x 0 i β) x i x 0 i The Hessian is independent of y i and is always negative definite. ln L(β y, X) is globally concave and a unique maximum exists. Hence,

16 2. For the probit model, analytic derivatives are also available: where S(β y, X) = H(β y, X) = nx nx m i x i m i ³ mi + x 0 i β x i x 0 i m i = q iϕ(q i x 0 i β) Φ(q i x 0 i β), q i =2y i 1 It can be shown that H(β y, X) is negative definite for all β so that lnl(β y, X) is globally concave and a unique maximum exists.

17 Measuring Goodness-of-fit in Binary Response Models 1. McFadden s Likelihood Ratio Index (R 2 ) LRI = RMcFadden 2 =1 ln L(ˆβ mle y, X) ln L(all slopes = 0 y, X) LRI 0 if model with all slopes = 0 fits the data as well as the model with estimated slopes. 2. Prediction Table 2 x 2 table of hits and misses of the prediction rule classify ŷ i =1if c Pr(y i =1 x i ) > cutoff probability

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit