Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Similar documents
Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Limited Dependent Variables

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Geometry of Logit and Probit

Linear Regression Analysis: Terminology and Notation

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

1 Binary Response Models

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Binomial Distribution: Tossing a coin m times. p = probability of having head from a trial. y = # of having heads from n trials (y = 0, 1,..., m).

Chapter 20 Duration Analysis

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Primer on High-Order Moment Estimators

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Basically, if you have a dummy dependent variable you will be estimating a probability.

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Composite Hypotheses testing

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Logistic regression models 1/12

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

ENG 8801/ Special Topics in Computer Engineering: Pattern Recognition. Memorial University of Newfoundland Pattern Recognition

Probability and Random Variable Primer

Linear Approximation with Regularization and Moving Least Squares

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Regression with limited dependent variables. Professor Bernard Fingleton

Interval Regression with Sample Selection

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

e i is a random error

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

Chapter 11: Simple Linear Regression and Correlation

Logistic Regression Maximum Likelihood Estimation

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

Speech and Language Processing

Hydrological statistics. Hydrological statistics and extremes

Engineering Risk Benefit Analysis

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Classification as a Regression Problem

First Year Examination Department of Statistics, University of Florida

Week 5: Neural Networks

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. CDS Mphil Econometrics Vijayamohan. 3-Mar-14. CDS M Phil Econometrics.

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Limited Dependent Variables and Panel Data. Tibor Hanappi

Lecture 3: Probability Distributions

RELIABILITY ASSESSMENT

Modelli Clamfim Equazioni differenziali 7 ottobre 2013

Lecture 6: Introduction to Linear Regression

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Basic R Programming: Exercises

Chapter 4: Regression With One Regressor

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Multinomial logit regression

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Exam. Econometrics - Exam 1

Simulation and Random Number Generation

Discriminative classifier: Logistic Regression. CS534-Machine Learning

Estimation: Part 2. Chapter GREG estimation

PhysicsAndMathsTutor.com

Econometrics of Panel Data

Empirical Methods for Corporate Finance. Identification

Maximum Likelihood Estimation (MLE)

9. Binary Dependent Variables

Lecture 4 Hypothesis Testing

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Multilevel Logistic Regression for Polytomous Data and Rankings

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Solutions Homework 4 March 5, 2018

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

STK4080/9080 Survival and event history analysis

Web-based Supplementary Materials for Inference for the Effect of Treatment. on Survival Probability in Randomized Trials with Noncompliance and

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Transcription:

ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for mamum lkelhood estmaton, and how to estmate by mamum lkelhood the two most common formulatons of such models, namely probt and logt models.. General Formulaton of Bnary Dependent Varables Models A conventonal formulaton of bnary dependent varables models relates the observed bnary outcome varable to an unobserved (or latent) dependent varable. he unobserved (or latent) dependent varable s assumed to be generated by a classcal lnear regresson model of the form + u () where: a contnuous real-valued nde varable for observaton that s unobservable, or latent; ( X X2 L Xk ), a K row vector of regressor values for observaton ; 0 2 L k ) (, a K column vector of regresson coeffcents; u an d random error term for observaton. ECO 452 -- ote 4: Flename 452note4.doc Page of 7 pages

ECO 452 -- OE 4: Probt and Logt Models he random error terms u are assumed to have zero condtonal means and constant condtonal varances for any set of regressor values : E ( u ) 0 2 2 ( u ) E( u ) Var (2.) (2.2) In addton, the condtonal dstrbuton of the around ther zero condtonal mean. Symmetry around mean zero means that Pr( u a) Pr(u > a) u s assumed to be symmetrc Snce by defnton Pr( u > a) Pr(u a), symmetry means that Pr( u a) Pr(u a) or Pr( u a) Pr(u a). (2.3) he observable outcomes of the bnary choce problem are represented by a bnary ndcator varable that s related to the unobserved dependent varable as follows: f > 0 (3.) 0 f 0 (3.2) he random ndcator varable represents the observed realzatons of a bnomal process wth the followng probabltes: > Pr( ) Pr( > 0) Pr( + u 0) (5.) Pr( 0) Pr( 0) Pr( + u 0) (5.2) What s requred to estmate the coeffcent vector are analytcal representatons of the bnomal probabltes (5.) and (5.2). ECO 452 -- ote 4: Flename 452note4.doc Page 2 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models Interpretaton of the regresson functon Under the zero condtonal mean error assumpton (2.), equaton () mples that ( ) E( ) + E( u ) E. (4) he regresson functon s thus the condtonal mean value of the latent random varable for gven values of the regressors. he slope coeffcents j (j,, k) are the partal dervatves of the regresson functon (4) wth respect to the ndvdual regressors: E ( ) X j X j ( 0 + X + L+ jx X j j + L+ k X k ) j. 2. Analytcal Representaton of Bnomal Probabltes he bnomal probabltes > Pr( ) Pr( > 0) Pr( + u 0) (5.) Pr( 0) Pr( 0) Pr( + u 0) (5.2) are represented analytcally n terms of the cumulatve dstrbuton functon, or c.d.f., for the random error term u n regresson equaton (): + u () ECO 452 -- ote 4: Flename 452note4.doc Page 3 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models he cumulatve dstrbuton functon (c.d.f.) for the random varable u s denoted n general by G(u) and s defned as G where ( a) Pr( u a) g( u) a du a g(u) du ( ) Pr( u ) g( u) du 0 G ( ) Pr( u ) g( u) du G ( a) G( b) G for a < b he probablty that Pr( u a) Pr( u a) Pr ( u > a) G( ) G( a) G( a) For a < b, the probablty Pr( a u b) Pr ( a u b) G( b) G( a). > s gven n terms of G(a) as s gven as: he frst dervatve of the c.d.f. equals the correspondng probablty densty functon, or p.d.f.: dg ( ) ( u) g u or g( a) du dg u du ( ) dg( a) u a da where g(a) s the value of d G(u) du evaluated at u a. ECO 452 -- ote 4: Flename 452note4.doc Page 4 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models he probablty densty functon (p.d.f.) for the random varable u s the functon g(u) defned over all real values of u such that:. g( u) 0 2. g ( u) du 3. for any real values a and b where < a < b <, b Pr ( a u b) g( u) a du Symmetry Property: In addton to the assumptons that the random varable u 2 has zero mean and constant (fnte) varance, t s assumed that the p.d.f. g(u) s symmetrc about ts zero mean. Symmetry of g(u) around mean zero means that ( a) g(a) and Pr ( u a) Pr( u > a) g Snce by defnton Pr. ( u a) G( a) and Pr( u a) Pr( u a) G(a symmetry of g(u) mples that G > ), ( a) G(a) or equvalently that ( a) G( a G ). Geometrcally, the symmetry property means that the lower tal area probablty that u a s equal to the upper tal area probablty that u > a. lower tal area Pr(u a) upper tal area Pr(u > a) ECO 452 -- ote 4: Flename 452note4.doc Page 5 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models Representaton of the Bnomal Probabltes he bnomal probablty Pr( ) Pr( > 0) Pr( + u 0) > can be represented n terms of the c.d.f. for the random varable u as follows: Pr( ) Pr( > 0) Pr + u ( > 0) Pr ( u > Pr( u G( ( G by symmetry of g u (6.) he bnomal probablty Pr( 0) Pr( 0) Pr( + u 0) can be represented n terms of the c.d.f. for the random varable u as follows: Pr( 0) Pr( 0) Pr + u ( 0) Pr ( u G ( G( ( ) ( ) by symmetry of g u (6.2) he probablty densty functon, or p.d.f., for the bnary dependent varable can thus be wrtten as: g ( ) [ G( G( [ for 0,. (7) ECO 452 -- ote 4: Flename 452note4.doc Page 6 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models 3. he Sample Lkelhood and Log-Lkelhood Functons he sample lkelhood functon for a sample of ndependent observatons { :,, } s: L(,, K, ) g ( ) 2 [ G( [ G( (8) G ( ( G( ) 0 he sample log-lkelhood functon for a sample of ndependent observatons { :,, } s: ln L(,, K, ) ln ( L) 2 ln g( ) { ln G( + ( )ln[ G( } [ ln G( + ( )ln G( (9) ln G( + ln[ G( 0 ECO 452 -- ote 4: Flename 452note4.doc Page 7 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models 4. Dstrbutonal Specfcatons of the Model o complete specfcaton of the model, a specfc probablty dstrbuton must be chosen for the random error terms u. he most commonly adopted dstrbutons n econometrc applcatons are the standard normal and the standard logstc.. he standard normal dstrbuton yelds the probt model. 2. he standard logstc dstrbuton yelds the logt model. Probt Model he standard normal dstrbuton has mean μ 0 and varance 2, and s symmetrc around ts zero mean. If the random varable s normally dstrbuted wth mean μ and varance 2, then the standard normal varable z ( μ) s normally dstrbuted wth mean 0 and varance. hat s, 2 f ~ ( μ, ), then ~ (0,) where z z ( μ). he standard normal p.d.f. s φ z 2. 2 ( ) ( π ) z 2 ep 2 he standard normal c.d.f. s Z Z 2 2 z ( ) ( ) ( ) ( ) Z φ π Pr z Z z dz 2 ep dz. 2 Choce of the standard normal for the dstrbuton of the random error terms u leads to the probt model. ECO 452 -- ote 4: Flename 452note4.doc Page 8 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models Logt Model he standard logstc dstrbuton has mean μ 0 and varance π / 3, and s symmetrc around ts zero mean. 2 2 he standard logstc p.d.f. s f ( ) ep( ). 2 ( + ep( )) ( + ep( )) 2 ep( ) he standard logstc c.d.f. s F(X ) [ + ep( X ) ( + ep( X )) ep(x ) ( + ep(x )). Choce of the standard logstc for the dstrbuton of the random error terms u leads to the logt model. ECO 452 -- ote 4: Flename 452note4.doc Page 9 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models 5. he Unvarate Probt Model Probt Representaton of the Bnomal Probabltes In the probt model, the bnomal probabltes Pr( ) and ( 0) represented analytcally n terms of the standard normal c.d.f. ( ): Z Z 2 2 z ( Z ) Pr( z Z ) φ ( z) dz ( 2 π ) he bnomal probablty ( ) represented n the probt model as follows: Pr( ) Pr( 0) > Pr( + u > 0) Pr ( > u ep dz 2 Pr Pr( 0) Pr( + u 0) Pr are Z > s > u Pr > dvdng by > 0 u Pr by defnton u snce ~ (0,) by symmetry of φ(z) (0) ECO 452 -- ote 4: Flename 452note4.doc Page 0 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models he bnomal probablty ( 0) represented n the probt model as follows: ote that Pr( 0) Pr( 0) Pr( + u 0) Pr ( u Pr Pr( 0) Pr( + u 0) s u Pr dvdng by > 0 u snce ~ (0,) by symmetry of φ(z) () Z 2 2 z Z ep dz where 2 ( ) ( 2 π ) Z. he contrbuton to the sample lkelhood functon of the -th sample observaton s: g ( ) 0, for for 0 ECO 452 -- ote 4: Flename 452note4.doc Page of 7 pages

ECO 452 -- OE 4: Probt and Logt Models Probt Lkelhood Functon he probt lkelhood functon for a sample of ndependent observatons { :,, } s: ECO 452 -- ote 4: Flename 452note4.doc Page 2 of 7 pages ) L (, ( ) g (2) 0 Probt Log-lkelhood Functon he probt log-lkelhood functon for a sample of ndependent observatons { :,, } s: ( L, ln ) ( ) [ L ln ( ) ln g + )ln ( ln + )ln ( ln (3) + 0 ln ln A property of the probt log-lkelhood functon s that the coeffcent vector and the scalar parameter are not separately dentfable.

ECO 452 -- OE 4: Probt and Logt Models Consequently, only the probt coeffcent vector can be estmated. However, t s conventonal to mpose the normalzaton, n whch case the probt coeffcent vector. Computng Probt Coeffcent Estmates Mamum lkelhood estmates of the probt coeffcent vector or are obtaned by mamzng the probt log-lkelhood functon (3) wth respect to the K elements of or : Ma{ } ln ln [ L( ) L( ) + ln ( )ln where ln ( ) + ( )ln[ ( ) (3.) or ln ) ln[ L(,) Ma{ } L(, ln ( + ( )ln[ ( (3.2) Mamzaton of the probt log-lkelhood functon (3.)/(3.2) wth respect to or requres the use of nonlnear optmzaton algorthms such as ewton's method. he result s an ML estmate ˆ ˆ of the probt coeffcent vector together wth an ML estmate of the covarance matr for ˆ ˆ, Vˆ (ˆ ) Vˆ (ˆ) Vˆ. ˆ ECO 452 -- ote 4: Flename 452note4.doc Page 3 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models 6. he Unvarate Logt Model Logt Representaton of the Bnomal Probabltes In the logt model, the bnomal probabltes Pr( ) and ( 0) Pr are represented analytcally n terms of the standard logstc c.d.f. F(Z ): ( ) F(Z ) Pr z Z ep(z ) ( + ep(z )) he bnomal probablty ( ) represented n the logt model as follows: Pr( ) Pr( 0) >. Pr Pr( 0) Pr( + u 0) Pr( + u > 0) Pr ( u > Pr( u F( ( he bnomal probablty ( 0) > s > by defnton snce ~ f (z) F by symmetry of f (z) (4) represented n the logt model as follows: Pr( 0) Pr( 0) Pr( + u 0) Pr ( u ( F( Pr Pr( 0) Pr( + u 0) u s F by defnton of F(Z) by symmetry of f (z) (5) ECO 452 -- ote 4: Flename 452note4.doc Page 4 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models he contrbuton to the sample lkelhood functon of the -th sample observaton s: g ( ) [ F( F( ( F( [ 0, F for for 0 Logt Lkelhood Functon he logt lkelhood functon for a sample of ndependent observatons { :,, } s: L ( g ( ) [ ( [ F( F (6) F ( [ F( 0 ECO 452 -- ote 4: Flename 452note4.doc Page 5 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models Logt Log-lkelhood Functon he logt log-lkelhood functon for a sample of ndependent observatons { :,, } s: ln L( ln[ L( ln g( ) ln [ F( [ F( { ln F( + ( )ln[ F( } ( ) [ ( ) ln F + ( )ln F (7) ln F( + ln[ F( 0 Computng Logt Coeffcent Estmates by Mamum Lkelhood Mamum lkelhood estmates of the logt coeffcent vector are obtaned by mamzng the logt log-lkelhood functon (7) wth respect to the K elements of : ln ( ) ln[ L( Ma{} L ln F( + ( )ln[ F( (7) ln F( + ln[ F( 0 ECO 452 -- ote 4: Flename 452note4.doc Page 6 of 7 pages

ECO 452 -- OE 4: Probt and Logt Models A convenent property of the logt log-lkelhood functon (7) s that t s globally concave wth respect to the coeffcent vector. L( ( ) [ ( ln ln F + ( )ln F (7) ln F( + ln[ F( 0 hs property makes nonlnear mamzaton of the logt log-lkelhood functon (7) wth respect to farly straghtforward. he most commonly used nonlnear optmzaton algorthm for computng the ML estmates of the logt coeffcents s ewton's method, whch uses analytcal frst and second dervatves of ln L( wth respect to. he result s an ML estmate ˆ L of the logt coeffcent vector together wth an ML estmate of the covarance matr for ˆ L, Vˆ (ˆ L) Vˆ ˆ. ) L ECO 452 -- ote 4: Flename 452note4.doc Page 7 of 7 pages