h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

Similar documents
Limited Dependent Variable Models II

ECON 594: Lecture #6

I. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

1 A Non-technical Introduction to Regression

2. We care about proportion for categorical variable, but average for numerical one.

Chapter 11. Regression with a Binary Dependent Variable

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Winter 2017 TOPIC 3: MULTINOMIAL CHOICE MODELS

1. The Multivariate Classical Linear Regression Model

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

Single-level Models for Binary Responses

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Comparing groups using predicted probabilities

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Probabilistic Choice Models

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Winter 2018 TOPIC 3: MULTINOMIAL CHOICE MODELS

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

8 Nominal and Ordinal Logistic Regression

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Binary Logistic Regression

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Nonparametric Welfare Analysis for Discrete Choice

disc choice5.tex; April 11, ffl See: King - Unifying Political Methodology ffl See: King/Tomz/Wittenberg (1998, APSA Meeting). ffl See: Alvarez

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression

Lecture Notes based on Koop (2003) Bayesian Econometrics

Ordered Response and Multinomial Logit Estimation

Linear Regression With Special Variables

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL

Maximum Likelihood and. Limited Dependent Variable Models

Introduction: structural econometrics. Jean-Marc Robin

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels

Random Utility Models, Attention Sets and Status Quo Bias

Probabilistic Choice Models

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

1 Uncertainty. These notes correspond to chapter 2 of Jehle and Reny.

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Generalized Linear Models for Non-Normal Data

Lecture-20: Discrete Choice Modeling-I

Environmental Econometrics

Instrumental Variables. Ethan Kaplan

Lecture 6: Discrete Choice: Qualitative Response

A short introduc-on to discrete choice models

Interpreting and using heterogeneous choice & generalized ordered logit models

Speci cation of Conditional Expectation Functions

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Binary Dependent Variable. Regression with a

Econometrics Problem Set 10

14. Binary Outcomes. A. Colin Cameron Pravin K. Trivedi Copyright 2006

Investigating Models with Two or Three Categories

PSC 504: Dynamic Causal Inference

Linear Classification: Probabilistic Generative Models

Lecture Notes Part 4: Nonlinear Models

2014 Preliminary Examination

Classification Based on Probability

MC3: Econometric Theory and Methods. Course Notes 4

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Data-analysis and Retrieval Ordinal Classification

Discrete Dependent Variable Models

Econometrics Homework 1

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

,..., θ(2),..., θ(n)

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

2. Multivariate ARMA

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Bayesian Modeling of Conditional Distributions

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECON Interactions and Dummies

ERSA Training Workshop Lecture 5: Estimation of Binary Choice Models with Panel Data

INTRODUCTION TO TRANSPORTATION SYSTEMS

Control Function and Related Methods: Nonlinear Models

Logistic & Tobit Regression

MLE for a logit model

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Empirical Methods in Applied Microeconomics

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

1 Static (one period) model

Exercise sheet 3 The Multiple Regression Model

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Föreläsning /31

Statistical Methods for Data Mining

Logistic Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo

Generalized Linear Models Introduction

Applied Health Economics (for B.Sc.)

Estimation in the Fixed Effects Ordered Logit Model. Chris Muris (SFU)

Binary Dependent Variables

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Naïve Bayes classification

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models

Transcription:

Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from a group is selected, the labeling of the choices is arbitrary. Example: Choice of Health Plan Choice of Occupation Transportation Mode for Commuting to Work Y takes the values f0; 1; : : : ; Jg J a positive integer X a set of conditioning variables Example: Y occupational choice X contains education, age, gender, race, marital status As in the binary case, we wish to know how ceteris paribus changes in the elements of X a ect the response probabilities for j = 0; 1; : : : ; J. Because the probabilities must sum to unity, P (Y = 0jX) is determined once we know the probabilities for j = 1; : : : ; J. Multinomial Logit X a 1 K vector with rst element unity exp X j = 1 + P J h=1 exp (X h) Because the probabilities sum to unity P (Y = 0jX) = 1 1 + P J h=1 exp (X h) j = 1; : : : ; J If J = 1, let 1 = and we have the binary logit model. Note, the model is not derived from an assumption that errors to a latent model are logisitc. Rather, the response probabilities are assumed to be a logistic function. Partial e ects are complicated. For continuous X k " P # J @ h=1 = hk exp (X h ) @X jk k 1 + P J h=1 exp (X : h) Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio with change approximately P (Y = 0jX) = exp X j jk exp X j Xk 1

The sign of jk determines the direction of the e ect on the odds ratio. To simplify the analysis even further ln = X P (Y = 0jX) j so that both the sign and the magnitude are deteremined by jk In general ln = X P (Y = hjx) j h Another useful fact. Because it follows that P (Y = j or Y = hjx) = + (Y = hjx) P (Y = jjy = j or Y = h; X) = which simpli es as exp X j exp X j + exp (Xh ) = + P (Y = hjx) exp X j h 1 + exp X j h which is a logistic function. Conditional on the choice being either j or h, the probability that the outcome is j follows a standard logit model with parameter vector j h. The density of Y given X is fully speci ed, so ML estimation. observation the conditional log-likelihood is l t () = JX 1 (y t = j) ln [P j (x t ; )] j=0 For each with P j (x t ; ) = P (Y t = jjx t ) McFadden (1974) establishes that the log-likelihood is globally concave, so the MLE is CAN Example: School and Employment Decisions - Young Men Sample of Men in 1987 Y = 0 enrolled in school Y = 1 home (not in school, not working) Y = 2 working X education, quadratic in work experience, black binary indicator 1,717 observations 99 in school 332 at home 1,286 working 2

Coe cient Estimates Y = 1 (home) Y = 2 (working) educ -.674 (.070) -.315 (.065) exper -.106 (.173).849 (.157) exper 2 -.013 (.025) -.077 (.023) black.813 (.303).311 (.282) constant 10.28 (1.13) 5.54 (1.09) For the home column, each experience term is insigni cant Wald test for joint signi cance yields p-value of.047, so signi cant at 5 percent level, we would probably leave coe cients unrestricted in 1 rather than setting them to zero To interpret Log-odds between at home and enrolled in school p1 ln = X 1 E ect of one more year of school, reduce log-odds by -.674 Impact of race, log-odds.813 higher for black men These magnitudes are hard to interpret Compute partial e ect (one more year of education) for: black, 5 years of experience, 12 years of education p 0 @P (Y = 1jX) @X educ = :095 ( :674 + :345) = :003 As education degrees are discrete, compare 12 years (HS) with 16 years (college) from above P (Y = 1j; 12 years schooling) = :095 P (Y = 1j; 16 years schooling) = :024 graduating from college, rather than HS, reduces the at home probability by.071 Similar calculations reveal: raises employment probability by.041 As the total change across all three responses must be zero, the in school probability must be raised by.030. Prediction For each observation t, the outcome with the highest estimated probability is the predicted probability Overall 80 percent correctly predicted Employed 95.2 percent correct In School 12.1 percent correct At Home 39.2 percent correct Probabilistic Choice Models McFadden (1974) showed that a model closely related to multinomial logit can be obtained from an underlying utility comparison 3

Utility from choosing alternative j is tj X tj Y tj = X tj + tj latent taste variables, independent of X tj every element must vary across alternatives (e.g. no constant) Y t = arg max (Yt1; : : : ; YtJ ) takes values in f0; 1; : : : ; Jg Example Y Choice of health plan X co-payment (di ers over plan, maybe over individuals) Y Commute Transport Choice X commute time (di ers over individual and choice) Assume for each j that tj are independently distributed with cdf F (a) = exp [ exp ( a)] (type I extreme value distribution), then P (Y t = jjx t ) = exp (X tj ) P J h=o exp (X tj) j = 0; : : : ; J These response probabilities are often termed, conditional logit model Marginal e ects (on response probabilities) for k = 1; : : : ; K @p j (X) @X jk = p j (X) [1 p j (X)] k j = 0; : : : ; J @p j (X) @X hk = p j (X) p h (X) k j 6= h multinomial logit - X speci c to individuals not alternatives, (occupational choice, we do not know how much someone could make in each occupation) conditional logit - X speci c to alternatives not individuals, choice is made on basis of observable attributes of choice Conditional logit model has an important restriction p j (X j ) p h (X h ) = exp (X j) exp (X h ) Relative probabilities do not depend on attributes of other alternatives, termed independence from irrelevant alternatives (IIA) Empirical applications often include individual speci c variables, W t Y tj = X tj + W t j + tj If j is constant across j, then W t drops out of the response probabilities p j (X j ) p h (X h ) = exp (X tj) exp (W t ) exp (X th ) exp (W t ) = exp (X tj) exp (X th ) Hence, no quantities that vary only across individuals are allowed in X (which has a constant coe cient across alternatives) 4

Classic example of restriction imposed by IIA two commute types, red bus and car, each selected with prob.5 add blue bus, commuters choose between bus types with prob.5 yet IIA implies p red bus p car remains equal to 1 so probability of each mode of transit is 1/3! Unlikely that the probability one drives falls from 1/2 to 1/3 just because a di erent color bus is added (admittedly extreme, in practice combine two bus types, but does illustrate unwanted restrictions) To relax IIA assumption: Assume t independent multivariate normal with arbitrary correlations among choices yields conditional probit model (often termed multivariate probit) Response probabilities are complicated, involve J + 1 dimensional integral MLE infeasbile with J > 4, recent work focuses on simulation Assume heirarchical model, most popular is nested logit Group alternative into S groups, G s number of alternatives in group s First, choose group; second, choose option within group Group selection Choice selection h P s j2g s exp s 1 X j i s P (Y 2 G s jx) = P h S P r=1 r exp 1 j2gr r X j i r exp s 1 X j P (Y = jjy 2 G s ; X) = Ph2G s exp s 1 X h Requires a normalization, usually s = 1 Response probability is the mulitiple of the two displayed probabilites Estimation: conditional on choosing group s, response probabilities are conditional logit with parameter s 1 = s rst estimate s from conditional logit to each group then plug ^ s into P (Y 2 G s jx) and estimate s by maximizing the log-likelihood nx SX t=1 s=1 1 (Y t 2 G s ) log hq s x t ; ^; i with q s the group selection probability with ^ plugged in for s 1 Ordered Response 5

Values assigned to Y are not arbitrary Example Credit Rating Y = 0 lowest rating Y = 6 highest rating Y remains ordinal, we cannot say the di erence between 4 and 2 is twice as important as the di erence between 1 and 0 Ordered Probit X Y = X + U UjX N (0; 1) K 1 vector that does not contain a constant De ne 1 < < J be threshold parameters (cut points) Y = 0 if Y 1 Y = 1 if 1 < Y 2. Y = J if J < Y Distribution of Y jx P (Y = 0jX) = P (Y 1 jx) = ( 1 X) P (Y = 1jX) = P ( 1 < Y 2 jx) = ( 2 X) ( 1 X). P (Y = JjX) = P ( J < Y jx) = 1 ( J X) If J = 1 P (Y = 1jX) = 1 P (Y = 0jX) = 1 ( 1 X) = (X 1 ) Thus 1 is the intercept, so X does not contain a constant (Of course, with only two outcomes, we set 1 = 0 and estimate an intercept) For each observation, conditional log-likelihood is 1 (Y t = 0) ln [ ( 1 X t )] + + 1 (Y t = J) ln [1 ( J X)] Replace with logit function delivers ordered logit Marginal E ects @p 0 (X) @X k = k ( 1 X) @p 1 (X) @X k = k [ ( 2 X) ( 1 X)] for most choices, k doesn t determine the sign of the e ect Interval-Coded Data Unlike ordered probit, don t need to estimate interval points, so replace unknown f i g with known intervals fa i g Example Y family wealth a known intervals Interest centers on E (Y jx) but only intervals in which Y falls are observed Assume Y jx N X; 2 6

Use same log-likelihood function as for ordered probit but let a j X replace j Interpret j as if we had observed Y and done OLS Follows from strong assumption that Y jx follows classic model X 7