Comparing groups using predicted probabilities

Size: px
Start display at page:

Download "Comparing groups using predicted probabilities"

Transcription

1 Comparing groups using predicted probabilities J. Scott Long Indiana University May 9, 2006 MAPSS - May 9, Page 1

2 The problem Allison (1999): Di erences in the estimated coe cients tell us nothing about the di erences in the underlying impact of x on the two groups. MAPSS - May 9, Page 2

3 Overview 1. Does the e ect of x on y di er across groups? 2. Comparing s across groups in the LRM 3. Problems with comparing s in logit and probit models 4. Tests and CIs for comparing Pr (y = 1) across groups 5. Example of gender di erences in tenure 6. Complications due to nonlinearity of models MAPSS - May 9, Page 3

4 LRM - Chow test 1. For example, Men: y = m + m educ educ + m ageage + " Women: y = w + w educ educ + w ageage + " 2. Do men and women have the same return for education? 3. We compute: z = H 0 : m educ = w educ b m educ b w educ r Var b m educ + Var b w educ MAPSS - May 9, Page 4

5 Logit and probit using latent y 1. Chow type test is invalid for logit & probit (Allison 1999). 2. Structural model with a latent y : y = + x + " 3. Error is normal for probit or logistic for logit: " f (0; " ) 4. We only observed a binary y: y = ( 1 if y > 0 0 if y 0 5. Graphically... MAPSS - May 9, Page 5

6 Logit and probit using latent y MAPSS - May 9, Page 6

7 y and Pr (y = 1) 1. Link to observed data: Pr (y = 1 j x) = Pr (y > 0 j x) = Pr (" < [ + x] j x) 2. Since y is not observed, is only identi ed up to a scale factor. MAPSS - May 9, Page 7

8 Identi cation of and " 1. Model A: 2. Model B: ya = a + a xx + " a where a = 1 = 6 + 1x + " a 3. Where: yb = b + b x x + " b where b = 2 = x + " b b = 2 a b x = 2 a x b = 2 a MAPSS - May 9, Page 8

9 Pr (y = 1 j x) in model A MAPSS - May 9, Page 9

10 Pr (y = 1 j x) in model B MAPSS - May 9, Page 10

11 The problem and a solution In terms of Pr(y = 1), these are empirically indistinguishable: Case 1: A change in x of 1 when a x = 1 and a = 1: Case 2: A change in x of 1 when b x = 2 and b = 2. MAPSS - May 9, Page 11

12 Implications for group comparisons 1. Comparing propensity for tenure for men and women: Men : y = m + m articles articles + " m Women : y = w + w articles articles + " w 2. Assume the s are equal: m articles = w articles 3. Assume women have more unobserved heterogeneity: w > m 4. Now we estimate the model... MAPSS - May 9, Page 12

13 Estimation 1. Probit software assumes that = For men, the estimated model is: y = m + m articles articles + " m m m m m = e m + em articlesarticles + e" m ; sd (e" m ) = 1 3. For women, the estimated model is: y = w + w articles articles + " w w w w w = e w + ew articlesarticles + e" w ; sd (e" w ) = 1 MAPSS - May 9, Page 13

14 Problem with Chow-type tests 1. We want to test: H 0 : m articles = w articles 2. But, end up testing: H 0 : em articles = ew articles 3. And, e m articles = e w articles does not imply m articles = w articles unless m = w. MAPSS - May 9, Page 14

15 Alternatives for testing group di erences 1. Allison s (1999) test of H 0 : m x = w x : (a) Disentangles the s and " s. (b) But requires that m z = w z for some z. 2. I propose testing H 0 : Pr (y = 1 j x) m = Pr (y = 1 j x) w since the probabilities are invariant to ". 3. Graphically... MAPSS - May 9, Page 15

16 Invariance of Pr(y = 1 j x) MAPSS - May 9, Page 16

17 Group comparisons of Pr(y = 1 j x) MAPSS - May 9, Page 17

18 Testing (x) 1. De ne: (x) = Pr (y = 1 j x) m Pr (y = 1 j x) w 2. Then: z = c (x) rv ar hc (x) i 3. Or the con dence interval: Pr ( LB (x) UB ) = :95 4. Delta method is very fast; bootstrap is very slow; both are available with SPost s prvalue and prgen. MAPSS - May 9, Page 18

19 Comparing groups di erences in Pr (y = 1) 1. Estimate: Pr (y = 1) = F ( w w + w x wx + m m + m x mx) where w = 1 for women, else 0; m = 1 for men, else 0; wx = w x; and mx = m x: 2. Then: Pr (y = 1 j x) w = F ( w + w x x) Pr (y = 1 j x) m = F ( m + m x x) 3. The gender di erence is: (x) = Pr (y = 1 j x) m Pr (y = 1 j x) w MAPSS - May 9, Page 19

20 Example: gender di erences in tenure Women Men Variable Mean SD Mean SD Is tenured? tenure Year year Year-squared yearsq Bachelor s selectivity select Total articles articles Distinguished job? distinguished Prestige of job prestige Person-years 1,121 1,824 Scientists MAPSS - May 9, Page 20

21 M1: articles only Women Men Variable e z e z constant articles Log-lik N 1,121 1,824 MAPSS - May 9, Page 21

22 M1: Pr(tenure j articles) 1 Men Women.8 Pr(tenure) Number of Articles (group_ten03a.do 11Apr2006) MAPSS - May 9, Page 22

23 M1: group di erences in probabilities 1. We are interested in group di erences in predictions: 2. We can test: 3. Or: (articles) = Pr (y = 1 j articles) m Pr (y = 1 j articles) w H 0 : (articles) = 0 (articles) LowerBound, (articles) UpperBound 4. With one RHS variable, we can plot all comparisons. MAPSS - May 9, Page 23

24 M1: (articles) with con dence intervals.8.7 Difference 95% confidence interval Pr(men) Pr(women) Number of Articles (group_ten03a.do 11Apr2006) MAPSS - May 9, Page 24

25 Adding variables additional variables 1. Structural model: y = + x x + z z + " 2. Setting z = Z changes the intercept: y = + x x + z Z + " = [ + z Z] + x x + " = + x x + " 3. Di erent values of z lead to di erent probability curves: Pr (y = 1 j x; z = Z) = Pr (" < [ + x x]) = F ( + x x) MAPSS - May 9, Page 25

26 E ect of levels of other variables 1.8 a* = (a + b_z 4) = 8 a* = (a + b_z 3) = 10 a* = (a + b_z 2) = 12 a* = (a + b_z 1) = 14 Pr(y=1 x,z ) x (group_parallel ) MAPSS - May 9, Page 26

27 Comparing groups with added variables 1. For given z = Z : Men: Pr (y = 1 j x; Z) m = F ( m + m x x) Women: Pr (y = 1 j x; Z) w = F ( w + w x x) 2. Group di erences depend on z: (x; Z) = Pr (y = 1 j x; Z) m Pr (y = 1 j x; Z) w 3. (x; Z) for a given x depends on the level of other variables. MAPSS - May 9, Page 27

28 M2: adding job type Women Men Variable e z e z constant articles distinguished Log-lik N 1,121 1,824 Chow tests: Allison tests: 2 articles =7.93, df=1, p<.01; 2 dist =1.38, df=1, p> articles =15.06, df=1, p<.01 (other e ects equal). 2 distinguished =3.54, df=1, p=.06 (other e ects equal). MAPSS - May 9, Page 28

29 M2: Pr(tenure j articles) 1 Men distinguished Women distinguished.8 Pr(tenure) Number of Articles (group_ten04a.do 14Apr2006) MAPSS - May 9, Page 29

30 M2: Pr(tenure j articles) 1.8 Men distinguished Women distinguished Men not distinguished Women not distinguished Pr(tenure) Number of Articles (group_ten04a.do 14Apr2006) MAPSS - May 9, Page 30

31 M2: (articles if not distinguished) Pr(men) Pr(women) Scientists not in distinguished departments Male Female difference 95% confidence interval Number of Articles (group_ten04a.do 11Apr2006) MAPSS - May 9, Page 31

32 M2: (articles if distinguished) Pr(men) Pr(women) Scientists in distinguished departments Male Female difference 95% confidence interval Number of Articles (group_ten04a.do 11Apr2006) MAPSS - May 9, Page 32

33 M2: (articles) by job type Pr(men) Pr(women) Distinguished if not significant Not distinguished if not significant Number of Articles (group_ten04a.do 11Apr2006) MAPSS - May 9, Page 33

34 M3: full model Women Men Variable e z e z constant year yearsq select articles prestige Log-lik N 1,121 1,824 Chow tests: Allison tests: 2 articles =1.07, df=1, p= prestige =0.03, df=1, p= articles =1.73, df=1, p=.19 (other e ects equal). 2 prestige =1.55, df=1, p=.21 (other e ects equal). MAPSS - May 9, Page 34

35 M3: Pr(tenure if prestige = 5) 1.8 Year 7 with prestige 5 Men Women Pr(tenure) Number of Articles (group_ten05b.do 04May2006) MAPSS - May 9, Page 35

36 M3: (articles if prestige = 5).7 Year 7 with prestige 5 Male Female difference 95% confidence interval Pr(men) Pr(women) Number of Articles (group_ten05b.do 04May2006) MAPSS - May 9, Page 36

37 M3: (articles by prestige) Pr(men) Pr(women) Weak (prestige=1) Adequate (prestige=2) Good (prestige=3) Strong (prestige=4) Distinguished (prestige=5) Number of Articles (group_ten05b.do 12Apr2006) MAPSS - May 9, Page 37

38 M3: (prestige by articles) articles 10 articles 20 articles 30 articles 40 articles 50 articles Pr(men) Pr(women) Job Prestige (group_ten05c.do 12Apr2006) MAPSS - May 9, Page 38

39 M3: (articles, prestige) MAPSS - May 9, Page 39

40 Conclusions 1. Predictions o er a general approach for comparing groups. 2. The approach cannot distinguish between di erent processes generating the same probabilities. For example: Case 1: m x = w x m 6= w ) (x) = 0 Case 2: m x 6= w x m = w ) (x) = 0 3. But, it provides a very detailed understanding of group di erences. 4. Implications for interpreting nonlinear models. 5. In Stata, enter: findit spost_groups for examples MAPSS - May 9, Page 40

41 References Allison, Paul D Comparing Logit and Probit Coe cients Across Groups. Sociological Methods and Research 28: Chow, G.C Tests of equality between sets of coe cients in two linear regressions. Econometrica 28: Long, J.S. and Freese, J Regression Models for Categorical and Limited Dependent Variables with Stata. Second Edition. College Station, TX: Stata Press. Long, J. Scott, Paul D. Allison, and Robert McGinnis Rank Advancement in Academic Careers: Sex Di erences and the E ects of Productivity. American Sociological Review 58: Xu, J. and J.S. Long, 2005, Con dence intervals for predicted outcomes in regression models for categorical outcomes. The Stata Journal 5: MAPSS - May 9, Page 41

42 Delta method for (x; ) 1. Start with the predicted probability: G() = Pr (y = 1 j x) = F (x) 2. A Taylor series expansion of G( ): b G( ) b G() + ( b ) T G 0 () 3. Where G 0 () = 0 = f(x)x K i T MAPSS - May 9, Page 42

43 4. Under standard assumptions, G( b ) is distributed normally around G() with a variance Var h G( b ) i = G 0 ( b ) T V ar( b )G 0 ( b ) 5. For a di erence in probability, let G () = F (jx a ) F (jx b ) MAPSS - May 9, Page 43

44 6. Then V ar h F ( jx b a ) F jx b i b = (jxa T V ar( ) (jx (jxa T V ar( ) (jx (jxb T V ar( ) (jx (jxb ) T V ar( ) (jx # # # # MAPSS - May 9, Page 44

45 Proof of invariance of Pr(y = 1) to " 1. If N (0; = 1) 2. Then: (") = 1 p 2 exp " Pr (y = 1 j x) = 3. If N (0; = 2): Z +x 1 (") = 1 p 2 exp " 2 2 2! = 1 p 2 exp 1 p 2 exp! t 2! 2 " 2! 2 dt (1) = 1 2 p 2 exp " 2! 8 MAPSS - May 9, Page 45

46 4. Then, Pr (y = 1 j x) = Z (+x) p 2 exp t 2! dt (2) 8 5. Equations 1 and 2 simply involve a linear change of variables, so the probabilities are equal. 6. If z = g (x) and x = g 1 (z), then Z A B f (z) dz = Z g 1 (A) g 1 (B) f [g (z)] dz dx dx 7. See Long 1997: for an alternative proof. MAPSS - May 9, Page 46

Group comparisons in logit and probit using predicted probabilities 1

Group comparisons in logit and probit using predicted probabilities 1 Group comparisons in logit and probit using predicted probabilities 1 J. Scott Long Indiana University May 27, 2009 Abstract The comparison of groups in regression models for binary outcomes is complicated

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1 Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio

h=1 exp (X : J h=1 Even the direction of the e ect is not determined by jk. A simpler interpretation of j is given by the odds-ratio Multivariate Response Models The response variable is unordered and takes more than two values. The term unordered refers to the fact that response 3 is not more favored than response 2. One choice from

More information

Speci cation of Conditional Expectation Functions

Speci cation of Conditional Expectation Functions Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics

More information

Limited Dependent Variable Models II

Limited Dependent Variable Models II Limited Dependent Variable Models II Fall 2008 Environmental Econometrics (GR03) LDV Fall 2008 1 / 15 Models with Multiple Choices The binary response model was dealing with a decision problem with two

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression Finansiell Statistik, GN, 15 hp, VT2008 Lecture 17-1: Regression with dichotomous outcome variable - Logistic Regression Gebrenegus Ghilagaber, PhD, Associate Professor May 7, 2008 1 1 Introduction Let

More information

Using predictions to compare groups in regression models for binary outcomes

Using predictions to compare groups in regression models for binary outcomes Using predictions to compare groups in regression models for binary outcomes J. Scott Long and Sarah A. Mustillo March 5, 2018 Abstract Methods for group comparisons using predicted probabilities and marginal

More information

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.

More information

Comparing groups in binary regression models using predictions

Comparing groups in binary regression models using predictions Comparing groups in binary regression models using predictions J. Scott Long and Sarah A. Mustillo June 6, 2017 Abstract Methods for group comparisons using predicted probabilities and marginal effects

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Binary Outcomes. Objectives. Demonstrate the limitations of the Linear Probability Model (LPM) for binary outcomes

Binary Outcomes. Objectives. Demonstrate the limitations of the Linear Probability Model (LPM) for binary outcomes Binary Outcomes Objectives Demonstrate the limitations of the Linear Probability Model (LPM) for binary outcomes Develop latent variable & transformational approach for binary outcomes Present several

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense

More information

Tables and Figures. This draft, July 2, 2007

Tables and Figures. This draft, July 2, 2007 and Figures This draft, July 2, 2007 1 / 16 Figures 2 / 16 Figure 1: Density of Estimated Propensity Score Pr(D=1) % 50 40 Treated Group Untreated Group 30 f (P) 20 10 0.01~.10.11~.20.21~.30.31~.40.41~.50.51~.60.61~.70.71~.80.81~.90.91~.99

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

ECON Interactions and Dummies

ECON Interactions and Dummies ECON 351 - Interactions and Dummies Maggie Jones 1 / 25 Readings Chapter 6: Section on Models with Interaction Terms Chapter 7: Full Chapter 2 / 25 Interaction Terms with Continuous Variables In some regressions

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence International Journal of Statistics and Probability; Vol. 4, No. 1; 2015 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Estimating Explained Variation of a Latent

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Marginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus

Marginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus Presentation at the 2th UK Stata Users Group meeting London, -2 Septermber 26 Marginal effects and extending the Blinder-Oaxaca decomposition to nonlinear models Tamás Bartus Institute of Sociology and

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Michael Bar October 3, 08 San Francisco State University, department of economics. ii Contents Preliminaries. Probability Spaces................................. Random Variables.................................

More information

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs.

Appendix A. Math Reviews 03Jan2007. A.1 From Simple to Complex. Objectives. 1. Review tools that are needed for studying models for CLDVs. Appendix A Math Reviews 03Jan007 Objectives. Review tools that are needed for studying models for CLDVs.. Get you used to the notation that will be used. Readings. Read this appendix before class.. Pay

More information

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 217, Boston, Massachusetts Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Fitting heterogeneous choice models with oglm

Fitting heterogeneous choice models with oglm The Stata Journal (2010) 10, Number 4, pp. 540 567 Fitting heterogeneous choice models with oglm Richard Williams Department of Sociology University of Notre Dame Notre Dame, IN Richard.A.Williams.5@nd.edu

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

More information

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS ECONOMETRICS II (ECO 2401) Victor Aguirregabiria Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS 1. Introduction and Notation 2. Randomized treatment 3. Conditional independence

More information

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012 Lecture Notes 2 Advanced Topics Econ 2050, Principles of Statistics Kevin R Foster, CCNY Spring 202 Endogenous Independent Variables are Invalid Need to have X causing Y not vice-versa or both! NEVER regress

More information

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 The lectures will survey the topic of count regression with emphasis on the role on unobserved heterogeneity.

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information Part VI Regression with Qualitative Information As of Oct 17, 2017 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving

More information

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 23-24, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting

More information

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data Limited Dependent Variables and Panel Data Logit, Probit and Friends Benjamin Bittschi Sebastian Koch Outline Binary dependent variables Logit Fixed Effects Models Probit Random Effects Models Censored

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Empirical Methods in Applied Microeconomics

Empirical Methods in Applied Microeconomics Empirical Methods in Applied Microeconomics Jörn-Ste en Pischke LSE November 2007 1 Nonlinearity and Heterogeneity We have so far concentrated on the estimation of treatment e ects when the treatment e

More information

Regression. Econometrics II. Douglas G. Steigerwald. UC Santa Barbara. D. Steigerwald (UCSB) Regression 1 / 17

Regression. Econometrics II. Douglas G. Steigerwald. UC Santa Barbara. D. Steigerwald (UCSB) Regression 1 / 17 Regression Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Regression 1 / 17 Overview Reference: B. Hansen Econometrics Chapter 2.20-2.23, 2.25-2.29 Best Linear Projection is

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

Comparing Logit & Probit Coefficients Between Nested Models (Old Version)

Comparing Logit & Probit Coefficients Between Nested Models (Old Version) Comparing Logit & Probit Coefficients Between Nested Models (Old Version) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 1, 2016 NOTE: Long and Freese s spost9

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Rethinking Inequality Decomposition: Comment. *London School of Economics. University of Milan and Econpubblica

Rethinking Inequality Decomposition: Comment. *London School of Economics. University of Milan and Econpubblica Rethinking Inequality Decomposition: Comment Frank A. Cowell and Carlo V. Fiorio *London School of Economics University of Milan and Econpubblica DARP 82 February 2006 The Toyota Centre Suntory and Toyota

More information

Logistic regression: Uncovering unobserved heterogeneity

Logistic regression: Uncovering unobserved heterogeneity Logistic regression: Uncovering unobserved heterogeneity Carina Mood April 17, 2017 1 Introduction The logistic model is an elegant way of modelling non-linear relations, yet the behaviour and interpretation

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014

Nemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014 Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of

More information

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression

Regression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Editor Executive Editor Associate Editors Copyright Statement:

Editor Executive Editor Associate Editors Copyright Statement: The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A & M University College Station, Texas 77843 979-845-3142 979-845-3144 FAX jnewton@stata-journal.com Associate Editors Christopher

More information

Lecture Notes Part 4: Nonlinear Models

Lecture Notes Part 4: Nonlinear Models 17.874 Lecture Notes Part 4: Nonlinear Models 4. Non-Linear Models 4.1. Functional Form We have discussed one important violation of the regression assumption { omitted variables. And, we have touched

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z Boston College First

More information

Econometrics Problem Set 10

Econometrics Problem Set 10 Econometrics Problem Set 0 WISE, Xiamen University Spring 207 Conceptual Questions Dependent variable: P ass Probit Logit LPM Probit Logit LPM Probit () (2) (3) (4) (5) (6) (7) Experience 0.03 0.040 0.006

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Multiple Regression: Chapter 13. July 24, 2015

Multiple Regression: Chapter 13. July 24, 2015 Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Problem set 1 - Solutions

Problem set 1 - Solutions EMPIRICAL FINANCE AND FINANCIAL ECONOMETRICS - MODULE (8448) Problem set 1 - Solutions Exercise 1 -Solutions 1. The correct answer is (a). In fact, the process generating daily prices is usually assumed

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

1 Fixed E ects and Random E ects

1 Fixed E ects and Random E ects 1 Fixed E ects and Random E ects Estimation 1.1 Fixed E ects Introduction Fixed e ects model: y it = + x it + f i + it E ( it jx it ; f i ) = 0 Suppose we just run: y it = + x it + it Then we get: ^ =

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r Introduction to Statistics in Psychology PS 1 Professor Greg Francis Lecture 11 correlation Is there a relationship between IQ and problem solving ability? CORRELATION suppose you get r 0. Does that mean

More information

CIVL 7012/8012. Simple Linear Regression. Lecture 3

CIVL 7012/8012. Simple Linear Regression. Lecture 3 CIVL 7012/8012 Simple Linear Regression Lecture 3 OLS assumptions - 1 Model of population Sample estimation (best-fit line) y = β 0 + β 1 x + ε y = b 0 + b 1 x We want E b 1 = β 1 ---> (1) Meaning we want

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Problem Set 10: Panel Data

Problem Set 10: Panel Data Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005

More information

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions Econ 513, USC, Department of Economics Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions I Introduction Here we look at a set of complications with the

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Pseudo panels and repeated cross-sections

Pseudo panels and repeated cross-sections Pseudo panels and repeated cross-sections Marno Verbeek November 12, 2007 Abstract In many countries there is a lack of genuine panel data where speci c individuals or rms are followed over time. However,

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Answer Key: Problem Set 5

Answer Key: Problem Set 5 : Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions

More information

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels : Dynamic Binary Random E ects Models Estimation with Unbalanced Panels Pedro Albarran* Raquel Carrasco** Jesus M. Carro** *Universidad de Alicante **Universidad Carlos III de Madrid 2017 Spanish Stata

More information