Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

Similar documents
Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Group comparisons in logit and probit using predicted probabilities 1

STA 216, GLM, Lecture 16. October 29, 2007

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Comparing groups using predicted probabilities

Linear Regression Models P8111

Generalized Linear Models for Non-Normal Data

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

LOGISTIC REGRESSION Joseph M. Hilbe

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Generalized Linear Models Introduction

Data-analysis and Retrieval Ordinal Classification

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

New York University Department of Economics. Applied Statistics and Econometrics G Spring 2013

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Multiple regression: Categorical dependent variables

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Ordered Response and Multinomial Logit Estimation

Introduction to General and Generalized Linear Models

Generalized Linear Models

Generalized linear models

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

STAT5044: Regression and Anova

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

ZERO INFLATED POISSON REGRESSION

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

The Logit Model: Estimation, Testing and Interpretation

,..., θ(2),..., θ(n)

Single-level Models for Binary Responses

An Introduction to Bayesian Linear Regression

ECON 594: Lecture #6

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Three hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

Advanced Quantitative Methods: limited dependent variables

Modeling Binary Outcomes: Logit and Probit Models

covariance between any two observations

Investigating Models with Two or Three Categories

Chapter 11 The COUNTREG Procedure (Experimental)

Applied Econometrics Lecture 1

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

Limited Dependent Variable Models II

Generalized Linear Models

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Applied Health Economics (for B.Sc.)

Linear Regression With Special Variables

2. We care about proportion for categorical variable, but average for numerical one.

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Discrete Choice Modeling

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Applied Linear Statistical Methods

Homework 1 Solutions

COMPLEMENTARY LOG-LOG MODEL

Logistic Regression. INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber SLIDES ADAPTED FROM HINRICH SCHÜTZE

POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010

Lecture 16 Solving GLMs via IRWLS

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

Generalized Multilevel Models for Non-Normal Outcomes

Bayesian Multivariate Logistic Regression

Generalized Linear Models and Exponential Families

8 Nominal and Ordinal Logistic Regression

Rockefeller College University at Albany

ML estimation: Random-intercepts logistic model. and z

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

Generalized Linear Models 1

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Models for Ordinal Response Data

Bayesian Regression Linear and Logistic Regression

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Generalized Models: Part 1

Generalized Linear Models

Mathematical statistics

Non-linear panel data modeling

Stat 5101 Lecture Notes

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Probability and Stochastic Processes

Simple ways to interpret effects in modeling ordinal categorical data

Categorical data analysis Chapter 5

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression

LOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Week 1 Quantitative Analysis of Financial Markets Distributions A

Statistics: A review. Why statistics?

SPH 247 Statistical Analysis of Laboratory Data. April 28, 2015 SPH 247 Statistics for Laboratory Data 1

Sections 4.1, 4.2, 4.3

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero

Multiclass Logistic Regression

GENERALIZED LINEAR MODELS Joseph M. Hilbe

Transcription:

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes 1 JunXuJ.ScottLong Indiana University 2005-02-03 1 General Formula The delta method is a general approach for computing confidence intervals for functions of maximum likelihood estimates. The delta method takes a function that is too complex for analytically computing the variance, creates a linear approximation of that function, then computes the variance of the simpler linear function that can be used for large sample inference. We begin with general result for maximum likelihood theory. Under stard regularity conditions, if β b is a vector of ML estimates, then ³ h ³ n bβ β d N 0,V ar bβ i. (1) Let G (β) be some function, such as predicted probabilities from a logit or ordinal logit model. The Taylor series expansion of G( b β) is G( β) b G(β)+( β b β) 0 G 0 (β)+( β b β) 0 G 00 (β )( β b β)/2 (2) G(β)+( β b β) 0 G 0 (β)+o( β b β) 0, where G 0 (β) G 00 (β) are matrices of first second partial derivatives with respect to β, β is some value between β b β. Then, h n G( β) b i G(β) n( β b β) 0 G 0 (β)+o p (1), (3) which leads to G( β) b ³ N G(β), G(β) Var( b β) G(β) (Greene 2000; Agresti 2002). To 0 G(β x) estimate the variance, we evaluate the partials at the ML estimates,,which β 0 β leads to ³ Var G( β) b G(b β) b 0 Var( β) b G(b β) b. (4) Forexample,considerthelogitmodelwith G(β) Pr(y 1 x) Λ (x 0 β). (5) 1 For information on related programs future updates to this program, please check www.indiana.edu/~jslsoc/spost.htm. 1

To compute the confidence interval for Pr (y 1 x), we need the gradient vector Since Λ is a cdf, Λ(x0 β) G(β) h Λ(x0 β) 0 Λ(x 0 β) 1 Λ(x0 β) k x 0 β x 0 β k λ (x 0 β) x k,then Λ(x 0 β) K i 0. (6) G(β) λ (x 0 β) λ (x 0 β) x 1 λ (x 0 0 β) x K, (7) where x 0 1. To compute the confidence interval for a change in the probability as the independent variables change from x a to x b,weusethefunction G (β) Λ(β x a ) Λ (β x b ), (8) where G(β) [Λ(β x a) Λ(β x b )] Λ(β x a) Λ(β x b) Substituting this result into equation 4, Λ(β xa ) Var(G(β)) 0 Var( β) b Λ(β x a) Λ(β xa ) 0 Var( β) b Λ(β x b) Λ(β xb ) 0 Var( β) b Λ(β x a) Λ(β xb ) + 0 Var( β) b Λ(β x b) We now apply these formula to the models considered in prvalue2. 2 Binary Models In binary models, G(β) Pr(y 1 x) F (x 0 β) where F is the cdf for the logistic, normal, or cloglog function. The gradient is F(x 0 β) F(x0 β) x 0 β f(x 0 β)x k x 0 k, (11) β k where f is the pdf corresponding to F.Forthevectorx it follows that F(x 0 β) f(x 0 β)x. (12) From equation 4, Var [Pr(y 1 x)] f(x 0 β)x 0 Var( β)xf(x b 0 β)f(x 0 β) 2 x 0 Var( β)x b. The variances of Pr(y 0 x) Pr(y 0 x) are the equal since [1 F (x 0 β)] f(x 0 β)x (13) Var [Pr(y 0 x)] [ f(x 0 β)] 2 x 0 Var( β)x b. (14) 2. (9) (10).

3 Ordered Logit Probit Assume that there are m 1,J outcome categories, where Pr (y m x) F (τ m x 0 β) F (τ m 1 x 0 β) for j 1,J. (15) Since we assume that τ 0 τ J, F (τ 0 x 0 β)0 F (τ J x 0 β)1. To compute the gradient, It follows that F (τ m x 0 β) F (τ m x 0 β) k (τ m x 0 β) F (τ m x 0 β) F (τ m x 0 β) τ j (τ m x 0 β) (τ m x 0 β) k f (τ m x 0 β)( x k ) (16) (τ m x 0 β) τ j. (17) F (τ m x 0 β) τ j f (τ m x 0 β) if j m (18) F (τ m x 0 β) 0if j 6 m. τ j (19) Using these results with equation 15, Pr (y i m x i ) k [f (τ m x 0 β)( x k )] [f (τ m 1 x 0 β)( x k )] (20) x k f (τ m x 0 β) [f (τ m 1 x 0 β)] then Pr (y i m x i ) F (τ m x 0 β) F (τ m 1 x 0 β) (21) τ j τ j τ j f (τ m x 0 β) if j m f (τ m 1 x 0 β) if j m 1 0 otherwise. For example, with three categories: Pr (y 1 x) F (τ 1 x 0 β) 0 (22) Pr (y 2 x) F (τ 2 x 0 β) F (τ 1 x 0 β) (23) Pr (y 3 x) 1 F (τ 2 x 0 β), (24) Pr (y i 1 x i ) k x k [f (τ 1 x 0 β)] (25) Pr (y i 2 x i ) k x k [f (τ 2 x 0 β) f (τ 1 x 0 β)] (26) Pr (y i 3 x i ) k x k [ f (τ 2 x 0 β)]. (27) 3

With respect to τ, Pr (y i 1 x i ) τ 1 f (τ 1 x 0 β) (28) Pr (y i 1 x i ) τ 2 0 (29) Pr (y i 2 x i ) τ 1 f (τ 1 x 0 β) (30) Pr (y i 2 x i ) τ 2 f (τ 2 x 0 β) (31) Pr (y i 3 x i ) τ 1 0 (32) Pr (y i 3 x i ) τ 2 0. (33) To implement these procedures in Stata, we create the augmented matrices: β β 0 τ 1 τ J 1 0 (34) x 1 x 0 1 0 0 0 x 2 x 0 0 1 0 0 (35). x J 1 x 0 0 0 1 0, such that x 0 j β τ j xβ. (36) We then create the gradients described above. 4 Generalized Ordered Logit The generalized ordered logit model is identical to the ordinal logit model except that the coefficients associated with x differ for each outcome. Since there is an intercept for each outcome, the τ s are fixed to zero β J 0 for identification. Then, Pr (y i 1 x i ) F ( x 0 β m ) (37) Pr (y i m x i ) F ( x 0 β m ) F x 0 β m 1 for m 2,J 1 (38) Pr (y i J x i ) F x 0 β m 1. (39) The gradient with respect to the β s is F ( x 0 β m ) m,k f ( x 0 β m )( x k ), (40) 4

while no gradient for thresholds is needed. Then, Pr (y i m x i ) x k f ( x 0 β m ) f x 0 β m 1. (41) m,k 5 Multinomial Logit Assuming outcomes 1 through J, Pr(y m x) exp (xβ m) JP exp, (42) xβ j where without loss of generality we assume that β 1 0 to identify the model ( accordingly, the derivatives below do not apply to the partial with respect to β 1 ). To simplify notation, let P exp xβ j. The derivative of the probability of m with respect to βn is Using the quotient rule, n n j1 exp (xβ m) 1 n. (43) exp (xβ m) exp (xβ m ) 2. (44) n n Examining each partial in turn. exp (xβ m ) exp (xβ m) m (45) n m n exp(xβ m ) x if m n 0 if m 6 n P J exp xβ j j1 n JP exp xβ j j1 (46) n exp(xβ n ) x, where the last equality follows since the partial of exp xβ j with respect to βn is 0 unless j n. Combining these results. If m n, exp (xβ m ) x exp (xβ m ) 2 x 2 (47) m exp (xβ m ) exp (xβ m ) 2 2 x exp (xβm ) exp (xβ m) exp (xβ m ) x [Pr(y m) Pr (y m)pr(y m)] x Pr(y m)[1 Pr (y m)] x. 5

For m 6 n, For example, for two x s m 1: [0 exp (xβ m )exp(xβ n ) x] 2 (48) n exp (xβ m) exp (xβ n ) x Pr(y m)pr(y n) x. p m (1 p m ) x 1 p m (1 p m ) x 2 p m (1 p m ) 0 (49) m 0 p m p n x 1 p m p n x 2 p m p n. (50) n6m 6 Poisson Negative Binomial Regression In the Poisson regression model, so that Using matrices, μ i exp(x 0 iβ), (51) μ exp (x0 β) (52) k k exp(x0 β) x 0 β x 0 β k exp(x 0 β)x k μx k μ exp (x0 β) exp(x0 β) x 0 β x 0 β μx (53) The probability of a given count is so we can compute the gradient as: Pr (y x) exp ( μ) μy y!, (54) exp ( μ) μ y /y! 1 exp ( μ) μ y μ. (55) k y! μ k 6

Since the last term was computed above, we only need to derive: exp ( μ) μ y exp( μ) μy exp ( μ) + μy μ μ μ exp( μ) yμ y 1 μ y exp ( μ) (56) which leads to: Using matrices, Pr (y x) 1 k y! μ exp ( μ) yμ y 1 μ y exp ( μ) x k (57) exp ( μ) yμy μ y+1 exp ( μ) x k y! yμy μ y+1 exp (μ) y! x k. Pr (y x) The negative binomial model is specified as exp ( μ) yμy μ y+1 exp ( μ) x (58) y! yμy μ y+1 exp (μ) y! x. μ exp(x 0 β + ε) (59) exp(x 0 β)exp(ε), where ε has a gamma distribution with variance α. The counts have a negative binomial distribution Pr (y i x i ) Γ(y µ ν µ yi i + ν) ν μi, (60) y i! Γ(ν) ν + μ i ν + μ i where ν α 1. The derivatives of the log-likelihood are given in Stata Reference, Version 8, page 10. To simplify notation, we define τ lnα, m 1/α, p 1/ (1 + αμ), μ exp(xβ) (61) with ψ (z) beingthedigammafunctionevaluatedatz, τ Then by the chain rule, p (y μ) (62) α (μ y) m ln (1 + αμ)+ψ (y + m) ψ (m). 1+αμ (63) Pr (y x) Pr (y x) Pr(y x) 1 Pr (y x), (64) 7

so that Similarly for τ, Pr (y x) Pr (y x) τ τ Pr (y x). (65) Pr (y x). (66) 7 References Agresti, Alan. 2002. Categorical Data Analysis. 2nd Edition. New York: Wiley. Greene, William H. 2000. Econometric Analysis, 4th Ed. New York: Prentice Hall. File prvalue2-delta-050203.tex 8