Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29
14.1 Regression Models with Binary Response Variable Regression Models with Binary Response Variable the response variable of interest has only two possible qualitative outcomes be represented by a binary indicator variables: taking values on 0 and 1 A binary response variable is said to involve binary responses or dichotomous( 二元 ) responses hsuhl (NUK) LR Chap 10 2 / 29
14.1 Regression Models with Binary Response Variable Meaning of Response Function when Outcome Variable is Binary 1 Consider the simple linear regression model: Y i = β 0 + β 1 X i + ε i, Y i = 0, 1 (E{ε i } = 0) E{Y i } = β 0 + β 1 X i 2 Consider Y i to be a Bernoulli random variable: Y i Probability 1 P(Y i = 1) = π i 0 P(Y i = 0) = 1 π i E{Y i } = 1(π i ) + 0(1 π i ) = π i = P(Y i = 1) = β 0 + β 1 X i hsuhl (NUK) LR Chap 10 3 / 29
14.1 Regression Models with Binary Response Variable Meaning of Response Function when Outcome Variable is Binary (cont.) The mean response E{Y i } = β 0 + β 1 X i is simply the probability that Y i = 1 when the level of the predictor variable is X i hsuhl (NUK) LR Chap 10 4 / 29
14.1 Regression Models with Binary Response Variable Special Problems when Response Variable is Binary 1 Nonnormal Error Terms: ε i = Y i (β 0 + β 1 X i ) When Y i = 1 : When Y i = 0 : ε i = 1 β 0 β 1 X i ε i = β 0 β 1 X i 2 Nonconstant Error Variance: ε i = Y i π i (π i : constant) σ 2 {Y i } = σ 2 {ε i } = π i (1 π i ) = (E{Y i })(1 E{Y i }) 3 Constraints on Response Function: 0 E{Y } = π 1 Many response function do not automatically posses this constraint. hsuhl (NUK) LR Chap 10 5 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response Ex: health researcher studying the effect of a mother s use of alcohol X: an index of degree of alcohol use during pregnancy Y c : the duration of her pregnancy (continuous response) simple linear regression model: Y c i = β c 0 + β c z X i + ε c i, ε c i N(0, σ 2 c) the usual simple linear regression analysis hsuhl (NUK) LR Chap 10 6 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) Coded each pregnancy: { 1 if Y c Y i = i T weeks 0 if Yi c > T weeks P(Y i = 1) = π i = P(Yi c T ) = P(β0 c + β1x c i + ε c i T ) = P ε c i σ c (Z) T βc 0 σ c (β0 ) = P(Z β 0 + β 1X i ) = Φ(β 0 + β 1X i ) (Φ(z) = P(Z z), Z N(0, 1) ) β1 c X i σ c ( β1 ) hsuhl (NUK) LR Chap 10 7 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 1 probit mean response function: E{Y i } = π i = Φ(β 0 + β 1X i ) Φ 1 (π i ) = π i = β 0 + β 1X i Φ 1 : sometimes called the probit transformation π i = β 0 + β 1 X i: probit response function or linear predictor hsuhl (NUK) LR Chap 10 8 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) bounded between 0 and 1 β1 (β 0 = 0) more S-shaped, changing rapidly in the center changing the sign of β1 Symmetric function: Y i = 1 Y i (Φ(Z) = 1 Φ( Z)) P(Y i = 1) = P(Y i = 0) = 1 Φ(β 0 + β 1X i ) = Φ( β 0 β 1X i ) hsuhl (NUK) LR Chap 10 9 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) Logistic function Similar to the normal distribution: mean=0, variance=1 slightly heavier tails hsuhl (NUK) LR Chap 10 10 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) ε L logistic r.v. µ = 0,σ = π/ 3 f L (ε L ) = exp(ε L ) [1 + exp(ε L )] 2, F L(ε L ) = exp(ε L) 1 + exp(ε L ) If ε c i Logistic distribution with µ = 0,σ c P(Y i = 1) = π i = P = P π ε c i 3 ( ε c ) i β0 + β σ 1X i c σ c (ε L ) π β0 + π β1 3 3 (β 0 ) (β 1 ), (E( εc i ) = 0, σ{ εc i } = 1) σ c σ c X i = P (ε L β 0 + β 1 X i ) = F L (β 0 + β 1 X i ) = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) hsuhl (NUK) LR Chap 10 11 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 2 logistic mean response function: E{Y i } = π i = F L (β 0 + β 1 X i ) = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) = 1 1 + exp( β 0 β 1 X i ) F 1 (π i ) = π i = β 0 + β 1 X i hsuhl (NUK) LR Chap 10 12 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) ( ) F 1 πi (π i ) = β 0 + β 1 X i = log e : 1 π i called the logit transformation of the probability π i π i : called the odds( 勝算 ) 1 π i hsuhl (NUK) LR Chap 10 13 / 29
14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 3 complementary log-log response function: extreme value of Gumbel probability distribution E{Y i } = π i = 1 exp( exp(β G 0 + β G 1 X i )) π = log[ log(1 π(x i ))] = β G 0 + β G 1 X i F 1 (π) = log e ( πi 1 π i ): called the logit transformation of the probability π π i 1 π i : called the odds( 勝算 ) hsuhl (NUK) LR Chap 10 14 / 29
Simple Logistic Regression The most widely used: 1 The regression parameters have relatively simple and useful interpretations 2 statistical software is widely available for analysis of logistic regression models Estimation parameters: Estimation: MLE to estimate the parameters of the logistic response function Utilize the Bernoulli distribution for a binary random variable hsuhl (NUK) LR Chap 10 15 / 29
Simple Logistic Regression (cont.) Simple Logistic Regression Model Y Ber(π): E{Y } = π Y i = E{Y i } + ε i the distribution of ε i depends on the Bernoulli distribution of the response Y i Y i are independent Bernoulli random variables with expected values E{Y i } = π, where E{Y i } = π i = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) hsuhl (NUK) LR Chap 10 16 / 29
Simple Logistic Regression (cont.) Likelihood function: Each Y i : P(Y i = 1) = π i ; P(Y i = 0) = 1 π i f i (Y i ) = π Y i i (1 π i ) 1 Y i, Y i = 0, 1; i = 1,..., n The joint probability function: n n g(y 1,..., Y n ) = f i (Y i ) = π Y i i (1 π i ) 1 Y i i=1 i=1 hsuhl (NUK) LR Chap 10 17 / 29
Simple Logistic Regression (cont.) n n g(y 1,..., Y n ) = f i (Y i ) = π Y i i (1 π i ) 1 Y i i=1 i=1 n [ ( )] πi n ln g(y 1,..., Y n ) = Y i ln + ln(1 π i ) 1 π i=1 i i=1 ( 1 π i = [1 + exp(β 0 + β 1 X i )] 1 ) ( ) πi ( ln = β 0 + β 1 X i ) 1 π i n n ln L(β 0, β 1 ) = Y i (β 0 + β 1 X i ) ln[1 + exp(β 0 + β 1 X i )] i=1 i=1 No closed-form solution for β 0, β 1 that max ln L(β 0, β 1 ) hsuhl (NUK) LR Chap 10 18 / 29
Simple Logistic Regression (cont.) Maximum Likelihood Estimation: To find the MLE b 0, b 1 : require computer-intensive numerical search procedures Once b 0, b 1 are found: the logit transformation: ˆπ i = exp(b 0 + b 1 X i ) 1 + exp(b 0 + b 1 X i ) ˆπ = ln ( ) ˆπ = b 0 + b 1 X 1 ˆπ hsuhl (NUK) LR Chap 10 19 / 29
Simple Logistic Regression (cont.) 1 examine the appropriateness of the fitted response function 2 if the fit is good make a variety of inferences and predictions hsuhl (NUK) LR Chap 10 20 / 29
Simple Logistic Regression (cont.) Interpretation of b 1 : The interpretation of the estimated regression coefficient b 1 in the fitted logistic response function is not the straightforward interpretation of the slope in a linear regression model. An interpretation of b 1 : the estimated odds ˆπ/(1 ˆπ) are multiplied by exp(b 1 ) for any unit increase in X ˆπ (X j ): the logarithm of the estimated odds when X = X j (denoted as ln(odds 1 )) ˆπ (X j ) = b 0 + b 1 (X j ) hsuhl (NUK) LR Chap 10 21 / 29
Simple Logistic Regression (cont.) Similarly, ln(odds 2 ) = ˆπ (X j + 1) ( ) odds2 b 1 = ln = ln(odds 2 ) ln(odds 1 ) odds 1 odds ratio( 勝算比 ): ÔR ÔR = odds 2 odds 1 = exp(b 1 ) hsuhl (NUK) LR Chap 10 22 / 29
Deviance Goodness of Fit Test Deviance goodness of fit test: completely analogous to the F test for lack of fit for simple and multiple linear regression models Assume: c unique combinations of the predictors: X 1,..., X c n j : #{repeat binary observations at X j } Y ij : the ith binary response ar X j Deviance goodness of fit test: Likelihood ratio test of the reduced model: Reduced model: E{Y i j} = [1 + exp( X jβ)] 1 π j : are parameters Full model: E{Y i j} = π j, j = 1,..., c hsuhl (NUK) LR Chap 10 23 / 29
Deviance Goodness of Fit Test (cont.) p j = Y j n j, j = 1,..., c ˆπ: the reduced model estimate of π The likelihood ratio test statistic: Deviance: G 2 = 2 [ln L(R) ln L(F )] [ ( ) ( )] c ˆπj 1 ˆπj = 2 Y j ln + (n j Y j ) ln p j 1 p j j=1 = DEV (X 0, X 1,..., X p 1 ) hsuhl (NUK) LR Chap 10 24 / 29
Deviance Goodness of Fit Test (cont.) Test the alternative: H 0 : E{Y i j} = [1 + exp( X jβ)] 1 H a : E{Y i j} [1 + exp( X jβ)] 1 Decision rule: If DEV (X 0, X 1,..., X p 1 ) χ 2 (1 α; c p) conclude H 0 If DEV (X 0, X 1,..., X p 1 ) > χ 2 (1 α; c p) conclude H a hsuhl (NUK) LR Chap 10 25 / 29
Logistic Regression Diagnostics various residuals ordinary residuals: e i e i = { 1 ˆπi if Y i = 1 ˆπ i if Y i = 0 not be normally distributed Plots of e i against Ŷ i or X j will be uninformative Pearson Residuals: r pi r pi = Y i ˆπ i ˆπ i (1 ˆπ i ) related to Pearson chi-square goodness of fit statistic hsuhl (NUK) LR Chap 10 26 / 29
Logistic Regression Diagnostics (cont.) Studentized Pearson Residuals: r SPi r SPi = r p i, H = 1 h Ŵ 1/2 X(X Ŵ X) 1 X Ŵ 1/2 ii Ŵ : the n n diagonal matrix with elements ˆπ i (1 ˆπ i ) Deviance Residuals: dev i n DEV (X 0, X 1,..., X p 1 ) = 2 [Y i ln(ˆπ i ) + (1 Y i ) ln(1 ˆπ i )] i=1 dev i = sign(y i ˆπ i ) n 2 [Y i ln(ˆπ i ) + (1 Y i ) ln(1 ˆπ i )] i=1 n ( devi 2 = DEV (X 0, X 1,..., X p 1 )) i=1 hsuhl (NUK) LR Chap 10 27 / 29
Logistic Regression Diagnostics (cont.) hsuhl (NUK) LR Chap 10 28 / 29
Logistic Regression Diagnostics (cont.) hsuhl (NUK) LR Chap 10 29 / 29