Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models

Size: px
Start display at page:

Download "Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models"

Transcription

1 Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29

2 14.1 Regression Models with Binary Response Variable Regression Models with Binary Response Variable the response variable of interest has only two possible qualitative outcomes be represented by a binary indicator variables: taking values on 0 and 1 A binary response variable is said to involve binary responses or dichotomous( 二元 ) responses hsuhl (NUK) LR Chap 10 2 / 29

3 14.1 Regression Models with Binary Response Variable Meaning of Response Function when Outcome Variable is Binary 1 Consider the simple linear regression model: Y i = β 0 + β 1 X i + ε i, Y i = 0, 1 (E{ε i } = 0) E{Y i } = β 0 + β 1 X i 2 Consider Y i to be a Bernoulli random variable: Y i Probability 1 P(Y i = 1) = π i 0 P(Y i = 0) = 1 π i E{Y i } = 1(π i ) + 0(1 π i ) = π i = P(Y i = 1) = β 0 + β 1 X i hsuhl (NUK) LR Chap 10 3 / 29

4 14.1 Regression Models with Binary Response Variable Meaning of Response Function when Outcome Variable is Binary (cont.) The mean response E{Y i } = β 0 + β 1 X i is simply the probability that Y i = 1 when the level of the predictor variable is X i hsuhl (NUK) LR Chap 10 4 / 29

5 14.1 Regression Models with Binary Response Variable Special Problems when Response Variable is Binary 1 Nonnormal Error Terms: ε i = Y i (β 0 + β 1 X i ) When Y i = 1 : When Y i = 0 : ε i = 1 β 0 β 1 X i ε i = β 0 β 1 X i 2 Nonconstant Error Variance: ε i = Y i π i (π i : constant) σ 2 {Y i } = σ 2 {ε i } = π i (1 π i ) = (E{Y i })(1 E{Y i }) 3 Constraints on Response Function: 0 E{Y } = π 1 Many response function do not automatically posses this constraint. hsuhl (NUK) LR Chap 10 5 / 29

6 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response Ex: health researcher studying the effect of a mother s use of alcohol X: an index of degree of alcohol use during pregnancy Y c : the duration of her pregnancy (continuous response) simple linear regression model: Y c i = β c 0 + β c z X i + ε c i, ε c i N(0, σ 2 c) the usual simple linear regression analysis hsuhl (NUK) LR Chap 10 6 / 29

7 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) Coded each pregnancy: { 1 if Y c Y i = i T weeks 0 if Yi c > T weeks P(Y i = 1) = π i = P(Yi c T ) = P(β0 c + β1x c i + ε c i T ) = P ε c i σ c (Z) T βc 0 σ c (β0 ) = P(Z β 0 + β 1X i ) = Φ(β 0 + β 1X i ) (Φ(z) = P(Z z), Z N(0, 1) ) β1 c X i σ c ( β1 ) hsuhl (NUK) LR Chap 10 7 / 29

8 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 1 probit mean response function: E{Y i } = π i = Φ(β 0 + β 1X i ) Φ 1 (π i ) = π i = β 0 + β 1X i Φ 1 : sometimes called the probit transformation π i = β 0 + β 1 X i: probit response function or linear predictor hsuhl (NUK) LR Chap 10 8 / 29

9 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) bounded between 0 and 1 β1 (β 0 = 0) more S-shaped, changing rapidly in the center changing the sign of β1 Symmetric function: Y i = 1 Y i (Φ(Z) = 1 Φ( Z)) P(Y i = 1) = P(Y i = 0) = 1 Φ(β 0 + β 1X i ) = Φ( β 0 β 1X i ) hsuhl (NUK) LR Chap 10 9 / 29

10 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) Logistic function Similar to the normal distribution: mean=0, variance=1 slightly heavier tails hsuhl (NUK) LR Chap / 29

11 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) ε L logistic r.v. µ = 0,σ = π/ 3 f L (ε L ) = exp(ε L ) [1 + exp(ε L )] 2, F L(ε L ) = exp(ε L) 1 + exp(ε L ) If ε c i Logistic distribution with µ = 0,σ c P(Y i = 1) = π i = P = P π ε c i 3 ( ε c ) i β0 + β σ 1X i c σ c (ε L ) π β0 + π β1 3 3 (β 0 ) (β 1 ), (E( εc i ) = 0, σ{ εc i } = 1) σ c σ c X i = P (ε L β 0 + β 1 X i ) = F L (β 0 + β 1 X i ) = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) hsuhl (NUK) LR Chap / 29

12 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 2 logistic mean response function: E{Y i } = π i = F L (β 0 + β 1 X i ) = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) = exp( β 0 β 1 X i ) F 1 (π i ) = π i = β 0 + β 1 X i hsuhl (NUK) LR Chap / 29

13 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) ( ) F 1 πi (π i ) = β 0 + β 1 X i = log e : 1 π i called the logit transformation of the probability π i π i : called the odds( 勝算 ) 1 π i hsuhl (NUK) LR Chap / 29

14 14.1 Regression Models with Binary Response Variable Sigmoidal(S 形 ) Response Functions for Binary Response (cont.) 3 complementary log-log response function: extreme value of Gumbel probability distribution E{Y i } = π i = 1 exp( exp(β G 0 + β G 1 X i )) π = log[ log(1 π(x i ))] = β G 0 + β G 1 X i F 1 (π) = log e ( πi 1 π i ): called the logit transformation of the probability π π i 1 π i : called the odds( 勝算 ) hsuhl (NUK) LR Chap / 29

15 Simple Logistic Regression The most widely used: 1 The regression parameters have relatively simple and useful interpretations 2 statistical software is widely available for analysis of logistic regression models Estimation parameters: Estimation: MLE to estimate the parameters of the logistic response function Utilize the Bernoulli distribution for a binary random variable hsuhl (NUK) LR Chap / 29

16 Simple Logistic Regression (cont.) Simple Logistic Regression Model Y Ber(π): E{Y } = π Y i = E{Y i } + ε i the distribution of ε i depends on the Bernoulli distribution of the response Y i Y i are independent Bernoulli random variables with expected values E{Y i } = π, where E{Y i } = π i = exp(β 0 + β 1 X i ) 1 + exp(β 0 + β 1 X i ) hsuhl (NUK) LR Chap / 29

17 Simple Logistic Regression (cont.) Likelihood function: Each Y i : P(Y i = 1) = π i ; P(Y i = 0) = 1 π i f i (Y i ) = π Y i i (1 π i ) 1 Y i, Y i = 0, 1; i = 1,..., n The joint probability function: n n g(y 1,..., Y n ) = f i (Y i ) = π Y i i (1 π i ) 1 Y i i=1 i=1 hsuhl (NUK) LR Chap / 29

18 Simple Logistic Regression (cont.) n n g(y 1,..., Y n ) = f i (Y i ) = π Y i i (1 π i ) 1 Y i i=1 i=1 n [ ( )] πi n ln g(y 1,..., Y n ) = Y i ln + ln(1 π i ) 1 π i=1 i i=1 ( 1 π i = [1 + exp(β 0 + β 1 X i )] 1 ) ( ) πi ( ln = β 0 + β 1 X i ) 1 π i n n ln L(β 0, β 1 ) = Y i (β 0 + β 1 X i ) ln[1 + exp(β 0 + β 1 X i )] i=1 i=1 No closed-form solution for β 0, β 1 that max ln L(β 0, β 1 ) hsuhl (NUK) LR Chap / 29

19 Simple Logistic Regression (cont.) Maximum Likelihood Estimation: To find the MLE b 0, b 1 : require computer-intensive numerical search procedures Once b 0, b 1 are found: the logit transformation: ˆπ i = exp(b 0 + b 1 X i ) 1 + exp(b 0 + b 1 X i ) ˆπ = ln ( ) ˆπ = b 0 + b 1 X 1 ˆπ hsuhl (NUK) LR Chap / 29

20 Simple Logistic Regression (cont.) 1 examine the appropriateness of the fitted response function 2 if the fit is good make a variety of inferences and predictions hsuhl (NUK) LR Chap / 29

21 Simple Logistic Regression (cont.) Interpretation of b 1 : The interpretation of the estimated regression coefficient b 1 in the fitted logistic response function is not the straightforward interpretation of the slope in a linear regression model. An interpretation of b 1 : the estimated odds ˆπ/(1 ˆπ) are multiplied by exp(b 1 ) for any unit increase in X ˆπ (X j ): the logarithm of the estimated odds when X = X j (denoted as ln(odds 1 )) ˆπ (X j ) = b 0 + b 1 (X j ) hsuhl (NUK) LR Chap / 29

22 Simple Logistic Regression (cont.) Similarly, ln(odds 2 ) = ˆπ (X j + 1) ( ) odds2 b 1 = ln = ln(odds 2 ) ln(odds 1 ) odds 1 odds ratio( 勝算比 ): ÔR ÔR = odds 2 odds 1 = exp(b 1 ) hsuhl (NUK) LR Chap / 29

23 Deviance Goodness of Fit Test Deviance goodness of fit test: completely analogous to the F test for lack of fit for simple and multiple linear regression models Assume: c unique combinations of the predictors: X 1,..., X c n j : #{repeat binary observations at X j } Y ij : the ith binary response ar X j Deviance goodness of fit test: Likelihood ratio test of the reduced model: Reduced model: E{Y i j} = [1 + exp( X jβ)] 1 π j : are parameters Full model: E{Y i j} = π j, j = 1,..., c hsuhl (NUK) LR Chap / 29

24 Deviance Goodness of Fit Test (cont.) p j = Y j n j, j = 1,..., c ˆπ: the reduced model estimate of π The likelihood ratio test statistic: Deviance: G 2 = 2 [ln L(R) ln L(F )] [ ( ) ( )] c ˆπj 1 ˆπj = 2 Y j ln + (n j Y j ) ln p j 1 p j j=1 = DEV (X 0, X 1,..., X p 1 ) hsuhl (NUK) LR Chap / 29

25 Deviance Goodness of Fit Test (cont.) Test the alternative: H 0 : E{Y i j} = [1 + exp( X jβ)] 1 H a : E{Y i j} [1 + exp( X jβ)] 1 Decision rule: If DEV (X 0, X 1,..., X p 1 ) χ 2 (1 α; c p) conclude H 0 If DEV (X 0, X 1,..., X p 1 ) > χ 2 (1 α; c p) conclude H a hsuhl (NUK) LR Chap / 29

26 Logistic Regression Diagnostics various residuals ordinary residuals: e i e i = { 1 ˆπi if Y i = 1 ˆπ i if Y i = 0 not be normally distributed Plots of e i against Ŷ i or X j will be uninformative Pearson Residuals: r pi r pi = Y i ˆπ i ˆπ i (1 ˆπ i ) related to Pearson chi-square goodness of fit statistic hsuhl (NUK) LR Chap / 29

27 Logistic Regression Diagnostics (cont.) Studentized Pearson Residuals: r SPi r SPi = r p i, H = 1 h Ŵ 1/2 X(X Ŵ X) 1 X Ŵ 1/2 ii Ŵ : the n n diagonal matrix with elements ˆπ i (1 ˆπ i ) Deviance Residuals: dev i n DEV (X 0, X 1,..., X p 1 ) = 2 [Y i ln(ˆπ i ) + (1 Y i ) ln(1 ˆπ i )] i=1 dev i = sign(y i ˆπ i ) n 2 [Y i ln(ˆπ i ) + (1 Y i ) ln(1 ˆπ i )] i=1 n ( devi 2 = DEV (X 0, X 1,..., X p 1 )) i=1 hsuhl (NUK) LR Chap / 29

28 Logistic Regression Diagnostics (cont.) hsuhl (NUK) LR Chap / 29

29 Logistic Regression Diagnostics (cont.) hsuhl (NUK) LR Chap / 29

Binary Response: Logistic Regression. STAT 526 Professor Olga Vitek

Binary Response: Logistic Regression. STAT 526 Professor Olga Vitek Binary Response: Logistic Regression STAT 526 Professor Olga Vitek March 29, 2011 4 Model Specification and Interpretation 4-1 Probability Distribution of a Binary Outcome Y In many situations, the response

More information

Chapter 11 Building the Regression Model II:

Chapter 11 Building the Regression Model II: Chapter 11 Building the Regression Model II: Remedial Measures( 補救措施 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 11 1 / 48 11.1 WLS remedial measures may

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Data-analysis and Retrieval Ordinal Classification

Data-analysis and Retrieval Ordinal Classification Data-analysis and Retrieval Ordinal Classification Ad Feelders Universiteit Utrecht Data-analysis and Retrieval 1 / 30 Strongly disagree Ordinal Classification 1 2 3 4 5 0% (0) 10.5% (2) 21.1% (4) 42.1%

More information

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 )

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 ) Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 ) and Neural Networks( 類神經網路 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples of nonlinear

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

Modeling Binary Outcomes: Logit and Probit Models

Modeling Binary Outcomes: Logit and Probit Models Modeling Binary Outcomes: Logit and Probit Models Eric Zivot December 5, 2009 Motivating Example: Women s labor force participation y i = 1 if married woman is in labor force = 0 otherwise x i k 1 = observed

More information

Chapter 10 Building the Regression Model II: Diagnostics

Chapter 10 Building the Regression Model II: Diagnostics Chapter 10 Building the Regression Model II: Diagnostics 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 41 10.1 Model Adequacy for a Predictor Variable-Added

More information

Lecture 13: More on Binary Data

Lecture 13: More on Binary Data Lecture 1: More on Binary Data Link functions for Binomial models Link η = g(π) π = g 1 (η) identity π η logarithmic log π e η logistic log ( π 1 π probit Φ 1 (π) Φ(η) log-log log( log π) exp( e η ) complementary

More information

Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

More information

MSH3 Generalized linear model

MSH3 Generalized linear model Contents MSH3 Generalized linear model 5 Logit Models for Binary Data 173 5.1 The Bernoulli and binomial distributions......... 173 5.1.1 Mean, variance and higher order moments.... 173 5.1.2 Normal limit....................

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression

Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression Tore Schweder October 28, 2011 Outline Examples of binary respons variables Probit and logit - examples

More information

LOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi

LOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi LOGISTIC REGRESSION Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi- lmbhar@gmail.com. Introduction Regression analysis is a method for investigating functional relationships

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Chapter 14 Logistic and Poisson Regressions

Chapter 14 Logistic and Poisson Regressions STAT 525 SPRING 2018 Chapter 14 Logistic and Poisson Regressions Professor Min Zhang Logistic Regression Background In many situations, the response variable has only two possible outcomes Disease (Y =

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Generalized Linear Models and Exponential Families

Generalized Linear Models and Exponential Families Generalized Linear Models and Exponential Families David M. Blei COS424 Princeton University April 12, 2012 Generalized Linear Models x n y n β Linear regression and logistic regression are both linear

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon April 6, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 25 Table of contents 1 Building and applying logistic regression models (Chap

More information

12 Modelling Binomial Response Data

12 Modelling Binomial Response Data c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

Lecture #11: Classification & Logistic Regression

Lecture #11: Classification & Logistic Regression Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded

More information

Homework 1 Solutions

Homework 1 Solutions 36-720 Homework 1 Solutions Problem 3.4 (a) X 2 79.43 and G 2 90.33. We should compare each to a χ 2 distribution with (2 1)(3 1) 2 degrees of freedom. For each, the p-value is so small that S-plus reports

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

The Logit Model: Estimation, Testing and Interpretation

The Logit Model: Estimation, Testing and Interpretation The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 Introduction to maximum likelihood estimation 1.1 The likelihood function Consider a random sample Y 1,...,

More information

Introduction to the Logistic Regression Model

Introduction to the Logistic Regression Model CHAPTER 1 Introduction to the Logistic Regression Model 1.1 INTRODUCTION Regression methods have become an integral component of any data analysis concerned with describing the relationship between a response

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Linear regression is designed for a quantitative response variable; in the model equation

Linear regression is designed for a quantitative response variable; in the model equation Logistic Regression ST 370 Linear regression is designed for a quantitative response variable; in the model equation Y = β 0 + β 1 x + ɛ, the random noise term ɛ is usually assumed to be at least approximately

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: ) NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Logistic Regressions. Stat 430

Logistic Regressions. Stat 430 Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to

More information

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models

Review of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

Chapter 5: Logistic Regression-I

Chapter 5: Logistic Regression-I : Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Models with Limited Dependent Variables

Models with Limited Dependent Variables LECTURE 7 Models with Limited Dependent Variables In this lecture, we present two models with limited dependent variables. The first is a logistic regression model in which the dependent variable has a

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Logistisk regression T.K.

Logistisk regression T.K. Föreläsning 13: Logistisk regression T.K. 05.12.2017 Your Learning Outcomes Odds, Odds Ratio, Logit function, Logistic function Logistic regression definition likelihood function: maximum likelihood estimate

More information

MLE for a logit model

MLE for a logit model IGIDR, Bombay September 4, 2008 Goals The link between workforce participation and education Analysing a two-variable data-set: bivariate distributions Conditional probability Expected mean from conditional

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017 Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent.

An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent. Statistical Methods in Business Lecture 6. Binomial Logistic Regression An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

ABHELSINKI UNIVERSITY OF TECHNOLOGY

ABHELSINKI UNIVERSITY OF TECHNOLOGY Cross-Validation, Information Criteria, Expected Utilities and the Effective Number of Parameters Aki Vehtari and Jouko Lampinen Laboratory of Computational Engineering Introduction Expected utility -

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Part III Limited Dependent Variable Models As of Jan 30, 2017 1 Background 2 Binary Dependent Variable The Linear Probability

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Chapter 9 Other Topics on Factorial and Fractional Factorial Designs

Chapter 9 Other Topics on Factorial and Fractional Factorial Designs Chapter 9 Other Topics on Factorial and Fractional Factorial Designs 許湘伶 Design and Analysis of Experiments (Douglas C. Montgomery) hsuhl (NUK) DAE Chap. 9 1 / 26 The 3 k Factorial Design 3 k factorial

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department

More information

Poisson regression 1/15

Poisson regression 1/15 Poisson regression 1/15 2/15 Counts data Examples of counts data: Number of hospitalizations over a period of time Number of passengers in a bus station Blood cells number in a blood sample Number of typos

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is Example The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is log p 1 p = β 0 + β 1 f 1 (y 1 ) +... + β d f d (y d ).

More information