Linear Regression Models P8111

Similar documents
Generalized Linear Models

Generalized Linear Models 1

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

LOGISTIC REGRESSION Joseph M. Hilbe

9 Generalized Linear Models

Classification. Chapter Introduction. 6.2 The Bayes classifier

Generalized linear models

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Generalized Linear Models. stat 557 Heike Hofmann

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

1. Logistic Regression, One Predictor 2. Inference: Estimating the Parameters 3. Multiple Logistic Regression 4. AIC and BIC in Logistic Regression

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Generalized Linear Models. Kurt Hornik

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Likelihoods for Generalized Linear Models

STA 450/4000 S: January

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Stat 579: Generalized Linear Models and Extensions

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Outline of GLMs. Definitions

Linear Methods for Prediction

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

STA102 Class Notes Chapter Logistic Regression

Logistic Regression 21/05

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

BMI 541/699 Lecture 22

Introduction to the Generalized Linear Model: Logistic regression and Poisson regression

Logistic Regression - problem 6.14

MS&E 226: Small Data

Generalized Linear Models for Non-Normal Data

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

SB1a Applied Statistics Lectures 9-10

Introduction to General and Generalized Linear Models

R Hints for Chapter 10

12 Modelling Binomial Response Data

Logistic Regressions. Stat 430

Modeling Overdispersion

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Single-level Models for Binary Responses

Administration. Homework 1 on web page, due Feb 11 NSERC summer undergraduate award applications due Feb 5 Some helpful books

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Generalized Estimating Equations

Linear Methods for Prediction

Generalized linear models

Poisson Regression. The Training Data

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Exercise 5.4 Solution

Lecture 5: LDA and Logistic Regression

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Sections 4.1, 4.2, 4.3

Introduction to Logistic Regression

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Chapter 4: Generalized Linear Models-II

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business

MSH3 Generalized linear model

Exam Applied Statistical Regression. Good Luck!

Logistic & Tobit Regression

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Lecture 12: Effect modification, and confounding in logistic regression

Data-analysis and Retrieval Ordinal Classification

Age 55 (x = 1) Age < 55 (x = 0)

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Lecture 10: Introduction to Logistic Regression

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

UNIVERSITY OF TORONTO Faculty of Arts and Science

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Generalized Linear Models

Introduction to Generalized Linear Models

Generalized Linear Models and Exponential Families

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Chapter 4: Generalized Linear Models-I

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Multinomial Logistic Regression Models

Lecture 01: Introduction

BANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1

Poisson regression 1/15

Analysing categorical data using logit models

Machine Learning Linear Classification. Prof. Matteo Matteucci

Econometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland

Checking the Poisson assumption in the Poisson generalized linear model

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Generalised linear models. Response variable can take a number of different formats

Various Issues in Fitting Contingency Tables

MSH3 Generalized linear model Ch. 6 Count data models

Generalized linear models

Generalized Linear Models I

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Transcription:

Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016

1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation

2 of 37 Linear regression Course started with the model y i = β 0 + β 1 x i + ɛ i where ɛ i (0, σ 2 ɛ ) In particular, y i has been continuous throughout the course

3 of 37 Binary responses Binary outcomes are common in practice; usually indicate some event Yes vs no Transplant vs no transplant Death vs no death

4 of 37 Binary responses How should we deal with binary (0/1) y s? Regression focuses on E(y x) For binary outcomes, we want E(y x) = p(y = 1 x) Does p i = p(y = 1 x) = β 0 + β 1 x i work?

5 of 37 Linear regression for binary outcome 1.0 y 0.5 0.0 10 5 0 5 x

6 of 37 What we need for binary outcomes Fitted probabilities should be between 0 and 1 Use a invertible function g : (0, 1) (, ) to link probabilities to the real line Build a model for g(p i ) = β 0 + β 1 x i

7 of 37 Link functions Lots of possible link functions: logit, probit, complimentary log-log By far, most common is the logit link: p i g(p i ) = logit(p i ) = log 1 p i The inverse link function is also useful: g 1 (z) = exp(z) 1 + exp(z)

8 of 37 Logistic regression Model is now E(y i x i ) = p i g(p i ) = log Using the logit link, we have p i 1 p i = β 0 + β 1 x i p i = g 1 (β 0 + β 1 x i ) = exp(β 0 + β 1 x i ) 1 + exp(β 0 + β 1 x i )

9 of 37 Parameter interpretation Suppose we can estimate β 0, β 1 ; what do they mean? For a binary predictor...

10 of 37 Parameter interpretation For a continuous predictor...

11 of 37 Parameter estimation For linear regression, we used least squares and found that this corresponded to ML Try using maximum likelihood for logistic regression; need a likelihood...

12 of 37 ML for logistic regression Assume that [y i x i ] Bern(p i ) Density function is p(y i ) = p y i i (1 p i) 1 y i As before, use that logit(p i ) = β 0 + β 1 x i Likelihood is L(β 0, β 1 ; y) = n i=1 p y i i (1 p i) 1 y i

13 of 37 ML for logistic regression Log likelihood is easier to work with, but it is typically not possible to find a closed-form solution Iterative algorithms are used instead (Newton-Raphson, Iteratively Reweighted Least Squares) These are implemented for a variety of link functions in R

14 of 37 Example 8 1.0 4 y 0.5 fitted_logit 0 0.0 4 10 5 0 5 x 10 5 0 5 x

15 of 37 Code > model = glm(y x, family = binomial(link = "logit"), data = data) > summary(model) Call: glm(formula = y x, family = binomial(link = "logit"), data = data) Deviance Residuals: Min 1Q Median 3Q Max -1.9360-0.4631 0.1561 0.5564 1.8131 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) 1.1072 0.3357 3.298 0.000974 *** x 0.8097 0.1664 4.865 1.15e-06 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 129.49 on 99 degrees of freedom Residual deviance: 73.24 on 98 degrees of freedom AIC: 77.24 Number of Fisher Scoring iterations: 6

16 of 37 Multiple predictors Essentially everything that worked for linear models works for logistic models: Multiple predictors of various types Interactions Polynomials Piecewise, splines (Penalization, random effects, Bayesian models)

17 of 37 Testing in Logistic In linear models, many of our inferential procedures (ANOVA, F tests,...) were based on RSS For logistic regression (and GLMs), we ll use the asymptotic Normality of MLEs: n(ˆβ β) N [0, V] with V = (X T WX) 1 and weight matrix W to construct Wald tests Likelihood ratio tests can be used to compare nested models

18 of 37 Wald tests For individual coefficients We can use the test statistic T = ˆβ j β j ŝe( ˆβ j ) This is compared to a Normal distribution, trusting that the asymptotics have kicked in Recall that coefficients are on the logit scale...

19 of 37 Confidence intervals A confidence interval with coverage (1 α) is given by β j ± t 1 α/2,n p 1 ŝe( ˆβ j ) To create a confidence interval for the exp( ˆβ j ), the estimated odds ratio, exponentiate: (exp( ˆβ j 2ŝe( ˆβ j )), exp( ˆβ j + 2ŝe( ˆβ j )))

20 of 37 Wald tests for multiple coefficients Define H 0 : c T β = c T β 0 or H 0 : c T β = 0 We can use the test statistic T = ct ˆβ c T β 0 ŝe(c T ˆβ) = ct ˆβ c T β 0 c T Var(ˆβ)c Useful for some tests, looking at fitted values

21 of 37 Model building Can define a model building strategy (at least for nested models) using these Other tools, like AIC and BIC, can compare non-nested models

22 of 37 ROC curves Forget logistic for a minute Suppose you have some test to classifying subjects as diseased or non-diseased You can describe that test using sensitivity P(+ D) and specificity P( D ) These values depend on what threshold you use for your test

23 of 37 Threshold effect on sens, spec PSA -3-2 -1 0 1 2 3 4 Non Dis

24 of 37 Threshold effect on sens, spec PSA -3-2 -1 0 1 2 3 4 sens 0.0 0.2 0.4 0.6 0.8 1.0 Non Dis 0.0 0.2 0.4 0.6 0.8 1.0 1 - spec

25 of 37 Better tests give better ROCs PSA -2 0 2 4 6 sens 0.0 0.2 0.4 0.6 0.8 1.0 Non Dis 0.0 0.2 0.4 0.6 0.8 1.0 1 - spec

26 of 37 Summarizing ROCs Area under the curve is a useful summary of an ROC AUC shouldn t be less than.5; can t be more than 1 Bigger AUC indicates better classification Useful alternative to AIC, BIC, etc

27 of 37 Connection with logistic regression Your test might be ˆµ i = ˆp(y i = 1 x i ) You can model this probability using logistic regression Cross-validated ROCs are a way to compare the predictive performance of different models: Based on fitted model (from training set) you construct fitted probabilities ˆµ i = exp(x iβ) 1+exp(x i β) for subjects in the validation set Validation subjects test positive or negative based on their fitted value; compare to the observed value

28 of 37 Generalizing this approach Suppose instead of binary data, we have y i EF(µ i, θ) where E(y i x i ) = µ i and Var(y i x i ) = a(φ)v(x i ) with known variance function V( ) and dispersion parameter φ

29 of 37 Generalized Linear Model Model components are the Probability distribution Link function Linear predictor

Linear regression as a GLM 30 of 37

31 of 37 Comparing linear and logistic Comparing linear, logistic, and Poisson regression models: Linear Logistic Poisson Outcome Continuous Binary Count Distribution Normal Binomia Poissonl Parameter E(Y) = µ E(Y) = p E(Y) = λ Range of mean < µ < 0 < p < 1 0 < λ < Variance σ 2 p(1 p) λ Natural Link identity logit log

Other link functions? 32 of 37

33 of 37 Other GLMs Framework holds for any member of the exponential family Probability distribution Link function Linear predictor

Exponential family distribution Any distribution whose density can be expressed as ( ) yθ + b(θ) f (y θ, φ) = exp + c(y, φ) a(φ) where b (θ) = µ and b (θ) = V Can take some effort to convert usual density to this form Includes Normal, Bern, Poisson, Gamma, Multinomial,... 34 of 37

35 of 37 Exponential family examples Normal: ( ) f (y; µ, σ 2 1 1 ) = exp (y µ)2 2πσ 2 2σ2 ( = exp (yµ µ 2 /2)/σ 2 1 ) 2 (y2 /σ 2 + log(2πσ 2 ))

36 of 37 Exponential family examples Bernoulli: f (y; p) = exp (y log(p) + (1 y) log(1 p)) ( ) p = exp y log + ( log(1 p)) 1 p

37 of 37 Today s big ideas Logistic regression and GLMs Suggested reading: ISLR Ch 4.2 and 4.3