MIT Spring 2016

Similar documents
Weighted Least Squares I

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

MIT Spring 2015

SB1a Applied Statistics Lectures 9-10

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

STAT5044: Regression and Anova

MIT Spring 2016

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Some explanations about the IWLS algorithm to fit generalized linear models

Likelihoods for Generalized Linear Models

Generalized Linear Models. Kurt Hornik

Generalized Linear Models I

MIT Spring 2016

Generalized Linear Models 1

Machine Learning. Lecture 3: Logistic Regression. Feng Li.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

Generalized Linear Models and Exponential Families

Generalized Linear Models

Linear Regression Models P8111

Generalized linear models

Stat 710: Mathematical Statistics Lecture 12

Methods of Estimation

Chapter 4: Generalized Linear Models-II

Stat 579: Generalized Linear Models and Extensions

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Classification. Chapter Introduction. 6.2 The Bayes classifier

Logistic Regression and Generalized Linear Models

Outline of GLMs. Definitions

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

MIT Spring 2015

Generalized Estimating Equations

Cox regression: Estimation

Generalized Linear Models. stat 557 Heike Hofmann

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Generalized linear models III Log-linear and related models

Generalized Linear Models Introduction

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Lecture 16 Solving GLMs via IRWLS

Linear Regression Spring 2014

Generalized Linear Models

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS


Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction

Linear Regression With Special Variables

Linear Methods for Prediction

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Introduction to Generalized Linear Models

STAT 510 Final Exam Spring 2015

MIT Spring 2016

Multinomial Regression Models

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistical Inference

Maximum likelihood estimation

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Logistic Regression. Seungjin Choi

Linear Models in Machine Learning

Computational methods for mixed models

Sections 4.1, 4.2, 4.3

Generalized linear models

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

Bayesian Linear Regression

Modern Methods of Statistical Learning sf2935 Lecture 5: Logistic Regression T.K

LOGISTIC REGRESSION Joseph M. Hilbe

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Fall 2003: Maximum Likelihood II

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

STA 450/4000 S: January

Quasi-likelihood Scan Statistics for Detection of

Information in a Two-Stage Adaptive Optimal Design

Generalized Linear Models (1/29/13)

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

1. Fisher Information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

Applied Linear Statistical Methods

General Regression Model

Linear Methods for Prediction

Lecture 8. Poisson models for counts

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Lecture 5: LDA and Logistic Regression

Recursive Estimation

Poisson regression: Further topics

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

o a 4 x 1 yector of natural parameters with 8; = log 1';, that is 1'; = exp (8;) K(O) = I: 1'; = I:exp(8;).

Ch. 5 Transformations and Weighting

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood.

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

STAT5044: Regression and Anova

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

MIT Spring 2016

Generalized logit models for nominal multinomial responses. Local odds ratios

Transcription:

Generalized Linear Models MIT 18.655 Dr. Kempthorne Spring 2016 1

Outline Generalized Linear Models 1 Generalized Linear Models 2

Generalized Linear Model Data: (y i, x i ), i = 1,..., n where y i : response variable x i = (x i,1,..., x i,p ) T : p explanatory variables Linear predictor: r For β = (β 1,..., β p ) R p : x p i β = j=1 x i,j β j Probability Model: {y i } independent, canonical exponential r.v. s: Density: p(y i η i ) = e η i y i A(η i ) h(x) Mean Function: µ i = E [Y i ] = A(η i ) Link Function g( ) : g(µ i ) = x i β With estimate βˆ: x i βˆ = ĝ(µ i ). T Canonical Link Function: g(µ i ) = η i = [ A ] 1 (µ i ) = x β i 3

Matrix Notation y 1 x 1,1 x 1,2 x 1,p β 1 y 2. X = x 2,1 x 2,2 x 2,p y =. β =....... β y p n x n,1 x n,2 x p,n µ 1 A(η 1 ) µ 2 = A(η 2 ) E [y] = µ = = A(η).. µ n A(η n ) Examples: y i Bernoulli(θ i ) : θ η i = log( i 1 θi ) y i Poisson(λ i ) : η i = log(λ i ) y i Gaussian(µ i, 1) : η i = µ i 4

Outline Generalized Linear Models 1 Generalized Linear Models 5

Log-Likelihood Function r for Generalized Linear Model i(β) = n i=1(η i y i A(η i ) + log[h(y i )]) η T y 1 T A(η) = g(µ) T y 1 T A(η) = [Xβ] T y 1 T A(η) (for Canonical Link) Note: T (y) = X T y is sufficient when g( ) is canonical of β Solve for {β, m = 1, 2,...} iteratively: m 0 = i(β m+1 ) = i(β m ) + (β m+1 β m )i(β m ) = β = β + [ i(β )] 1 i(β ) m+1 m m m Fisher Scoring Algorithm Newton-Raphson β m Pr βˆ (the MLE) 6

For Canonical Link rn i(β) = [ (η i y i A(η i ) + log[h(y i )])] r β i=1 n = i=1 β [(η i y i A(η i ) + log[h(y i )])] r ( ) ( ) = n ηi i=1 η [(η i y i A(η i ) + log[h(y i )])] ( i β r ) ( ) T n xi β = i=1 y i A(η i ) β rn r n = i=1 (y i µ i ) x i = i=1 x i (y i µ i ) = [X] T (y µ) ( ) ( ) r T n x β β T ( β T i i(β) = [i(β)] = [ i=1 y i A(η i ) ) β ] rn = [ i=1 y i A(η i ) x i ] β r T ( ) n = i=1 [ A(η ( β T i )] x i ] r ) = n i=1 ( η i [ A(η i )] η i x i ] ) β r T ( ) r n x i ] η i n = T i=1 [ A(η i )] = β T i=1 [ A(η i )] x i xi = X T WX 7

For Canonical Link rn i(β) = β [ i=1(η i y i A(η i ) + log[h(y i )])] = X T (y µ) r ( ) ( ) xt n i β i(β) = [i(β)] = [ y i A(η i ) ] β T β T i=1 β r ( ) n = [A(η T i=1 i )] x i xi = X T WX where W = Cov(y) is diagonal with W i,i = A(η i ) = Var[y i ] 8

Iteratively Re-weighted Least Squares Interpretation Given β, and ˆµ(β ) = A(Xβ ), m m m β = β + [ i(β )] 1 i(β ) m+1 m m m = β m + [X T WX] 1 [X] T (y µˆ(β m)) Δ = (β m+1 β m ) is solved as the WLS regression of y = [y µˆ(β m)] on X = WX using Σ = Cov(y ) = W The WLS estimate of Δ is given by: Δˆ = [X Σ 1 X ] 1 T X Σ 1 y = [X T WW 1 WX] 1 X T WW 1 [y µˆ(β m)] = [X T WX] 1 X T [y µˆ(β m)] 9

Outline Generalized Linear Models 1 Generalized Linear Models 10

Binomial Data: Y i Binomial(m i, π i ), i = 1,..., k. Log-Likelihood Function ( ) k π i(π i 1,..., π k ) = [(Y i log ) + m i log(1 π i )] Covariates: {x i, i = 1,..., k} i=1 1 π i Logistic Regression Parameter: η T i = log[π i /(1 π i)] = x β W = Cov(Y) = diag(m i π i (1 π i )), (k k matrix) i 11

Outline Generalized Linear Models 1 Generalized Linear Models 12

for Generalized Linear Models Consider testing H 0 : µ = µ 0 = A(Xβ 0 ). vs H Alt : µ = µ (general n-vector) e.g., µ = A(X β ) with X = I n and β = η. Suppose y is in interior of convex support of {y : p(y η) > 0}. Then ˆη = η(y) = A 1 (y) is the MLE under H Alt The Likelihood Ratio Test Statistic of H 0 vs H Alt : 2 log λ = 2[i(η(Y [ )) i(η(µ 0 ))] = 2 [η(y) η(µ 0 )] T y [A(η(y)) A(η(µ 0 ))] = Deviance Between y and µ 0 13

Deviance Formulas for Distributions r Gaussian : r i (y i µ i ) 2 /σ0 2 Poisson : 2 r i[y i log(y i /µˆi ) (y i µˆi )] Binomial : 2 [y i log(y i /µˆi ) + (m i y i ) log[(m i y i )/(m i µˆi )] i 14

Outline Generalized Linear Models 1 Generalized Linear Models 15

Data: (y i, x i ), i = 1,..., n where y i : a q-dimensional response vector variable x i = (x i,1,..., x i,p ) T : p explanatory variables Probability Model: The conditional distributions of each y i given x i is of the form p(y x; B, φ) = f (y, η 1,..., η M, φ) for some known function f ( ), where B = [β 1 β 2 β M ] is a p M matrix of unknown regression coefficients. M Linear Predictors: For j r = 1,..., M, the jth linear predictor is η p j = η j (x) = β T j x = k=1 B kj x k, where x = (x 1,..., x p ) T with x 1 = 1 when there is an intercept. 16

Matrix Notation yt x T 1 1 x 1,1 x 1,2 x 1,p yt T 2 X = x2 x 2,1 x 2,2 x 2,p Y = =......... yt xt x n,1 x n,2 x p,n n B = [β 1 β 2 β M ] n Link Function: For each observation i E [y i ] = µ i (q 1 vectors) and g(µ i ) = η i = B T x i (M 1 vectors) where: η β T 1 (x i ) 1 x i η i = η(x i ) = =... = B T x i η M (x i ) β T mx i 17

Multivariate Exponential Family Models η T Density: p(y i η i ) = e i y i A(η i ) h(y i ) Mean Function: µ i = E[y i ] = A(η i ) Link Function g( ) : g(µ i ) = B T x i Canonical Link Function: g(µ i ) = η i = [ A ] 1 (µ i ) = B T x i Examples: y i Multinomial(m i, π i,1,..., π i,m+1 ) rm+1 π i,j 0, j = 1,..., M + 1 and π i,j = 1. j=1 y i M-Variate-Gaussian(µ i, Σ 0 ) µ i R M and Σ 0 known (M M) Case Study: Applying Generalized Linear Models. Note: Reference Yee (2010) on the VGAM Package for R 18

MIT OpenCourseWare http://ocw.mit.edu 18.655 Mathematical Statistics Spring 2016 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.