Nonlinear Models. What do you do when you don t have a line? What do you do when you don t have a line? A Quadratic Adventure

Size: px
Start display at page:

Download "Nonlinear Models. What do you do when you don t have a line? What do you do when you don t have a line? A Quadratic Adventure"

Transcription

1 What do you do when you don t have a line? Nonlinear Models Spores 0e+00 2e+06 4e+06 6e+06 8e longevity What do you do when you don t have a line? A Quadratic Adventure 1. If nonlinear terms are additive fit with OLS 2. Transform? But think about what it will do to error. 3. Nonlinear Least Squares 4. Generalized Linear Models Spores 0e+00 2e+06 4e+06 6e+06 8e longevity Spores = b0 + b1 Longevity + b2 Longevity 2 + error

2 Putting Nonlinear Terms into an Additive Model Parameters are the Same as Ever summary(fungus.lmsq) fungus.lmsq <- lm(spores longevity + I(longevityˆ2), data=fungus) Call: lm(formula = Spores longevity + I(longevityˆ2), data = fungus) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) longevity e-05 I(longevityˆ2) Residual standard error: on 29 degrees of freedom Multiple R-squared: 0.607, Adjusted R-squared: 0.58 F-statistic: 22.4 on 2 and 29 DF, p-value: 1.3e-06 We Can t use abline Anymore We Can t use abline Anymore plot(spores longevity, data=fungus, pch=19) fungusfun <- function(x) coef(fungus.lmsq)[1] + coef(fungus.lmsq)[2]*x + coef(fungus.lmsq)[3]*xˆ2 curve(fungusfun, add=t, col="red", lwd=2) Spores 0e+00 2e+06 4e+06 6e+06 8e longevity

3 What if It s not a Linear Combination of Terms? Common Transformations bmr.watts log(y) arcsin(sqrt(y)) for bounded data logit for bounded data (more well behaved) Box-Cox Transform May have to add 0.01, 0.5, or 1 to many of these in cases with 0s mass.g You must ask yourself, what do the transformed variables mean? MetabolicRate = a mass b But Where does Error Come In But Where does Error Come In log(bmr.watts) log(metabolicrate) = log(a) + b log(mass) + error implies MetabolicRate = a mass b e error but we often want log(mass.g) MetabolicRate = a mass b + error log(metabolicrate) = log(a) + b log(mass) + error

4 Nonlinear Least Squares Fitting Nonlinear Least Squares Fitting summary(primate.nls) primate.nls <- nls(bmr.watts a*mass.gˆb, data=primates, start=list(a = , b = )) Uses algorithm for fitting. Very flexible. Must specify start values. Formula: bmr.watts a * mass.gˆb Parameters: Estimate Std. Error t value Pr(> t ) a b < 2e-16 Residual standard error: on 15 degrees of freedom Number of iterations to convergence: 4 Achieved convergence tolerance: 7.01e-07 NLS Performs Better Exercise: Kelp! bmr.watts NLS log log Evaluate the Frond Holdfast relationship Fit a model with a log transformation Fit a model with a nls model Compare Check the diagnostics - see anything? mass.g

5 The Kelp Data Envelope Residuals from Log Transform Residuals vs Fitted Normal Q Q Error: object kelp not found are a count variable, cannot be < 0 Residuals Standardized residuals Fitted values Theoretical Quantiles Mild Trumpet even in NLS The Kelp Data residuals(kelp.nls) Maybe the error is wrong fitted(kelp.nls) are a count variable, cannot be < 0

6 Generalized Linear Models Some Common Links Basic Premise: y dist(η, ν) dist is a distribution of the exponential family η is a link function such that η = f(µ) where µ is the mean of a curve φ is a variance function Identity: η = µ - e.g. µ = a + bx Log: η = log(µ) - e.g. µ = e ( a + bx) Logit: η = logit(µ) - e.g. µ = e( a+bx) 1+e ( a+bx) Inverse: η = 1 µ - e.g. µ = (a + bx) 1 For example, if dist is Normal, canonical link is µ, variance is σ 2 Distributions, Canonical Links, and Dispersion Distributions and Other Links Distribution Canonical Link Variance Function Normal identity 1 Poisson log µ Quasipoisson log µθ Binomial logit µ(1 µ) Quasibinomial logit µ(1 µ)θ Negative Binomial log µ + κµ 2 Gamma inverse µ 2 Inverse Normal 1/µ 2 µ 3 Distribution Normal Poisson Quasipoisson Binomial Quasibinomial Negative Binomial Gamma Inverse Normal Links identity, log, inverse log, identity, sqrt log, identity, sqrt logit, probit, cauchit, log, log-log logit, probit, cauchit, log, log-log log, identity, sqrt inverse, identity, log 1/µ 2, inverse, identity, log

7 Deviance and IWLS The Kelp Data Every GLM has a Set of Deviance Function to be Minimized i.e., for a normal distribution DM = (yi ˆµi) 2 Models are Fit using Iteratively Weighted Least Squares algorithm are a count variable, cannot be < 0 Fitting a GLM with a Poisson Error and Log Link How do we Assess Meeting Assumptions? Fronds Poisson( Fronds ˆ ) ˆ F ronds = exp(a + b * holdfast diameter) Deviance Residuals Deviance Residuals kelp.glm <- glm(, data=kelp, family=poisson(link="log")) Fitted Link Fitted Response Response Residuals Response Resdiuals Fitted Link Fitted Response

8 Predicted values Obs. number Theoretical Quantiles Cook's distance Leverage Predicted values Different Types of Residuals How do we Assess Meeting Assumptions? residuals(kelp.glm, type="deviance") residuals(kelp.glm, type="pearson") residuals(kelp.glm, type="response") Residuals Residuals vs Fitted Cook's distance Std. deviance resid Normal Q Q Residuals vs Leverage Std. deviance resid Scale Location Cook's distance Std. Pearson resid GLM Model Coefficients Call: glm(formula =, family = poisson(link = "log"), data = kelp) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) <2e <2e-16 (Dispersion parameter for poisson family taken to be 1) Null deviance: on 107 degrees of freedom Residual deviance: on 106 degrees of freedom Checking Fit cor(fitted(kelp.glm), fitted(kelp.glm) + residuals(kelp.glm, type="response"))ˆ2 [1] summary(kelp.lm)$r.squared [1] 0.277

9 The Fitted Model Prediction Confidence Intervals by Hand upperci <- qpois(0.975, lambda = round(fitted(kelp.glm))) lowerci <- qpois(0.025, lambda = round(fitted(kelp.glm))) HLD <- na.omit(kelp)$ kelp.ggplot + geom_line(mapping=aes(x=hld, y=upperci), lty=2, col="blue") + geom_line(mapping=aes(x=hld, y=lowerci), lty=2, col="blue") Prediction Confidence Intervals by Hand Which Overdispersed Distribution to Use? v(quasipoisson) = µθ v(negative Binomial) = µ + κµ 2 20 see Ver Hoef and Boveng 2007 Ecology 0 Overdispersion?

10 GLM with Negative Binomial Negative Binomial Performs Better anova(kelp.glm, kelp.glm.nb) library(mass) kelp.glm.nb <- glm.nb(, data=kelp) Analysis of Deviance Table Model 1: Model 2: Resid. Df Resid. Dev Df Deviance The Fitted Model Fit with Prediction Error

11 Compare to Quasipoisson with Prediction Error Example: Wolf Inbreeding and Litter Size The Number of Pups is a Count! Fit GLMs with different errors and links Which is the best model? Plot with fit and prediction error

Generalized Linear Models

Generalized Linear Models Generalized Linear Models 1/37 The Kelp Data FRONDS 0 20 40 60 20 40 60 80 100 HLD_DIAM FRONDS are a count variable, cannot be < 0 2/37 Nonlinear Fits! FRONDS 0 20 40 60 log NLS 20 40 60 80 100 HLD_DIAM

More information

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30 Nonlinear Models 1/30 Daphnia: Purveyors of Fine Fungus 2/30 What do you do when you don t have a straight line? 7500000 Spores 5000000 2500000 0 30 40 50 60 70 longevity 3/30 What do you do when you don

More information

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples. Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Non-Gaussian Response Variables

Non-Gaussian Response Variables Non-Gaussian Response Variables What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized

More information

Generalized linear models

Generalized linear models Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Reaction Days

Reaction Days Stat April 03 Week Fitting Individual Trajectories # Straight-line, constant rate of change fit > sdat = subset(sleepstudy, Subject == "37") > sdat Reaction Days Subject > lm.sdat = lm(reaction ~ Days)

More information

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017

Poisson Regression. Gelman & Hill Chapter 6. February 6, 2017 Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson ) Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Modeling Overdispersion

Modeling Overdispersion James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

STAC51: Categorical data Analysis

STAC51: Categorical data Analysis STAC51: Categorical data Analysis Mahinda Samarakoon April 6, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 25 Table of contents 1 Building and applying logistic regression models (Chap

More information

SPH 247 Statistical Analysis of Laboratory Data. April 28, 2015 SPH 247 Statistics for Laboratory Data 1

SPH 247 Statistical Analysis of Laboratory Data. April 28, 2015 SPH 247 Statistics for Laboratory Data 1 SPH 247 Statistical Analysis of Laboratory Data April 28, 2015 SPH 247 Statistics for Laboratory Data 1 Outline RNA-Seq for differential expression analysis Statistical methods for RNA-Seq: Structure and

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps

More information

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics Linear, Generalized Linear, and Mixed-Effects Models in R John Fox McMaster University ICPSR 2018 John Fox (McMaster University) Statistical Models in R ICPSR 2018 1 / 19 Linear and Generalized Linear

More information

Exercise 5.4 Solution

Exercise 5.4 Solution Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia

More information

Correlation and Regression

Correlation and Regression Correlation and Regression http://xkcd.com/552/ Review Testing Hypotheses with P-Values Writing Functions Z, T, and χ 2 tests for hypothesis testing Power of different statistical tests using simulation

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Checking the Poisson assumption in the Poisson generalized linear model

Checking the Poisson assumption in the Poisson generalized linear model Checking the Poisson assumption in the Poisson generalized linear model The Poisson regression model is a generalized linear model (glm) satisfying the following assumptions: The responses y i are independent

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Experimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla.

Experimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla. Experimental Design and Statistical Methods Workshop LOGISTIC REGRESSION Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Logistic regression model Logit

More information

Generalized Linear Models. stat 557 Heike Hofmann

Generalized Linear Models. stat 557 Heike Hofmann Generalized Linear Models stat 557 Heike Hofmann Outline Intro to GLM Exponential Family Likelihood Equations GLM for Binomial Response Generalized Linear Models Three components: random, systematic, link

More information

12 Modelling Binomial Response Data

12 Modelling Binomial Response Data c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual

More information

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Poisson Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Poisson Regression 1 / 49 Poisson Regression 1 Introduction

More information

Consider fitting a model using ordinary least squares (OLS) regression:

Consider fitting a model using ordinary least squares (OLS) regression: Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful

More information

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables

More information

Generalized Linear Models in R

Generalized Linear Models in R Generalized Linear Models in R NO ORDER Kenneth K. Lopiano, Garvesh Raskutti, Dan Yang last modified 28 4 2013 1 Outline 1. Background and preliminaries 2. Data manipulation and exercises 3. Data structures

More information

Homework 5 - Solution

Homework 5 - Solution STAT 526 - Spring 2011 Homework 5 - Solution Olga Vitek Each part of the problems 5 points 1. Agresti 10.1 (a) and (b). Let Patient Die Suicide Yes No sum Yes 1097 90 1187 No 203 435 638 sum 1300 525 1825

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Logistic Regression - problem 6.14

Logistic Regression - problem 6.14 Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46

A Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46 A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response

More information

Chapter 3: Generalized Linear Models

Chapter 3: Generalized Linear Models 92 Chapter 3: Generalized Linear Models 3.1 Components of a GLM 1. Random Component Identify response variable Y. Assume independent observations y 1,...,y n from particular family of distributions, e.g.,

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Multivariate Statistics in Ecology and Quantitative Genetics Summary

Multivariate Statistics in Ecology and Quantitative Genetics Summary Multivariate Statistics in Ecology and Quantitative Genetics Summary Dirk Metzler & Martin Hutzenthaler http://evol.bio.lmu.de/_statgen 5. August 2011 Contents Linear Models Generalized Linear Models Mixed-effects

More information

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 7 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, Colonic

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00 Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section

More information

Generalized Linear Models: An Introduction

Generalized Linear Models: An Introduction Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,

More information

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017 Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007 Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big

More information

Notes for week 4 (part 2)

Notes for week 4 (part 2) Notes for week 4 (part 2) Ben Bolker October 3, 2013 Licensed under the Creative Commons attribution-noncommercial license (http: //creativecommons.org/licenses/by-nc/3.0/). Please share & remix noncommercially,

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5.

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5. Ch. 5: Transformations and Weighting 1. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3. Weighted regression -

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Assumptions of Linear Model Homoskedasticity Model variance No error in X variables Errors in variables No missing data Missing data model Normally distributed error Error in

More information

Sample solutions. Stat 8051 Homework 8

Sample solutions. Stat 8051 Homework 8 Sample solutions Stat 8051 Homework 8 Problem 1: Faraway Exercise 3.1 A plot of the time series reveals kind of a fluctuating pattern: Trying to fit poisson regression models yields a quadratic model if

More information

Generalized Linear Models 1

Generalized Linear Models 1 Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Generalized linear models and survival analysis. fligner.test (resid (ACFq.glm) ~ factor (ACF1$ endtime ))

Generalized linear models and survival analysis. fligner.test (resid (ACFq.glm) ~ factor (ACF1$ endtime )) 5. Generalized linear models and survival analysis Under the assumed model, these should be approximately equal. The differences in variance are however nowhere near statistical significance. The following

More information

Introduction to the Generalized Linear Model: Logistic regression and Poisson regression

Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Statistical modelling: Theory and practice Gilles Guillot gigu@dtu.dk November 4, 2013 Gilles Guillot (gigu@dtu.dk)

More information

Review of Poisson Distributions. Section 3.3 Generalized Linear Models For Count Data. Example (Fatalities From Horse Kicks)

Review of Poisson Distributions. Section 3.3 Generalized Linear Models For Count Data. Example (Fatalities From Horse Kicks) Section 3.3 Generalized Linear Models For Count Data Review of Poisson Distributions Outline Review of Poisson Distributions GLMs for Poisson Response Data Models for Rates Overdispersion and Negative

More information

Statistics 572 Semester Review

Statistics 572 Semester Review Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.

More information

MSH3 Generalized linear model Ch. 6 Count data models

MSH3 Generalized linear model Ch. 6 Count data models Contents MSH3 Generalized linear model Ch. 6 Count data models 6 Count data model 208 6.1 Introduction: The Children Ever Born Data....... 208 6.2 The Poisson Distribution................. 210 6.3 Log-Linear

More information

> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE))

> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE)) School of Mathematical Sciences MTH5120 Statistical Modelling I Tutorial 4 Solutions The first two models were looked at last week and both had flaws. The output for the third model with log y and a quadratic

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks

More information

Week 7 Multiple factors. Ch , Some miscellaneous parts

Week 7 Multiple factors. Ch , Some miscellaneous parts Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires

More information

Ch. 5 Transformations and Weighting

Ch. 5 Transformations and Weighting Outline Three approaches: Ch. 5 Transformations and Weighting. Variance stabilizing transformations; Box-Cox Transformations - Section 5.2; 5.4 2. Transformations to linearize the model - Section 5.3 3.

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Interactions in Logistic Regression

Interactions in Logistic Regression Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Residuals from regression on original data 1

Residuals from regression on original data 1 Residuals from regression on original data 1 Obs a b n i y 1 1 1 3 1 1 2 1 1 3 2 2 3 1 1 3 3 3 4 1 2 3 1 4 5 1 2 3 2 5 6 1 2 3 3 6 7 1 3 3 1 7 8 1 3 3 2 8 9 1 3 3 3 9 10 2 1 3 1 10 11 2 1 3 2 11 12 2 1

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

PAPER 206 APPLIED STATISTICS

PAPER 206 APPLIED STATISTICS MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Yan Lu Jan, 2018, week 3 1 / 67 Hypothesis tests Likelihood ratio tests Wald tests Score tests 2 / 67 Generalized Likelihood ratio tests Let Y = (Y 1,

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

Extending the Linear Model

Extending the Linear Model Extending the Linear Model G.Janacek School of Computing Sciences, UEA Part I generalized linear models 1 Contents I generalized linear models 1 0.1 Introduction................................... 4 0.1.1

More information

R Hints for Chapter 10

R Hints for Chapter 10 R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.

More information

11. Generalized Linear Models: An Introduction

11. Generalized Linear Models: An Introduction Sociolog 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Generalized Linear Models: An Introduction 1 1. Introduction I A snthesis due to Nelder and Wedderburn, generalized linear

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Handling Categorical Predictors: ANOVA

Handling Categorical Predictors: ANOVA Handling Categorical Predictors: ANOVA 1/33 I Hate Lines! When we think of experiments, we think of manipulating categories Control, Treatment 1, Treatment 2 Models with Categorical Predictors still reflect

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: ) NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3

More information

MSH3 Generalized linear model

MSH3 Generalized linear model Contents MSH3 Generalized linear model 5 Logit Models for Binary Data 173 5.1 The Bernoulli and binomial distributions......... 173 5.1.1 Mean, variance and higher order moments.... 173 5.1.2 Normal limit....................

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

Regression Model Building

Regression Model Building Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated

More information

Statistical analysis of trends in climate indicators by means of counts

Statistical analysis of trends in climate indicators by means of counts U.U.D.M. Project Report 2016:43 Statistical analysis of trends in climate indicators by means of counts Erik Jansson Examensarbete i matematik, 15 hp Handledare: Jesper Rydén Examinator: Jörgen Östensson

More information

STA102 Class Notes Chapter Logistic Regression

STA102 Class Notes Chapter Logistic Regression STA0 Class Notes Chapter 0 0. Logistic Regression We continue to study the relationship between a response variable and one or more eplanatory variables. For SLR and MLR (Chapters 8 and 9), our response

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

GLMs for Count Data. Outline. Generalized linear models. Generalized linear models. Michael Friendly. Example: phdpubs.

GLMs for Count Data. Outline. Generalized linear models. Generalized linear models. Michael Friendly. Example: phdpubs. Number of articles Number of articles 1.9 1.7 1.5 1.7 1.5 0 1 female 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 phdprestige GLMs for Count Data Michael Friendly Psych 6136 November 28, 2017 Number of articles

More information

Logistic Regressions. Stat 430

Logistic Regressions. Stat 430 Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to

More information

Booklet of Code and Output for STAD29/STA 1007 Midterm Exam

Booklet of Code and Output for STAD29/STA 1007 Midterm Exam Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 NBA attendance data........................ 2 2 Regression model for NBA attendances...............

More information