Non-Gaussian Response Variables
|
|
- Candice Pearson
- 5 years ago
- Views:
Transcription
1 Non-Gaussian Response Variables
2 What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized linear mixed model (GLMM) includes fixed and random factors The random effects are modeled as coming from a specified distribution (i.e., each female has a different baseline offspring size, and this baseline size has a normal distribution with an estimated variance) The model basically searches through parameter values to find the set of slopes and intercepts that maximize the probability of the observed data By including a random effect, you can reduce that variable s impact on the fixed effect analysis you usually will not care too much about the random effect (but you might)
3 A Flexible Modeling Framework GLMMs (and related models) allow a lot of modeling flexibility Explicit modeling of heterogeneity Including a spatial or temporal component Response variables that are not normally distributed Include fixed and random factors Naturally allows nested designs and repeated measures
4 Exponential Family of Distributions Normal: symmetric, continuous Poisson: Asymmetric, discrete Rare events, like number of robberies per week in College Station The mean equals the variance Binomial: Asymmetric, discrete Number of occurrences The mean is larger than the variance Negative binomial: Asymmetric, discrete Like the binomial, except the variance is larger than the mean Gamma: asymmetric, continuous Can have a variety of shapes, but all observations are positive
5 Parts of a GLM The distribution of the response variable Usually we assume it s normal Specification of the systematic component in terms of explanatory variables The fixed effects If we had random factors, it would be a GLMM The link between the systematic part and the response variables In our usual models, the link is the identity link, where the expected value of the response value is directly estimated (like from the equation for a line: y = mx+b)
6 Implementing a Poisson GLM Now the response variable has a Poisson distribution We specify the systematic part of the model in the usual way (same goes for random parts if we want those) The link is logarithmic, which ensures that the predicted values are always non-negative (a Poisson distribution doesn t allow negative values)
7 Example: Amphibian Roadkills Dataset: Roadkills of amphibians at 52 sites of varying distance from a natural park Number of roadkills is not normally distributed Amphibian getting run over by car might be a rare, random event, so you might expect it to have a Poisson distribution (at each distance)
8 Total Roadkills Plot of Roadkills on Distance Distance from Park
9 Fitting a Poisson GLM > M1 <- glm(tot.n ~ D.PARK, family=poisson, data=roadkills) > summary(m1)
10 Poisson GLM Output Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) 4.316e e <2e-16 *** D.PARK e e <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * (Dispersion parameter for poisson family taken to be 1) Null deviance: on 51 degrees of freedom Residual deviance: on 50 degrees of freedom AIC: Number of Fisher Scoring iterations: 4
11 Meaning of Deviance Null and residual deviances are kind of like maximum likelihood equivalents of the total and residual sums of squares An R 2 like term can be obtained from: null 100 deviance - residual null deviance deviance Applying this relationship to the previous model, we find that it explains 63.5% of the variation
12 Total Roadkills Fitting a Line for the Model Distance from Park
13 Code for the Lines MyData <- data.frame(d.park = seq(from = 0, to = 25000, by=1000)) G <- predict(m1, newdata=mydata, type="link", se=true) F <- exp(g$fit) FSEUP <- exp(g$fit+1.96*g$se.fit) FSELOW <- exp(g$fit-1.96*g$se.fit) lines(mydata$d.park, F, lty=1, lwd=3) lines(mydata$d.park, FSEUP, lty=2, lwd=3) lines(mydata$d.park, FSELOW, lty=2, lwd=3)
14 Model Selection in a Poisson GLM Option 1: Drop terms sequentially and test full and reduced models Option 2: Use the drop1 command to drop each explanatory variable in turn Option 3: Use the anova command to sequentially remove each term and compare the resulting models to the original full model
15 The drop1 command Example: Still roadkills, but with nine explanatory variables > M2 <- glm(tot.n ~ OPEN.L + MONT.S + SQ.POLIC + D.PARK + SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR, family=poisson, data=rk) > summary(m2) > drop1(m2, test= Chi )
16 Results of drop1() Single term deletions Model: TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + D.PARK + SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR Df Deviance AIC LRT Pr(>Chi) <none> OPEN.L MONT.S e-09 *** SQ.POLIC e-05 *** D.PARK < 2.2e-16 *** SQ.SHRUB e-07 *** SQ.WATRES ** L.WAT.C e-16 *** SQ.LPROAD *** SQ.DWATCOUR Signif. codes: 0 *** ** 0.01 *
17 Overdispersion Recall that the Poisson distribution assumes the variance is equal to the mean If the variance is greater than the mean, then a Poisson will not accurately describe the data This problem is called overdispersion
18 Detecting Overdispersion Calculate: ˆ D is the residual deviance of the model [It was in model M1 a few slides ago]. n p represents the degrees of freedom for the residual deviance [also reported by the summary() function in this case 50] If this value is around 1, then overdispersion should not be a problem If it is greater than 1, then overdispersion is a problem /50 = 7.8, so overdispersion is a problem in this dataset. n D p
19 Overdispersion in a Poisson GLM One approach is to use a quasi-poisson GLM This model includes a dispersion parameter to better model the variance relative to the mean If the dispersion parameter (φ) is large then it might be better to use a different model
20 Fitting a Quasipoisson > M4 <- glm(tot.n ~ D.PARK, family=quasipoisson, data=rk) > summary(m4)
21 Results Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 4.316e e < 2e-16 *** D.PARK e e e-11 *** --- Signif. codes: 0 *** ** 0.01 * (Dispersion parameter for quasipoisson family taken to be ) Null deviance: on 51 degrees of freedom Residual deviance: on 50 degrees of freedom AIC: NA Number of Fisher Scoring iterations: 4
22 Model Selection in Quasipoisson AIC is not defined for a quasipoisson model, so you can t use AIC It s possible to compare models using F-tests drop1(m5, test= F )
23 Model Validation in Poisson GLM Pearson residuals: scaled by the expected mean for a given value of the explanatory variable (because the variance of the poisson changes with the mean) Deviance residuals: the contribution of each observation to the residual deviance. In other words, a measure of how badly that point fits. The default is to use the deviance residuals for model validation, and they will usually be the best choice.
24 What to Plot Deviance residuals versus: The fitted values Each explanatory variable in the model Each explanatory variable dropped from the model Against time (if it s available) Against any spatial aspect of the data We don t expect normality, but we are looking for patterns and fit
25 Std. deviance resid Std. Pearson resid Residuals Std. deviance resid Model Validation Plots Residuals vs Fitted Normal Q-Q Predicted values Theoretical Quantiles Scale-Location Residuals vs Leverage Cook's distance Predicted values Leverage
26 EP ED E EP Model Validation Plots Response residuals Pearson residuals mu mu Pearson residuals scaled Deviance residuals mu mu
27 Code for the Validation Plots #Model validation example M5 <- glm(tot.n ~ D.PARK, family = quasipoisson, data=rk) plot(m5) EP <- resid(m5, type="pearson") ED <- resid(m5, type="deviance") mu <- predict(m5, type="response") E <- RK$TOT.N - mu EP2 <- E/sqrt( *mu) op <- par(mfrow = c(2,2)) plot(x = mu, y = E, main="response residuals") plot(x = mu, y = EP, main="pearson residuals") plot(x = mu, y = EP2, main="pearson residuals scaled") plot(x = mu, y = ED, main="deviance residuals") par(op)
28 Interpretation This model has a couple of problems First, the residuals have a clear pattern, where they are above the predicted line at some distances and below it at others Second, some outliers are strongly influencing the results
29 Negative Binomial GLM Assumes: The distribution of the response variable is negative binomial for any value of X. Recall that the variance is larger than the mean for a negative binomial distribution The link function is logarithmic, which ensures that the fitted values are always non-negative
30 Fitting a Negative Binomial GLM > library(mass) > M6 <- glm.nb(tot.n ~ OPEN.L + MONT.S + SQ.POLIC + D.PARK + SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR, link="log", data=rk) > summary(m6, cor=false)
31 Some output Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) 3.951e e <2e-16 *** OPEN.L e e ** MONT.S 5.846e e SQ.POLIC e e D.PARK e e <2e-16 *** SQ.SHRUB e e SQ.WATRES 1.631e e L.WAT.C 2.076e e * SQ.LPROAD 5.944e e SQ.DWATCOUR e e Signif. codes: 0 *** ** 0.01 * (Dispersion parameter for Negative Binomial(5.5178) family taken to be 1) Null deviance: on 51 degrees of freedom Residual deviance: on 42 degrees of freedom AIC:
32 Tools for Model Selection The z-statistic from the summary (previous slide) Analysis of deviance table from anova(m6, test= Chi ) does sequential testing Drop each term in turn using drop1(m6, test= Chi ) Manually specify a nested model and compare them using anova(m6, M7, test= Chi )
33 Results Model after model selection procedure: > M8 <- glm.nb(tot.n ~ OPEN.L + D.PARK, link = "log", data=rk) > summary(m8) > plot(m8)
34 Std. deviance resid Std. Pearson resid Residuals Std. deviance resid Residuals vs Fitted Normal Q-Q Negative Binomial Plots Predicted values Theoretical Quantiles Scale-Location Residuals vs Leverage Cook's distance Predicted values Leverage
35 Std. deviance resid Std. Pearson resid Residuals Std. deviance resid Residuals vs Fitted Normal Q-Q Poisson Plots Which is better? Predicted values Theoretical Quantiles Scale-Location Residuals vs Leverage Cook's distance Predicted values Leverage
36 Adding Random Effects in a GLMM What if you have a non-gaussian response variable AND want to include random effects in your model? The answer is a GLMM Several packages are available in R, but we will use glmer from the lme4 package
37 Example: Deer Parasites Data consist of whether or not each deer has parasites Deer differ by sex, size and farm of origin Which factors seem like they should be fixed and which are random? Because the response variable is binary, a binomial distribution is appropriate
38 Implementing the GLMM > library(lme4) > DE.lme4 <- glmer(ec01 ~ CLength * fsex + (1 ffarm), family=binomial, data=deer) > summary(de.lme4)
39 Results Part I Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmermod] Family: binomial ( logit ) Formula: Ec01 ~ CLength * fsex + (1 ffarm) Data: deer AIC BIC loglik deviance df.resid Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. ffarm (Intercept) Number of obs: 826, groups: ffarm, 24
40 Results Part II Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) ** CLength e-08 *** fsex ** CLength:fSex ** --- Signif. codes: 0 *** ** 0.01 * Correlation of Fixed Effects: (Intr) CLngth fsex2 CLength fsex CLngth:fSx
41 Summary Generalized Linear Models can accommodate non-gaussian response variables It s possible to include fixed and random effects, and then the model is called a Generalized Linear Mixed Model The syntax for the random effects depends upon the package that s being used for the analysis, so be careful
42 Summary Other features can be modeled as well, and you should consult Zuur et al. and the literature if your data include: Temporal autocorrelation Spatial autocorrelation An excess or deficit of individuals in the zero category compared to the expectations of the exponential family of distributions
Lecture 9 STK3100/4100
Lecture 9 STK3100/4100 27. October 2014 Plan for lecture: 1. Linear mixed models cont. Models accounting for time dependencies (Ch. 6.1) 2. Generalized linear mixed models (GLMM, Ch. 13.1-13.3) Examples
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationTento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/
Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.
More informationGeneralized Linear Models
Generalized Linear Models 1/37 The Kelp Data FRONDS 0 20 40 60 20 40 60 80 100 HLD_DIAM FRONDS are a count variable, cannot be < 0 2/37 Nonlinear Fits! FRONDS 0 20 40 60 log NLS 20 40 60 80 100 HLD_DIAM
More informationRegression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.
Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationNonlinear Models. What do you do when you don t have a line? What do you do when you don t have a line? A Quadratic Adventure
What do you do when you don t have a line? Nonlinear Models Spores 0e+00 2e+06 4e+06 6e+06 8e+06 30 40 50 60 70 longevity What do you do when you don t have a line? A Quadratic Adventure 1. If nonlinear
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationGeneralized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58
Generalized Linear Mixed-Effects Models Copyright c 2015 Dan Nettleton (Iowa State University) Statistics 510 1 / 58 Reconsideration of the Plant Fungus Example Consider again the experiment designed to
More informationMixed models in R using the lme4 package Part 5: Generalized linear mixed models
Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed
More informationMixed models in R using the lme4 package Part 5: Generalized linear mixed models
Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14
More informationlme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept
lme4 Luke Chang Last Revised July 16, 2010 1 Using lme4 1.1 Fitting Linear Mixed Models with a Varying Intercept We will now work through the same Ultimatum Game example from the regression section and
More informationWeek 7 Multiple factors. Ch , Some miscellaneous parts
Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires
More informationExperimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla.
Experimental Design and Statistical Methods Workshop LOGISTIC REGRESSION Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Logistic regression model Logit
More informationIntroduction and Background to Multilevel Analysis
Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and
More informationGeneralized Linear Models
York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear
More informationLogistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More informationPower analysis examples using R
Power analysis examples using R Code The pwr package can be used to analytically compute power for various designs. The pwr examples below are adapted from the pwr package vignette, which is available
More informationA Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46
A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationGeneralized linear models and survival analysis. fligner.test (resid (ACFq.glm) ~ factor (ACF1$ endtime ))
5. Generalized linear models and survival analysis Under the assumed model, these should be approximately equal. The differences in variance are however nowhere near statistical significance. The following
More informationContents. 1 Introduction: what is overdispersion? 2 Recognising (and testing for) overdispersion. 1 Introduction: what is overdispersion?
Overdispersion, and how to deal with it in R and JAGS (requires R-packages AER, coda, lme4, R2jags, DHARMa/devtools) Carsten F. Dormann 07 December, 2016 Contents 1 Introduction: what is overdispersion?
More informationExercise 5.4 Solution
Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia
More informationMixed models in R using the lme4 package Part 7: Generalized linear mixed models
Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of
More informationR Hints for Chapter 10
R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.
More informationPAPER 206 APPLIED STATISTICS
MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationMultivariate Statistics in Ecology and Quantitative Genetics Summary
Multivariate Statistics in Ecology and Quantitative Genetics Summary Dirk Metzler & Martin Hutzenthaler http://evol.bio.lmu.de/_statgen 5. August 2011 Contents Linear Models Generalized Linear Models Mixed-effects
More informationA Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 7 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, Colonic
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationOverdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion
Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion
More informationAedes egg laying behavior Erika Mudrak, CSCU November 7, 2018
Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018 Introduction The current study investivates whether the mosquito species Aedes albopictus preferentially lays it s eggs in water in containers
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized
More informationSample solutions. Stat 8051 Homework 8
Sample solutions Stat 8051 Homework 8 Problem 1: Faraway Exercise 3.1 A plot of the time series reveals kind of a fluctuating pattern: Trying to fit poisson regression models yields a quadratic model if
More informationAnalysis of binary repeated measures data with R
Analysis of binary repeated measures data with R Right-handed basketball players take right and left-handed shots from 3 locations in a different random order for each player. Hit or miss is recorded.
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationOutline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs
Outline Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009,
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationPoisson Regression. Gelman & Hill Chapter 6. February 6, 2017
Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for
More informationBinary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017
Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationPAPER 218 STATISTICAL LEARNING IN PRACTICE
MATHEMATICAL TRIPOS Part III Thursday, 7 June, 2018 9:00 am to 12:00 pm PAPER 218 STATISTICAL LEARNING IN PRACTICE Attempt no more than FOUR questions. There are SIX questions in total. The questions carry
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon April 6, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 25 Table of contents 1 Building and applying logistic regression models (Chap
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00
Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationPAPER 30 APPLIED STATISTICS
MATHEMATICAL TRIPOS Part III Wednesday, 5 June, 2013 9:00 am to 12:00 pm PAPER 30 APPLIED STATISTICS Attempt no more than FOUR questions, with at most THREE from Section A. There are SIX questions in total.
More informationR code and output of examples in text. Contents. De Jong and Heller GLMs for Insurance Data R code and output. 1 Poisson regression 2
R code and output of examples in text Contents 1 Poisson regression 2 2 Negative binomial regression 5 3 Quasi likelihood regression 6 4 Logistic regression 6 5 Ordinal regression 10 6 Nominal regression
More informationcor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )
Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationWorkshop 9.3a: Randomized block designs
-1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized
More informationPoisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Poisson Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Poisson Regression 1 / 49 Poisson Regression 1 Introduction
More informationRegression Methods for Survey Data
Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear
More informationChapter 22: Log-linear regression for Poisson counts
Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure
More informationPrediction of Bike Rental using Model Reuse Strategy
Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu
More informationModel checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking
Model checking overview Checking & Selecting GAMs Simon Wood Mathematical Sciences, University of Bath, U.K. Since a GAM is just a penalized GLM, residual plots should be checked exactly as for a GLM.
More informationReaction Days
Stat April 03 Week Fitting Individual Trajectories # Straight-line, constant rate of change fit > sdat = subset(sleepstudy, Subject == "37") > sdat Reaction Days Subject > lm.sdat = lm(reaction ~ Days)
More informationSolution Anti-fungal treatment (R software)
Contents Solution Anti-fungal treatment (R software) Question 1: Data import 2 Question 2: Compliance with the timetable 4 Question 3: population average model 5 Question 4: continuous time model 9 Question
More informationNotes for week 4 (part 2)
Notes for week 4 (part 2) Ben Bolker October 3, 2013 Licensed under the Creative Commons attribution-noncommercial license (http: //creativecommons.org/licenses/by-nc/3.0/). Please share & remix noncommercially,
More informationGeneralized linear models
Generalized linear models Christopher F Baum ECON 8823: Applied Econometrics Boston College, Spring 2016 Christopher F Baum (BC / DIW) Generalized linear models Boston College, Spring 2016 1 / 1 Introduction
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationGeneralized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.
Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationInteractions in Logistic Regression
Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/
More informationPoisson Regression. The Training Data
The Training Data Poisson Regression Office workers at a large insurance company are randomly assigned to one of 3 computer use training programmes, and their number of calls to IT support during the following
More informationrobmixglm: An R package for robust analysis using mixtures
robmixglm: An R package for robust analysis using mixtures 1 Introduction 1.1 Model Ken Beath Macquarie University Australia Package robmixglm implements the method of Beath (2017). This assumes that data
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationGeneralized Linear Models in R
Generalized Linear Models in R NO ORDER Kenneth K. Lopiano, Garvesh Raskutti, Dan Yang last modified 28 4 2013 1 Outline 1. Background and preliminaries 2. Data manipulation and exercises 3. Data structures
More informationPackage HGLMMM for Hierarchical Generalized Linear Models
Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General
More informationMODULE 6 LOGISTIC REGRESSION. Module Objectives:
MODULE 6 LOGISTIC REGRESSION Module Objectives: 1. 147 6.1. LOGIT TRANSFORMATION MODULE 6. LOGISTIC REGRESSION Logistic regression models are used when a researcher is investigating the relationship between
More informationGeneralized Linear Models
Generalized Linear Models Methods@Manchester Summer School Manchester University July 2 6, 2018 Generalized Linear Models: a generic approach to statistical modelling www.research-training.net/manchester2018
More informationLinear Modelling: Simple Regression
Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation
More informationMatched Pair Data. Stat 557 Heike Hofmann
Matched Pair Data Stat 557 Heike Hofmann Outline Marginal Homogeneity - review Binary Response with covariates Ordinal response Symmetric Models Subject-specific vs Marginal Model conditional logistic
More information20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36
20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite
More informationDuration of Unemployment - Analysis of Deviance Table for Nested Models
Duration of Unemployment - Analysis of Deviance Table for Nested Models February 8, 2012 The data unemployment is included as a contingency table. The response is the duration of unemployment, gender and
More informationA strategy for modelling count data which may have extra zeros
A strategy for modelling count data which may have extra zeros Alan Welsh Centre for Mathematics and its Applications Australian National University The Data Response is the number of Leadbeater s possum
More informationGeneralized Estimating Equations
Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson
More informationBooklet of Code and Output for STAD29/STA 1007 Midterm Exam
Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 Packages................................ 2 2 Hospital infection risk data (some).................
More informationMohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago
Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationHow to work correctly statistically about sex ratio
How to work correctly statistically about sex ratio Marc Girondot Version of 12th April 2014 Contents 1 Load packages 2 2 Introduction 2 3 Confidence interval of a proportion 4 3.1 Pourcentage..............................
More informationExploring Hierarchical Linear Mixed Models
Exploring Hierarchical Linear Mixed Models 1/49 Last time... A Greenhouse Experiment testing C:N Ratios Sam was testing how changing the C:N Ratio of soil affected plant leaf growth. He had 3 treatments.
More informationGeneralized Linear Models. stat 557 Heike Hofmann
Generalized Linear Models stat 557 Heike Hofmann Outline Intro to GLM Exponential Family Likelihood Equations GLM for Binomial Response Generalized Linear Models Three components: random, systematic, link
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More informationIntroduction to logistic regression
Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia What we are going to learn
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationLogistic Regression. 1 Analysis of the budworm moth data 1. 2 Estimates and confidence intervals for the parameters 2
Logistic Regression Ulrich Halekoh, Jørgen Vinslov Hansen, Søren Højsgaard Biometry Research Unit Danish Institute of Agricultural Sciences March 31, 2006 Contents 1 Analysis of the budworm moth data 1
More information