Lecture 14: Introduction to Poisson Regression

Size: px
Start display at page:

Download "Lecture 14: Introduction to Poisson Regression"

Transcription

1 Lecture 14: Introduction to Poisson Regression Ani Manichaikul 8 May / 52

2 Overview Modelling counts Contingency tables Poisson regression models 2 / 52

3 Modelling counts I Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week Number of coughs in a concert hall, per minute Number of customers arriving in a shop, daily 3 / 52

4 Modelling counts II Counts show up in all kinds of scenarios in our every day life In public health, count data are particularly important in measuring disease rates: We often count the number of people acquiring a particular disease in a given year Modelling counts gives us a framework in which to estimate disease incidence Poisson regression will allow us to measure association between incidence, and risk factors of interest adjusting for other related covariates 4 / 52

5 A whole new model to deal with counts? I So far we have talked about: Linear regression: for normally distributed errors Logistic regression: for binomial distributed errors Typically, our count data do not follow either of these distributions Counts are not binary (0 / 1) Counts are discrete, not continuous Counts typically have a right skewed distribution 5 / 52

6 A whole new model to deal with counts? II So far, the regression strategies we ve discussed allow us to model: Expected values and expected increase in linear regression Log odds or log odds ratios in logistic regression In modelling counts, we are typically more interested in Incidence rates Incidence ratios (when comparing across levels of a risk factor) Poisson regression will provide us with a framework to handle counts properly! 6 / 52

7 Poisson Assumptions Before going on to talk about Poisson regression, let s revisit some key concepts about the Poisson distribution: The occurrences of a random event in an interval of time are independent In theory, an infinite number of occurrences of the event are possible (though perhaps rare) within the interval In any extremely small portion of the interval, the probability of more than one occurrence of the event is approximately zero 7 / 52

8 Poisson Probability I The probability of x occurrence of an event in an interval is: P(X = x) = e λ λ x x!, x = 0, 1, 2,... where λ = the expected number of occurrences in the interval e = a constant ( 2.718) For the Poisson distribution: mean = variance = λ Mode = floor (λ), largest integer less than λ We can also think of λ as the rate parameter 8 / 52

9 Poisson Probability II Poisson probability mass function for λ = 5 Poisson probablity, λ=5 Pr(X=x) x 9 / 52

10 Poisson Probability III Poisson probability mass function for λ = 30 Poisson probablity, λ=30 Pr(X=x) x 10 / 52

11 Poisson and Binomial The Poisson distribution can be used to approximate a binomial distribution when: n is large and p is very small, or np = λ is fixed, and n becomes infinitely large 11 / 52

12 Example: Cancer in a large population Yearly cases of esophageal cancer in a large city 30 cases observed in 1990 P(X = 30) = e λ λ 30 30! λ = yearly average number of cases of esophageal cancer 12 / 52

13 Example: Belief in Afterlife I Men and women were asked whether or not they believed in afterlife (General Social Survey 1991) Possible responses were: yes, no, or unsure Y N or U M F Total / 52

14 Example: Belief in Afterlife II Question: Is belief in the afterlife independent of gender? We could address this question using a χ 2 test To perform a χ 2 test: Begin by stating our model: independece Then, we calculate expected cell counts for each entry in the table 14 / 52

15 Example: Belief in Afterlife III Under an independence model, the probability of belief in afterlife for both genders would be estimated as number who believe ˆp = P(belief in afterlife) = total number asked = Then, we calculate the expected counts as: E(males answering yes) = # men asked ˆp = = 432 E(females answering yes) = # women asked ˆp = = / 52

16 Example: Belief in Afterlife IV The observed and expected cell counts are as follows: Y N or U M 435 (432) 147 (150) 582 F 375 (378) 134 (131) 509 Total Then, the χ 2 statistic is calculated as: i,j (O ij E ij ) 2 E ij = ( ) ( )2 150 ( ) = ( ) / 52

17 Example: Belief in Afterlife V Remeber, we are testing the null hypothesis of independence, versus the alternative that the proportion believing in afterlife differs across genders: H 0 : H a : p male = p female = p overall p male p female We decided to perform the hypothesis test using a χ 2 test. Here, the appropriate degrees of freedom are: (number of rows 1)(number of columns 1) = (2 1)(2 1) = 1 17 / 52

18 Example: Belief in Afterlife VI Let s test the hypothesis at level α = Look up the appropriate critical value in R: > qchisq(1-0.05, df=1) [1] Our observed χ 2 statistic of is much smaller than the critical value We can also look up the corresponding p-value for our test: > pchisq(0.173, df=1, lower.tail=f) [1] Conclusion: fail to reject the null hypothesis 18 / 52

19 Example: Belief in Afterlife Poisson Model I Just now, we had to calculate the expected counts to perform the χ 2 test Actually, we could have written down a linear model to express the expected counts systematically Make use of the Poisson approximation to binomial Model: Y ij Poisson(λ ij ) λ ij = λ α male γresponse Here we interpret λ ij as the Poisson rate for the cell in the i th row and j th column λ is the baseline rate, α is the male effect, and γ is the response 19 / 52

20 Example: Belief in Afterlife Poisson Model II Recall that the log of a product is equal to the sum of the log: log(a b) = log(a) + log(b) So, let s log-transform our model... originally we had: λ ij = λ α male γresponse Taking the log of both sides, we have: log(λ ij ) = log λ + log(α male ) + log(γ response ) This is looking like a linear model / 52

21 Example: Belief in Afterlife Poisson Model III We stated the systematic portion of this model as log(λ ij ) = log λ + log(α male ) + log(γ response ) which we can also write using β s to look like other linear models: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) The probabilistic portion of this model enters as: Y ij Poisson(λ ij ) 21 / 52

22 Example: Belief in Afterlife Poisson Model IV In this Poisson regression model: The outcome is the log of the expected cell count The baseline β 0 is the log expected cell count for females responding no β 1 is the increase in log expected cell count for males compared to females β 2 is the increase in log expected cell count for the response yes compared to no 22 / 52

23 Fitting the afterlife model in R I Data format We began with the view of our data in a table Y N or U M 435 (432) 147 (150) 582 F 375 (378) 134 (131) 509 Total However, R expects to see unique observations listed, together with relevant covariates (indicators here), as follows: count male yes / 52

24 Fitting the afterlife model in R II Once the data is entered in R, we can analyze it using the glm command, specifying the family as poisson: > summary(out<-glm(count ~ male + yes, family=poisson)) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) <2e-16 *** male * yes <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Null deviance: on 3 degrees of freedom Residual deviance: on 1 degrees of freedom AIC: / 52

25 Fitting the afterlife model in R III So we fit the model: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) and our fitted model is: log(λ ij ) = I (male) I (response) 25 / 52

26 Predicting log expected cell counts I Using the fitted model: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) we can get predicted values for log counts in each of the four cells: For females responding no : log(e(count female, no)) = = 4.88 For males responding no : log(e(count male, no)) = = / 52

27 Predicting log expected cell counts II Using the fitted model: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) we can get predicted values for log counts in each the four cells: For females responding yes : log(e(count female, yes)) = = 5.94 For males responding yes : log(e(count male, yes)) = = / 52

28 Predicting expected cell counts I Using the fitted model: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) we can get predicted values for counts in each of the four cells: For females responding no : E(count female, no) = exp(4.88) = 131 For males responding no : E(count male, no) = exp( ) = exp(5.01) = / 52

29 Predicting expected cell counts II Using the fitted model: log(λ ij ) = β 0 + β 1 I (male) + β 2 I (response) we can get predicted values for counts in each of the four cells: For females responding yes : E(count female, yes) = exp( ) = exp(5.94) = 378 For males responding yes : E(count male, yes) = exp( ) = exp(6.07) = / 52

30 Comparing model fit in R with our fit by hand When we did our χ 2 test, we calculated the expected cell counts by hand: Estimate ˆp = , probability of responding yes under the independence model ˆp (or 1 ˆp) by the number of males (or females) to get the expected number in each cell Expected cell counts represented the numbers we should have gotten under a perfect independence model We obtained the expected cell counts: Y N or U M F Total which are exactly what we got by Poisson regression! 30 / 52

31 Afterlife coefficients I By fitting the independence model, we force the relative rate of responding yes versus no to the question of belief in the afterline to be fixed across males and females Deviation from the independence model suggests the proportion of those believing in afterlife differs by gender 31 / 52

32 Afterlife coefficients II Under the independence model: β 0 = 4.88 is the log expected count of females responding no, the baseline group β 1 = is the difference in log expected counts comparing males to females β 2 = 1.05 is the difference in log expected counts for yes responses compared to no responses 32 / 52

33 Afterlife coefficients III Under the independence model: exp(β 0 ) = is the expected count for females responding no, the baseline group expβ 1 = 1.14 is the ratio comparing the counts of males to females exp(β 2 )=2.85 is the ratio of the number of yes responses compared to no responses 33 / 52

34 A regression model for count data Contingency table: regression model Suppose we have r categories on one axis and c categories on the other We can think of an r c contingency table with r rows and c columns Independence model: log(e(y ij )) = log(λ ij ) = log(λ) + log(α i ) + log(γ j ) To test independence, we write out χ 2 test statistic as: (O ij E ij ) 2 i,j E ij which is distributed as χ 2 (r 1)(c 1) of independence under the null hypothesis 34 / 52

35 Relation to other types of regression Both logistic and poisson regression are types of linear models Extend ideas from linear regression with continuous outcomes to: binary outcomes (logistic regression) count data (Poisson regression) We say logistic regression uses the logit link Poisson regression uses a log link Poisson models is also called log-linear models Regression Outcome Link Link type Distribution name function Linear Normal identity f (µ) = µ Logistic Binary logit logit(p) = log( p 1 p ) Log-linear Poisson log log(p) 35 / 52

36 Interpretation I Under the model log(e(y ij )) = log(λ ij ) = log(λ) + log(α i ) + log(γ j ) we can interpret: λ ij is the predicted rate for the i th row and j th column of our contingency table λ is the baseline rate 36 / 52

37 Interpretation II Under the model log(e(y ij )) = log(λ ij ) = log(λ) + log(α i ) + log(γ j ) we can interpret: α i is the rate ratio, the effect of being in row i compared to baseline (and holding column fixed) γ j is the rate ratio, the effect of being in column j compared to baseline (and holding row fixed) The rate ratio is the factor by which the baseline is multiplied to get the rate, or expected cell count, for our particular non-baseline category 37 / 52

38 Example: Customers at a lumber company I Outcome: Y = number of customers visiting store from region Predictors: X 1 : number of housing units in region X 2 : average household income X 3 : average housing unit age in region X 4 : distance to nearest competitor X 5 : average distance to store in miles Counts are obtained for 110 regions, so our n= / 52

39 Example: Customers at a lumber company II Poisson (log-linear) regression model Given observations and covariates: Y i, X ij, 1 i n, 1 j p Systematic component of model: log(λ i ) = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 Probabilistic component of model: Y i X i Poisson(λ i ) 39 / 52

40 Example: Customers at a lumber company III Here s a look at some of the data: Customers Housing Income Age Competitor Store / 52

41 The Distribution of Customer counts Histogram of Customers The customer counts distribution is clearly not normally distributed Linear regression would not work well here Log-linear regression will work just fine Frequency Customers 41 / 52

42 The fitted model I > summary(lumber.glm <- glm(customers ~ Housing + Income +Age +Competitor +Store, family=poisson()) ) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) 2.942e e < 2e-16 *** Housing 6.058e e e-05 *** Income e e e-08 *** Age e e * Competitor 1.684e e e-11 *** Store e e e-15 *** --- Signif. codes: 0 *** ** 0.01 * Null deviance: on 109 degrees of freedom Residual deviance: on 104 degrees of freedom 42 / 52

43 The fitted model II We wrote down the model: log(λ i ) = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 and obtained the parameter estimates in R to be: log(λ i ) = Housing i Income i β 3 Age i Competitor i 0.129Store i 43 / 52

44 Interpretation I We interpret ˆβ 0 = 2.94 as the baseline log expected count, or log rate, in the group with all covariates (housing, income, age, distance to nearest competitor, and average distance to store) set to zero exp( ˆβ 0 ) = exp(2.94) = 18.9 is the expected count of customers in the baseline group This baseline value does not quite make sense It may be helpful to center our covariates, but this is not a big deal if we don t care about baseline because our primary inference is about the increase with respect to covariates 44 / 52

45 Interpretation II We interpret ˆβ 1 = as the increase in log expected count, or the log rate ratio comparing districts whose number of housing units differ by one, adjusting for other covariates exp( ˆβ 1 ) = exp( ) = is the rate ratio comparing districts whose mean housing units differ by one, adjusting for other covariates exp(100 ˆβ 1 ) = exp( ) = is the rate ratio comparing districts whose mean housing units differ by one hundred Keeping other factors constant, a 100 unit increase in housing units, would yield an expected 6.2% increase in customer count 45 / 52

46 Interpretation III Although our reported estimate for β 1 is very small, so is the standard error ( ) The z-statistic reported in R is 4.26, with a p-value of We can conclude that there is a statistically significant association between the customer count and mean housing units in the region We could have calculated this z-statistic ourselves as This z-statistic is also called a Wald statistic ˆβ 1 SE( ˆβ 1 ) 46 / 52

47 Interpretation IV We interpret ˆβ 2 = to be the increase in log expected count, or the log rate ratio comparing districts whose mean income differs by $1, adjusting for other covariates exp( ˆβ 2 ) = exp( ) = is the relative rate comparing regions whose mean income differs by $1, adjusting for other covariates 47 / 52

48 Interpretation V exp(10000 ˆβ 2 ) = exp( ) = is the relative rate comparing regions whose mean income differs by $10000, adjusting for other covariates Keeping other factors constant, the relative rate comparing regions whose average income differs by $10000 is For a $10000 change in average income, we would predict an 11% decrease in the rate of customers visiting the store 48 / 52

49 Interpretation VI A 95% confidence interval for β 2 is ˆβ 2 ± 1.96 SE( ˆβ 2 ) = ± = ( , ) A 95% confidence interval for exp(β 2 ) is e ( , ) = ( , ) A 95% confidence interval for exp(10000β 2 ) is e (10000 ( ),10000 ( )) = (0.854, 0.927) 49 / 52

50 Interpretation VII The 95% confidence interval for the relative rate of customers visiting the store is (0.854, 0.927) for a $10,000 increase in average income for the region Question: Based on this model, if we are going to choose a location to build a new store, should we choose areas with higher or lower income? Does it matter? 50 / 52

51 Summary I Poisson regression gives us a framework in which to build models for count data It is a special case of generalized linear models, so it is closely related to linear and logistic regression modelling All of the same modelling techniques will carry over from linear regression: Adjustment for confounding Allowing for effect modification by fitting interactions Splines and polynomial terms 51 / 52

52 Summary II Can test significance of regression coefficients using z-statistics (also called Wald statistic) Since Poisson regression specifies a model, we can calculate likelihood likelihood ratio tests will apply for testing nested models Will need a new interpretation for regression coefficients: Intercept term interpreted as log rate for baseline Coefficients for covariates interpreted as log rate ratio, or log relative rate Exponentiate to get rid of the logs 52 / 52

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

BMI 541/699 Lecture 22

BMI 541/699 Lecture 22 BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Modeling Overdispersion

Modeling Overdispersion James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Chapter 22: Log-linear regression for Poisson counts

Chapter 22: Log-linear regression for Poisson counts Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Introduction to the Analysis of Tabular Data

Introduction to the Analysis of Tabular Data Introduction to the Analysis of Tabular Data Anthropological Sciences 192/292 Data Analysis in the Anthropological Sciences James Holland Jones & Ian G. Robertson March 15, 2006 1 Tabular Data Is there

More information

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

More information

STA102 Class Notes Chapter Logistic Regression

STA102 Class Notes Chapter Logistic Regression STA0 Class Notes Chapter 0 0. Logistic Regression We continue to study the relationship between a response variable and one or more eplanatory variables. For SLR and MLR (Chapters 8 and 9), our response

More information

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

Section Poisson Regression

Section Poisson Regression Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102 Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: ) NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

We know from STAT.1030 that the relevant test statistic for equality of proportions is: 2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which

More information

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20 Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)

More information

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News:

Today. HW 1: due February 4, pm. Aspects of Design CD Chapter 2. Continue with Chapter 2 of ELM. In the News: Today HW 1: due February 4, 11.59 pm. Aspects of Design CD Chapter 2 Continue with Chapter 2 of ELM In the News: STA 2201: Applied Statistics II January 14, 2015 1/35 Recap: data on proportions data: y

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks

More information

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00 Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section

More information

Inference and Regression

Inference and Regression Name Inference and Regression Final Examination, 2015 Department of IOMS This course and this examination are governed by the Stern Honor Code. Instructions Please write your name at the top of this page.

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

R Hints for Chapter 10

R Hints for Chapter 10 R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X

Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Chapter 864 Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Introduction Logistic regression expresses the relationship between a binary response variable and one or more

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Modeling the Mean: Response Profiles v. Parametric Curves

Modeling the Mean: Response Profiles v. Parametric Curves Modeling the Mean: Response Profiles v. Parametric Curves Jamie Monogan University of Georgia Escuela de Invierno en Métodos y Análisis de Datos Universidad Católica del Uruguay Jamie Monogan (UGA) Modeling

More information

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

More information

Logistic Regressions. Stat 430

Logistic Regressions. Stat 430 Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p ) Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

More information

Regression Methods for Survey Data

Regression Methods for Survey Data Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear

More information

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game. EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1. Probelm 3.18 (page 96 of Agresti). (a) Y assume Poisson random variable. Plausible Model: E(y) = µt. The expected number of arrests arrests

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Logistic Regression - problem 6.14

Logistic Regression - problem 6.14 Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values

More information

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer Solutions to Exam in 02402 December 2012 Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer 3 1 5 2 5 2 3 5 1 3 Exercise IV.2 IV.3 IV.4 V.1

More information

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example

More information

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). STAT 515 -- Chapter 13: Categorical Data Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure). Many studies allow for more than 2 categories. Example

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 Work all problems. 60 points are needed to pass at the Masters Level and 75

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

Lecture 1 Introduction to Multi-level Models

Lecture 1 Introduction to Multi-level Models Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Homework 1 Solutions

Homework 1 Solutions 36-720 Homework 1 Solutions Problem 3.4 (a) X 2 79.43 and G 2 90.33. We should compare each to a χ 2 distribution with (2 1)(3 1) 2 degrees of freedom. For each, the p-value is so small that S-plus reports

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti Lecture 2: Categorical Variable A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti 1 Categorical Variable Categorical variable is qualitative

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S Logistic regression analysis Birthe Lykke Thomsen H. Lundbeck A/S 1 Response with only two categories Example Odds ratio and risk ratio Quantitative explanatory variable More than one variable Logistic

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Ron Heck, Fall Week 3: Notes Building a Two-Level Model Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level

More information