# Stat 642, Lecture notes for 04/12/05 96

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal sized groups according to their predicted probabilities. Then where G 2 HL = 10 j=1 O j E j 2 E j 1 E j /n j χ2 8 n j = Number of observations in the j th group O j = i E j = i y ij = Observed number of cases in the j th group ˆp ij = Expected number of cases in the j th group Example: Using the THS data described earlier 4/7/2005, I added the lackfit option to the model statement in PROC LOGISTIC as follows: model cens28dy = trt age sbpcat aist hctbase / lackfit; The output corresponding to the Hosmer-Lemeshow statistic is: Partition for the Hosmer and Lemeshow Test CENS28DY = 0 CENS28DY = 1 Group Total Observed Expected Observed Expected Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square DF Pr > ChiSq

2 Stat 642, Lecture notes for 04/12/05 97 This output shows the observed and expected numbers of cases and controls within each group, and the final test statistic. The chi-square statistic suggests that there may be lack of fit, so I refit the model including a quadratic term for age: model cens28dy = trt age age sbpcat aist hctbase / lackfit; Partition for the Hosmer and Lemeshow Test CENS28DY = 0 CENS28DY = 1 Group Total Observed Expected Observed Expected Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square DF Pr > ChiSq This model shows no evidence of lack of fit based on the H-L statistic, so apparently any lack of fit has been corrected. Ideally, incorrect model specification such as non-linearity in the predictors or missing predictors should be detectable by this statistic. However, depending on the model, G 2 LH may not be particularly sensitive to departures from the true model. The H-L test does not provide any information regarding the nature of the lack of fit observed. Other procedures may be more useful. Other goodness-of-fit procedures Tsiatis A note on the goodness-of-fit test for the istic regression model, Biometrika, 67: , 1980 proposes a test for goodness-of-fit obtained by grouping observations into k groups according to covariate values. The goodness-of-fit test uses the score test of H 0 that the fitted model is correct, versus the model which adds to the fitted model a categorical variable corresponding to the groupings. For example, for the model above, one could group observations by age categories possibly equally spaced, or of equal size - the groups are somewhat arbitrary, and perform the score test for the addition of these groupings to the model similar to what one would do in a forward selection procedure.

3 Stat 642, Lecture notes for 04/12/05 98 Lin, Wei and Ying Model checking techniques based on cumulative residuals, Biometrics, 58:1-12, 2002 propose assessing goodness-of-fit using moving sums of residuals of the form: W x = i Ix b < X ij < x + by i Ey i where I is one if the argument is true, and zero otherwise, x is an arbitrary value of covariate j and X ij is the value of covariate j for subject i. W x is simply the total observed - expected for all observations with the value covariate j within b units of x. By plotting W x against x for values of x over the range of the covariate, patterns may be apparent which suggest lack of fit. LWY propose a formal test using simulation to assess whether the observed patterns are likely by chance alone. The score test of Tsiatis may also be used to provide a formal test. Polychotomous Responses note that in the interest of time, I skipped over some of the material in this section during class Now we consider the case in which we have more than two levels of outcome variable. Examples Type of Disease Site of Onset e.g. tumor Disease Severity ordered Cause of Death Suppose that we have k + 1 response categories, 0, 1,..., k. Under a multinomial model there are k + 1 event probabilities which sum to one. Hence there are k independent its to model. We consider three approaches to this modeling, all of which produce different results. Log-linear model Continuation Ratios Cumulative Logit Log-linear Model Poisson Sampling model Suppose we have a covariate with levels 0, 1, 2,.... We may construct the following table:

4 Stat 642, Lecture notes for 04/12/05 99 Response k Covariate Values For the i, j cell, we have the -linear model λ ij = α i + β j + γ ij where we impose the constraint that γ 0j = γ i0 = 0 i, j. We could, for example, choose a baseline category for the response, say i = 0. Then λij λ 0j = pij p 0j = α i α 0 + γ ij Then γ ij is the odds ratio for response level i versus level 0 and exposure level j versus level 0. This may be called the baseline it model. Similarly, if the responses are naturally ordered, then it may be reasonable to consider adjacent categories: λi+1,j λ i,j = pi+1,j p i,j = α i+1 α i + γ i+1,j γ i,j = α i + γ i,j Then γij is the odds ratio for response level i + 1 versus level i and exposure level j versus level 0. More generally, we might have In which case pij p 0j λ ij = x T j β i We may compute the probabilities themselves via = x T j β i β 0 = x T j β i. p ij = e x jβ i 1 + l 1 ex jβ l For example, if k = 2, we have: p 0j = e x jβ1 + e x jβ2, p 1j = e x jβ e x jβ 1 + e x jβ 2 p 2j = e x jβ e x jβ 1 + e x jβ 2 Hypothetical example: Suppose we consider death from Heart Disease or Cancer. There is no particular order to these outcomes, so this kind of model may make some sense. In the accompanying

5 Stat 642, Lecture notes for 04/12/ plot, the probabilities of death from heart disease and cancer change as a function of age and this model allows them to change quite independently. Hypothetical Example Death from Heart Disease vs. Cancer Heart Disease Cancer Survival or Other Probability Age Note that if we consider a plot of its for each of heart disease and cancer versus survival we have straight lines: Hypothetical Example - it scale Heart Disease Cancer p i /p 0 Age

6 Stat 642, Lecture notes for 04/12/ Continuation Ratios Take k = 2 for a moment and consider two parallel schemes for constructing the 2 independent its: In General: Scheme A p0j p 1j + p 2j p1j p 2j pij l>i p lj Scheme B p2j p 0j + p 1j p1j p 0j pij l<i p lj If the categories are 0, 1, 2,..., k, then under Scheme A, we first consider the lowest category i = 0 versus all remaining categories. For each category i, i < k, we consider category i versus all higher categories. Under Scheme B, we consider for each category i, i < 0, we consider category i versus all lower categories. For example, if the categories are Alive, Heart Disease Death and Cancer Death, we might consider its for Alive versus Dead and among the deaths Heart Disease Death versus Cancer Death. If we set the i th it equal to x T j β i, we can express the multinomial probabilities of being in a particular category as follows: take k = 3, Scheme A. First we have that next, Similarly, Finally, p 0j = Pr{y j = 0} = ext β e xt β 0 p 1j = Pr{y j = 1} = Pr{y j = 1 y j 1} Pr{y j 1} = e xt β e xt β e xt β 0. p 2j = Pr{y j = 2} = Pr{y j = 2 y j 2} Pr{y j 2 y j 1} Pr{y j 1} = e xt β e xt β e xt β e xt β 0 p 3j = Pr{y j = 3} = Pr{y j = 3 y j 2} Pr{y j 2 y j 1} Pr{y j 1} = e xt β e xt β e xt β 0

7 Stat 642, Lecture notes for 04/12/ In all cases, the p ij are products of conditional probabilities each of which depends one only on one of the β i. Since the likelihood is composed of products of the above p ij, it will also factor into a product of conditional likelihoods, each of which can be independently maximized. Hence, this model can be fit as a series of binary istic regression models. Each binary model is fit by restricting the data set to only those observations with responses in categories i through k under Scheme A or 0 through i under Scheme B and considering response level i versus the remaining categories. Continuation Ratio models give different fitted values than -linear models and Scheme A gives different fitted values than Scheme B. Cumulative Logits. Cumulative Logit models are typically used when we have ordinal responses. That is, we may consider the outcomes as a series of progressively worse or more extreme outcomes. In this case the relevant questions usually involve probabilities of subjects progressing from one category to the next, or probabilities of a subject being above of a given category versus at or lower than a given category. Example: Disease Severity Percent of Subjects None Mild Moderate Severe - + Exposure In the above plot, the size of the none category decreases slightly with exposure. This suggests that exposure increases the risk of disease without regard for severity. On the other hand, the mild category also decreases with exposure, yet this fact by itself is not very meaningful since we don t know whether or not subjects who otherwise would have been in the mild category were moved to the none category or the moderate or worse categories. The conclusion might be quite different depending on which of these happened. More meaningful is the observation that

8 Stat 642, Lecture notes for 04/12/ the combined none and mild categories shrunk, suggesting that the rate of moderate or worse disease increased. This leads us to the cumulative it model: Pr{yj i} = α i + x T j Pr{y j > i} β i. If β i = β i, this is the Proportional Odds model. We also require that α 0 α 1... although this will hold automatically in a proportional odds model. eα i+x T j β i We have Pr{y j i} =. 1 + e α i+x T j β i Note that if β i β i for some i i, then α i + β i x = α i + β i x for some x and the lines will cross. If α i + x T j β i > α i + x T j β i for some i < i then Pr{y j i} > Pr{y j i } which is nonsense. Hence if a cumulative it model is NOT a proportional odds model, there may be particular values of x for which the predicted values are inconsistent. Under the proportional odds model, the shapes of all curves are the same, they are shifted progressively to the left for increasing i. The horizontal distance between adjacent curves is α i α i 1 /β. Pr{y i} i = 0 i = 1 i = 2 Pr{y = 3 x} Pr{y = 2 x} Pr{y = 1 x} Pr{y = 0 x} Odds Ratios: Let the odds of {y j i} be ψ ij = Pr{y j i} Pr{y j > i} = eα i+x T j β x so and ψ ij ψ i j ψ ij ψ ij = e α i α i is independent of j proportional odds = e x j x j T β is independent of i With more than two levels of the outcome variable, SAS fits a proportional odds model and uses the score test for to test for proportionality.

### Multinomial Logistic Regression Models

Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

### Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

### Chapter 5: Logistic Regression-I

: Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

### 8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

### Longitudinal Modeling with Logistic Regression

Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

### Correlation and regression

1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

### STA6938-Logistic Regression Model

Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

### Analysing categorical data using logit models

Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.research-training.net/manchester

### Generalized Linear Modeling - Logistic Regression

1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating

### Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

### Models for Ordinal Response Data

Models for Ordinal Response Data Robin High Department of Biostatistics Center for Public Health University of Nebraska Medical Center Omaha, Nebraska Recommendations Analyze numerical data with a statistical

### ZERO INFLATED POISSON REGRESSION

STAT 6500 ZERO INFLATED POISSON REGRESSION FINAL PROJECT DEC 6 th, 2013 SUN JEON DEPARTMENT OF SOCIOLOGY UTAH STATE UNIVERSITY POISSON REGRESSION REVIEW INTRODUCING - ZERO-INFLATED POISSON REGRESSION SAS

### 2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

### Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

### Comparing IRT with Other Models

Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

### Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

### Three-Way Contingency Tables

Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep

### Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

### Investigating Models with Two or Three Categories

Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

### Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

### 9 Generalized Linear Models

9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

### Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

### Logistic regression modeling the probability of success

Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might

### Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example

### Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

### Negative Multinomial Model and Cancer. Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,

### Model Estimation Example

Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

### Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

### Tests for Two Correlated Proportions in a Matched Case- Control Design

Chapter 155 Tests for Two Correlated Proportions in a Matched Case- Control Design Introduction A 2-by-M case-control study investigates a risk factor relevant to the development of a disease. A population

### 12 Generalized linear models

12 Generalized linear models In this chapter, we combine regression models with other parametric probability models like the binomial and Poisson distributions. Binary responses In many situations, we

### ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

### Chapter 1 Statistical Inference

Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

### Linear Regression With Special Variables

Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

### COMPLEMENTARY LOG-LOG MODEL

COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α

### Extensions of Cox Model for Non-Proportional Hazards Purpose

PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

### Correlation and simple linear regression S5

Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

### Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X

Chapter 864 Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Introduction Logistic regression expresses the relationship between a binary response variable and one or more

### Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric

### Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear

### Models for Binary Outcomes

Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

### Generalized linear models

Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

### Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

### A note on R 2 measures for Poisson and logistic regression models when both models are applicable

Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer

### Dependent Variable Q83: Attended meetings of your town or city council (0=no, 1=yes)

Logistic Regression Kristi Andrasik COM 731 Spring 2017. MODEL all data drawn from the 2006 National Community Survey (class data set) BLOCK 1 (Stepwise) Lifestyle Values Q7: Value work Q8: Value friends

### Research Methodology: Tools

MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 05: Contingency Analysis March 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi Bruderer Enzler Contents Slide

### CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.

### Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

### Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25,

Exact McNemar s Test and Matching Confidence Intervals Michael P. Fay April 25, 2016 1 McNemar s Original Test Consider paired binary response data. For example, suppose you have twins randomized to two

### Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

### Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

### ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and

### POLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II

POLI 7050 Spring 2008 March 5, 2008 Unordered Response Models II Introduction Today we ll talk about interpreting MNL and CL models. We ll start with general issues of model fit, and then get to variable

### Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s

Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or

### Mediation for the 21st Century

Mediation for the 21st Century Ross Boylan ross@biostat.ucsf.edu Center for Aids Prevention Studies and Division of Biostatistics University of California, San Francisco Mediation for the 21st Century

### Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense

### Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

### BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

### Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

### Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

### Inference for Binomial Parameters

Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

### Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

### Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

### Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models

Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29 14.1 Regression Models

### Linear Regression Models P8111

Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

### Regression models for multivariate ordered responses via the Plackett distribution

Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

### Hypothesis Testing hypothesis testing approach

Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

### Modelling Binary Outcomes 21/11/2017

Modelling Binary Outcomes 21/11/2017 Contents 1 Modelling Binary Outcomes 5 1.1 Cross-tabulation.................................... 5 1.1.1 Measures of Effect............................... 6 1.1.2 Limitations

### LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

### Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

### CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

### STAT5044: Regression and Anova

STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation

### Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

### The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

### ON GOODNESS-OF-FIT OF LOGISTIC REGRESSION MODEL YING LIU. M.S., University of Rhode Island, 2003 M.S., Henan Normal University, 2000

ON GOODNESS-OF-FIT OF LOGISTIC REGRESSION MODEL by YING LIU M.S., University of Rhode Island, 2003 M.S., Henan Normal University, 2000 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of

### VIII. ANCOVA. A. Introduction

VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest.

### Deriving Decision Rules

Deriving Decision Rules Jonathan Yuen Department of Forest Mycology and Pathology Swedish University of Agricultural Sciences email: Jonathan.Yuen@mykopat.slu.se telephone (018) 672369 May 17, 2006 Yuen,

### Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

### MS&E 226: Small Data

MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

### Mixture Modeling in Mplus

Mixture Modeling in Mplus Gitta Lubke University of Notre Dame VU University Amsterdam Mplus Workshop John s Hopkins 2012 G. Lubke, ND, VU Mixture Modeling in Mplus 1/89 Outline 1 Overview 2 Latent Class

### EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

### Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

1 Supplemental Materials Graphing Values for Individual Dyad Members over Time In the main text, we recommend graphing physiological values for individual dyad members over time to aid in the decision

### Estimated Precision for Predictions from Generalized Linear Models in Sociological Research

Quality & Quantity 34: 137 152, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 137 Estimated Precision for Predictions from Generalized Linear Models in Sociological Research TIM FUTING

### Lecture 8 Stat D. Gillen

Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

### ABSTRACT INTRODUCTION. SESUG Paper

SESUG Paper 140-2017 Backward Variable Selection for Logistic Regression Based on Percentage Change in Odds Ratio Evan Kwiatkowski, University of North Carolina at Chapel Hill; Hannah Crooke, PAREXEL International

### Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

### Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Title stata.com logistic postestimation Postestimation tools for logistic Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

### Lecture 16 Solving GLMs via IRWLS

Lecture 16 Solving GLMs via IRWLS 09 November 2015 Taylor B. Arnold Yale Statistics STAT 312/612 Notes problem set 5 posted; due next class problem set 6, November 18th Goals for today fixed PCA example

### Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

### Introduction to Survey Analysis!

Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!

### 1 Introduction A common problem in categorical data analysis is to determine the effect of explanatory variables V on a binary outcome D of interest.

Conditional and Unconditional Categorical Regression Models with Missing Covariates Glen A. Satten and Raymond J. Carroll Λ December 4, 1999 Abstract We consider methods for analyzing categorical regression

### Lecture 11 Multiple Linear Regression

Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression

### Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis

Analytic Methods for Applied Epidemiology: Framework and Contingency Table Analysis 2014 Maternal and Child Health Epidemiology Training Pre-Training Webinar: Friday, May 16 2-4pm Eastern Kristin Rankin,

### Pseudo-score confidence intervals for parameters in discrete statistical models

Biometrika Advance Access published January 14, 2010 Biometrika (2009), pp. 1 8 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asp074 Pseudo-score confidence intervals for parameters

### Attributable Risk Function in the Proportional Hazards Model

UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

### SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ;

SAS Analysis Examples Replication C8 * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ; libname ncsr "P:\ASDA 2\Data sets\ncsr\" ; data c8_ncsr ; set ncsr.ncsr_sub_13nov2015

### STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

### Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke