Chapter 1. Modeling Basics
|
|
- Lindsey Strickland
- 5 years ago
- Views:
Transcription
1 Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1
2 What is a statistical model? A model is a mathematical description of the processes we think give rise to the observations in a set data. Minimal elements for statistical model Observation Systematic part describes the presumed impact of explanatory variables (observation mean). Random part describes the probability distributions associated with aspects of the process we assume to be characterized by random variation (the distribution of observations). 2
3 Explanatory variable Fixed effect - Parameter of interest (mean, beta coefficient) Random effect - repeated measurement - multi-level design Response Gaussian data Non-Gaussian data - Categorical : binomial/multinomial - Count : Poisson/negative binomial - Continuous : lognormal, proportion, beta - Time to event 3
4 From Linear model to generalized linear mixed model 4
5 Acronym LM: linear model GLM: generalized linear model LMM: linear mixed model GLMM: generalized linear mixed model SAS Procedure Response Gaussian Non-Gaussian Fixed Only LM PROC GLM + All others GLM PROC GENMOD PROC GLIMMIX Model Effects Mixed LMM PROC MIXED PROC GLIMMIX GLMM PROC GLIMMIX 5
6 Example I Cell means model y e ij i ij where - i = 1,2 denotes treatment - j = 1,, n i denotes the number of observations on the i-th treatment - y ij denotes the j-th observation on the i-th treatment - μ i denotes the mean of the i-th treatment - e ij denotes random error / random variation μ i : observation mean - systematic part e ij : random part typically we assumes iid N(0, σ 2 ). 6
7 Example II Linear regression y X e ij 0 1 i ij where - β 0 : the intercept - β 1 : the slope - X i : the value of predictor β 0 + β 1 X i : observation mean - systematic part e ij : random part typically we assumes iid N(0, σ 2 ). 7
8 Remarks George Box All models are wrong but some are useful. i. There is much we do not know about each subject or we choose not to pursue. ii. Alternatively, we settle for approximating variation among these subjects using the Gaussian probability distribution. It is wrong but good enough for many situations. 8
9 Two model forms Model equation Basic form: observation = systematic part + random part See comparison of two mean (example I) and regression (example II) Cell mean vs effect model Cell mean model: ij i ij y e Effect model: ij i ij y e where 0 i Probability distribution form As long as we assume a Gaussian distribution, example I can be expressed as y N 2 ~ (, ) ij i i 9
10 Weakness of the model equation form Example We observe N ij number of 0/1 or success / failure for y ij. Assumption: y ~ B( N, ) ij ij i where π i : the probability of a success It cannot be modeled by model equation approach 10
11 Analysis with linear regression - p y N X e e N 2 /, ~ (0, ) ij ij ij 0 1 i ij ij i. Histograms of simulated data from repeated sampling of a binomial distribution with N = 100 and π between 0.1 and 0.9 are virtually indistinguishable from the normal distribution, and in such a case, estimates of β from this approach are close to the coefficient of logistic regression. ii. However for binomial distribution the variance of response variable should depend on the probability of success but the linear regression always generates the same variance. iii. ˆ and ˆ If X 0 1 i is 0, pˆ (it cannot be negative!!) i 11
12 - Transformation with sin -1 ( / y N ) ij ij i. If we know we have a binomial response variable, we would do better to deal with it as such and not try to force it to be normal when we know it is not. ii. Therefore we prefer the probability distribution form. 12
13 SAS program for linear regression - SAS program data example1; ; input obs x N propotion=y/n; cards; run; proc genmod data=example1; run; model proportion = x /dist=normal link=identity; - SAS output 13
14 Modeling binomial data with probability distribution form Inspect the probability distribution Example: logistic regression (generalized linear model) (i) y ~ B( N, ) : random component ij ij ij log X 1 ij (ii) 0 1 ij i : link function & systematic component For logistic regression, ˆ 4.109, ˆ What is ˆ if X i = 10? cor(observed,logit) = cor(observed,p_glm) = Fit with logistic regression is better!! 14
15 SAS Program for logistic regression - SAS Program data example2; set example1; success=y; run; proc genmod data=example2; run; model success/n = x / dist=bin link=logit; - SAS Output 15
16 Define a plausible process Example : probit regression We never see the process and we only see its consequences (liability threshold model). (i) If the process exceeds a certain value, we observe a failure, and otherwise a success. We denote the boundary between success and failure by η : η = β 0 + β 1 X. (ii) We assume that the unobservable process has a standard normal distribution. The probability of success at X can be modeled by ( X ) i ( i ) 0 1 X i i 16
17 Generalized linear model - Identify the probability distribution of the observed data. - Focus on modeling the expected value, E( y ) N. If N ij is known, we focus only on π i. ij ij i - State the linear predictor - Identify the function that connects the expected value to the linear predictor (link function). In this lecture, we focus on the probability distribution model. 17
18 Provide the probability distribution model for linear regression (example II) Consider the two-treatment paired comparison. The response variable is continuous and can be assumed to have a Gaussian distribution. Consider the two-treatment paired comparison. For the i-th treatment (i = 1,2) on the j-th pair (j = 1,, p) at the k-th time point (k=1,2), N ijk observations are taken and each observation has either a favorable or unfavorable outcome. Denote y ijk as the number of favorable outcomes observed on the ij-th pair at the k-th time point. 18
19 Types of model effects Classification variable Classification variable: in the two-treatment mean comparison model (Model I), predictor variable is a treatment and it is an example of a classification variable. Direct variable: if is a function of X i i as was in linear regression (Model 2), predictor variable is a direct variable. 19
20 Example We assume that levels of X are observed at multiple locations or for multiple batches, and there is only one observation per level of X. There are 11 levels of X in varying intervals from 0 to 48. For each of four batches, a continuous variable (Y) and a binomial variable are observed at each level of X. Batch1 Batch3 Global mean Batch4 Batch2 20
21 Assuming separate linear regressions by batch: X (i = 1,, 4) can be alternatively expressed as b ( b ) X where β 0i 1i ij 0 0i 1 1i ij 0 and β 1 are the overall intercept and slope. - Random effect: batches could represent a sample of a larger population of batches and we could have sampled ay four batches. Then b N and 2 ~ (0, ) 0i 0 b N. 2 ~ (0, ) 1i 1 - Fixed effect: four batches in this data set could be the entire population, and batch i really means supplier i. For a linear regression (example II), state the mixed effects model. For a multi batch data with binomial distribution, state the final model. 21
22 Fixed/random effects model Definition - Fixed effects model: models that contain only fixed effects - Mixed effects model: models that contain both fixed and random effects. If we assumed random batches effect, you must state assumed probability distributions. For instance, b 2 0i 0 01 ~ MVN 0, 2 b 1i 01 1 Remarks Both random effect and response variable can have any plausible distribution. However computational methods for mixed models with Gaussian random effects are better developed, and we focus on the Gaussian case in this lecture. 22
23 Complete description of models discussed in this chapter TABLE 1.3: Complete Description of Models Discussed in This Chapter Type of Model Distribution Linear Predictor Link 1. LM ) or equivalently Identity: 2. LM Identity: 3. GLM or equivalently Logit: Or probit: = 4. GLM Logit or probit 5. LM Identity: 6. GLM Logit or probit 7. LMM Identity: 8. GLMM Logit or probit 23
24 Writing models in matrix form Matrix form is important for the theoretical development of estimation, inference, and statistical computing programs. Fixed-effect models We consider the two-treatment LM and suppose there are three observations on each treatment , , , , If we define (, 1, 2) t and X ( X, X ), define each term. X matrix is often called a design matrix. It is also called derivative matrix because X η η η
25 Consider the linear regression over levels of X ij = 0,, 10. We assume that 1 0, 1 0 1,, Define the design matrix and the parameter vector. Consider the fixed effects model for multi-batch models with ij 0 b0 i ( 1 b1 i) Xij and find a matrix form. 25
26 Mixed-effect models Consider the mixed effects model for multi-batch models with ij 0 b0 i ( 1 b1 i) Xij and find a matrix form. 26
27 Summary Typology of linear models Type of Model GLMM LMM GLM LM Observations Link Linear Predictor Mean Model 1 1 y b ~ G(, R) g( b) X Zb, b ~ N(0, G) ˆ g ( ˆ ) g ( X ˆ Zbˆ ) y b ~ N(, R) X Zb, b ~ N(0, G) ˆ X ˆ Zbˆ 1 1 y ~ G(, R) g( ) X ˆ g ( ˆ ) g ( X ˆ ) y ~ N(, R) X ˆ X ˆ GLMM : generalized linear mixed model, LMM : linear mixed model GLM : generalized linear model, LM : linear model LM, GLM, and LMM are all special cases of the GLMM 27
28 SAS procedure PROC GLM : General linear model (not generalized linear model) PROC ANOVA : Analysis of variance PROC LMM (extension of ANOVA) : Linear mixed model PROC LOGISTIC/GENMOD/CATMOD : Generalized linear model PROC GLIMMIX : Generalized linear mixed model CATMOD vs GENMOD vs LOGISTIC vs PROBIT - CATMOD: provides maximum likelihood estimation for logistic regression, including the analysis of logits for dichotomous outcomes and the analysis of generalized logits for 28
29 polychotomous outcomes. It provides weighted least squares estimation of many other response functions, such as means, cumulative logits, and proportions, and you can also compute and analyze other response functions that can be formed from the proportions corresponding to the rows of a contingency table. In addition, a user can input and analyze a set of response functions and user-supplied covariance matrix with weighted least squares. With the CATMOD procedure, by default, all explanatory (independent) variables are treated as classification variables. - GENMOD: a general statistical modeling tool which fits generalized linear models to data: it fits several useful models to categorical data including logistic regression, the proportional odds model, and Poisson regression. The GENMOD procedures also provides a facility for fitting generalized estimating equations to correlated response data that are categorical, such as repeated dichotomous outcomes. The GENMOD procedure fits models using maximum likelihood estimation, and you include classification variables in your models with a CLASS 29
30 statement. PROC GENMOD can perform type I and type III tests, and it provides predicted values and residuals. - LOGISTIC: specifically designed for logistic regression. For dichotomous outcomes, it performs the usual logistic regression and for ordinal outcomes, it fits the proportional odds model. Note that any polychotomous response variable will be treated as an ordinal outcome by PROC LOGISTIC. This procedure has capabilities for a variety of model-building techniques, including stepwise, forward, and backwards selection. It produces predicted values and can create output data sets containing these values and other statistics including ROC, and it produces a number of regression diagnostics. - PROBIT: designed for quantal assay or other discrete event data. It performs logistic regression. This procedure includes a CLASS statement. 30
31 31
Generalized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationGeneralized Multilevel Models for Non-Normal Outcomes
Generalized Multilevel Models for Non-Normal Outcomes Topics: 3 parts of a generalized (multilevel) model Models for binary, proportion, and categorical outcomes Complications for generalized multilevel
More informationGeneralized Linear Models for Count, Skewed, and If and How Much Outcomes
Generalized Linear Models for Count, Skewed, and If and How Much Outcomes Today s Class: Review of 3 parts of a generalized model Models for discrete count or continuous skewed outcomes Models for two-part
More informationGeneralized Models: Part 1
Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes
More informationGeneralized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.
Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint
More informationIntroduction to Generalized Models
Introduction to Generalized Models Today s topics: The big picture of generalized models Review of maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical
More informationGeneralized Linear Models
York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear
More informationRegression Model Building
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated
More informationModel Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)
Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV
More informationModels for Binary Outcomes
Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.
More informationSTAT 5200 Handout #26. Generalized Linear Mixed Models
STAT 5200 Handout #26 Generalized Linear Mixed Models Up until now, we have assumed our error terms are normally distributed. What if normality is not realistic due to the nature of the data? (For example,
More information(c) Interpret the estimated effect of temperature on the odds of thermal distress.
STA 4504/5503 Sample questions for exam 2 1. For the 23 space shuttle flights that occurred before the Challenger mission in 1986, Table 1 shows the temperature ( F) at the time of the flight and whether
More informationLogistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20
Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)
More informationNiche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016
Niche Modeling Katie Pollard & Josh Ladau Gladstone Institutes UCSF Division of Biostatistics, Institute for Human Genetics and Institute for Computational Health Science STAMPS - MBL Course Woods Hole,
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 18 Outline 1 Logistic regression for Binary data 2 Poisson regression for Count data 2 / 18 GLM Let Y denote a binary response variable. Each observation
More informationij i j m ij n ij m ij n i j Suppose we denote the row variable by X and the column variable by Y ; We can then re-write the above expression as
page1 Loglinear Models Loglinear models are a way to describe association and interaction patterns among categorical variables. They are commonly used to model cell counts in contingency tables. These
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationSections 4.1, 4.2, 4.3
Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear
More informationPackage HGLMMM for Hierarchical Generalized Linear Models
Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationChapter 14 Logistic and Poisson Regressions
STAT 525 SPRING 2018 Chapter 14 Logistic and Poisson Regressions Professor Min Zhang Logistic Regression Background In many situations, the response variable has only two possible outcomes Disease (Y =
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationCount data page 1. Count data. 1. Estimating, testing proportions
Count data page 1 Count data 1. Estimating, testing proportions 100 seeds, 45 germinate. We estimate probability p that a plant will germinate to be 0.45 for this population. Is a 50% germination rate
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationSpring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations
Spring RMC Professional Development Series January 14, 2016 Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations Ann A. O Connell, Ed.D. Professor, Educational Studies (QREM) Director,
More informationAnalysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationContrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:
Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Logistic regression (v1) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 30 Regression methods for binary outcomes 2 / 30 Binary outcomes For the duration of this
More informationYou can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.
The GENMOD Procedure MODEL Statement MODEL response = < effects > < /options > ; MODEL events/trials = < effects > < /options > ; You can specify the response in the form of a single variable or in the
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationSTAT 705 Generalized linear mixed models
STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random
More informationComparing IRT with Other Models
Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationRonald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012
Ronald Heck Week 14 1 From Single Level to Multilevel Categorical Models This week we develop a two-level model to examine the event probability for an ordinal response variable with three categories (persist
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationPoisson regression: Further topics
Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to
More informationStat 587: Key points and formulae Week 15
Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationModels for Ordinal Response Data
Models for Ordinal Response Data Robin High Department of Biostatistics Center for Public Health University of Nebraska Medical Center Omaha, Nebraska Recommendations Analyze numerical data with a statistical
More informationModel Assumptions; Predicting Heterogeneity of Variance
Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of
More informationMixed Models for Longitudinal Ordinal and Nominal Outcomes
Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel
More informationNon-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationSAS/STAT 13.1 User s Guide. Introduction to Regression Procedures
SAS/STAT 13.1 User s Guide Introduction to Regression Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 9: Logistic regression (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 28 Regression methods for binary outcomes 2 / 28 Binary outcomes For the duration of this lecture suppose
More informationMultilevel Methodology
Multilevel Methodology Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics Universiteit Hasselt, Belgium geert.molenberghs@uhasselt.be www.censtat.uhasselt.be Katholieke
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationProduct Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013
Modeling Sub-Visible Particle Data Product Held at Accelerated Stability Conditions José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013 Outline Sub-Visible Particle (SbVP) Poisson Negative Binomial
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationCOMPLEMENTARY LOG-LOG MODEL
COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More information36-463/663: Multilevel & Hierarchical Models
36-463/663: Multilevel & Hierarchical Models (P)review: in-class midterm Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 In-class midterm Closed book, closed notes, closed electronics (otherwise I have
More informationEconometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit
Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationIntroduction to Regression Procedures (Chapter)
SAS/STAT 9.3 User s Guide Introduction to Regression Procedures (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March Presented by: Tanya D. Havlicek, ACAS, MAAA ANTITRUST Notice The Casualty Actuarial Society is committed
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationSingle-level Models for Binary Responses
Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationStatistics 572 Semester Review
Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More information2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
More informationStat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.
Stat 260 - Lecture 20 Recap of Last Class Last class we introduced the covariance and correlation between two jointly distributed random variables. Today: We will introduce the idea of a statistic and
More informationChapter 4: Generalized Linear Models-I
: Generalized Linear Models-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationLogistic Regression. Continued Psy 524 Ainsworth
Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression
More informationSection Poisson Regression
Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationH-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL
H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationLogistic Regression in R. by Kerry Machemer 12/04/2015
Logistic Regression in R by Kerry Machemer 12/04/2015 Linear Regression {y i, x i1,, x ip } Linear Regression y i = dependent variable & x i = independent variable(s) y i = α + β 1 x i1 + + β p x ip +
More informationStatistics 262: Intermediate Biostatistics Model selection
Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationGeneralized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey
More information7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).
1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationLecture 13: More on Binary Data
Lecture 1: More on Binary Data Link functions for Binomial models Link η = g(π) π = g 1 (η) identity π η logarithmic log π e η logistic log ( π 1 π probit Φ 1 (π) Φ(η) log-log log( log π) exp( e η ) complementary
More informationChapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93
Contents Preface ix Chapter 1 Introduction 1 1.1 Types of Models That Produce Data 1 1.2 Statistical Models 2 1.3 Fixed and Random Effects 4 1.4 Mixed Models 6 1.5 Typical Studies and the Modeling Issues
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationLecture 2. The Simple Linear Regression Model: Matrix Approach
Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution
More informationFrom Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...
From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More information