Statistical Questions: Classification Modeling Complex Dose-Response

Size: px
Start display at page:

Download "Statistical Questions: Classification Modeling Complex Dose-Response"

Transcription

1 Biost 536 / Epi 536 ategorical ata Analysis in Epidemiology Lecture Outline Modeling nonlinear associations (complex dose response ) Flexible methods Scott S. Emerson, M.., Ph.. Professor of Biostatistics University of Washington Lecture 7: Modeling of Nonlinear Associations October 23, Statistical Questions: lassification Modeling omplex ose-response Statistical Questions 1. lustering of observations Perhaps into groups that might be different diseases 2. lustering of variables Perhaps into groups representing biochemical pathways 3. Quantification of distributions Perhaps reporting mean life expectancy after diagnosis 4. omparing distributions Perhaps investigating associations between variables 5. Prediction of individual observations Perhaps diagnosing disease or estimating kidney function 3 4 ategorical ata Analysis, AUT

2 4. Investigating Associations Transformation of Predictor of Interest Our scientific questions can be at many different levels of detail 1. Is there an association? 2. What is the general (first order) trend in Y with higher X? 3. Is their a nonlinear trend in the association? 4. Is the general trend a particular shape? Increasing exponentially? Increasing to a threshold? onstant then decreasing? U-shaped? S-shaped? 5. What is the association at particular levels of X? E.g., What is the difference in odds of mortality between subjects with LL of 160 and 161 mg/dl? Any questions can be about associations independent of other We choose the exact form of the modeled POI in order to answer our scientific question Accurately, and Precisely Important issues Ensuring that the question is appropriately reflected by some regression parameter(s) Striving to have greatest precision to make inference about those parameter(s) Trying to avoid overly influential observations Ensuring that we do not allow multiple comparisons to inflate the type 1 error mechanisms (i.e., adjusted for potential confounding) 5 6 Invalid Reasons for Transformations alid Reasons for Transformation 1 A commonly stated reason for transforming a predictor is: We transformed the predictor in order that it would appear more normally distributed. This reasoning is WRONG There is nothing in our statistical theory that ever demands that a predictor in a regression model be normally distributed With just a little thought, this should be obvious to all: The most straightforward of all statistical investigations of associations is the two-sample problem In such a problem, our POI is far from normal it is binary (discrete) 7 Modeling our question We ensure that our full model is flexible enough to reflect our alternative hypothesis We ensure that our null hypothesis can be represented by constraining one or more of our parameters to some set value Most often constrained to 0 (equivalent to removing the term ) Examples: We use an linear (untransformed) predictor to detect a first order trend We include both a linear and squared term to detect nonlinearity A linear relationship would not need the squared term We fit linear splines to detect U-shaped trends A U-shaped trend would require that slopes for lower values of X are of opposite sign of the slopes for higher values of X 8 ategorical ata Analysis, AUT

3 alid Reasons for Transformation 2 Loss of Precision from Lack of Fit Increasing precision The more accurately we can model the true data generating mechanism, the greater precision we will have This avoids mixing pure error (the random variation in the distribution) and systematic error (the difference between the fitted model and the true distribution) Examples: Anticipating that either too high or too low itamin A is harmful, we fit a model that can model a U-shaped trend However, this does make it difficult to test for direction of trend Anticipating that risk of death from prostate cancer is more closely related to multiplicative increases, we model log(psa) Equal increased risk with every doubling of serum PSA 9 10 alid Reasons for Transformation 3 Methodologic Approach Avoiding overly influential observations Any observation having a predictor value that is distant from the rest of the data has the potential to Greatly influence the estimated parameters ( influential ) Greatly influence the statistical significaned ( highly leveraged ) It is not uncommon that this criterion is especially important when The true effect of the predictor is based on multiplicative differences ( log transformations of predictor) Measurement error is greatest on the largest measurements (perhaps use variance stabilizing transformation of log(x)) 11 We must avoid allowing multiple comparisons to inflate the type 1 error Generally, our null hypothesis is common to all transformations that we might consider Looking for the best transformation is thus testing the null hypothesis We must prespecify the model we will fit Luckily, we should know our question If we do not know the data generation mechanism we should make a reasonable guess at a suitably flexible model Science is incremental: answer first questions first There is always room for additional exploratory descriptive analyses that might generate the next hypothesis 12 ategorical ata Analysis, AUT

4 Linear Predictors Modeling omplex ose-response Transformations The most commonly used regression models use linear predictors Linear refers to linear in the parameters The modeled predictors can be transformations of the scientific measurements Examples g X log X i 0 log X i 13 g 2 X X 2 X i 0 X i X i 14 Transformations of Predictors General Applicability We transform predictors to answer scientific questions aimed at detecting nonlinear relationships E.g., is the association between all cause mortality and LL in elderly adults nonlinear? E.g., is the association between all cause mortality and LL in elderly adults U-shaped? We transform predictors to provide more flexible description of complex associations between the response and some scientific measure (especially confounders, but also precision and POI) Threshold effects Exponentially increasing effects U-shaped functions S-shaped functions etc. 15 The issues related to transformations of predictors are similar across all types of regression with linear predictors Linear regression Logistic regression Poisson regression Proportional hazards regression Accelerated failure time regression However, it is easiest to use descriptive statistics to illustrate the issues in linear regression In other forms of regression we can display differences between fitted values, but display of the original data is more difficult Binary data ensored data Models that use a log link 16 ategorical ata Analysis, AUT

5 Ex: ubic Relationship FE vs Height in hildren Ex: Threshold Effect of ose? RT of beta carotene supplementation: 4 doses plus placebo Plasma Beta-carotene at 3 months by ose Plasma Beta-carotene at 9 months by ose FE (l/sec) Height (in.) 17 Plasma Beta-carotene ose Plasma Beta-carotene ose 18 Ex: U-shaped Trend? Inflammatory marker vs cholesterol Lowess smoother, bandwidth =.8 Ex: S-shaped trend In vitro cytotoxic effect of oxorubicin with chemosensitizers hemosensitizers -reactive protein holesterol (mg/dl) 19 ell ount = OX only = OX + erapimil = OX + yclosporine A oncentration of oxirubicin ategorical ata Analysis, AUT

6 Y Lecture 7: Modeling of Nonlinear Associations October 23, 2014 Y 1:1 Transformations Sometimes we transform 1 scientific measurement into 1 modeled predictor Log Transformations Simulated data where every doubling of X has same difference in mean of Y Untransformed Log Transformed X Ex: log transformation will sometimes address apparent threshold effects Ex: cubing height produces more linear association with FE Ex: dichotomization of dose to detect efficacy in presence of strong threshold effect against placebo X log X 22 ubic Transformation: FE vs Height Transforming Predictors: Interpretation When using a predictor that represents a transformed predictor, we try to use the same interpretation of slopes Additive models: ifference in θ Y X per 1 unit difference in modeled predictor Multiplicative models: Ratio of θ Y X per 1 unit difference in modeled predictor Such interpretations are generally easy for ichotomization of a measured variable Logarithmic transformation of a measured variable 23 Other univariate transformations are generally difficult to interpret I tend not to use other transformations when interpretability of the estimate of effect is key (and I think it always is) 24 ategorical ata Analysis, AUT

7 iagnostics It is natural to wonder whether univariate transformations of some measured covariate are appropriate We can illustrate methods for investigating the appropriateness of a transformation using one of the more common flexible methods of modeling covariate associations I consider polynomial regression to investigate whether some of the transformations we have talked about make statistical sense I am not suggesting that we do model building by routinely investigating many different models I think questions about linearity vs nonlinearity of associations is an interesting scientific question in its own right and should be placed in a hierarchy of investigation I revisit this later 25 Effect of Link Function: R, RR, OR With binary data, we cannot easily look at scatterplots Instead we can look at fitted values compared to extremely flexible models (e.g., linear splines) But first we need to recognize that when looking at fitted values, we usually look at fitted probabilities Examples: 5 year survival versus LL regress deadin5 ldl predict Rfitlin poisson deadin5 ldl predict RRfitlin logistic deadin5 ldl predict ORfitlin 26 Linear Trend in Mortality by LL 1:Many Transformations Fitted alues: Linear LL Sometimes we transform 1 scientific measurement into several modeled predictor Ex: polynomial regression Ex: dummy variables ( factored variables ) Ex: piecewise linear Ex: splines ldl R linear OR linear RR linear ategorical ata Analysis, AUT

8 Polynomial Regression Ex: Mortality - LL Assoc Linear? Fit linear term plus higher order terms (squared, cubic, ) an fit arbitrarily complex functions An n-th order polynomial can fit n+1 points exactly We can try to assess whether any association between 5 year mortality and LL follows a straight line association I am presuming this was a prespecified scientific question (We should not pre-test our statistical models) Generally very difficult to interpret parameters I usually graph function when I want an interpretation Special uses 2 nd order (quadratic) model to look for U-shaped trend Test for linearity by testing that all higher order terms have parameters equal to zero 29 I fit a 2 nd order polynomial to the data g ldlsqr= ldl^2 regress deadin4 ldl ldlsqr predict Rfitquad poisson deadin4 ldl ldlsqr predict RRfitquad logistic deadin4 ldl ldlsqr predict ORfitquad 30 Mortality - LL Assoc Linear?: OR Linear vs Quadratic Fitted alues No statistically significant evidence of a nonlinear trend based on this analysis Fitted alues: Linear LL. logistic deadin4 ldl ldlsqr Logistic regression Number of obs = 725 LR chi2(2) = 9.42 Prob > chi2 = Log likelihood = Pseudo R2 = deadin4 Odds Ratio Std. Err. z P> z [95% onf. Interval] ldl ldlsqr ldl 31 R linear OR linear RR quadratic RR linear R quadratic OR quadratic 32 ategorical ata Analysis, AUT

9 Mortality - LL Associated?: OR ummy ariables Need to test both covariates When these are the only predictors Overall LR test Otherwise use post-estimation test or testparm. logistic deadin4 ldl ldlsqr Logistic regression Number of obs = 725 LR chi2(2) = 9.42 Prob > chi2 = Log likelihood = Pseudo R2 = deadin4 Odds Ratio Std. Err. z P> z [95% onf. Interval] ldl ldlsqr Indicator variables for all but one group This is the only appropriate way to model nominal (unordered) variables E.g., for marital status Indicator variables for married (married = 1, everything else = 0) widowed (widowed = 1, everything else = 0) divorced (divorced = 1, everything else = 0) (single would then be the intercept) Often used for other settings as well Equivalent to Analysis of ariance (ANOA) 34 ategorized ontinuous Fitted alues: Linear, ummy We can use dummy variables with categorized continuous random variables to explore dose-response Fitted alues: Linear LL In Stata, we can quickly make categorized variables using egen Examples: egen ldlctg = cut(ldl), at(0,70,100,130,160,250) egen ldlctgq = cut(ldl), group(5) regress deadin4 i.ldlctg predict Rfitstep poisson deadin4 i.ldlctg predict RRfitstep logistic deadin4 i.ldlctg predict ORfitstep ldl R linear OR linear RR step RR linear R step OR step 36 ategorical ata Analysis, AUT

10 Flexible Modeling of Predictors Flexible Methods Linear Splines 37 We do have methods that can fit a wide variety of curve shapes Polynomials If high degree: allows many patterns of curvature Fractional polynomial: allows raising to a fractional power, often searching for best fit (I will not be a party to the propagation of these methods) ummy variables A step function with tiny steps Flat lines over each interval Piecewise linear or piecewise polynomial efine intervals over which the curve is a line or polynomial Splines Piecewise linear or piecewise polynomial but joined at knots 38 Linear Splines raw straight lines between pre-specified knots Stata: Linear Splines Stata will make variable that will fit piecewise linear curves Model intercept and m+1 variables when using m knots mkspline new0 #k1 new1 #k2 new2 #kp newp= oldvar Suppose knots are k 1,, k m, for variable X efine variables Spline0 SplineM Spline0 equals X for X < k 1 k 1 for k 1 < X Then, for J = 1.. m, SplineJ equals (define k 0 =0, k m+1 = ) 0 for X < k J X k J for k J < X < k J+1 k J+1 k J for k J+1 < X 39 Regression on newvar0 newvarp Straight lines between min and k1; k1 and k2, etc. 40 ategorical ata Analysis, AUT

11 Regression with Linear Splines: FE, Age. mkspline age3 6 age6 9 age9 12 age12 15 age15= age. list age age3 age6 age9 age12 age15 in 1/15 Regression with Linear Splines: FE, Age. mkspline age3 6 age6 9 age9 12 age12 15 age15= age. regress fev age3 age6 age9 age12 age15, robust age age3 age6 age9 age12 age Linear regression Number of obs = 654 F( 5, 648) = Prob > F = R-squared = Root MSE = Robust fev oef. Std Err t P> t [95% onf Intervl] age age age age age _cons predict splinefit (option xb assumed; fitted values) 42 Fitted alues with Linear Splines Fitted alues with Linear Splines. tabstat splinefit, by(age) stat(n mean sd min max) age N mean sd min max tabstat splinefit, by(age) age N mean sd min max ifference ategorical ata Analysis, AUT

12 Linear Splines: Parameter Interpretation Fitted alues With identity link Intercept β 0 : θ Y X when X = 0 Slope parameters β j : Estimated difference in θ Y X between two groups both between the same knots but differing by 1 unit in X With log link Exponentiated intercept exp(β 0 ): θ Y X when X = 0 Exponentiated slope parameters exp(β j ) : Estimated ratio of θ Y X between two groups both between the same knots but differing by 1 unit in X 45 Lowess (largely hidden), linear, dummy variables, linear splines FE by Age (stratified by sex) age Males Lowess ummy fit Females Linear fit Spline fit 46 Testing Linearity A straight line is a special case of linear splines All the parameter coefficients would have to be equal an use Stata s test. test age3 = age6 = age9 = age12 = age15 ( 1) age3 - age6 = 0 ( 2) age3 - age9 = 0 ( 3) age3 - age12 = 0 ( 4) age3 - age15 = 0 Linear Splines In Stata, we can quickly make linear splines using mkspline Examples: mkspline sldl0 70 sldl sldl sldl sldl160= ldl regress deadin4 sldl* predict Rfitspline poisson deadin4 sldl* predict RRfitspline logistic deadin4 sldl* predict ORfitspline F( 4, 648) = 6.89 Prob > F = ategorical ata Analysis, AUT

13 Fitted alues Fitted alues: Linear LL Flexible Methods omments ldl R linear OR linear RR step RR linear R step OR step Flexible Modeling of Predictors Uses of Flexible Modeling of Predictors ommonly used flexible models include Polynomials ummy variables Linear splines Possibilities are limitless, but some you may encounter ubic splines Makes curves smooth at knots But for the ways I use splines, I cannot be bothered Fractional polynomial: allows raising to a fractional power Often searching for best fit over a grid of values I will not be a party to the propagation of these methods 51 For predictor of interest When strong suspicion of a complex nonlinear fit May provide greater precision due to better fit an test for linearity by including linear term, then testing all the other terms When fit is fairly well approximated by a straight line of untransformed predictor or straight line with a univariate transformation of predictor, splines may result in loss of precision due to loss of df Keep an open mind, but not so open that your brains fall out - irginia Gildersleeve For confounders, ensures more accurately modeled effect of covariates But, again, not wise to go overboard For precision variables, often not often worth the effort 52 ategorical ata Analysis, AUT

Ex: Cubic Relationship. Transformations of Predictors. Ex: Threshold Effect of Dose? Ex: U-shaped Trend?

Ex: Cubic Relationship. Transformations of Predictors. Ex: Threshold Effect of Dose? Ex: U-shaped Trend? Biost 518 Applied Biostatistics II Scott S. Emerson, M.., Ph.. Professor of Biostatistics University of Washington Lecture Outline Modeling complex dose response Flexible methods Lecture 9: Multiple Regression:

More information

Lecture Outline Biost 518 / Biost 515 Applied Biostatistics II / Biostatistics II. Linear Predictors Modeling Complex Dose-Response

Lecture Outline Biost 518 / Biost 515 Applied Biostatistics II / Biostatistics II. Linear Predictors Modeling Complex Dose-Response Lecture Outline Biost 518 / Biost 515 Applied Biostatistics II / Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Modeling complex dose response Multiple

More information

Biost 536 / Epi 536 Categorical Data Analysis in Epidemiology

Biost 536 / Epi 536 Categorical Data Analysis in Epidemiology Lecture 9: Inference with omplex Modeling of POI November 4, 2014 iost 536 / Epi 536 ategorical ata nalysis in Epidemiology Scott S. Emerson, M.., Ph.. Professor of iostatistics University of Washington

More information

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression: Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of

More information

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Ph.D. course: Regression models

Ph.D. course: Regression models Ph.D. course: Regression models Non-linear effect of a quantitative covariate PKA & LTS Sect. 4.2.1, 4.2.2 8 May 2017 www.biostat.ku.dk/~pka/regrmodels17 Per Kragh Andersen 1 Linear effects We have studied

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis

Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis Soc 63993, Homework #7 Answer Key: Nonlinear effects/ Intro to path analysis Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Problem 1. The files

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies

More information

Residuals and model diagnostics

Residuals and model diagnostics Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression

Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard Modelling excess mortality using fractional polynomials and spline functions Bernard Rachet 7 September 2007 Stata workshop - Stockholm 1 Modelling time-dependent excess hazard Excess hazard function λ

More information

Unit 11: Multiple Linear Regression

Unit 11: Multiple Linear Regression Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable

More information

Working with Stata Inference on the mean

Working with Stata Inference on the mean Working with Stata Inference on the mean Nicola Orsini Biostatistics Team Department of Public Health Sciences Karolinska Institutet Dataset: hyponatremia.dta Motivating example Outcome: Serum sodium concentration,

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Sociology 362 Data Exercise 6 Logistic Regression 2

Sociology 362 Data Exercise 6 Logistic Regression 2 Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs

More information

Stat 587: Key points and formulae Week 15

Stat 587: Key points and formulae Week 15 Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Data Analysis 1 LINEAR REGRESSION. Chapter 03

Data Analysis 1 LINEAR REGRESSION. Chapter 03 Data Analysis 1 LINEAR REGRESSION Chapter 03 Data Analysis 2 Outline The Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression Other Considerations in Regression Model Qualitative

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 = August 2005 Stata Application Tutorial 4: Discrete Models Data Note: Code makes use of career.dta, and icpsr_discrete1.dta. All three data sets are available on the Event History website. Code is based

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Threshold Autoregressions and NonLinear Autoregressions

Threshold Autoregressions and NonLinear Autoregressions Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Group Comparisons: Differences in Composition Versus Differences in Models and Effects

Group Comparisons: Differences in Composition Versus Differences in Models and Effects Group Comparisons: Differences in Composition Versus Differences in Models and Effects Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 Overview.

More information

Practical Biostatistics

Practical Biostatistics Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:

More information

Interpreting coefficients for transformed variables

Interpreting coefficients for transformed variables Interpreting coefficients for transformed variables! Recall that when both independent and dependent variables are untransformed, an estimated coefficient represents the change in the dependent variable

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Addition to PGLR Chap 6

Addition to PGLR Chap 6 Arizona State University From the SelectedWorks of Joseph M Hilbe August 27, 216 Addition to PGLR Chap 6 Joseph M Hilbe, Arizona State University Available at: https://works.bepress.com/joseph_hilbe/69/

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

A new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston

A new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

2: Multiple Linear Regression 2.1

2: Multiple Linear Regression 2.1 1. The Model y i = + 1 x i1 + 2 x i2 + + k x ik + i where, 1, 2,, k are unknown parameters, x i1, x i2,, x ik are known variables, i are independently distributed and has a normal distribution with mean

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Sociology Exam 2 Answer Key March 30, 2012

Sociology Exam 2 Answer Key March 30, 2012 Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

BIOS 312: MODERN REGRESSION ANALYSIS

BIOS 312: MODERN REGRESSION ANALYSIS BIOS 312: MODERN REGRESSION ANALYSIS James C (Chris) Slaughter Department of Biostatistics Vanderbilt University School of Medicine james.c.slaughter@vanderbilt.edu biostat.mc.vanderbilt.edu/coursebios312

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam June 8 th, 2016: 9am to 1pm Instructions: 1. This is exam is to be completed independently. Do not discuss your work with

More information

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations

More information

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester

Modelling Rates. Mark Lunt. Arthritis Research UK Epidemiology Unit University of Manchester Modelling Rates Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 05/12/2017 Modelling Rates Can model prevalence (proportion) with logistic regression Cannot model incidence in

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Instantaneous geometric rates via Generalized Linear Models

Instantaneous geometric rates via Generalized Linear Models The Stata Journal (yyyy) vv, Number ii, pp. 1 13 Instantaneous geometric rates via Generalized Linear Models Andrea Discacciati Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se Matteo Bottai

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

4.1 Example: Exercise and Glucose

4.1 Example: Exercise and Glucose 4 Linear Regression Post-menopausal women who exercise less tend to have lower bone mineral density (BMD), putting them at increased risk for fractures. But they also tend to be older, frailer, and heavier,

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

Lecture 6: Linear Regression (continued)

Lecture 6: Linear Regression (continued) Lecture 6: Linear Regression (continued) Reading: Sections 3.1-3.3 STATS 202: Data mining and analysis October 6, 2017 1 / 23 Multiple linear regression Y = β 0 + β 1 X 1 + + β p X p + ε Y ε N (0, σ) i.i.d.

More information