Residuals in the Analysis of Longitudinal Data
|
|
- Ross Copeland
- 5 years ago
- Views:
Transcription
1 Residuals in the Analysis of Longitudinal Data Jemila Hamid, PhD (Joint work with WeiLiang Huang) Clinical Epidemiology and Biostatistics & Pathology and Molecular Medicine McMaster University
2 Outline 1. Introduction 2. Residuals in the analysis of longitudinal data 3. Transformed Residuals 4. The growth curve model and decomposed residuals 5. Real data application 6. Discussion 2
3 1. Introduction Statistical modeling plays important roles in understanding the relationship between one or more variables Statistical modeling is commonly used in a wide range of applications, from finance, banking and weather forecasting to clinical medicine and public health, to name a few In medical and biological research in particular, modeling has been demonstrated to be an essential tool for enhancing our understanding of variety of common as well as rare diseases affecting the public 3
4 Introduction (cont d) Statistical modeling also plays important roles in disease diagnosis, prognosis, management as well as disease prevention and health promotion Statistical models are also commonly used in identifying risk factors associated with diseases and hence allowing effective diagnosis, treatment as well as prevention mechanisms The terms evidence-based medicine, evidence-based diagnosis and evidence-based decision making highlight the importance of statistical methods in areas of medicine and public health 4
5 Introduction (cont d) At the initial stages of modeling, we specify the model and model assumptions We then estimate the model parameters based on the specified model and the underlying assumptions Statistical models often rely on several assumptions including Distributional assumptions Mostly on outcome variables Relational assumptions Quantify relationship between outcome and predictors 5
6 Introduction (cont d) However, modeling is not complete without the investigation of model-data agreement Model Diagnostics We need to ask questions like Does data support model assumptions? Does the model fit the data? Are the mean and the covariance modeled properly? Do we need to add or remove variables? Are there outliers and/or influential observations that influence our estimation and affect the generalizability of our model? Model diagnostics is, therefore, a crucial component of any model fitting problem 6
7 Outline 1. Introduction 2. Residuals in the analysis of longitudinal data 3. Residuals decomposition 4. Residuals transformation 5. Real data application 6. Discussion 7
8 Residuals We can not talk about model diagnostics without residuals Residuals are not only used to check adequacy of model fit, they also are excellent tools to validate model assumptions as well as identify outliers and/or influential observations Residuals in univariate models are relatively simple to explore and have been studied extensively They are routinely used for model diagnostics Different types of residuals are proposed ordinary residuals, standardized residuals, studentized residuals and jackknife residuals 8
9 Residuals (cont d) Consider Model: Y = Xβ + ε Parameter Estimates: β = (X X) 1 X Y Ordinary Residuals Y C (X) C (X) R = Y Y = I X X X 1 X Y = I H Y Note: R = I H ε, E R = 0, aaa VVV r i = (1 h ii )σ 2 Residuals represent part of data that is left unexplained after a model has been fitted to data 9
10 Residuals (cont d) Standardized Residuals rr i = r i s, where s2 = 1 n p 1 r i 2 Studentized Residuals rrr i = r i s (1 h ii ) Jackknife Residuals rr i = r i s (i) (1 h ii ) 10
11 Residuals (cont d) How are the residuals used? Graphically Checking normality QQ plots Checking model fit Scatter plots Checking independence Scatter plots Checking for outliers and or influential observations leverage plots, plot of Cook s Distance, plot of DFBETAS and DEFITS 11
12 Residuals (cont d) How are the residuals used? Formal tests based on Residuals Test of normality: Shapiro-Wilk s Test, Kolmogorove- Smirnove test Constant Variance homoscedasticity: White s test Checking independence: Durbin-Watson Test Outliers and/or influential observations: Wald s test using Cook s Distance 12
13 Residuals (cont d) Normal Q-Q Plot Normal Q-Q Plot Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles Data from the normal distribution Data from the lognormal distribution 13
14 Residuals (cont d) Normal Q-Q Plot Normal Q-Q Plot Sample Quantiles Sample Quantiles Theoretical Quantiles Data from the normal distribution Wrongly fitted model Theoretical Quantiles Data from the normal distribution Correctly fitted model resid(fitsq) resid(fitsq) fitted(fitsq) fitted(fitsq)
15 Residuals (cont d) Y1 others Y others X others Xnew others DFBETAS cooks.distance(fitnew)[-180] Index Index
16 2. Residuals in the Analysis of Longitudinal data Residuals are correlated Residuals are not normally distributed Residuals from the analysis of longitudinal data where there is no systematic component no effect of time 16
17 Residuals in the Analysis of Longitudinal data When there is time dependency where the mean is represented by a function of time, it is not obvious as to how we can use ordinary residuals obtained as a difference between the observed and fitted value 17 Correctly fitted model Wrongly fitted model
18 Outline 1. Introduction 2. Residuals in the analysis of longitudinal data 3. Transformed Residuals 4. The growth curve model and decomposed residuals 5. Real data application 6. Discussion 18
19 3. Transformed Residuals Cholesky decomposition Recall: The estimated covariance matrix for residuals is Consider the Cholesky decomposition Transform the residuals to get (Fitzmaurice, 2004) 19
20 Transformed Residuals (cont d) Small's graphical method The idea behind Small's graphical approach is to reduce the multivariate data to a univariate Suppose x 1, x 2, x n are independently distributed as N p (µ, ), then the statistic has a Beta distribution with parameters α = ½, β = ½(n-p-1) Where: 20
21 Transformed Residuals (cont d) Normal Q-Q Plot Multivariate normal data Independent (left) Correlated (right) Model is correctly fitted Normal Q- Q Plot Fitzmaurice's transformation Multivariate normal data Independent (left) Correlated (right) Model correctly fitted 21
22 Transformed Residuals (cont d) Normal Q-Q Plot Multivariate normal data Independent (left) Correlated (right) Model is correctly fitted Beta probability Plot Small's transformation Multivariate normal data Independent (left) Correlated (right) Model correctly fitted 22
23 Transformed Residuals (cont d) Normal Q-Q Plot of R Multivariate lognormal data Independent (left) correlated (right) Model is correctly fitted Fitzmaurice s Transformed Multivariate lognormal data Independent (left) Correlated (right) Model is correctly fitted 23
24 Transformed Residuals (cont d) Normal Q-Q Plot of R Multivariate lognormal data Independent (left) correlated (right) Model is correctly fitted Beta probability plots Small s Transformed Multivariate lognormal data Independent (left) Correlated (right) Model is correctly fitted 24
25 Transformed Residuals (cont d) Normal Q-Q Plot of R Multivariate normal data Correlated data Model is wrongly fitted Fitzmaurice's transformed residuals Multivariate normal data Correlated data Model is wrongly fitted 25
26 Transformed Residuals (cont d) Limitations in using the above two transformations in multivariate analysis Meant to be used for checking distributional assumptions and do not allow assessment of model fit If the model is not properly fitted, the performance for checking multivariate normality is not good as well This is particularly important in the analysis of longitudinal data where there is within individual assumption that has to be prespecified to describe the mean growth/change over time Residuals that allow is to check the within and between and between individual assumptions are better under this situations 26
27 Outline 1. Introduction 2. Residuals in the analysis of longitudinal data 3. Transformed Residuals 4. The growth curve model and decomposed residuals 5. Real data application 6. Discussion 27
28 4. The GCM and decomposed residuals The Growth Curve Model Suppose that we have m different groups where repeated measurements are taken from a given individual at p different time points. Suppose also that the mean for the i th group follows a polynomial curve of degree q over time, which can be described as Then, the Growth Curve Model (GCM) can be formulated as: 28
29 The Growth Curve Model A px(q+1) : Within individual design matrix B (q+1)xm : Parameter matrix C mxn : Between individual design matrix X pxn : observation matrix, and n = n 1 +n 2 29
30 The Growth Curve Model (cont d) Example: Dental measurements on eleven girls and sixteen boys at four different ages (8, 10, 12, 14) were taken. Each measurement is the distance, in millimeters, from the center of pituitary to pteryomaxillary fissure X = 30
31 The Growth Curve Model (cont d) 31
32 The Growth Curve Model (cont d) Objectives Should the growth curves be represented by second degree equations in time (t), or are linear equations adequate? Should two separate curves be used for boys and girls, or do both have the same growth curve? We may also be interested to estimate the growth curve(s) and obtain confidence band(s) for the expected growth curve(s)? 32
33 The Growth Curve Model (cont d) Example: Glucose Data Standard glucose tolerance test is administered 13 control and 20 obese patients Plasma inorganic phosphate measurements were determined from blood samples taken at 0, 0.5, 1, 1.5, 2, 3, 4 and 5 hours after a standard dose oral glucose Objective of the study was to study whether(or not) there is a significant difference between control and obese group of patients Second degree polynomial is used to model both groups 33
34 The Growth Curve Model (cont d) The matrices for the model 34
35 The Growth Curve Model (cont d) 35
36 Decomposed residuals (cont d) The maximum likelihood estimator for the parameter matrix B in the GCM is given by Khatri (1966): Where The predicted value is given as Therefore, ordinary residuals can be calculated by 36
37 Decomposed residuals (cont d) X Recall MANOVA Model: X X = BB + E The MLE estimate of B is B = XXX(CC ) 1 Residuals are therefore given by 37 R = X (I CC(CC ) 1 C)
38 Decomposed residuals (cont d) Note that R 1 + R 2 = X(I C CC 1 C) X Can be used to check between individual assumptions such as the normality assumption R 3 = XC CC 1 C AB C R 1 = I P A X(I P C ) R 2 = P A X(I P C ) R 3 = I P A XP C Can be used to check the within individual assumption This residual can be used to check if the fitted curve over time is adequate to represent the change over time 38
39 Decomposed residuals (cont d) Normal Q-Q Plot of R Multivariate normal data Correlated data Correctly fitted (left) Model is wrongly fitted (right) Scatter plot of R 3 Multivariate normal data Correlated data Correctly fitted (left) Model is wrongly fitted (right) 39
40 Decomposed residuals (cont d) Normal Q-Q Plot of R Multivariate normal data Correlated data Correctly fitted (left) Model is wrongly fitted (right) Scatter plot of R 3 Multivariate normal data Correlated data Correctly fitted (left) Model is wrongly fitted (right) 40
41 Decomposed residuals (cont d) Normal Q-Q Plot of R 1 + R 2 Perfectly fitted (left) Miss fitted (right) Multivariate Normal data Fitzmaurice's transformation Beta probability Plot of R 1 + R 2 Perfectly fitted (left) Miss fitted (right) GCM with normal error Small's transformation 41
42 Outline 1. Introduction 2. Residuals in the Growth Curve Model (GCM) 3. Residuals decomposition 4. Residuals transformation 5. Real data application 6. Discussion 42
43 5. Real data application Recall: Dental data 43
44 Real data application (cont d) Normal Q-Q plot of R 1 +R 2 (left) Scatter plot of R 3 (right) Normal Q-Q plot of Fitzmaurice's R 1 + R 2 (left) Beta quantile plot of Small's R 1 + R 2 (right) 44
45 Real data application (cont d) Dental data after outliers have been removed Normal Q-Q plot of Fitzmaurice's R 1 + R 2 (left) Beta quantile plot of Small's R 1 + R 2 (right) 45
46 Real data application (cont d) Recall: Glucose data 46
47 Real data application (cont d) Normal Q-Q plot of R 1 + R 2 (left) Scatter plot of R 3 (right) Normal Q-Q plot of Fitzmaurice's R 1 + R 2 (left) Beta quantile plot of Small's R 1 + R 2 (right) 47
48 Real data application (cont d) Glucose data without higher order of polynomial fitting Scatter plot of decomposed R 3 for quadratic fit (left) Scatter plot of decomposed R 3 for third degree fit (right) 48
49 6. Discussion Residuals play important roles in checking the adequacy of model fit, validating assumptions and identifying outliers and/or influential observations Residuals in the analysis of longitudinal data are correlated, not necessary normally distributed and Both Fitzmaurice's transformation or the Small's graphical method successfully removed the correlation structure. However, Fitzmaurices transformation did not perform well when data are not normally distributed where the transformed residuals leading to wrong decisions 49
50 Discussion (cont d) Residuals based on the growth curve model provided separate components of residuals that are useful for model diagnostics and checking multivariate normality The scatter plot of R 3 is able to identify systematic error in model fitting R 1 + R 2 provide reliable analysis for checking the normality assumption The results are consistent for small as well as large sample sizes, and for different covariance structures 50
51 Thank you!
STAT 4385 Topic 06: Model Diagnostics
STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice
The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)
The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE
More informationChecking model assumptions with regression diagnostics
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationSimple linear regression
Simple linear regression Prof. Giuseppe Verlato Unit of Epidemiology & Medical Statistics, Dept. of Diagnostics & Public Health, University of Verona Statistics with two variables two nominal variables:
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationRegression Diagnostics Procedures
Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the
More informationMulticollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response.
Multicollinearity Read Section 7.5 in textbook. Multicollinearity occurs when two or more predictors in the model are correlated and provide redundant information about the response. Example of multicollinear
More informationContents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects
Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More information2. TRUE or FALSE: Converting the units of one measured variable alters the correlation of between it and a second variable.
1. The diagnostic plots shown below are from a linear regression that models a patient s score from the SUG-HIGH diabetes risk model as function of their normalized LDL level. a. Based on these plots,
More informationSTAT5044: Regression and Anova
STAT5044: Regression and Anova Inyoung Kim 1 / 49 Outline 1 How to check assumptions 2 / 49 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationMultivariate and Multivariable Regression. Stella Babalola Johns Hopkins University
Multivariate and Multivariable Regression Stella Babalola Johns Hopkins University Session Objectives At the end of the session, participants will be able to: Explain the difference between multivariable
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationM A N O V A. Multivariate ANOVA. Data
M A N O V A Multivariate ANOVA V. Čekanavičius, G. Murauskas 1 Data k groups; Each respondent has m measurements; Observations are from the multivariate normal distribution. No outliers. Covariance matrices
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationRegression Diagnostics for Survey Data
Regression Diagnostics for Survey Data Richard Valliant Joint Program in Survey Methodology, University of Maryland and University of Michigan USA Jianzhu Li (Westat), Dan Liao (JPSM) 1 Introduction Topics
More informationwith the usual assumptions about the error term. The two values of X 1 X 2 0 1
Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationPrepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti
Prepared by: Prof Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti Putra Malaysia Serdang M L Regression is an extension to
More informationGeneralized Additive Models (GAMs)
Generalized Additive Models (GAMs) Israel Borokini Advanced Analysis Methods in Natural Resources and Environmental Science (NRES 746) October 3, 2016 Outline Quick refresher on linear regression Generalized
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationAnalysis of Incomplete Non-Normal Longitudinal Lipid Data
Analysis of Incomplete Non-Normal Longitudinal Lipid Data Jiajun Liu*, Devan V. Mehrotra, Xiaoming Li, and Kaifeng Lu 2 Merck Research Laboratories, PA/NJ 2 Forrest Laboratories, NY *jiajun_liu@merck.com
More informationRegression Model Building
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated
More informationMachine Learning Linear Regression. Prof. Matteo Matteucci
Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares
More informationSMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance
More informationProbability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS
CHAPTER Probability, Statistics, and Reliability for Engineers and Scientists FUNDAMENTALS OF STATISTICAL ANALYSIS Second Edition A. J. Clark School of Engineering Department of Civil and Environmental
More informationNonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp
Nonlinear Regression Summary... 1 Analysis Summary... 4 Plot of Fitted Model... 6 Response Surface Plots... 7 Analysis Options... 10 Reports... 11 Correlation Matrix... 12 Observed versus Predicted...
More informationCore Courses for Students Who Enrolled Prior to Fall 2018
Biostatistics and Applied Data Analysis Students must take one of the following two sequences: Sequence 1 Biostatistics and Data Analysis I (PHP 2507) This course, the first in a year long, two-course
More informationLabor Economics with STATA. Introduction to Regression Diagnostics
Labor Economics with STATA Liyousew G. Borga November 4, 2015 Introduction to Regression Diagnostics Liyou Borga Labor Economics with STATA November 4, 2015 64 / 85 Outline 1 Violations of Basic Assumptions
More informationLecture 2: Linear and Mixed Models
Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =
More informationBiostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich
Biostatistics Correlation and linear regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1 Correlation and linear regression Analysis
More informationSingle and multiple linear regression analysis
Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationRegression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur
Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using
More informationIncorporating published univariable associations in diagnostic and prognostic modeling
Incorporating published univariable associations in diagnostic and prognostic modeling Thomas Debray Julius Center for Health Sciences and Primary Care University Medical Center Utrecht The Netherlands
More informationOptimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping
: Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology
More informationResiduals and model diagnostics
Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional
More informationPath Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis
Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationPolynomial Regression
Polynomial Regression Summary... 1 Analysis Summary... 3 Plot of Fitted Model... 4 Analysis Options... 6 Conditional Sums of Squares... 7 Lack-of-Fit Test... 7 Observed versus Predicted... 8 Residual Plots...
More informationMcGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination
McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationSTK4900/ Lecture 5. Program
STK4900/9900 - Lecture 5 Program 1. Checking model assumptions Linearity Equal variances Normality Influential observations Importance of model assumptions 2. Selection of predictors Forward and backward
More informationMultivariate Capability Analysis Using Statgraphics. Presented by Dr. Neil W. Polhemus
Multivariate Capability Analysis Using Statgraphics Presented by Dr. Neil W. Polhemus Multivariate Capability Analysis Used to demonstrate conformance of a process to requirements or specifications that
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationThe In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification p.1/27
The In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification Brett Presnell Dennis Boos Department of Statistics University of Florida and Department of Statistics North Carolina State
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationRegression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr
Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics
More informationTutorial 2: Power and Sample Size for the Paired Sample t-test
Tutorial 2: Power and Sample Size for the Paired Sample t-test Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function of sample size, variability,
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationLeast Squares Estimation
Least Squares Estimation Using the least squares estimator for β we can obtain predicted values and compute residuals: Ŷ = Z ˆβ = Z(Z Z) 1 Z Y ˆɛ = Y Ŷ = Y Z(Z Z) 1 Z Y = [I Z(Z Z) 1 Z ]Y. The usual decomposition
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 In this chapter we will discuss inference
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationThe impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationMeta-analysis of epidemiological dose-response studies
Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.
More informationPsychology Seminar Psych 406 Dr. Jeffrey Leitzel
Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationMath 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University
Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions
More informationSTAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.
STAT 01 Assignment NAME Spring 00 Reading Assignment: Written Assignment: Chapter, and Sections 6.1-6.3 in Johnson & Wichern. Due Monday, February 1, in class. You should be able to do the first four problems
More information8. Example: Predicting University of New Mexico Enrollment
8. Example: Predicting University of New Mexico Enrollment year (1=1961) 6 7 8 9 10 6000 10000 14000 0 5 10 15 20 25 30 6 7 8 9 10 unem (unemployment rate) hgrad (highschool graduates) 10000 14000 18000
More informationFORECASTING STANDARDS CHECKLIST
FORECASTING STANDARDS CHECKLIST An electronic version of this checklist is available on the Forecasting Principles Web site. PROBLEM 1. Setting Objectives 1.1. Describe decisions that might be affected
More informationModule 6: Model Diagnostics
St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 6: Model Diagnostics 6.1 Introduction............................... 1 6.2 Linear model diagnostics........................
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationFrom Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...
From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...
More informationRef.: Spring SOS3003 Applied data analysis for social science Lecture note
SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton
More informationUnderstanding the Individual Contributions to Multivariate Outliers in Assessments of Data Quality
Understanding the Individual Contributions to Multivariate Outliers in Assessments of Data Quality Richard C. Zink, Ph.D. Senior Director, Data Management and Statistics TARGET PharmaSolutions Inc. rzink@targetpharmasolutions.com
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationCost analysis of alternative modes of delivery by lognormal regression model
2016; 2(9): 215-219 ISSN Print: 2394-7500 ISSN Online: 2394-5869 Impact Factor: 5.2 IJAR 2016; 2(9): 215-219 www.allresearchjournal.com Received: 02-07-2016 Accepted: 03-08-2016 Vice Principal MVP Samaj
More informationApplied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur
Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 29 Multivariate Linear Regression- Model
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More informationRemedial Measures for Multiple Linear Regression Models
Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline
More informationAvailable online at (Elixir International Journal) Statistics. Elixir Statistics 49 (2012)
10108 Available online at www.elixirpublishers.com (Elixir International Journal) Statistics Elixir Statistics 49 (2012) 10108-10112 The detention and correction of multicollinearity effects in a multiple
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationRegression Analysis By Example
Regression Analysis By Example Third Edition SAMPRIT CHATTERJEE New York University ALI S. HADI Cornell University BERTRAM PRICE Price Associates, Inc. A Wiley-Interscience Publication JOHN WILEY & SONS,
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More information