Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models

Size: px
Start display at page:

Download "Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models"

Transcription

1 Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models Francisco José A. Cysneiros 1 1 Departamento de Estatística - CCEN, Universidade Federal de Pernambuco, Recife - PE Brazil, cysneiros@de.ufpe.br Abstract: This work extends some diagnostics procedures to heteroscedastic symmetrical linear models. This class of models includes all symmetric continuous distributions, such as normal, Student-t, generalized Student-t, exponential power and logistic, among others. We present an iterative process for the parameter estimation and we derive the appropriate matrices for assessing the local influence under perturbation schemes. An standardized residual is deduced and illustrative example is given. S-Plus codes are available in the address cysneiros/elliptical/ heteroscedastic.html to implement the author s method. Keywords: Symmetrical distributions; Local influence; Residuals; Heteroscedastic models; Robust models. 1 Heteroscedastic symmetrical linear models The problem of modelling variances has been discussed by various authors, particularly in the econometric area. Under normal error, for instance, Cook and Weisberg (1983) present some graphical methods to detect heteroscedasticity. Smyth (1989) describes a method which allows modelling the dispersion parameter in some generalized linear models. Moving away from normal error, let ɛ i, i = 1,..., n, be independent random variables with density function of the form f ɛi (ɛ) = 1 φi g{ɛ 2 /φ i }, ɛ IR, (1) where φ i > 0 is the scale parameter, g : IR [0, ] is such that 0 g(u)du <. We shall denote ɛ i S(0, φ i ). The function g( ) is called density generator (see, for example, Fang, Kotz and Ng, 1990). We consider the linear regression model y i = µ i + φ i ɛ i, (2) where y = (y 1,..., y n ) T are the observed response values, µ i = x T i β, x i = (x i1,..., x ip ) T has values of p explanatory variables, β = (β 1,..., β p ) T and ɛ i S(0, 1). We have, when they exist, that E(Y i ) = µ i and Var(Y i ) =

2 2 Local Influence and Leverage in Heteroscedastic Symmetrical Linear Models ξφ i, where ξ > 0 is a constant given by ξ = 2ϕ (0), while ϕ (0) = dϕ(u)/du u=0 with ϕ( ) being a function such that ς(t) = e itµ ϕ(t 2 φ), t IR, where ς(t) = E(e ity ) is the characteristic function. We call the model defined by (1)-(2) heteroscedastic symmetrical linear model. We assume that the dispersion parameter φ i is parameterized as φ i = h(τ i ), where h( ) is a known one-to-one continuously differentiable function and τ i = z T i γ, where Z i = (z i1,..., z iq ) T has values of q explanatory variables and γ = (γ 1,..., γ q ) T. The function h( ) is usually called dispersion link function and it must be a positive-value function. One possible choice for h( ) is h(τ) = exp(τ). The dispersion covariates z i s are not necessarily the same location covariates x i s. It can be shown that β and γ are globally orthogonal parameters and the Fisher information matrix K for θ is blockdiagonal, namely K = diag{k β, K γ }. The Fisher information matrices K β and K γ for β and γ are given by K β = X T W 1 X and K γ = Z T W 2 Z, where W 1 = diag{d g /φ i } and W 2 = diag{ (fg 1)h i φ 2 i 2 }, for i = 1,..., n, where X is a n p matrix with rows x T i, v i = 2W g (u i ), u i = (y i µ i ) 2 /φ i, W g (u) = g (u) g(u), g (u) = g(u) u, h i = h(τ i) τ i and Z is a n q matrix with rows z T i. An iterative process to get the maximum likelihood estimates of β and γ may be developed by using, for example, the scoring Fisher method, which leads to the following system of equations: X T W (k) 1 Xβ(k+1) = X T W (k) 1 z(k) β and Z T W (k) 2 Zγ(k+1) = Z T W (k) 2 z(k) γ, where z β and z γ are n 1 vectors whose components take the forms z βi = µ i + v i d g (y i µ i ) and z γi = τ i + 2φ i (f g 1)h (v i u i 1), i d g = E{W 2 g (U 2 )U 2 } and f g = E{W 2 g (U 2 )U } with U S(0, 1). For example, the Student-t distribution with ν degrees of freedom one has d g = (ν + 1)/(ν + 3) and f g = 3(ν + 1)/(ν + 3). 2 Local influence The idea behind local influence is concerned with the study of the behaviour of some influence measure around the vector of no perturbation ω 0. For example, if the likelihood displacement LD(ω) = 2{L(ˆθ) L(ˆθ ω )} is used, where ˆθ ω denotes the maximum likelihood estimate under the perturbed model, the suggestion of Cook (1986) is to investigate the normal curvature of the lifted line LD(ω 0 + al), where a IR, around a = 0 for an arbitrary direction l, l = 1. He shows that the normal curvature may be expressed in the general form C l (θ) = 2 l T T L 1 θθ l, where is a (p + q) s matrix with elements ij = 2 L(θ ω)/ θ i ω j, i = 1,..., p + q and j = 1,..., s, with all the quantities evaluated at ˆθ.

3 Cysneiros 3 Lesaffre and Verbeke (1998) suggest evaluating the normal curvature at the direction of the ith observation, that consists in evaluating C l (θ) at the n 1 vector l i formed by zeros with one at the ith position. Paula et al. (2003) discuss some diagnostics procedures in homoscedastic symmetrical nonlinear regression models. Suppose the log-likelihood function for θ expressed as L(θ ω) = n i=1 ω ilog{g(u i )/ φ i }, where 0 ω j 1 is a case weights. Under this perturbation scheme the matrix T takes the form T = [D(g)D(e)X, D(m)Z] T where D(g) = diag{g 1,..., g n }, g i = vi φ i, D(m) = diag{m 1,..., m n }, m i = h i 2φ i (v i u i 1), D(e) = diag{e 1,..., e n } and e i = y i µ i. 3 Local influence on predictions Let q a p 1 vector explanatory variables values, for which we do not have necessarily an observed response. Then, the prediction at q is ˆµ(q) = p j=1 q j ˆβ j. Analogously, the point prediction at q based on the perturbed model becomes ˆµ(q, ω) = p j=1 q j ˆβ jw, where ˆβ ω = ( ˆβ 1ω,..., ˆβ pω ) T denotes the maximum likelihood estimate from the perturbed model. Thomas and Cook (1990) have investigated the effect of small perturbations on predictions at some particular point q in continuous generalized linear models. The objective function f(q, ω) = {ˆµ(q) ˆµ(q, ω)} 2 was chosen due to simplicity and invariance with respect to scale change. The normal curvature at the unit direction l takes, in this case, the form C l = l T fl, where f = 2 f/ ω ω T = 2 T ( L 1 ββ qqt L 1 ββ ), is evaluated at ω 0 and ˆβ. One has that l max (q) T L 1 ββ q. Consider an additive perturbation on the ith response, namely y iω = y i + ω i s i, where s i may be an estimate of the standardized deviation of y i and ω i IR. Then, the matrix equals X T D(a)D(s), where D(s) = diag{s 1,..., s n } and D(a) = diag{a 1,..., a n } a i = 1 φ i {v i W g(u i )u i }.. The vector l max (q) is constructed here by taking q = x i, which corresponds to the n 1 vector l max (x i ) D(s)D(a)X{X T D(a)X} 1 x i. A large value for l maxi (x i ) indicates that the ith observation should have substantial local influence on ŷ i. Then, the suggestion is to take the plot of the n 1 vector (l max1 (x 1 ),..., l maxn (x n )) T in order to identify those observations with high influence on its own fitted value. Residuals Because we have a symmetrical class of errors it is reasonable to think on the residual r i = y i ŷ i to perform residual analysis. A standardized version for r i may be attained by using the expansions up to order n 1 by Cox and Snell (1968). After some algebraic manipulations, we find that E(r) = 0 and Var(r) = ξφ{i n (d g ξ) 1 H},

4 Local Influence and Leverage in Heteroscedastic Symmetrical Linear Models where H = Φ 1/2 X(X T Φ 1 X) 1 X T Φ 1/2 and Φ = diag{φ 1,..., φ n }, I n is the identity matrix of order n, Therefore, a standardized form for r i is given by (y i ŷ i ) t ri = ˆφi ξ{1 (d g ξ) 1ĥ ii}. Simulation studies omitted here indicate that t ri has mean approximately zero, variance exceeding one, negligible skewness and some kurtosis. 5 Application To illustrate an application we shall consider the data set described in Montgomery et al. (2001, Table 3.2). The interest is on predicting the amount of time required by the router driver to service of vending machines in an outlet. The service activity includes stocking the machine with beverage products and minor maintenance or housekeeping. They fitted a homoscedastic linear regression model with intercept where the response variable was the delivery time, y (min), the covariates were the number of cases of producted stocked (x 1 ) and the distance walked by the route driver (x 2 ) in a sample of 25 observations. In their diagnostic analysis, points 9 and 22 appear with large effects on the parameter estimates ( see Montgomery et al. 2001, pp. 0,3,5,6,7). We propose to fit heteroscedastic linear models under error distributions with heavier tails than the normal ones, namely y i = β 0 + β 1 x i1 + β 1 x i2 + φ i ɛ i, i = 1,..., 25 (3) with φ i = exp{α + γx i2 } and ɛ i S(0, 1) mutually independent errors. We tried various error distributions but only two models seem to fit the data as well as or better than the normal model, the Student-t with degrees of freedom and the logistic-ii models. The generated envelopes for the three postulated models do not present any unusual features, (see Figure 1). Figure 1 also presents the plots of C i under normal, Student-t and logistic-ii errors. Influential observations appear in Student-t model with smaller values than normal and logistic-ii models. Acknowledgments: The author received financial support from CNPq, Brazil. References Cook, R.D. (1986) Assessment of local influence (with discussion). Journal of the Royal Statistical Society, Series B, 8,

5 Cysneiros FIGURE 1. Envelopes and plots of C i under the normal (left), Student-t with d.f. (middle) and logistic-ii (right) fitted on the delivery data. Cook, R.D. and Weisberg, S. (1983). Diagnostics for heteroscedasticity in regression. Biometrika 70, 1-10 Cox, D.R. and Snell, E.J. (1968). A general definition of residuals. Journal of the Royal Statistical Society, Series B, 30, Fang, K.T., Kotz, S. and Ng, K.W. (1990). Symmetric Multivariate and Related Distributions. London: Chapman & Hall. Lesaffre, F. and Verbeke, G. (1998). Local influence in linear mixed models. Biometrics 5, Montgomery, D.C.; Peck, E.A. and Vining, G.G. (2001). Introduction to Linear Regression Analysis, 3rd ed. New York: Wiley. Paula, G.A.; Cysneiros, F.J.A. and Galea, M. (2003). Local influence and Leverage in elliptical Nonlinear Regression Models. In: Proceedings of the 18th International Workshop on Statistical Modelling; Verbeke, G., Molenberghs, G; Aerts, A. and Fieuws, S. (Eds). Leuven: Katholieke Universiteit Leuven, pp Smyth, G.K. (1989). Generalized linear models with varying dispersion. Journal of the Royal Statistical Society, Series B, 51, Thomas,W. and Cook, R.D. (1990). Assessing influence on predictions from generalized linear models. Technometrics 32,

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series A, Pt. 1, pp. 144 149 A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE By M. MERCEDES SUÁREZ RANCEL and MIGUEL A. GONZÁLEZ SIERRA University

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

arxiv: v1 [stat.me] 20 Apr 2018

arxiv: v1 [stat.me] 20 Apr 2018 A new regression model for positive data arxiv:184.7734v1 stat.me Apr 18 Marcelo Bourguignon 1 Manoel Santos-Neto,3 and Mário de Castro 4 1 Departamento de Estatística, Universidade Federal do Rio Grande

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Horizonte, MG, Brazil b Departamento de F sica e Matem tica, Universidade Federal Rural de. Pernambuco, Recife, PE, Brazil

Horizonte, MG, Brazil b Departamento de F sica e Matem tica, Universidade Federal Rural de. Pernambuco, Recife, PE, Brazil This article was downloaded by:[cruz, Frederico R. B.] On: 23 May 2008 Access Details: [subscription number 793232361] Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE T E C H N I C A L R E P O R T 0465 KERNEL WEIGHTED INFLUENCE MEASURES HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY ATTRACTION

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity International Mathematics and Mathematical Sciences Volume 2011, Article ID 249564, 7 pages doi:10.1155/2011/249564 Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity J. Martin van

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

Quantitative Methods I: Regression diagnostics

Quantitative Methods I: Regression diagnostics Quantitative Methods I: Regression University College Dublin 10 December 2014 1 Assumptions and errors 2 3 4 Outline Assumptions and errors 1 Assumptions and errors 2 3 4 Assumptions: specification Linear

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors

D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors Journal of Data Science 920), 39-53 D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors Chuan-Pin Lee and Mong-Na Lo Huang National Sun Yat-sen University Abstract: Central

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Multivariate Normal-Laplace Distribution and Processes

Multivariate Normal-Laplace Distribution and Processes CHAPTER 4 Multivariate Normal-Laplace Distribution and Processes The normal-laplace distribution, which results from the convolution of independent normal and Laplace random variables is introduced by

More information

A Note on Visualizing Response Transformations in Regression

A Note on Visualizing Response Transformations in Regression Southern Illinois University Carbondale OpenSIUC Articles and Preprints Department of Mathematics 11-2001 A Note on Visualizing Response Transformations in Regression R. Dennis Cook University of Minnesota

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence Jordi Gali Luca Gambetti ONLINE APPENDIX The appendix describes the estimation of the time-varying coefficients VAR model. The model

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Flat and multimodal likelihoods and model lack of fit in curved exponential families

Flat and multimodal likelihoods and model lack of fit in curved exponential families Mathematical Statistics Stockholm University Flat and multimodal likelihoods and model lack of fit in curved exponential families Rolf Sundberg Research Report 2009:1 ISSN 1650-0377 Postal address: Mathematical

More information

Comparison of Estimators in GLM with Binary Data

Comparison of Estimators in GLM with Binary Data Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Linear Models, Problems

Linear Models, Problems Linear Models, Problems John Fox McMaster University Draft: Please do not quote without permission Revised January 2003 Copyright c 2002, 2003 by John Fox I. The Normal Linear Model: Structure and Assumptions

More information

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris.

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris. /pgf/stepx/.initial=1cm, /pgf/stepy/.initial=1cm, /pgf/step/.code=1/pgf/stepx/.expanded=- 10.95415pt,/pgf/stepy/.expanded=- 10.95415pt, /pgf/step/.value required /pgf/images/width/.estore in= /pgf/images/height/.estore

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

Measuring Local Influential Observations in Modified Ridge Regression

Measuring Local Influential Observations in Modified Ridge Regression Journal of Data Science 9(2011), 359-372 Measuring Local Influential Observations in Modified Ridge Regression Aboobacker Jahufer 1 and Jianbao Chen 2 1 South Eastern University and 2 Xiamen University

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

J.D. Godolphin. Department of Mathematics, University of Surrey, Guildford, Surrey. GU2 7XH, U.K. Abstract

J.D. Godolphin. Department of Mathematics, University of Surrey, Guildford, Surrey. GU2 7XH, U.K. Abstract New formulations for recursive residuals as a diagnostic tool in the classical fixed-effects linear model of arbitrary rank JD Godolphin Department of Mathematics, University of Surrey, Guildford, Surrey

More information

((n r) 1) (r 1) ε 1 ε 2. X Z β+

((n r) 1) (r 1) ε 1 ε 2. X Z β+ Bringing Order to Outlier Diagnostics in Regression Models D.R.JensenandD.E.Ramirez Virginia Polytechnic Institute and State University and University of Virginia der@virginia.edu http://www.math.virginia.edu/

More information

The Log-generalized inverse Weibull Regression Model

The Log-generalized inverse Weibull Regression Model The Log-generalized inverse Weibull Regression Model Felipe R. S. de Gusmão Universidade Federal Rural de Pernambuco Cintia M. L. Ferreira Universidade Federal Rural de Pernambuco Sílvio F. A. X. Júnior

More information

STAT 4385 Topic 06: Model Diagnostics

STAT 4385 Topic 06: Model Diagnostics STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized

More information

Tom A.B. Snijders, ICS, University of Groningen Johannes Berkhof, Free University, Amsterdam

Tom A.B. Snijders, ICS, University of Groningen Johannes Berkhof, Free University, Amsterdam Chapter 4 DIAGNOSTIC CHECKS FOR MULTILEVEL MODELS Tom A.B. Sniders, ICS, University of Groningen Johannes Berkhof, Free University, Amsterdam To be published in Jan de Leeuw & Ita Kreft (eds., Handbook

More information

St 412/512, D. Schafer, Spring 2001

St 412/512, D. Schafer, Spring 2001 St 412/512, D. Schafer, Spring 2001 Midterm Exam Your name:_solutions Your lab time (Circle one): Tues. 8:00 Tues 11:00 Tues 2:00 This is a 50-minute open-book, open-notes test. Show work where appropriate.

More information

Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model

Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model Statistical Papers manuscript No. (will be inserted by the editor) Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model Alexandre G. Patriota Artur J. Lemonte Heleno Bolfarine

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. 1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

The Poisson-Weibull Regression Model

The Poisson-Weibull Regression Model Chilean Journal of Statistics Vol. 8, No. 1, April 2017, 25-51 Research Paper The Poisson-Weibull Regression Model Valdemiro Piedade Vigas 1, Giovana Oliveira Silva 2,, and Francisco Louzada 3 1 Instituto

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference 1 / 171 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2013 2 / 171 Unpaid advertisement

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017 Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

Diagnostics analysis for skew-normal linear regression models: applications to a quality of life dataset

Diagnostics analysis for skew-normal linear regression models: applications to a quality of life dataset Submitted to the Brazilian Journal of Probability and Statistics Diagnostics analysis for skew-normal linear regression models: applications to a quality of life dataset Introduction Clécio da Silva Ferreira

More information

Total Least Squares Approach in Regression Methods

Total Least Squares Approach in Regression Methods WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Package sym.arma. September 30, 2018

Package sym.arma. September 30, 2018 Type Package Package sym.arma September 30, 2018 Title Autoregressive and Moving Average Symmetric Models Version 1.0 Date 2018-09-23 Author Vinicius Quintas Souto Maior [aut,cre,cph] and Francisco Jose

More information

Stat 710: Mathematical Statistics Lecture 12

Stat 710: Mathematical Statistics Lecture 12 Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:

More information

Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

Cox s proportional hazards/regression model - model assessment

Cox s proportional hazards/regression model - model assessment Cox s proportional hazards/regression model - model assessment Rasmus Waagepetersen September 27, 2017 Topics: Plots based on estimated cumulative hazards Cox-Snell residuals: overall check of fit Martingale

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Likelihood inference in the presence of nuisance parameters

Likelihood inference in the presence of nuisance parameters Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview

Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview Abstract This chapter introduces the likelihood function and estimation by maximum likelihood. Some important properties

More information

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares L Magee Fall, 2008 1 Consider a regression model y = Xβ +ɛ, where it is assumed that E(ɛ X) = 0 and E(ɛɛ X) =

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y Predictor or Independent variable x Model with error: for i = 1,..., n, y i = α + βx i + ε i ε i : independent errors (sampling, measurement,

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Vector Auto-Regressive Models

Vector Auto-Regressive Models Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

SENSITIVITY ANALYSIS IN LINEAR REGRESSION

SENSITIVITY ANALYSIS IN LINEAR REGRESSION SENSITIVITY ANALYSIS IN LINEAR REGRESSION José A. Díaz-García, Graciela González-Farías and Victor M. Alvarado-Castro Comunicación Técnica No I-05-05/11-04-2005 PE/CIMAT Sensitivity analysis in linear

More information

Information in a Two-Stage Adaptive Optimal Design

Information in a Two-Stage Adaptive Optimal Design Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for

More information

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

A Covariance Regression Model

A Covariance Regression Model A Covariance Regression Model Peter D. Hoff 1 and Xiaoyue Niu 2 December 8, 2009 Abstract Classical regression analysis relates the expectation of a response variable to a linear combination of explanatory

More information

VAR Models and Applications

VAR Models and Applications VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information