Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models

Similar documents
A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE

Regression diagnostics

arxiv: v1 [stat.me] 20 Apr 2018

Regression Analysis for Data Containing Outliers and High Leverage Points

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Horizonte, MG, Brazil b Departamento de F sica e Matem tica, Universidade Federal Rural de. Pernambuco, Recife, PE, Brazil

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Generalized Linear Models

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE

Scatter plot of data from the study. Linear Regression

Multiple Linear Regression

Math 423/533: The Main Theoretical Topics

Diagnostics for Linear Models With Functional Responses

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Lectures on Simple Linear Regression Stat 431, Summer 2012

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

GARCH Models Estimation and Inference

Scatter plot of data from the study. Linear Regression

Quantitative Methods I: Regression diagnostics

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors

EM Algorithm II. September 11, 2018

Multivariate Normal-Laplace Distribution and Processes

A Note on Visualizing Response Transformations in Regression

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence

ASSESSING A VECTOR PARAMETER

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Outline of GLMs. Definitions

Flat and multimodal likelihoods and model lack of fit in curved exponential families

Comparison of Estimators in GLM with Binary Data

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Linear Models, Problems

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris.

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Nonparametric Modal Regression

Measuring Local Influential Observations in Modified Ridge Regression

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

J.D. Godolphin. Department of Mathematics, University of Surrey, Guildford, Surrey. GU2 7XH, U.K. Abstract

((n r) 1) (r 1) ε 1 ε 2. X Z β+

The Log-generalized inverse Weibull Regression Model

STAT 4385 Topic 06: Model Diagnostics

Tom A.B. Snijders, ICS, University of Groningen Johannes Berkhof, Free University, Amsterdam

St 412/512, D. Schafer, Spring 2001

Improved maximum likelihood estimators in a heteroskedastic errors-in-variables model

Simple Linear Regression

Diagnostics can identify two possible areas of failure of assumptions when fitting linear models.

UNIVERSITY OF TORONTO Faculty of Arts and Science

For more information about how to cite these materials visit

The Poisson-Weibull Regression Model

Linear Regression (9/11/13)

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Diagnostics analysis for skew-normal linear regression models: applications to a quality of life dataset

Total Least Squares Approach in Regression Methods

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

MIT Spring 2015

Factor Analysis (10/2/13)

Package sym.arma. September 30, 2018

Stat 710: Mathematical Statistics Lecture 12

Continuous Time Survival in Latent Variable Models

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Cox s proportional hazards/regression model - model assessment

Regression models for multivariate ordered responses via the Plackett distribution

Ch 3: Multiple Linear Regression

Likelihood inference in the presence of nuisance parameters

Stat 579: Generalized Linear Models and Extensions

Chapter 1 Likelihood-Based Inference and Finite-Sample Corrections: A Brief Overview

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Simple Linear Regression

PQL Estimation Biases in Generalized Linear Mixed Models

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Multivariate Regression (Chapter 10)

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

INTRODUCING LINEAR REGRESSION MODELS Response or Dependent variable y

Linear Regression. Junhui Qian. October 27, 2014

Linear Models and Estimation by Least Squares

Vector Auto-Regressive Models

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

SENSITIVITY ANALYSIS IN LINEAR REGRESSION

Information in a Two-Stage Adaptive Optimal Design

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

[y i α βx i ] 2 (2) Q = i=1

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Generalized Linear Models

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

A Covariance Regression Model

VAR Models and Applications

Regression and Statistical Inference

Transcription:

Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models Francisco José A. Cysneiros 1 1 Departamento de Estatística - CCEN, Universidade Federal de Pernambuco, Recife - PE 5079-50 - Brazil, e-mail: cysneiros@de.ufpe.br Abstract: This work extends some diagnostics procedures to heteroscedastic symmetrical linear models. This class of models includes all symmetric continuous distributions, such as normal, Student-t, generalized Student-t, exponential power and logistic, among others. We present an iterative process for the parameter estimation and we derive the appropriate matrices for assessing the local influence under perturbation schemes. An standardized residual is deduced and illustrative example is given. S-Plus codes are available in the address www.de.ufpe.br/ cysneiros/elliptical/ heteroscedastic.html to implement the author s method. Keywords: Symmetrical distributions; Local influence; Residuals; Heteroscedastic models; Robust models. 1 Heteroscedastic symmetrical linear models The problem of modelling variances has been discussed by various authors, particularly in the econometric area. Under normal error, for instance, Cook and Weisberg (1983) present some graphical methods to detect heteroscedasticity. Smyth (1989) describes a method which allows modelling the dispersion parameter in some generalized linear models. Moving away from normal error, let ɛ i, i = 1,..., n, be independent random variables with density function of the form f ɛi (ɛ) = 1 φi g{ɛ 2 /φ i }, ɛ IR, (1) where φ i > 0 is the scale parameter, g : IR [0, ] is such that 0 g(u)du <. We shall denote ɛ i S(0, φ i ). The function g( ) is called density generator (see, for example, Fang, Kotz and Ng, 1990). We consider the linear regression model y i = µ i + φ i ɛ i, (2) where y = (y 1,..., y n ) T are the observed response values, µ i = x T i β, x i = (x i1,..., x ip ) T has values of p explanatory variables, β = (β 1,..., β p ) T and ɛ i S(0, 1). We have, when they exist, that E(Y i ) = µ i and Var(Y i ) =

2 Local Influence and Leverage in Heteroscedastic Symmetrical Linear Models ξφ i, where ξ > 0 is a constant given by ξ = 2ϕ (0), while ϕ (0) = dϕ(u)/du u=0 with ϕ( ) being a function such that ς(t) = e itµ ϕ(t 2 φ), t IR, where ς(t) = E(e ity ) is the characteristic function. We call the model defined by (1)-(2) heteroscedastic symmetrical linear model. We assume that the dispersion parameter φ i is parameterized as φ i = h(τ i ), where h( ) is a known one-to-one continuously differentiable function and τ i = z T i γ, where Z i = (z i1,..., z iq ) T has values of q explanatory variables and γ = (γ 1,..., γ q ) T. The function h( ) is usually called dispersion link function and it must be a positive-value function. One possible choice for h( ) is h(τ) = exp(τ). The dispersion covariates z i s are not necessarily the same location covariates x i s. It can be shown that β and γ are globally orthogonal parameters and the Fisher information matrix K for θ is blockdiagonal, namely K = diag{k β, K γ }. The Fisher information matrices K β and K γ for β and γ are given by K β = X T W 1 X and K γ = Z T W 2 Z, where W 1 = diag{d g /φ i } and W 2 = diag{ (fg 1)h i φ 2 i 2 }, for i = 1,..., n, where X is a n p matrix with rows x T i, v i = 2W g (u i ), u i = (y i µ i ) 2 /φ i, W g (u) = g (u) g(u), g (u) = g(u) u, h i = h(τ i) τ i and Z is a n q matrix with rows z T i. An iterative process to get the maximum likelihood estimates of β and γ may be developed by using, for example, the scoring Fisher method, which leads to the following system of equations: X T W (k) 1 Xβ(k+1) = X T W (k) 1 z(k) β and Z T W (k) 2 Zγ(k+1) = Z T W (k) 2 z(k) γ, where z β and z γ are n 1 vectors whose components take the forms z βi = µ i + v i d g (y i µ i ) and z γi = τ i + 2φ i (f g 1)h (v i u i 1), i d g = E{W 2 g (U 2 )U 2 } and f g = E{W 2 g (U 2 )U } with U S(0, 1). For example, the Student-t distribution with ν degrees of freedom one has d g = (ν + 1)/(ν + 3) and f g = 3(ν + 1)/(ν + 3). 2 Local influence The idea behind local influence is concerned with the study of the behaviour of some influence measure around the vector of no perturbation ω 0. For example, if the likelihood displacement LD(ω) = 2{L(ˆθ) L(ˆθ ω )} is used, where ˆθ ω denotes the maximum likelihood estimate under the perturbed model, the suggestion of Cook (1986) is to investigate the normal curvature of the lifted line LD(ω 0 + al), where a IR, around a = 0 for an arbitrary direction l, l = 1. He shows that the normal curvature may be expressed in the general form C l (θ) = 2 l T T L 1 θθ l, where is a (p + q) s matrix with elements ij = 2 L(θ ω)/ θ i ω j, i = 1,..., p + q and j = 1,..., s, with all the quantities evaluated at ˆθ.

Cysneiros 3 Lesaffre and Verbeke (1998) suggest evaluating the normal curvature at the direction of the ith observation, that consists in evaluating C l (θ) at the n 1 vector l i formed by zeros with one at the ith position. Paula et al. (2003) discuss some diagnostics procedures in homoscedastic symmetrical nonlinear regression models. Suppose the log-likelihood function for θ expressed as L(θ ω) = n i=1 ω ilog{g(u i )/ φ i }, where 0 ω j 1 is a case weights. Under this perturbation scheme the matrix T takes the form T = [D(g)D(e)X, D(m)Z] T where D(g) = diag{g 1,..., g n }, g i = vi φ i, D(m) = diag{m 1,..., m n }, m i = h i 2φ i (v i u i 1), D(e) = diag{e 1,..., e n } and e i = y i µ i. 3 Local influence on predictions Let q a p 1 vector explanatory variables values, for which we do not have necessarily an observed response. Then, the prediction at q is ˆµ(q) = p j=1 q j ˆβ j. Analogously, the point prediction at q based on the perturbed model becomes ˆµ(q, ω) = p j=1 q j ˆβ jw, where ˆβ ω = ( ˆβ 1ω,..., ˆβ pω ) T denotes the maximum likelihood estimate from the perturbed model. Thomas and Cook (1990) have investigated the effect of small perturbations on predictions at some particular point q in continuous generalized linear models. The objective function f(q, ω) = {ˆµ(q) ˆµ(q, ω)} 2 was chosen due to simplicity and invariance with respect to scale change. The normal curvature at the unit direction l takes, in this case, the form C l = l T fl, where f = 2 f/ ω ω T = 2 T ( L 1 ββ qqt L 1 ββ ), is evaluated at ω 0 and ˆβ. One has that l max (q) T L 1 ββ q. Consider an additive perturbation on the ith response, namely y iω = y i + ω i s i, where s i may be an estimate of the standardized deviation of y i and ω i IR. Then, the matrix equals X T D(a)D(s), where D(s) = diag{s 1,..., s n } and D(a) = diag{a 1,..., a n } a i = 1 φ i {v i W g(u i )u i }.. The vector l max (q) is constructed here by taking q = x i, which corresponds to the n 1 vector l max (x i ) D(s)D(a)X{X T D(a)X} 1 x i. A large value for l maxi (x i ) indicates that the ith observation should have substantial local influence on ŷ i. Then, the suggestion is to take the plot of the n 1 vector (l max1 (x 1 ),..., l maxn (x n )) T in order to identify those observations with high influence on its own fitted value. Residuals Because we have a symmetrical class of errors it is reasonable to think on the residual r i = y i ŷ i to perform residual analysis. A standardized version for r i may be attained by using the expansions up to order n 1 by Cox and Snell (1968). After some algebraic manipulations, we find that E(r) = 0 and Var(r) = ξφ{i n (d g ξ) 1 H},

Local Influence and Leverage in Heteroscedastic Symmetrical Linear Models where H = Φ 1/2 X(X T Φ 1 X) 1 X T Φ 1/2 and Φ = diag{φ 1,..., φ n }, I n is the identity matrix of order n, Therefore, a standardized form for r i is given by (y i ŷ i ) t ri = ˆφi ξ{1 (d g ξ) 1ĥ ii}. Simulation studies omitted here indicate that t ri has mean approximately zero, variance exceeding one, negligible skewness and some kurtosis. 5 Application To illustrate an application we shall consider the data set described in Montgomery et al. (2001, Table 3.2). The interest is on predicting the amount of time required by the router driver to service of vending machines in an outlet. The service activity includes stocking the machine with beverage products and minor maintenance or housekeeping. They fitted a homoscedastic linear regression model with intercept where the response variable was the delivery time, y (min), the covariates were the number of cases of producted stocked (x 1 ) and the distance walked by the route driver (x 2 ) in a sample of 25 observations. In their diagnostic analysis, points 9 and 22 appear with large effects on the parameter estimates ( see Montgomery et al. 2001, pp. 0,3,5,6,7). We propose to fit heteroscedastic linear models under error distributions with heavier tails than the normal ones, namely y i = β 0 + β 1 x i1 + β 1 x i2 + φ i ɛ i, i = 1,..., 25 (3) with φ i = exp{α + γx i2 } and ɛ i S(0, 1) mutually independent errors. We tried various error distributions but only two models seem to fit the data as well as or better than the normal model, the Student-t with degrees of freedom and the logistic-ii models. The generated envelopes for the three postulated models do not present any unusual features, (see Figure 1). Figure 1 also presents the plots of C i under normal, Student-t and logistic-ii errors. Influential observations appear in Student-t model with smaller values than normal and logistic-ii models. Acknowledgments: The author received financial support from CNPq, Brazil. References Cook, R.D. (1986) Assessment of local influence (with discussion). Journal of the Royal Statistical Society, Series B, 8, 133-169.

Cysneiros 5 - -2 0 2-3 3 9 18 18 FIGURE 1. Envelopes and plots of C i under the normal (left), Student-t with d.f. (middle) and logistic-ii (right) fitted on the delivery data. Cook, R.D. and Weisberg, S. (1983). Diagnostics for heteroscedasticity in regression. Biometrika 70, 1-10 Cox, D.R. and Snell, E.J. (1968). A general definition of residuals. Journal of the Royal Statistical Society, Series B, 30, 28-275. Fang, K.T., Kotz, S. and Ng, K.W. (1990). Symmetric Multivariate and Related Distributions. London: Chapman & Hall. Lesaffre, F. and Verbeke, G. (1998). Local influence in linear mixed models. Biometrics 5, 579-582. Montgomery, D.C.; Peck, E.A. and Vining, G.G. (2001). Introduction to Linear Regression Analysis, 3rd ed. New York: Wiley. Paula, G.A.; Cysneiros, F.J.A. and Galea, M. (2003). Local influence and Leverage in elliptical Nonlinear Regression Models. In: Proceedings of the 18th International Workshop on Statistical Modelling; Verbeke, G., Molenberghs, G; Aerts, A. and Fieuws, S. (Eds). Leuven: Katholieke Universiteit Leuven, pp. 361-366. Smyth, G.K. (1989). Generalized linear models with varying dispersion. Journal of the Royal Statistical Society, Series B, 51, 7-60. Thomas,W. and Cook, R.D. (1990). Assessing influence on predictions from generalized linear models. Technometrics 32, 59-65.