This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px

Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Julian Parsons
5 years ago
Views:

1 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

Computational Statistics and Data Analysis 68 (2013) 44 51 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.

Paula Instituto de Matemática e Estatística - USP, Brazil a r t i c l e i n f o a b s t r a c t Article history: Received 19 December 2012 Received in revised form 6 June 2013 Accepted 6 June 2013

diagnostic methods in double generalized linear models (DGLMs) for large samples.

2 Computational Statistics and Data Analysis 68 (2013) Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: On diagnostics in double generalized linear models Gilberto A. Paula Instituto de Matemática e Estatística - USP, Brazil a r t i c l e i n f o a b s t r a c t Article history: Received 19 December 2012 Received in revised form 6 June 2013 Accepted 6 June 2013 Available online 14 June 2013 Keywords: Deviance component residual Double gamma model Leverage measure Local influence Pearson residual Residual analysis The aim of this paper is to propose some diagnostic methods in double generalized linear models (DGLMs) for large samples. A review of DGLMs is given, including the iterative process for the estimation of the mean and precision coefficients as well as some asymptotic results. Then, a variety of diagnostic tools, such as leverage measures and curvatures of local influence under some usual perturbation schemes, the standardized deviance component, and Pearson residuals, are proposed. The diagnostic plots are constructed for the mean and precision models, and an illustrative example, in which the texture of four different forms of light snacks is compared across time with the texture of a traditional one, is analyzed under appropriate double gamma models. Some of the diagnostic procedures proposed in the paper are applied to analyze the fitted selected model Elsevier B.V. All rights reserved. 1. Introduction The class of double generalized linear models (DGLMs) was proposed by Smyth (1989), and Verbyla (1993) derived some case deletion diagnostics for linear heteroscedastic models under maximum likelihood (ML) and restricted maximum likelihood (REML) estimation. The REML method has been considered more reliable than ML for small samples (Smyth and Verbyla, 1999), and various papers have been published under this methodology. For example, Smyth and Verbyla (1999) investigated the sensitivity of the restricted maximum likelihood estimates (REMLEs) for some DGLMs, whereas Smyth and Jørgensen (2002) applied the framework of DGLMs to insurance claims. However, under the ML approach, little has been done on diagnostic methods. In this paper, some usual diagnostic quantities, such leverage measures, local influence curvatures, and Pearson and deviance component residuals are derived for DGLMs under ML. A large sample data set, in which the texture of five snack types is compared across time, is fitted under appropriate double gamma models, and a diagnostic analysis is performed with the quantities proposed in the paper to analyze the selected fitted model. The paper is organized as follows. In Section 2, a review of DGLMs is presented, whereas in Section 3 we derive some useful diagnostic quantities, such as generalized leverages, curvatures of local influence under some usual perturbation schemes, and standardized forms for the Pearson and deviance component residuals. All the calculations are performed for the mean and precision models. The application is given in Section 4, and Section 5 deals with some conclusions. Approximate standardized forms for the Pearson residuals are derived in the Appendix. 2. Review of DGLMs Let Y 1,..., Y n be independent random variables with the density function of Y i expressed in the exponential family form, f (y i ; θ i, φ i ) = exp[φ i {y i θ i b(θ i )} + c(y i ; φ i )], (1) where c(y i, φ i ) = d(φ i ) + φ i a(y i ) + u(y i ) (normal, inverse Gaussian, and gamma distributions), b( ), d( ), a( ), and u( ) are Correspondence to: Departamento de Estatística, IME-USP, Rua do Matão 1010, Cidade Universitária, , São Paulo-SP, Brazil. Tel.: addresses: giapaula@ime.usp.br, gilbertop056@gmail.com /$ see front matter 2013 Elsevier B.V. All rights reserved.

3 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Table 1 Useful quantities derived for some exponential family distributions. Normal Inverse Gaussian Gamma t i y i µ i 1 2 (µ2 i + y 2) i {y i/2µ 2 i + µ 1 i + (2y i ) 1 } log(y i /µ i ) y i /µ i 1 d(φ) log φ 1 log φ φ log φ log Γ (φ) 2 2 d (φ) (2φ) 1 (2φ) 1 (1 + log φ) ψ(φ) d (φ) (2φ 2 ) 1 (2φ 2 ) 1 φ 1 ψ (φ) Γ ( ), ψ( ), and ψ ( ) denote the gamma, digamma, and trigamma functions. twice differentiable functions, θ i is the canonical parameter, and φ i (φ 1 i ) is the precision (dispersion) parameter. Alternatively, taking T i = Y i θ i b(θ i ) + a(y i ), one may express the density function of T i (given θ i ) in the exponential family form (1), namely f (t i ; φ i ) = exp{φ i t i + d(φ i ) + u(y i )}. From standard regularity conditions it follows that µ i = E(Y i ) = b (θ i ) and Var(Y i ) = φ 1 i V(µ i ), where V(µ i ) = V i = b (θ i ) is the variance function, E(T i ) = d (φ i ) and Var(T i ) = d (φ i ). Table 1 presents some of the quantities above derived for the normal, inverse Gaussian, and gamma distributions. The DGLMs are defined by assuming the systematic components g(µ i ) = η i = x i β and h(φ i ) = λ i = z i γ, where β = (β 1,..., β p ) and γ = (γ 1,..., γ q ) are the model parameters to be estimated, x i = (x i1,..., x ip ) and z i = (z i1,..., z 1q ) contain values of explanatory variables, and g( ) and h( ) are the link functions. Models (1) and (2), called the mean model and the precision model, respectively, belong to the class of generalized additive models for location, scale, and shape proposed by Rigby and Stasinopoulos (2005) Parameter estimation The score function for β and γ may be, respectively, expressed as U β = X W V (y µ) and U γ = Z H 1 γ (t µ T ), where X is an n p matrix of rows x i (i = 1,..., n), W = diag{ω 1,..., ω n } with weights ω i = (dµ i /dη i ) 2 /V i, V = diag{v 1,..., V n }, = diag{φ 1,..., φ n }, y = (y 1,..., y n ), µ = (µ 1,..., µ n ), Z is an n q matrix of rows z i (i = 1,..., n), H γ = diag{h (φ 1 ),..., h (φ n )}, t = (t 1,..., t n ), and µ T = (E(T 1 ),..., E(T n )) = ( d (φ 1 ),..., d (φ n )). The Fisher information matrices for β and γ are, respectively, given by K ββ = X WX and K γ γ = Z PZ, where P = diag{p 1,..., p n } with p i = d (φ i ){h (φ i )} 2, i = 1,..., n. The joint iterative process for obtaining the maximum likelihood estimates ˆβ and ˆγ takes the form and β (m+1) = (X (m) W (m) X) 1 X (m) W (m) y (m) (3) γ (m+1) = (Z P (m) Z) 1 Z P (m) z (m), for m = 0, 1, 2,..., where y = Xβ + W V (y µ) and z = Zγ + V 1 γ H γ (t µ T ) are the modified dependent variables and V γ = diag{ d (φ 1 ),..., d (φ n )}. Note that P = V γ H 2 γ. This joint iterative process is solved by alternating Eqs. (3) (4) until convergence. Starting values may be the maximum likelihood estimates (MLEs) from the generalized linear model (GLM) with constant dispersion. The iterative process for obtaining the REMLEs takes the same form as (3) (4) with the quantities P and z being modified appropriately (see, for instance, Smyth and Verbyla, 1999). Fahrmeir and Tutz (2001) presented some regularity conditions for attaining the asymptotic normality of the parameter estimates in GLMs. Assuming that such regularity conditions are extended for DGLMs, one has for large n that ˆβ N p (β, K 1 ββ ) and ˆγ N q(γ, K 1 γ γ ). Due (2) (4) to the orthogonality between β and γ, one has asymptotic independence between ˆβ and ˆγ. DGLMs may be performed by using, for instance, the packages dglm and gamlss available in R software. 3. Diagnostic methods 3.1. Leverage The main idea behind the concept of leverage is that of evaluating the influence of each response on its own predicted value. In DGLMs, the influence of y on ŷ and t on ˆt may be well represented by the principal diagonal elements of the n n matrices ( ŷ/ y ) and ( ˆt/ t ), respectively. Using results from Wei et al. (1998), we find the generalized leverage

4 46 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) matrices GLy = ( ŷ/ y ) = NX( Lββ ) 1 X V 1 N ˆθ (γ fixed) and GLt = ( ˆt/ t ) = H 1 γ V T Z( Lγ γ ) 1 Z H 1 γ ˆθ (β fixed), where θ = (β, γ ), N = diag{dµ 1 /dη 1,..., dµ n /dη n }, Lββ and Lγ γ are the observed Fisher information matrices, and V T = diag{ d (φ 1 ),..., d (φ n )}. For large n, we obtain GL y = NX(X WX) 1 X V 1 N (5) and GL t = H 1 γ V T Z(Z PZ) 1 Z H 1 γ. (6) Thus, the principal diagonal elements GLyii = ˆφ i ˆω i x i (X WX) 1 x i of (5) and GLtii = ˆpii z i (Z PZ) 1 z i of (6), for i = 1,..., n, may be interpreted as leverage measures on the predicted responses of the models (1) and (2), respectively Local influence Suppose that the log-likelihood function is expressed as L(θ) = n i=1 L i(θ), where L i (θ) denotes the contribution of the ith observation. If a perturbation scheme is applied in the model or data, the perturbed log-likelihood function takes the form L(θ δ), where δ = (δ 1,..., δ n ) is the perturbation vector and δ 0 denotes the no-perturbation vector, which satisfies L(θ δ 0 ) = L(θ). A measure of discrepancy between the perturbed and non-perturbed models is the likelihood displacement, LD(δ) = 2{L(ˆθ) L(ˆθ δ )}, where ˆθ δ denotes the maximum likelihood estimate under the perturbed model. The local influence approach (Cook, 1986) takes into account the influence of small perturbations in the model or data on the measure LD(δ). The main idea is to study the normal curvatures for β and γ in the unitary direction l available at ˆθ and δ 0. Such curvatures are expressed for large n as C l (β) = 2 l 1 K 1 ˆβ ˆβ 1l and C l (γ) = 2 l 2 K 1 ˆγ ˆγ 2l, respectively, where 1 and 2 are p n and q n matrices with elements 1ji = L(θ δ)/ β j ω i and 2ki = L(θ δ)/ γ k ω i for i = 1,..., n, j = 1,..., p and k = 1,..., q. In order to have a curvature invariant under uniform change of scale, Poon and Poon (1999) proposed the conformal normal curvature, defined, for large n, as B l (β) = l 1 K 1 ˆβ ˆβ 1l tr( 1 K 1 ˆβ ˆβ 1) 2 and B l (γ) = l 2 K 1 ˆγ ˆγ 2l. tr( 2 K 1 ˆγ ˆγ 2) 2 This curvature is characterized to allow for any unitary direction l that 0 B l 1. A suggestion is evaluating the normal curvature in the direction l = e i, where e i is an n 1 vector with 1 in the ith position and zeros in the remaining positions, and observing the index plot of B ei. We suggest using B ei > B + 4SE(B) to discriminate if an observation is influential or not, where B is the mean of B = {Bei, i = 1,..., n} and SE(B) denotes the standard error of B. Case-weight perturbation Under this perturbation scheme, we assume that L(θ δ) = n i=1 δ il i (θ), 0 δ i 1 and δ 0 = (1,..., 1). After some algebraic manipulation, we find that 1ji = ˆr Pi ( ˆφ i ˆω i ) x ij for j = 1,..., p and 2ji = ˆr Ti ˆpi z ij for j = 1,..., q, with r Pi = φ i (y i µ i )/ V i and r Ti = {t i + d (φ i )}/ d (φ i ) being the Pearson residuals for Y i and T i, respectively, for i = 1,..., n. Hence, for large n, we obtain C l (β) = 2 l DrP H DrP l (7) and C l (γ) = 2 l DrT R DrT l, (8) where DrP = diag{ˆr P1,..., ˆr Pn }, DrT = diag{ˆr T1,..., ˆr Tn }, H = W X(X WX) 1 X W, and R = P Z(Z PZ) 1 Z P. To assess the sensitivity of the parameter estimates ˆβ and ˆγ under the case-weight perturbation scheme, we can consider the largest directions l = l max in (7) and (8), which correspond to the eigenvectors relative to the largest eigenvalues of the matrices DrP H DrP and DrT R DrT, respectively. The total local influence may be also performed by evaluating the curvatures (7) and (8), respectively, in the direction of the ith observation, obtaining C i (β) = 2ĥ iiˆr 2 P i and C i (γ) = 2ˆr iiˆr 2 T i for i = 1,..., n, where ĥ ii and ˆr ii are the principal diagonal elements of the matrices H and R, respectively. Note that ĥii = GLyii and ˆr ii = GLtii.

5 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Response perturbation Suppose that the ith observed response of the mean model (1) is perturbed as y iδ = y i + s yi δ i, where s yi is a consistent estimate of the standard deviation of Y i, δ i R and δ 0 = (0,..., 0) T. For φ i fixed, we have that 1ji = φ i x ij s yi ˆωi ˆV i, for j = 1,..., p and i = 1,..., n. Then, for large n, the normal curvature for β in the unitary direction l takes the form C l (β) = 2 l Ddy H Ddy l, where D dy = diag{s y1 /sd y1,..., s yn /sd yn } and sd yi denotes the standard deviation of Y i. Since s y i sd yi p 1, the total local influence, evaluating the curvature (9) in the direction of the ith observation, yields C i (β) ĥ ii. On the other hand, suppose for θ i fixed that the ith observed response of the precision model (2) is perturbed as t iδ = t i +s ti δ i, where s ti is a consistent estimate of the standard deviation of T i, δ i R and δ 0 = (0,..., 0) T. We found 2ji = s ti h (φ) z ij, for j = 1,..., q and i = 1,..., n. Then, for large n, the normal curvature for γ in the unitary direction l takes the form C l (γ) = 2 l Ddt R Ddt l, where D dt = diag{s t1 /sd t1,..., s tn /sd tn } and sd ti denotes the standard deviation of T i. Since s t i sd ti p 1, the total local influence, evaluating the curvature (10) in the direction of the ith observation, yields C i (β) ˆr ii. Explanatory variable perturbation Consider now the values of the tth explanatory variable of η i, assumed continuous, perturbed as x itδ = x it + s xt δ i, where s xt is an estimate of the standard deviation of X t, δ i R and δ 0 = (0,..., 0). Matrix 1 has elements 1ji = s xt ˆβ t ˆφ i x ij {ˆfi (y i ˆµ i ) ˆω i } for j = 1,..., p and j t, and 1ti = s xt ˆβ t ˆφ i x it {ˆfi (y i ˆµ i ) ˆω i } + s xt ˆφ i ˆωi ˆV i (y i ˆµ i ). The normal curvature for β in the unitary direction l yields C l (β) = 2 l 1 (X W X) 1 1 l, where 1 = s xt ˆβ t X { F Dr W} + sxt A 1 W V Dr, where F = diag{f 1,..., f n }, f i = d 2 θ i /dη 2 i, D r = diag{y 1 µ 1,..., y n µ n }, and A 1 is a p n matrix of zeros with 1s in the tth row. Similarly, consider the values of the tth explanatory variable of λ i, assumed continuous, perturbed as z itδ = z it + s zt δ i, where s zt is an estimate of the standard deviation of Z t, δ i R, and δ 0 = (0,..., 0). Matrix 2 has elements 2ji = s zt ˆγ t z ij [ĝ i {t i + d ( ˆφ i )} ˆp i ] for j = 1,..., q and j t, and 2ti = s xt ˆγ t z it [ĝ i {t i + d ( ˆφ i )} ˆp i ] + s zt h ( ˆφ i ) {t i + d ( ˆφ i )}. The normal curvature for γ in the unitary direction l yields C l (γ) = 2 l 2 (Z PZ) 1 2 l, where 2 = s zt ˆγ t Z { G Drt P} + szt A 2 H 1 γ D rt, where G = diag{g 1,..., g n }, g i = d 2 φ i /dλ 2, i D r t = diag{t 1 + d (φ 1 ),..., t n + d (φ n )}, and A 2 is a q n matrix of zeros with 1s in the tth row Residual analysis The aim of residual analysis is to assess departures from the assumptions made for the model, particularly for the error assumptions, and to detect outlying observations. Natural residuals in DGLMs are Pearson and deviance component residuals. Approximate standardized forms for the Pearson residuals ˆr Pi and ˆr Ti (see the Appendix) are given by t P1i = ˆφ i (y i ˆµ i ) (1 ĥ ii ) ˆVi and t P2i = {ˆti + d ( ˆφ i )}, d ( ˆφ i )(1 ˆr ii ) respectively. Deviance component residuals may be derived from the deviances of models (1) and (2). For model (1), one has D 1 (y; ˆµ) = n i=1 d2 1 (y i; ˆµ i ) (φ i fixed i), where d 2 1 (y i; ˆµ i ) = 2φ i [y i ( θ i ˆθ i ) + {b(ˆθ i ) b( θ i )}], with θ i = θ i (y i ) being the maximum likelihood estimate of θ i under the saturated model, and θ i satisfies b( θ i ) = y i. For model (2), the deviance takes the form D 2 (t; ˆφ) = n i=1 d2 2 (t i; ˆφ i ) (θ i fixed i), where φ = (φ 1,..., φ n ), d 2 2 (t i; ˆφ i ) = 2φ i [t i ( φ i ˆφ i ) + {d( ˆφ i ) d( φ i )}], with φ i = φ i (t i ) being the maximum likelihood estimate of φ i under the saturated model, and φ i satisfies d ( φ i ) = t i. Standardized forms are given by t D1i = ± d 2 1 (y i; ˆµ i ) ± d 2 2 and t D2i = (ˆti ; ˆφ i ), 1 ĥ 1 ˆr ii ii (9) (10) which may be supported by calculations of Cox and Snell (1968), where the signs are the same as those of (y i ˆµ i ) and {ˆti + d ( ˆφ)}, respectively. Even though the empirical distributions of the residuals t D1i and t D2i are not well known, we may suggest

6 48 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Fig. 1. Robust boxplots of the texture for each snack type for all weeks (left) and across weeks for all snacks (right). Fig. 2. Profile of the means for each snack type across weeks (left) and profile of the variation coefficients for each snack type across weeks (right). performing a normal probability plot with generated envelope as suggested by Atkinson (1981) (see also Williams, 1987) to detect departures from the error assumptions as well as outlying observations in the mean and precision fitted models. 4. Application As illustration, we will consider a data set from an experiment developed in the School of Public Health University of São Paulo, in which four different forms of light snacks (named B, C, D, and E) were compared across 20 weeks with a traditional snack (named A). For the light snacks, the hydrogenated vegetable fat (hvf) was replaced by canola oil under different proportions: B (0% hvf, 22% canola oil), C (17% hvf, 5% canola oil), D (11% hvf, 11% canola oil) and E (5% hvf, 17% canola oil), whereas A (22% hvf, 0% canola oil). The experiment was conducted so that in each even week a random sample of 15 units of each snack type was analyzed in a laboratory and various variables were measured. Then, a total of 75 units was analyzed in each even week, making 750 units in total during the experiment (Paula et al., 2004). In this analysis we will only consider the variable texture, which will be compared across time among the five snack types. Fig. 1 presents the boxplots for the texture adjusted for asymmetric data (see Hubert and Vandervieren, 2008) for the five snack types and across weeks. We notice from both graphs skewed distributions to the right, with few extreme observations. The R package robustbase was used to construct the adjusted boxplots using the function adjbox. In Fig. 2, one has the mean and variation coefficient profiles for each snack type and across weeks. The means and variation coefficients seem to be different among the snacks, changing across weeks, with indication of quadratic tendencies for the means. ind Then, based on the graphs above we used the following double gamma models to fit the snack data: (1) Y ijk G(µ ij, φ ij ), (2) g(µ ij ) = η ij for η ij = β 0 + β i + β 6 weeks j or η ij = β 0 + β i + β 6 weeks j + β 7 weeks 2 j and (3) h(φ ij ) = λ ij for λ ij = γ 0 + γ i or λ ij = γ 0 + γ i + γ 6 weeks j or λ ij = γ 0 + γ i + γ 6 weeks j + γ 7 weeks 2 j, with g( ) identity and logarithmic and h( ) logarithmic, where Y ijk denotes the texture corresponding to the kth unit of the ith snack type in the jth week, for i = 1(A), 2(B), 3(C), 4(D), 5(D), j = 2, 4,..., 20 and k = 1,..., 15. The reciprocal link was not considered for g( ) due to difficulty in the parameter interpretation, and for other h( ) links the convergence was not attained. Because the aim of the study is to compare the snack types controlling the effects of the texture by time, we do not consider the possibility of

7 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Fig. 3. Normal probability plots with generated envelopes for the deviance component residual for the mean model (left) and for the precision model (right). Table 2 Summary of the double gamma models fitted to the snacks data set. g( ) Predictor h( ) Predictor AIC BIC Identity G + W Log G G + W G + W + W G + W + W2 Log G G + W G + W + W Log G + W Log G G + W G + W + W G + W + W2 Log G G + W G + W + W G: group, W: weeks and W2: weeks 2. interactions among the snack types and the number of weeks. Table 2 summarizes the AIC (Akaike, 1973) and BIC (Schwarz, 1978) values for each fit, where AIC = 2L( θ) + 2(p + q) and BIC = 2L( θ) + (p + q) log(n). Thus, we selected with the smallest AIC and BIC values the following double gamma model: (1) Y ijk ind G(µ ij, φ ij ) with (2) log(µ ij ) = β 0 + β i + β 6 weeks j + β 7 weeks 2 j and (3) log(φ ij ) = γ 0 + γ i, for i = 1,..., 5 and j = 2, 4,..., 20. The parameter estimates are given in Table 3. We also fitted the final model under the restricted maximum likelihood method, and the parameter estimates were very similar to the ones given in Table 3. The normal probability plots for the deviance component residuals t D1i and t D2i with simulated envelopes (Fig. 3) do not present any unusual features. The graph of B ei against the number of weeks to assess the sensitivity of the mean coefficients is given in Fig. 4(a), and we notice eight observations for which B ei is above the cutoff line ( B + 4SE(B)). Such observations appear in the last weeks. In Fig. 4(b), one has the graph of B ei against the number of weeks to assess the sensitivity of the precision coefficients. Here, six observations appear with B ei above the cutoff line. It is interesting to notice that no outstanding observation corresponds to snack type A. Elimination of these observations does not change the inference for either the mean or the precision coefficients. Finally, the predicted mean and precision values for the texture across weeks are described in Figs. 5(a) and (b), respectively, for each snack type. We notice in Fig. 5(a) that snack type A has the largest mean values across weeks, followed by snack type C. Looking at Fig. 5(b), the largest predicted dispersions for the texture appear for the snack types A and C, respectively. The largest mean values for snack type A is expected, since it is the standard one, as well as the quadratic tendency for all snacks, because an aim of the study is to determine the ideal storing time, which seems to be about 12 weeks. 5. Conclusion In this paper, DGLMs are revisited. Useful diagnostic methods are derived and approximations are given for large n in which the bias estimates tend to be small under ML. Bias expressions up to order n 1 may be found, for instance, in Botter and Cordeiro (1998). The selected double gamma model presented an adequate fit to the snack data, as confirmed by the diagnostic analysis; however, other skew models could be applied, such as inverse Gaussian, log-normal, Weibull, and Birnbaum Saunders, among others. In particular, robust parameter estimates may be obtained by applying Birnbaum Saunders t

8 50 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Table 3 Parameter estimates for the final double gamma model. Effect Mean Precision Estimate E/S.E. Estimate E/S.E. Intercept Group A Group B Group C Group D Group E Week Week Deviance d.f d.f Bei Bei Weeks Weeks Fig. 4. Graphs of B ei against weeks under the case-weight perturbation scheme for assessing the local influence on ˆβ (left) and on ˆγ (right), respectively. Predicted Mean Value Predicted Precision Weeks Weeks Fig. 5. Predicted mean value of texture (left) and predicted precision of texture (right) for each snack type across weeks. models (Paula et al., 2012). Extension of the diagnostic procedures discussed in this paper for small samples is more complex, and depends on the definition of appropriate influence measures for REMLEs to derive the curvatures of local influence. Concerning leverage measures, the approach proposed in Wei et al. (1998) may be adapted for REML estimation, and the residuals proposed in Section 3.3 may be applied under REML estimation; however, their empirical distributional properties should be considered in the analysis. Nevertheless, extension of the results for dispersion models (Jørgensen, 1997; Cordeiro et al., 1994) may be performed for large samples by adapting the expressions derived in this paper. The author developed codes in R to produce the plots given in Section 4 that may be available upon request. Acknowledgments The author is grateful to the editors and two anonymous referees. This work was supported by CNPq and FAPESP, Brazil.

9 G.A. Paula / Computational Statistics and Data Analysis 68 (2013) Appendix At the convergence of (3), the maximum likelihood estimate ˆβ may be interpreted as the least-squares solution of the linear regression of W ŷ on the columns of the matrix W X. Thus, there follows the relationship (I H) W Xŷ = W (ŷ ˆη), and, in particular, V (y ˆµ) = (I H) W ŷ. The approximation Var(ŷ ) = 1 W 1 supports the standardized form for t P1i (see, for instance, McCullagh and Nelder, 1989, p. 397). Similarly, at the convergence of (4), the maximum likelihood estimate ˆγ may be interpreted as the leastsquares solution of the linear regression of P ẑ on the columns of the matrix P Z. Thus, one has the relationship V 1 γ (ˆt ˆµT ) = (I R) P ẑ. The approximation Var(ẑ ) = P 1 supports the standardized form for t P2i. References Akaike, H., Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csàki, F. (Eds.), International Symposium on Information Theory. Akadémiai Kiadó, Budapest, Hungary, pp Atkinson, A.C., Two graphical display for outlying and influential observations in regression. Biometrika 68, Botter, D.A., Cordeiro, G.M., Improved estimators for generalized linear models with dispersion covariates. Journal of Statistical Computations and Simulation 62, Cook, R.D., Assessment of local influence (with discussion). Journal of the Royal Statistical Society. Series B 48, Cordeiro, G.M., Paula, G.A., Botter, D.A., Improved likelihood ratio tests for dispersion models. International Statistical Review 62, Cox, D.R., Snell, E.J., A general definition of residuals (with discussion). Journal of the Royal Statistical Society. Series B 30, Fahrmeir, L., Tutz, G., Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, New York. Hubert, H., Vandervieren, E., An adjusted boxplot for skewed distributions. Computational Statistics and Data Analysis 52, Jørgensen, B., The Theory of Dispersion Models. Chapman and Hall, London. McCullagh, P., Nelder, J.A., Generalized Linear Models, second ed. Chapman and Hall, London. Paula, G.A., de Moura, A.S., Yamaguchi, A.M., Sensorial stability of snacks with canola oil and hydrogenated vegetable fat. Technical Report. Center of Applied Statistics, University of São Paulo (in Portuguese). Paula, G.A., Leiva, V., Barros, M., Liu, S., Robust statistical modeling using Birnbaum Saunders-t distribution applied to insurance. Applied Stochastic Models in Business and Industry 28, Poon, W., Poon, Y.S., Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society. Series B 61, Rigby, R.A., Stasinopoulos, D.M., Generalized additive models for location, scale and shape. Journal of Applied Statistics 54, Schwarz, C., Estimating the dimension of a model. Annals of Statistics 6, Smyth, G.K., Generalized linear models with varying dispersion. Journal of the Royal Statistical Society. Series B 51, Smyth, G.K., Jørgensen, B., Fitting Tweedie s compound Poisson model to insurance claims data: dispersion modelling. ASTIN Bulletin 32, Smyth, G.K., Verbyla, A., Adjusted likelihood methods for modelling dispersion in generalized linear models. Environmetrics 10, Verbyla, A.P., Modelling variance heterogeneity: residual maximum likelihood and diagnostics. Journal of the Royal Statistical Society. Series B 55, Wei, B.C., Hu, Y.Q., Fung, W.K., Generalized leverage and its applications. Scandinavian Journal of Statistics 25, Williams, D.A., Generalized linear model diagnostic using the deviance and single case deletion. Applied Statistics 36,

Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models

Local Influence and Residual Analysis in Heteroscedastic Symmetrical Linear Models Francisco José A. Cysneiros 1 1 Departamento de Estatística - CCEN, Universidade Federal de Pernambuco, Recife - PE 5079-50