Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Similar documents
ATINER's Conference Paper Series STA

Generalized Quasi-likelihood (GQL) Inference* by Brajendra C. Sutradhar Memorial University address:

Bias Study of the Naive Estimator in a Longitudinal Binary Mixed-effects Model with Measurement Error and Misclassification in Covariates

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

PQL Estimation Biases in Generalized Linear Mixed Models

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

Using Estimating Equations for Spatially Correlated A

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS

,..., θ(2),..., θ(n)

Figure 36: Respiratory infection versus time for the first 49 children.

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Generalized Linear Models Introduction

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

Sample size calculations for logistic and Poisson regression models

Stat 579: Generalized Linear Models and Extensions

Outline of GLMs. Definitions

Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Analysis of 2 n Factorial Experiments with Exponentially Distributed Response Variable

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Regularization in Cox Frailty Models

MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA

STAT 526 Advanced Statistical Methodology

Quasi-likelihood Scan Statistics for Detection of

R Package glmm: Likelihood-Based Inference for Generalized Linear Mixed Models

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

For more information about how to cite these materials visit

Generalized Linear Models (GLZ)

Repeated ordinal measurements: a generalised estimating equation approach

Approximate Likelihoods

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

arxiv: v2 [stat.me] 8 Jun 2016

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

High-dimensional two-sample tests under strongly spiked eigenvalue models

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1

MODELING COUNT DATA Joseph M. Hilbe

Fisher information for generalised linear mixed models

The LmB Conferences on Multivariate Count Analysis

Semiparametric Generalized Linear Models

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Discrete Response Multilevel Models for Repeated Measures: An Application to Voting Intentions Data

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

A weighted simulation-based estimator for incomplete longitudinal data models

Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Count Data

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Negative Multinomial Model and Cancer. Incidence

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Sample Size and Power Considerations for Longitudinal Studies

Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Least Squares Estimation-Finite-Sample Properties

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

CONDITIONAL LIKELIHOOD INFERENCE IN GENERALIZED LINEAR MIXED MODELS

Linear Regression. Junhui Qian. October 27, 2014

Nonlinear multilevel models, with an application to discrete response data

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

Generalized Linear Models. Kurt Hornik

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

The performance of estimation methods for generalized linear mixed models

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

Generalized linear mixed models (GLMMs) for dependent compound risk models

Comparison of Estimators in GLM with Binary Data

Introduction to General and Generalized Linear Models

Generalized Estimating Equations

Generalized linear mixed models for dependent compound risk models

BOOTSTRAPPING WITH MODELS FOR COUNT DATA

Linear Regression Models P8111

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

Longitudinal data analysis using generalized linear models

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Bias-corrected AIC for selecting variables in Poisson regression models

Generalized, Linear, and Mixed Models

Part 8: GLMs and Hierarchical LMs and GLMs

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

D-optimal Designs for Factorial Experiments under Generalized Linear Models

On testing the equality of mean vectors in high dimension

Covariance function estimation in Gaussian process regression

BIOS 2083 Linear Models c Abdus S. Wahed

Generalized Estimating Equations (gee) for glm type data

Stat 5101 Lecture Notes

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Generalized Linear Models I

High-Throughput Sequencing Course

Linear Regression With Special Variables

Transcription:

Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data M.R.I. Chowdhury and B.C. Sutradhar Memorial University of Newfoundland, Canada Abstract Inferences for the regression parameters and the variance of the random effects in the generalized linear mixed models (GLMMs) set up, is an extremely important statistical issue. It is however known that the most widely used penalized quasi-likelihood (PQL) approach may not produce consistent estimates for the parameters, especially when the true variance of the random effects is large. In the context of Poisson mixed models, in this paper, we examine the consistency performances of two other competitive estimation approaches, namely, the hierarchical likelihood (HL) and the generalized quasi-likelihood (GQL) approaches. An extensive simulation study shows that the HL approach, similar to the PQL approach, appears to produce highly biased and hence inconsistent estimates for the regression parameters, especially when the variance of the random effects is large. The biases of the HL estimates also appears to vary depending on the cluster sizes. As an alternative, the GQL approach appears to produce consistent estimates for all parameters of this model irrespective of the size of the cluster and the magnitude of the variance of the random effects. The GQL and HL estimates are also compared in a real life data analysis. AMS (2000) subject classification. Primary 62F10; secondary 62H20. Keywords and phrases. GLMMs, method of moments approach, penalized quasi-likelihood approach, hierarchical likelihood approach, generalized quasilikelihood approach, relative bias, zero-inflated poisson mixed model. 1 Introduction In general, whether the responses are count or binary, the generalized linear mixed models (GLMMs) for such responses are generated from the well-known generalized linear model (GLM) (McCullagh and Nelder, 1989) by adding random effects to the linear predictor. A random variable Y ij

56 M.R.I. Chowdhury and B.C. Sutradhar for the j th member (j = 1,...,n i ) of the i th family (i = 1,...,K), with exponential density f (y ij ) = exp[{y ij η ij a(η ij )} + b(y ij )] (1.1) follows a GLM when η ij has a linear form, namely, η ij = x ij β where x ij = (x ij1,...,x iju,...,x ijp ) is the vector of covariates and β is the vector of regression parameters. In (1.1), a(.) and b(.) are known functions. Note that the exponential form (1.1) contains both the Poisson and binary distributions. As far as the GLMM is concerned, it is developed by adding random effects, say γ i to the linear predictor η ij, where γ i for i = 1,...,K are generally assumed to be independently and identically distributed (i.i.d) with mean 0 and variance σ 2 i.i.d. In notation, we use γ i (0,σ 2 ). Thus in the GLMM set-up, the binary and count responses follow (1.1) with a(η ij ) = exp (η ij ) in the Poisson case, and a(η ij ) = exp(η ij )/(1 + exp (η ij )) in the binary case, where η ij = x ij β + σ z i γi. Here, z i is a random effects related known covariate for the ith cluster, and γi = γ i (0,1). In this paper, σ to be specific, we will deal with the GLMMs for count data only. Note that the GLMMs for the count and binary data with two or more random effects have also been discussed in the literature (see Lin and Breslow (1996), for example) but in the present paper, without any loss of generality, we confined our analysis to the GLMMs (1.1) with a single random effect. One of the main reasons for the selection of such simple models in this paper is that our objective is not to analyze any higher dimensional complicated model, but, to compare the relative performances of two highly competitive estimation approaches in estimating the parameters of simpler models. Under the GLMMs setup, it has proven to be difficult to obtain consistent and efficient estimators for the regression parameters (β) and the variance of the random effects (σ 2 ). A full or exact likelihood analysis is complicated as it requires a complex integration over the distribution of the random effects. This integration problem compels one to avoid the exact likelihood estimation method, even though it is known that maximum likelihood estimators will be fully efficient (optimal). To overcome this computational problem, many authors, over the last two decades, have used various approximate methods to obtain consistent estimates for the regression effect β and the variance of the random effects σ 2. For example, we refer to some of the leading approaches such as the penalized quasi-likelihood (PQL) approach of Breslow and Clayton (1993), Breslow and Lin (1995), Kuk (1995), and i.i.d

Generalized quasi-likelihood vs. hierarchical likelihood 57 Lin and Breslow (1996), hierarchical likelihood (HL) approach due to Lee and Nelder (1996, 2001), and the recent generalized quasi-likelihood (GQL) approach of Sutradhar (2004), Jiang (1998), Jiang and Zhang (2001), and Sutradhar and Rao (2001). Note that in both PQL and HL approaches the estimation of β and σ 2 requires the estimation of γ i as well; which are estimated by pretending that γ i s are fixed parameters even though they are truly unobservable random effects. To be specific, in the first step, both PQL and HL approaches estimate β and γ i. The difference between the two approaches is that the PQL approach estimates β and γ i by maximizing a penalized quasi-likelihood function, whereas the HL approach maximizes the hierarchical likelihood function h = log K n i j=1 f(y ij γ i ) + log K h (γ i ;σ 2 ), (1.2) to estimate these parameters, where h (γ i ;σ 2 ) is the likelihood function of unobserved γ i s, and f(y ij γ i ) is the conditional density function of the response y ij given γ i. In the second step, in estimating σ 2, the PQL approach maximizes a profile quasi-likelihood function, whereas the HL approach maximizes an adjusted profile hierarchical likelihood function. Further we remark that the PQL approach of Breslow and Clayton (1993) may however not provide consistent estimates for the parameters, mainly for the variance component σ 2 [Sutradhar and Qu (1998)]. To be specific, Sutradhar and Qu (1998) have argued that the PQL approach may provide inconsistent estimate for σ 2 when cluster sizes (n i ) are small, specially under the situations where σ 2 is large such as σ 2 0.75. Breslow and Lin (1995, p.20) also have reported similar consistency problem for the estimates of σ 2. It was in particular argued by these authors that the PQL approach produces consistent estimates for the cases when σ 2 is small such as σ 2 0.25. Lin and Breslow (1996) attempted to produce bias corrected estimates, but their bias corrections also do not appear to improve the consistency results significantly when σ 2 is large. Consequently, we do not follow the PQL approach any further in the present paper. Note that as the PQL and the HL approaches estimate β and σ 2 through the estimation of γ i (i = 1,...,K), the HL approach may also suffer from

58 M.R.I. Chowdhury and B.C. Sutradhar similar inconsistency problems due to similar reasons that cause inconsistency in the PQL approach. There does not however exist any comparative study between the HL and any other competitive approaches. Further note that as an alternative to the PQL and other moments based (Jiang and Zhang (2001)) approaches, Sutradhar (2004) has recently proposed a GQL approach that produces consistent as well as more efficient estimates. The purpose of this paper is to make a comparative study between the GQL and the HL approaches. Both of these HL and GQL approaches are reviewed briefly in Section 2. An extensive simulation study is conducted in Section 3 for various small and large values of σ 2, as well as for small and large cluster sizes such as n i = 2, 4 and 6. The simulation results show that the HL approach estimates the parameters with larger biases as compared to the GQL approach, the performance of the HL approach is being relatively worse for the estimates of the main regression parameter β. Both of the GQL and HL estimation approaches are illustrated in Section 4 by analyzing a health care utilization data. The paper is concluded in Section 5. 2 Estimation of Parameters: GQL versus HL Inferences 2.1. GQL approach. Let y i = (y i1,...,y ij,...,y ini ) and µ i = E (Y ij ) = (µ i1,...,µ ij,...,µ ini ), where µ ij = E γi E(Y ij γ i ) = E γi (µ ij ) with µ ij = exp (x ij β + z iγ i ) as the conditional mean of the Poisson distribution computed from (1.1). Further let, Σ i = (σ ijk ) be the n i n i covariance matrix of y i. It can be shown that σ ijj = V ar (Y ij ) = µ ij + [exp(z 2 i σ 2 ) 1]µ 2 ij for j = 1,...,n i, and for j,k = 1,...,n i,j k, σ ijk = Cov (Y ij,y ik ) = µ ij µ ik [exp(z 2 i σ 2 ) 1]. For given σ 2, one may then write the GQL estimating equation [Sutradhar (2004)] for β as K µ i β Σ 1 i (y i µ i ) = 0, (2.1) (see Wedderburn (1979) and McCullagh (1983) for the independence case) where µ i β = X ia i, with X i = [x i1,...,x ini ] and A i = diag[µ i1,...,µ ini ].

Generalized quasi-likelihood vs. hierarchical likelihood 59 Note that as the first order responses are used to construct (2.1), we refer to this equation as the first order GQL estimating equation. Once the regression effects β is estimated by solving (2.1), one may construct a second order GQL estimating equation to estimate σ 2. Let g i = [y 2 i1,...,y 2 in i,y i1 y i2,...,y ini 1 y ini ]. (2.2) be the n i (n i + 1)/2 1 vector of all distinct second order responses for the ith cluster. Let λ i = E(G i ) and Ω i = Cov (G i ) be the mean vector and covariance matrix of g i respectively under the Poisson mixed model. It then follows that for a given value of β, the variance component σ 2 may be estimated by solving the second order GQL estimating equation K λ i σ 2 Ω 1 i (g i λ i ) = 0. (2.3) [Sutradhar (2004), Jiang and Zhang (2001)]. Note that the λ i vector in (2.3) is easily calculated. This is because, E(Yij 2) = V ar(y ij) + µ 2 ij for all j = 1,...,n i ; and E(Y ij Y ik ) = Cov(Y ij,y ik ) + µ ij µ ik, for all j k,j,k = 1,...,n i. The derivative vector λ i may also be easily computed. The σ2 computation for the fourth order moments matrix Ω i is, however, lengthy, but straightforward. All elements of this matrix may be computed by using the conditioning and unconditioning principle as it was done for the computation of the variances and covariances. For example, the variance of Yij 2 may be computed as V ar(yij 2 ) = E(Y ij 4 2 ) [E (Yij )]2 = E γi E(Yij 4 γ i) [E (Yij 2 )]2 = µ ij + µ 2 ij (7ez2 i σ2 1) + 2µ 3 ij ez2 i σ2 (3e 2z2 i σ2 1) +µ 4 ij e 2z2 i σ2 (e 4z2 i σ2 1), (2.4) and the covariance between Y 2 ij and Y iky il for j k < l may be computed as Cov(Y 2 ij,y iky il ) = E γi E (Y 2 ij Y ik Y il γ i ) E γi [E (Y 2 ij γ i)e (Y ik Y il γ i )] = µ ij µ ik µ il e z2 i σ2 (e 2z2 i σ2 1)+µ 2 ij µ ikµ il e 2z2 i σ2 (e 4z2 i σ2 1).(2.5)

60 M.R.I. Chowdhury and B.C. Sutradhar We remark that as E(Y i ) = µ i and E(G i ) = λ i, the first (2.1) and the second order (2.3) GQL estimating equations are naturally unbiased. This unbiasedness property yields consistent estimates for both β and σ 2. Furthermore, as the covariance matrices Σ i and Ω i are appropriate weight matrices in the estimating equations, the GQL estimates of β and σ 2 will also be more efficient than any other unbiased such as moment estimators derived without exploiting the weight matrices. See, for example, Jiang (1998), Jiang and Zhang (2001), and Sutradhar (2004). We further remark that the GQL estimation of β and σ 2 by (2.1) and (2.3) does not require any estimates for the random effects γ i (i = 1,2,...,K), whereas the existing leading analytical approaches such as the PQL approach of Breslow and Clayton (1993) and the HL approach of Lee and Nelder (1996) need to estimate these random effects by pretending that they are fixed effects. This is, however, known [Sutradhar and Qu (1998), Jiang (1998)] that the PQL approach for the estimation of β and σ 2 through the estimation of γ i appears to encounter convergence problems, especially for the estimation of σ 2. Similar to the PQL approach, as the HL approach exploits the estimates of γ i for the estimation of β and σ 2, the HL approach may also encounter similar inconsistency problems as that of the PQL approach. But, even though, the HL approach has been illustrated by analyzing various numerical examples in Lee and Nelder (1996, 2001), there does not appear any empirical study to examine the consistency property of the HL estimates for β and σ 2. In Section 3, we conduct an extensive simulation study to examine whether HL estimation satisfy this important property. The HL estimating equations are overviewed in the next subsection. Note that we also include the GQL approach in the simulation study in order to examine the relative performances of the HL estimators as compared to those of the GQL estimators. The GQL and HL approaches are also illustrated by analyzing a real life data. This is done in Section 4. 2.2. HL approach. In the present set up, that is, under the Poissonnormal mixed model, the HL function (1.2) may be expressed as h = N [ n i j=1 y ij log(µ ij ) K n i j=1 K 2 log(2π) + K 2 log(σ2 ) + 1 2σ 2 N µ ij n i log (y ij!) K γ 2 i ] j=1, (2.6)

Generalized quasi-likelihood vs. hierarchical likelihood 61 where µ ij = exp(x ij β + z iγ i ). In this approach, for given σ 2, similar to the PQL approach, one estimates β and γ i by the maximizing the likelihood function in (2.6). More specifically, by writing µ i = [µ i1,...,µ in i ], one solves the HL estimating equations h β = 0 and h γ i = 0, (2.7) for β and γ i (i = 1,...,K), respectively. For the estimation σ 2, Lee and Nelder (1996) have, however maximized a general adjusted profile h-likelihood defined by h A = h + 1 2 log {det(2π ϕh 1 )} = h + 1 2 log (2π) 1 2 log {det(h)}, (2.8) where ϕ=1 under the Poisson model, and H matrix is defined as [ X H = WX X WZ Z WX Z WZ + U ] (p+k) (p+k) where X = [X 1,...,X i,...,x K ] : n i p, with X i as the n i p covariate matrix; W and Z are block-diagonal matrices given by W = K A i : ni n i and Z = K 1 n i : ni K, respectively, with A i = diag[µ i1,...,µ in i ]: ni n i and 1 ni = (1,...,1) : ni 1 ; and U = 1 I σ 2 K, I K being the K K identity matrix. The maximization is achieved by using the iterative equation given by ˆσ 2 (r+1) = ˆσ2 (r) + [ ( 2 h A σ 4 ) 1 ] ha σ 2, (2.9) (r) where the square bracket [ ] (r) indicates that the quantity in [ ] is evaluated at σ 2 = ˆσ (r) 2, r being the rth iteration.

62 M.R.I. Chowdhury and B.C. Sutradhar Let ˆβ GQL and ˆσ GQL 2 be the solutions of the GQL estimating equations (2.1) for β and (2.3) for σ 2, respectively. Similarly, let ˆβ HL and ˆσ HL 2 be the solutions of the HL based estimating equations (2.7) for β, and (2.9) for σ 2, respectively. 3 A Simulation Study In this section, we examine the relative performances of the HL estimators, ˆβHL and ˆσ HL 2, to the corresponding GQL estimators, ˆβGQL and ˆσ GQL 2, through a simulation study consisting of 500 simulations. Under each simulation, for given covariates and selected values of the design parameters, we generate the data {y ij ; j = 1,...,n i, i = 1,...,K} by using the Poisson form of (1.1) with mean parameters µ ij = E (Y ij γ i ) = exp(β 1 x ij1 + β 2 x ij2 + σ γ i ) (3.1) where γi = γ i iid N(0,1). We choose the cluster number as K = 100 and σ cluster sizes n i = 2, 4, 6, 10 and 16. Note that the cluster sizes are chosen to accommodate both small and large clusters. For regression parameters we use β 1 = β 2 = 1.0. As far as the values of the fixed covariates x ij = (x ij1, x ij2 ) are concerned, we choose x ij1 = 1 for j = 1,2,...,n i /2; i = 1,2,...,K/2 0 for j = n i /2 + 1,...,n i ; i = 1,2,...,K/2 1 for j = 1,...,n i ; i = K/2 + 1,...,K 1 for j = 1,2,...,n i /2; i = 1,2,...,K/2 2 for j = n i /2 + 1,...,n i ; i = 1,2,...,K/2 x ij2 = 0 for j = 1,2,...,n i /2; i = K/2 + 1,...,K 1 for j = n i /2 + 1,...,n i ; i = K/2 + 1,...,K

Generalized quasi-likelihood vs. hierarchical likelihood 63 With regard to the selection of the variance of the random effects, we choose σ 2 = 0.4, 0.8, and 1.2. We remark that even though in theory the overdispersion index parameter σ 2 can take any value from 0 to, for practical purposes σ 2 1.0 appears to be quite large. This is because under the Poisson-normal mixed model (3.1), the overdispersion in the count data may increase significantly even if the increment in σ 2 is small. To be specific, the variance of y ij, σ ijj = µ ij + [exp(σ 2 ) 1]µ 2 ij, under the Poisson-normal mixed model increases significantly, depending on the value of the mean function µ ij = exp(x ij β + 1 2 σ2 ), even if σ 2 changes from 1.0 to 1.2, for example. We further remark that Breslow and Lin (1995, P. 90) were able to obtain unbiased estimate of this overdispersion index parameter σ 2 when σ 2 ranges up to 0.5 only. Next, under each simulation, the simulated values of {y ij } along with the values of the covariates {x ij } are used to obtain the GQL estimates of β and σ 2 by solving (2.1) and (2.3), respectively, and the HL estimates of β and σ 2 by using (2.7) and (2.9), respectively. Note that under the HL approach, we also had to estimate γ i (i = 1,...,100) by treating them as the fixed parameters, but these estimates will not be reported as they are not of direct interest. The simulated means (SMs), simulated standard errors (SSEs) for each of the above four estimates are shown in Table 1 for all selected cluster sizes n i = 2, 4, 6, 10 and 16.

64 M.R.I. Chowdhury and B.C. Sutradhar Table 1. Comparison of simulated means (SMs), standard errors (SSEs), and relative bias (SRB) of the regression estimates and estimates of variance of random effects by the GQL and HL approaches, under the Poisson-normal mixed model, for selected values of σ 2 : K = 100; n i=2, 4, 6, 10 and 16 (i = 1,2,..., K); true values of the regression parameters: β 1 = 1.0 and β 2 = 1.0 Cluster Estimates size σ 2 Method Quantity ˆβ1 ˆβ2 ˆσ 2 2 0.40 GQL SM 1.0151 1.0177 0.38080 SSE 0.0458 0.0335 0.05454 SRB 33 52 35 HL SM 1.0777 1.0671 0.37277 SSE 0.0399 0.0265 0.03490 SRB 195 253 78 0.80 GQL SM 1.0167 1.0211 0.76685 SSE 0.0577 0.0434 0.08797 SRB 29 49 38 HL SM 1.2498 1.2320 0.86248 SSE 0.0467 0.0613 0.09693 SRB 535 378 64 1.20 GQL SM 1.0151 1.0167 1.1580 SSE 0.0648 0.0532 0.10970 SRB 23 31 38 HL SM 2.0532 1.9972 4.94560 SSE 0.1144 0.1306 0.91610 SRB 921 764 409 4 0.40 GQL SM 1.0109 1.0122 0.39234 SSE 0.0405 0.0305 0.06464 SRB 27 40 12 HL SM 1.0878 1.0760 0.40431 SSE 0.0293 0.0185 0.05466 SRB 300 411 8 0.80 GQL SM 1.0125 1.0108 0.78193 SSE 0.0483 0.0399 0.07726 SRB 26 27 23 HL SM 1.1661 1.1430 0.82808 SSE 0.0277 0.0205 0.05278 SRB 600 698 53

Generalized quasi-likelihood vs. hierarchical likelihood 65 Table 1. (Continued) Cluster Estimates size σ 2 Method Quantity ˆβ1 ˆβ2 ˆσ 2 4 1.20 GQL SM 1.0079 1.0097 1.17190 SSE 0.0523 0.0442 0.08980 SRB 15 22 31 HL SM 1.2894 1.2594 1.34880 SSE 0.0249 0.0164 0.07270 SRB 1162 1582 205 6 0.40 GQL SM 1.0112 1.0115 0.39263 SSE 0.0351 0.0283 0.05388 SRB 32 41 14 HL SM 1.09580 1.08310 0.41699 SSE 0.02400 0.01520 0.02610 SRB 399 547 65 0.80 GQL SM 1.0081 1.0080 0.79385 SSE 0.0460 0.0375 0.14485 SRB 18 21 4 HL SM 1.15830 1.13660 0.84985 SSE 0.02700 0.02070 0.04668 SRB 586 660 107 1.20 GQL SM 1.0049 1.0053 1.18060 SSE 0.0491 0.0415 0.08470 SRB 10 13 23 HL SM 0.62217 0.64976 1.83400 SSE 0.11312 0.10386 0.32340 SRB 334 337 196 10 0.40 GQL SM 1.0062 1.0079 0.40305 SSE 0.0323 0.0268 0.06370 SRB 19 29 5 HL SM 1.1028 1.0901 0.43130 SSE 0.01860 0.01220 0.02503 SRB 553 739 125

66 M.R.I. Chowdhury and B.C. Sutradhar Table 1. (Continued) Cluster Estimates size σ 2 Method Quantity ˆβ1 ˆβ2 ˆσ 2 10 0.80 GQL SM 1.0087 1.0086 0.79286 SSE 0.0397 0.0334 0.09233 SRB 22 26 8 HL SM 1.1457 1.1244 0.86208 SSE 0.02200 0.02040 0.03576 SRB 662 610 174 1.20 GQL SM 1.0024 1.0029 1.27940 SSE 0.0430 0.0370 0.52300 SRB 10 13 23 HL SM 0.3976 0.4469 2.55910 SSE 0.16860 0.15750 0.81620 SRB 357 351 167 16 0.40 GQL SM 1.0035 1.0033 0.40656 SSE 0.0346 0.0267 0.05391 SRB 10 12 12 HL SM 1.1091 1.0933 0.43602 SSE 0.01560 0.00939 0.02648 SRB 699 994 136 0.80 GQL SM 1.0036 1.0010 0.80778 SSE 0.0310 0.0267 0.05993 SRB 12 4 13 HL SM 1.1344 1.1111 0.79100 SSE 0.02070 0.01880 0.19300 SRB 649 591 5 1.20 GQL SM 1.0066 1.0053 1.50400 SSE 0.06190 0.04630 0.92270 SRB 11 11 33 HL SM 0.1753 0.2341 2.98100 SSE 0.25280 0.25070 1.36000 SRB 326 306 131

Generalized quasi-likelihood vs. hierarchical likelihood 67 Table 2. Comparison of simulated means (SMs), standard errors (SSEs), and relative bias (SRB) of the regression estimates and estimates of variance of random effects by the GQL approach, under the Poisson-gamma mixed model, for selected values of σ 2 : K = 100; n i=4 and 10 (i = 1,2,..., K); true values of the regression parameters: β 1 = 1.0 and β 2 = 1.0 Cluster Estimates size σ 2 Quantity ˆβ1 ˆβ2 ˆσ 2 4 0.40 SM 1.0122 1.0124 0.39125 SSE 0.0406 0.0318 0.06482 SRB 30 39 13 0.80 SM 1.0085 1.0047 0.81190 SSE 0.0486 0.0440 0.2300 SRB 17 11 5 1.20 SM 1.0061 1.0120 1.17340 SSE 0.0518 0.0460 0.09620 SRB 12 26 28 10 0.40 SM 0.99721 1.0018 0.41667 SSE 0.03591 0.02800 0.06479 SRB 8 6 26 0.80 SM 1.0084 1.0081 0.79112 SSE 0.03940 0.03280 0.06760 SRB 21 25 13 1.20 SM 0.9994 1.0007 1.18950 SSE 0.03968 0.03420 0.06870 SRB 1 2 15 Note that when an estimate becomes highly biased with small standard error, it turns to be an useless estimate. For this reason, to check the actual convergence of the estimates to their corresponding parameter values, we have also computed the simulated relative bias (SRB) given by SRB = SM True parameter value SSE 100. These SRBs are also reported in the same Table for all cluster sizes and selected values of the overdispersion index parameter. 3.1. Relative performance of GQL and HL approaches. With regard to the estimation of β 1 and β 2, it is clear from the simulation results displayed in Table 1 that the GQL approach always produces the regression estimates with smaller relative bias as compared to the HL approach. This better performance of the GQL approach appears to hold for all small and large cluster sizes n i = 2, 4, 6, 10 and 16; as well as for all small and large values

68 M.R.I. Chowdhury and B.C. Sutradhar of σ 2 = 0.4, 0.8 and 1.2. For example, when n i = 2, the GQL estimates of β 1 and β 2 are slightly biased with SRBs 33 and 52 when σ 2 = 0.4, and respective SRBs are 23 and 31 when σ 2 = 1.2. But, the HL estimates for the same regression parameters appear to converge to wrong values with small standard errors. To be specific, for n i = 2, the HL estimates of β 1 and β 2 appear to have SRBs 195 and 253 when σ 2 = 0.4, and 921 and 764 when σ 2 = 1.2. Similarly, for a large cluster size n i = 16, the SRBs for the GQL estimates are found to be 10 and 12 when σ 2 = 0.4, and 11 and 11 when σ 2 = 1.2; whereas the respective SRBs for the HL approach are found to be 699 and 994 when σ 2 = 0.4, and 326 and 306 when σ 2 = 1.2. Thus, irrespective of the cluster size n i and the value of σ 2, the GQL approach performs much better than the HL approach in estimating β 1 and β 2. Note that the results in Table 1 also reveal that the pattern of the regression estimates as a function of n i and σ 2 is different under the HL approach as compared to the GQL approach. When cluster size is small such as, n i = 2, 4 and 6, the SRBs of the regression estimates get smaller for the GQL estimates but they get larger for the HL estimates, as the value of σ 2 increases. When cluster size is large such as n i = 10 and 16, the performance of the GQL estimates of the regression parameters remains the same irrespective of the value of σ 2, whereas the HL estimates appear to perform better as the value of σ 2 increases. But, when HL and GQL approaches are compared, as mentioned above, the GQL approach performs uniformly better in estimating β 1 and β 2. With regard to the estimation of the overdispersion parameter σ 2, the GQL approach, in general, appears to perform better than the HL approach. For example, when n i = 6, the GQL approach produces σ 2 estimates with SRBs 14 and 23 for σ 2 = 0.4 and 1.2, respectively; whereas the corresponding SRBs for the HL estimates are found to be 65 and 196, respectively. 3.2.Performance of the GQL approach under negative binomial set-up. In the simulation study in the last section, it was found that the GQL approach performed uniformly better than the HL approach in estimating both regression and overdispersion parameter. But, the comparison was made under the Poisson-normal mixed model discussed in Section 2. In this section, we conducted another simulation study by generating the random effects γ i from a gamma distribution with parameters 1/(c exp(σ 2 /2)) and 1/c with c = exp(σ 2 ) 1. To be specific, we generate γ i from the gamma distribution with density

Generalized quasi-likelihood vs. hierarchical likelihood 69 f(γ i ) = [{c exp(σ 2 /2)} 1/c Γc 1 ] 1 exp[ {c exp(σ 2 /2)} 1 γ i ]γ c 1 1 i, (3.2) (Jowaheer, 2006) and then generate y ij from the Poisson distribution (Poisson form of (1.1)) with mean µ ij = exp (x ij β + γ i). It then follows that y ij unconditionally has a negative binomial distribution with the same mean and the same variance as in the Poisson-normal model considered in Section 3.1. But, the higher order moments are different under these two models. In the present simulation study we use the same covariates and the same values for the design parameters as in the previous simulation study in Section 3.1. For simplicity, we have reported the simulation results in Table 2 for a small cluster size n i = 4 and for a relatively large cluster size n i = 10. All three values of σ 2 were considered. It is clear from the results in Table 2 that the GQL approach continue to performs well in estimating the parameters even under the present negative binomial model. 4 A Numerical Illustration: Health Care Utilization Data Analysis In the last section it was shown through a simulation study that the HL approach produces highly biased estimates with smaller standard errors as compared to the GQL approach. Thus the GQL approach performs better than the HL approach in estimating all parameters of the Poisson familial model. The purpose of this section is to analyze a real life data and interpret the estimates reflecting the simulation behavior. With regard to the selection of the data, we have chosen to analyze a set of count data collected by a general hospital in Canada that contains responses and associated covariates from a moderately large number of independent families. The data set is described in subsection 4.1. 4.1. Health care utilization data. The Department of Community Medicine, Health Science Center (General Hospital) in St. John s, Canada, collected a familial-longitudinal data on health care utilization for K = 48 families over a period of 6 years for 1985 to 1990. Note that some authors such as Sutradhar, Jowaheer and Sneddon (2008) have used the GQL approach and analyzed this complete familial-longitudinal data for four years collected from 48 families. Our purpose is, however, to concentrate on the familial data analysis only as an application of the theoretical and simulation results discussed in Sections 2 and 3.

70 M.R.I. Chowdhury and B.C. Sutradhar For our purpose, we consider a part of this whole data set, for 1985 only. Among these 48 families, 36 are of size n i = 4 (i = 1,...,36), and the remaining 12 are of size n i = 3 (i = 37,...,48). Each of the family members was asked about the number of visits they paid to a physician during 1985. In our notation, y ij denote this count response for the jth (j =1,...,n i ) of the ith family (i = 1,...,48). Their gender (x ij1 ), the chronic condition (x ij2 ) [CC], education level (x ij3 ) [EL] and age of the individual (x ij4 ), were also collected. For convenience, we use the following codes for these four covariates: x ij1 = { 0 female 1 male x ij2 = { 0 without chronic diseases 1 with chronic diseases x ij3 = { 0 less than high school 1 high school or above x ij4 = exact age of the individual To have a feel for this familial data, the distribution of the count responses, from 180 individuals, by all four covariates is displayed in Table 3. Table 3. Summary Statistics of Physician Visits by Four Covariates in the Health Care Utilization Data for 1985. Number of Visits Covariates Level 0 1 2 3-5 6 Total Gender Male 28 22 18 16 12 96 Female 11 5 15 21 32 84 Chronic Condition No 26 20 15 16 11 88 Yes 13 7 18 21 33 92 Education Level < High School 17 5 11 10 15 58 High School 22 22 22 27 29 122 Age 20-30 23 17 14 15 15 84 31-40 1 1 3 3 3 11 41-50 4 4 5 12 8 33 51-65 10 5 8 5 13 41 66-85 1 0 3 2 5 11 It is seen from Table 3 that, in general, more males appear to visit a physician a smaller number of times, while a large number of females visit a physician at least 3 times. As expected, we see that an individual with

Generalized quasi-likelihood vs. hierarchical likelihood 71 chronic diseases visits a physician more often. Physician visits for individuals with a higher level of education seems to be evenly distributed, i.e. individuals are just as likely to visit a physician once as 3-5 times. For those with lower level of eduction, they appear to either not visit a physician, or visit a large number of times. With regard to the relationship between number of visits and age, we have temporarily made 5 age groups and observed that some of the individuals in the 20-30 age group have visited a physician a large number of times. As expected, a large number of individuals did not visit a physician at all. For older age groups, there was a tendency for an individual to see a physician more often. In the next section, we examine the effects of the above four covariates on the responses by applying both GQL and HL approaches discussed in Section 2. In our notation, β = (β 1,β 2,β 3,β 4 ) denote their effects. Note however that when we computed the mean and the variance of the 180 count responses, it was found that an individual on the average has visited his/her physician 3.92 times with variance 22.66. This indicate that the count responses contain over-dispersion which in our notation is represented by the over-dispersion index parameter σ 2, the variance of the random family effects. We also need to estimate this variance parameter along with the estimate of β. 4.2. GQL and HL estimates. The GQL estimates of β and σ 2 were obtained by solving the equation (2.1) and (2.3) respectively. The standard errors of the estimates of the components of β were calculated from the asymptotic covariance matrix of ˆβ GQL, given by Cov(ˆβ GQL ) = [ K µ i µ i β Σ 1 i β ] 1, and similarly, the standard error of ˆσ 2 GQL was computed from V ar(ˆσ 2 GQL) = [ K λ i λ i σ 2 Ω 1 i σ 2 ] 1. Next, the HL estimates of β and σ 2 were obtained from (2.7) and (2.9), respectively. The standard errors of the components of β were computed from

72 M.R.I. Chowdhury and B.C. Sutradhar Cov(ˆβ HL ) = [ K ] 1 X i diag [µ i1,...,µ in i ]X i, and similarly the standard error of ˆσ HL 2 was computed by using the so-called Hessian scalar [ ] 1 V ar(ˆσ HL 2 ) = 2 h A σ 4. These estimates along with their standard errors are displayed in Table 4. Table 4. The GQL, PGQL, ZIGQL, HL, PHL and MM Estimates along with available standard errors, for the Health Care Utilization Data for 1985. Effects of the Covariates Variance Method Quantity Gender(ˆβ 1) CC(ˆβ 2) EL(ˆβ 3) Age(ˆβ 4) ˆσ 2 GQL Value -0.754 0.666 0.434 0.010 0.873 SE 0.091 0.125 0.123 0.0030 0.409 PGQL Value -0.739 0.670 0.494 0.012 0.619 SE 0.090 0.125 0.117 0.0028 ZIGQL Value -0.630 0.606 0.255 0.0078 1.397 SE 0.121 0.164 0.164 0.0043 1.317 HL Value -0.693 0.689 0.633 0.016 0.187 SE 0.080 0.088 0.067 0.0017 0.020 PHL Value -0.704 0.685 0.615 0.015 0.235 SE 0.080 0.088 0.066 0.0017 MM Value -0.651 0.686 0.511 0.014 0.529 SE 0.079 0.088 0.067 0.0017 Note that even though the method of moments [Jiang (1998)] is known to produce inferior estimates than the GQL approach [Sutradhar (2004)], for the sake of completeness, in this section, we also provide the moment estimates for both β and σ 2. To be specific, the moment estimate (ˆβ MM ) of β is obtained by solving K X i (y i µ i ) = 0, (4.1)

Generalized quasi-likelihood vs. hierarchical likelihood 73 with its asymptotic variance computed from Cov(ˆβ MM ) = [ K ] 1 X i diag [µ i1,...,µ ini ]X i, where µ i in (4.1) is the unconditional mean vector. As far as the moment estimate of σ 2 is concerned, we obtain ˆσ MM 2 by solving where [S E(S)] = 0 (4.2) S = N n i j=1 y 2 ij + N n i j<k y ij y ik, and E (S) = N n i [µ ij + µ 2 ij eσ2 ] + j=1 N n i j<k µ ij µ ik e σ2, with µ ij = exp (x ij β + 1 2 σ2 ). The MM estimate of β along with the standard errors of the components of ˆβ MM are shown in the same Table 4. The value of ˆσ MM 2 is also given. We also provide the estimates for β and σ 2 based on a partial GQL (PGQL) as well as partial HL (PHL) approach. To be specific in the PGQL approach, β and σ 2 are iteratively estimated by using their formulas computed based on the GQL and the MM approaches, respectively. Similarly, the PHL estimates of β and σ 2 are iteratively estimated by using their formulas computed based on the HL and the MM approaches, respectively. These estimates are also reported in Table 4. The results of the Table 4 show that the estimates for β and σ 2 are generally different under the GQL and the HL approaches. Furthermore, except for the estimate of σ 2, the HL estimates appear to be closer to the MM estimates than the GQL estimates. Since the GQL estimates are known

74 M.R.I. Chowdhury and B.C. Sutradhar to be better than the MM estimates [Sutradhar (2004)], and also because the GQL estimates reported based on the simulation study in Section 3, were found to be less biased than the HL estimates, one would naturally find the GQL estimates in Table 4 more reliable than the HL estimates. Note that the estimated standard errors under the HL approach are found to be smaller than the GQL approach. This is also in agreement with the simulation results reported in Table 1. But as argued in Section 3, the large biases of the HL estimates along with their smaller standard errors, make these HL estimates unreliable, as they converge to wrong values far away from the parameter values. Remark that as expected, the PGQL and PHL estimates are found to be closer to the GQL and HL estimates, respectively. Since GQL estimates are found to be better that the HL and MM estimates, we now interpret the effects of the covariates as well as the variance of the family effects by using the GQL approach. Thus, ˆσ GQL 2 = 0.873 indicates that the data contain large over-dispersion. This is also in agrement with the results reported in Subsection 4.1, where it was shown that an individual visits the physician 3.92 times on the average with a very large variance 22.66. With regard to the regression effects, the negative value of ˆβ 1(GQL), namely ˆβ 1,(GQL) = - 0.754, indicates that females made more visits to the physician as compared to males. Next, ˆβ 2(GQL) = 0.666 and ˆβ 4,(GQL) = 0.010 suggest that the individuals having some chronic diseases or individuals who are older pay more visits to the physician, as expected. The effect of the education level on the health condition, however, appears to be intriguing. This is because ˆβ 3(GQL) = 0.434 suggests that highly educated individuals have more visits compared to individuals with a lower level of education. One of the reasons for this type of behavior of this covariate may be that individuals with a higher level of education are more concerned about their health condition compared to individuals with a lower level of education. 4.3.GQL estimates under a zero-inflated poisson model. Note that in the analysis of health care data displayed in Table 3, we have assumed that the Poisson distribution conditional on the random effects would fit the data well. Since, on a closer look, the count 0 in particular, appears to occur in an excessive rate than expected, we now accommodate this feature by applying the so-called Zero-inflated Poisson (ZIP) mixed model. Here we refer to Ridout et al. (2001) and the references therein for the analysis of ZIP fixed model.

Generalized quasi-likelihood vs. hierarchical likelihood 75 Let w i denote the proportion of zeros in the ith family (,..., 48). Then in the fashion similar to that of Ridout et al. (2001), we write ZIP density for y ij conditional on γ i, as f(y ij γ i ) = w i + (1 w i ) exp( µ ij ) y ij = 0, (1 w i ) exp( µ ij )(µ ij )2 /y ij! y ij > 0, (4.3) where µ ij = exp(x ij β + γ i). This conditional ZIP model produces the mean and the variance of y ij as E(Y ij γ i ) = (1 w i )µ ij, V ar(y ij γ i ) = (1 w i )µ ij (1 + w i µ ij ). By computing ŵ i = [number of zeros in ith family]/n i, and using this estimate we now compute the unconditional mean and the variance and other moments of order up to four. We then solve the GQL estimating equations (2.1) and (2.3) to obtain the estimates of β and σ 2 under the ZIP mixed model. These Zero-inflated Poisson GQL (ZIGQL) estimates along with their standard errors are reported in the same Table 4. Note that the ZIGQL estimate of the overdispersion parameter appears to be larger (1.397) than the GQL estimate (0.873). Thus, the count data considered here appear to have much more overdispersion than shown by the GQL approach. As far as the regression estimates are concerned, the ZIGQL estimates appear to be slightly different than the corresponding GQL estimates. The ZIGQL approach produces the regression estimates with larger standard errors than those of the GQL approach. This is expected, as the ZIGQL approach indicates that the data has more variation. Note that the ZIGQL estimates are not comparable with those of the HL estimates. To examine any possible changes to the HL estimates due to zero-inflation, one could compute the ZIHL estimates, but, it was not done in this paper as the HL estimates were found to be inferior to the GQL estimates. 5 Concluding Remarks In this paper we have considered a Poisson-normal mixed model which is an important special case of the well-known generalized linear mixed model.

76 M.R.I. Chowdhury and B.C. Sutradhar In this problem, it is of interest to estimate the regression effects and variance of the random effects, consistently and efficiently. A great deal of discussion has taken place over the last two decades on the relative performance of some of the widely used estimation methods such as MM (Jiang, 1998, Jiang and Zhang, 2001), PQL (Breslow and Clayton, 1995) and GQL (Sutradhar, 2004) approaches. But none of these procedures were compared with the existing HL (Lee and Nelder, 1996) approach, even though this later approach appears to be quite familiar. Since the GQL approach was found to be better than MM and PQL approaches, in this paper, we have examined the relative performance of this well behaved GQL approach with the HL approach. For the comparison between the HL and GQL approaches, we have first simplified all related estimating equations under these two approaches. We then conducted an extensive simulation study to examine the relative performances of these procedures in estimating both regression effects and variance of the random effects. Note that the HL approach requires 3 estimating equations including the estimation of the random effects, whereas the GQL approach requires only two estimating equations where it is not needed to estimate the random effects. The simulation study was conducted for five different cluster sizes and three values of σ 2 (variance of the random effects), small and large. It was found that as the value of σ 2 increases, the HL approach starts to produce highly biased estimates for the regression effects. The GQL approach was however found to be producing almost unbiased estimates for the regression effects, irrespective of the magnitude of σ 2. As far as the estimation of the variance parameter σ 2 is concerned, the GQL approach was also found to be uniformly better than the HL approach. In this case, in contrary to the regression estimation, the HL approach was found to perform better even though it trails to the GQL approach. Hence, the GQL approach is definitely better than the HL approach in estimating all parameters of the model. When other studies mentioned above are taken into consideration, the GQL approach appears to be the best so far among the MM, PQL and HL approaches. We therefore recommend the use of the GQL approach in practice irrespective of the magnitude of the over-dispersion in the familial/cluster count data. To demonstrate the effectiveness of the GQL approach, we have applied both GQL and HL approaches to analyze a real life data set on health care utilization. The estimates were found to reflect the simulation results. We have also applied

Generalized quasi-likelihood vs. hierarchical likelihood 77 a ZIGQL approach to the same data set and estimates were found to be similar but different than the GQL estimates. Acknowledgements. The authors would like to thank the referee and the associate editor for their valuable comments that lead to the improvement of the paper. This research was supported partially by a grant from the Natural Sciences and Engineering Research Council of Canada. References Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc., 88, 9 25. Breslow, N.E. and Lin, X. (1995). Bias correction in generalized linear models with single component of dispersion. Biometrika, 82, 81 92. Jiang, J. (1998). Consistent estimators in generalized linear mixed models. J. Amer. Statist. Assoc., 93, 720 729. Jiang, J. and Zhang, W. (2001). Robust estimation in generalized linear mixed models. Biometrika, 88, 753 765. Jowaheer, V. (2006). Model misspecification effects in clustered count data analysis. Statist. Probab. Lett., 76, 470 478. Kuk, A. C. (1995). Asymptotically unbiased estimation in generalized linear models with random effects. J. Roy. Statist. Soc. B, 57, 395 407. Lee, Y. and Nelder, J.A. (1996). Hierarchical generalized linear models. J. Roy. Statist. Soc. B, 58, 619 678. Lee, Y. and Nelder, J.A. (2001). Hierarchical generalized linear models: A synthesis of generalized linear models, random-effect models and structured dispersions. Biometrika, 88, 987 1006. Lin, X. and Breslow, N.E. (1996). Bias correction in generalized linear mixed models with multiple components of dispersion. J. Amer. Statist. Assoc., 91, 1007 1016. McCullagh, P. (1983). Quasi-likelihood Functions. Ann. Statist., 11, 59 67. McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models, 2nd edn. London: Chapman and Hall. Ridout, M., Hinde, J. and Demetrio, C.G.B. (2001). A score test for testing a Zero-inflated Poisson regression model against Zero-inflated negative binomial alternatives. Biometrics, 57, 219 223. Sutradhar, B.C., Jowaheer, V. and Sneddon, G. (2008). On a unified generalized quasi-likelihood approach for familial-longitudinal non-stationary data. Scandinavian J. Statist., 35, 597 612. Sutradhar, B.C. and Qu, Z. (1998). On approximation likelihood inference in Poisson mixed Model. Canad. J. Statist., 26, 169 186. Sutradhar, B.C. and Rao, R.P. (2001). On marginal quasi-likelihood inference in generalized linear mixed models. J. Multivariate Anal., 76, 1 34. Sutradhar, B.C. (2004). On exact quasi-likelihood inference in generalized linear mixed models. Ind. J. of Statist., 66, 263 291.

78 M.R.I. Chowdhury and B.C. Sutradhar Wedderburn, R. (1974). Quasi-likelihood functions, generalized linear models, the Gauss-Newton method. Biometrika, 61, 439 447. M.R.I. Chowdhury and B.C. Sutradhar Department of Mathematics and Statistics Memorial University of Newfoundland St. John s, NL, Canada A1C 5S7. E-mail: bsutradh@mun.ca Paper received December 2007; revised December 2008.