MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA

Size: px
Start display at page:

Download "MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA"

Transcription

1 J. Jpn. Soc. Comp. Statist., 26(2013), DOI: /jjscs MODEL SELECTION BASED ON QUASI-LIKELIHOOD WITH APPLICATION TO OVERDISPERSED DATA Yiping Tang ABSTRACT Overdispersion is a common phenomenon in discrete data analysis with generalized linear models which can never be ignored. In the literature on model selection with overdispersion, the main stream of the methods is to use the maximum likelihood method with a strong assumption of defining an entire distribution of the data to construct information criteria. For example, Fitzmaurice (1997) proposed model selection methods based on Efron s (1986) double-exponential families. However, the assumption of a parametric distribution of the data is sometimes too strong. In this paper, we propose a new model selection strategy based on the quasi-likelihood functions. First, we define a statistical model with two moments in the form of mean and variance. Second, we define the semiparametric information based on the semiparametric statistical models. Finally, we propose the semiparametric information criterion (SIC) based on quasi-likelihood estimators with weak assumptions on the first two moments. Although the proposed semiparametric information criterion is inspired by data with overdispersion, it applies more generally to data no matter whether they are discrete or continuous. It may be useful for any models for which the quasi-likelihood methods are appropriate. We also demonstrate the use of SIC through simulation studies and data application of the well-known data on germination of Orobanche. 1. Introduction 1.1. Overdispersed data In many areas of applications such as Biostatistics, life insurance, epidemiological studies and so on, when dealing with data analysis depending on GLMs (including binomial and Poisson distributions), the phenomenon of overdispersion widely exists. That is, given a standard exponential family, GLMs hold a specified mean-variance relationship, the variation of the data is however greater than the variance predicted by the mean-variance relationship. Example 1 We use a familiar example introduced by McCullagh and Nelder (1989, p.125) to illustrate the problem of overdispersion. Overdispersion commonly arises in cluster sampling. Clusters usually vary in size, for simplicity, we assume that the cluster size, k, is fixed and that the m individuals sampled actually come from m/k clusters. In the ith cluster, the number of positive respondents, Z i, is assumed to have the binomial distribution with index k and parameter π i, which is considered as a random variable and thus varies from cluster to cluster. The total number of positive respondents is Y = Z 1 + Z Z m/k. Graduate School of Science, Chiba University, 1-33 Yayoi-cho, Inage-ku, Chiba , Japan tou.kihei@gmail.com Key words: Akaike information criterion; Generalized linear models; Kullback-Leibler information; Model selection; Overdispersion; Quasi-likelihood 53

2 TANG If we let E(π i ) = π and Var(π i ) = τ 2 π(1 π), it may be shown that the unconditional mean and variance of Y are given by E(Y ) = mπ, Var(Y ) = mπ(1 π){1 + (k 1)τ 2 } = mπ(1 π)σ 2. Note that the above results only depend on the first two moments of π i. Note also that the dispersion parameter σ 2 = 1 + (k 1)τ 2 depends both on the cluster size and on the variability of π from cluster to cluster, but not on the sample size, m. Overdispersion can occur only if m > Modeling overdispersed data In order to model overdispersion, many approaches have been proposed to use the maximum likelihood method with an assumption of defining an entire distribution of the data. Williams (1982) modified standard logistic-linear analysis by using beta-binomial distributions to accommodate the overdispersed binomial data. Lawless (1987) proposed negativebinomial regression models to deal with overdispersed Poisson data. Efron (1986) proposed the double-exponential family method to add a dispersion parameter to control the variance independently of the mean. Anderson et al. (1994) proposed modified information criteria QAIC and QAIC c which are based on the quasi-likelihood to deal with model selection with overdispersed data, and they found that these criteria performed well in product multinomial models of capture-recapture data in the presence of differing levels of overdispersion. Hurvich and Tsai (1995) provided information on the use of AIC c with overdispersed data. Fitzmaurice (1997) developed model selection strategies based on Efron s (1986) doubleexponential families so that the maximum likelihood method can still be used and model selection can be done in the generalized linear model framework. However, in practical data analysis, the above methods have a common drawback that the true distribution of the data will never be known, therefore the assumption of specifying an exact distribution is very strong. Wedderburn (1974) proposed an important method called the quasi-likelihood method. He added a dispersion parameter to control the variance independently of the mean, but weakened the assumption of defining the entire distribution of the data. He just used assumptions on the first two moments, rather than the entire distribution of the data. An attractive property of the quasi-likelihood function is that it has properties similar to those of the likelihood function. When we deal with overdispersed data, and only use the first and the second moments, the maximum likelihood methods will not be available. Furthermore, when the model selection problem appears, maximum likelihood based information criterion AIC can not be effective any more Contributions of the paper The purpose of this paper is to extend GLM to situations where we do not wish to impose any distributional assumptions for the data, instead we make assumptions on the first two moments of the data. Using this semiparametric information, we proceed by extending the idea of Akaike (1974) using the Kullback-Leibler information to measure the discrepancy between the true distribution and the assumed semiparametric models. We call the resulting information criterion the semiparametric information criterion (SIC). One 54

3 Semiparametric Information Criteria Using Quasi-likelihood major application is to apply SIC for analyzing overdispersed data. The rest of the paper is organized as follows. In Section 2, we make a brief review of the quasi-likelihood method of Wedderburn (1974). We shall use the quasi-likelihood estimator in the subsequent sections for constructing predictive semiparametric models. Section 3 contains the main theoretic results. New information criteria based on the semiparametric models are developed in this section. In Section 4, we apply the proposed model selection techniques for analyzing the classical seeds germination data of Crowder (1978). We found that our results are consistent with most results in the literature. Finally, in Section 5, we show some results of a series of simulation studies to demonstrate the potential usefulness of the proposed model selection methods, which also indicate problems for future research. 2. Quasi-likelihood Generalized linear models are widely used as a standard tool in modern regression models. They were first introduced by Nelder and Wedderburn (1972). Their estimation is based on the iteratively reweighted least squares algorithm, which only requires a relationship between the mean and variance instead of the full distribution. Wedderburn (1974) extended the GLM by replacing the likelihood function with the quasi-likelihood function. McCullagh (1983) further developed the quasi-likelihood and established the consistency and asymptotic normality of the parameter estimates under second moment assumptions. We will briefly review the quasi-likelihood method here. Suppose that the n components of the response vector Y are independent with mean vector µ and covariance matrix ϕv (µ); ϕ may be a known or unknown constant and V ( ) is known. Since the components of Y are independent, the matrix V (µ) can be written in the following form: V (µ) = diag {V 1 (µ),..., V n (µ)}. One assumption in V (µ) is that V i (µ) depends only on the ith component of µ. We first consider a single component Y of the response vector Y. Under the above conditions, we define U = Q(Y ; µ) µ The function U has the following properties: E(U) = 0, Var (U) = 1/{ϕV (µ)}, E The quasi-likelihood function Q(Y ; µ) may be written as follows: Q(Y ; µ) = µ Y = Y µ ϕv (µ). (1) Y t ϕv (t) dt. ( ) U = 1/{ϕV (µ)}. µ We refer to Q(Y ; µ) as the quasi-likelihood, or more correctly, as the log quasi-likelihood for µ. Since the components of Y are independent by assumption, the log quasi-likelihood for the complete data Q (Y ; µ) is the sum of the individual contributions. The quasi-likelihood estimating equations for the regression parameters β, obtained by differentiating Q (Y ; µ) with respect to β, may be written in the form U(β) = 0, where U(β) = D T V 1 (Y µ) /ϕ (2) 55

4 TANG is called the quasi-score function. In this expression, the components of D, of order n q, are D ir = µ i / β r, the derivatives of µ(β) with respect to the parameters. The covariance matrix of U(β), which is also the negative expected value of U(β)/ β, is i β = D T V 1 D/ϕ. For quasi-likelihood functions, this matrix plays a similar role to the Fisher information for ordinary likelihood functions. Let y i (i = 1,..., n) be independent response variables which may be proportions or counts with mean µ i and variance ϕv (µ i ). In addition, suppose that Ey i ] = µ i = t 1 (η i ), where η i = x i β, t 1 ( ) is the inverse link function, x i is a q vector of covariates, β is a q vector of the unknown parameters β = (β 1,..., β q ) and ϕ does not depend on β. Let X be the n q matrix of covariates with the ith row being X (i), θ = (β, ϕ) Θ q+1 and q < n. Since the above model only depends on the assumption of the mean and variance, we may use the quasi-likelihood method to estimate the unknown parameters θ. For estimating β, we use the following quasi-score functions: ] n yi t 1 (X (i) β) U 1j (β, ϕ) = ϕv ( t ( 1 X (i) β )) i X ij, for j = 1,..., q. (3) For estimating ϕ we use the second moment function: ( n yi t ( 1 X (i) β )) ] 2 U 2 (β, ϕ) = ϕv ( t ( 1 X (i) β )) (n q 1), (4) where X (i) is the ith row of X, and i = t 1 (η i )/ η i. The quasi-likelihood estimators ( β, ϕ) are the simultaneous solutions of the q + 1 equations U 1j ( β, ϕ) = 0, for j = 1,..., q and U 2 ( β, ϕ) = 0. According to the results of McCullagh (1983), the numerical value of β is not affected by ϕ whether ϕ is known or not. So U 1j (β, ϕ) can be viewed as a function of β. We rewrite quasi-likelihood equations in a matrix form, U 1 ( β, ϕ) { ( = D T V 1 Y t 1 X β )} = 0, (5) where Y is a n 1 vector; t 1 (X β) = (t 1 (X (1) β),..., t 1 (X (n) β)) ; D is a n q matrix with components as D ij = i X ij, for i = 1,..., n and j = 1,..., q; V is a n-dimensional diagonal matrix with components ϕv ( t ( 1 X (i) β )), for i = 1,..., n. The asymptotic covariance matrix of β ( 1 is cov( β) = D T V D) 1. Beginning with an arbitrary value β 0 sufficiently close to β, the sequence of parameter estimates defined by the Newton-Raphson method with Fisher scoring is β 1 = β ( ) DT 0 V 0 D 0 DT 0 1 V 0 {Y ( t 1 X β )} 0. (6) The quasi-likelihood estimator β can be obtained by iterating until convergence. Approximate unbiasedness and asymptotic normality of β follow directly from (6) under the second moment assumptions. Let θ = ( β, ϕ). Moore (1986) established consistency and asymptotic normality of θ. 56

5 Semiparametric Information Criteria Using Quasi-likelihood 3. Semiparametric Model Selection Criteria In this section, we propose semiparametric model selection criteria SIC and SIC T. They apply generally to data no matter whether they are discrete or continuous. For convenience, we suppose that the data in the subsequent paragraphs are continuous. If they are discrete then integration is replaced by summation Semiparametric models We assume that the true probability density function of Y is h(y) and the true distribution function is H(y). We will relax the assumption of a complete specification of the probability model, only using some general form of the variance function about the mean as used in GLM. Now suppose that the statistician makes the assumption that the random variables y i (i = 1,..., n) have distribution G( ), density function g( ), mean µ i (β) and variance ϕv (µ i (β)). The first and the second moments can be combined in the following way: ( m(y i, θ) = m(y i, β, ϕ) = y i µ i (β) (y i µ i (β)) 2 ϕv (µ i (β)) ) = ( ) m1i m 2i By assumption, we have that E G m (y i, θ)] = m(y i, θ) dg(y) = 0. Let M denote the set of all possible distribution functions, let { } G(θ) = G M : m(y, θ) dg(y) = 0. Define G = (θ Θ) G(θ) as the set of all possible distribution functions that are restricted by the moments conditions. The class G is the statistical model we shall consider in this paper Kullback-Leibler information Extending the idea of AIC, we shall generalize the data-based model selection approach based also on the Kullback-Leibler information. We use the Kullback-Leibler information to measure the closeness between H(y) and the assumed semiparametric model G( ). Let g vary in the statistical model class G. We consider the constrained minimization problem ( ) h(y) v(β, ϕ) = inf log h(y) dy (7) g G g(y) subject to m(y, β, ϕ)g(y) dy = 0 and g(y) dy = 1. Using the results in convex analysis (e.g., Kitamura, 2006), the solution to the infinite dimensional optimization problem v(β, ϕ) can be equivalently written as the solution to the finite dimensional optimization problem v (β, ϕ) as follows: v (β, ϕ) = sup λ,r λ + ] log ( λ r m (y, β, ϕ)) h(y) dy. (8) 57

6 TANG Since we are only interested in the maximization of v (β, ϕ) with respect to β and ϕ, we can ignore other terms and proceed to define the semiparametric information of the statistical model G as follows: Definition 1 The semiparametric Kullback-Leibler information for quasi-likelihood function is defined as follows: SKL(θ) = sup E H ρ(y, θ, r)] = sup ρ(y, θ, r)h(y) dy, (9) r 2 r 2 where ρ(y, θ, r) = log (1 + r m(y, β, ϕ)). Since the true distribution function H( ) is unknown, we use the empirical distribution of the data to estimate the true distribution function H( ) Regularity conditions In this paper, we shall use SKL(θ) to select an appropriate semiparametric model. We will consider both the correctly specified case and the misspecified case. A semiparametric model is called correctly specified if the moments assumed evaluated at a particular value of the parameter are equal to the moments of the underlying true distribution. First, we state the regularity conditions which will be used in the sequel to prove the asymptotic unbiasedness of the proposed information criteria. C1 Define θ = (β, ϕ ) as the pseudo-true value, such that the semiparametric Kullback- Leibler information SKL(θ) is minimized at θ. C2 ρ(y, θ, r) is twice continuously differentiable in θ Θ and r 2. Note that C2 implies that r(θ) = arg sup ρ(y, θ, r)h(y) dy (10) r 2 is continuous with respect to θ. Furthermore, from C1 we know that there is a unique r corresponding to θ : r = arg sup r 2 ρ(y, θ, r)h(y) dy. (11) C3 The problem of inf θ Θ SKL(θ) has a unique saddle point solution (θ, r ), where r is defined in (11). When the model G is correctly specified, there exists a unique (θ, r ) satisfying SKL(θ ) = 0. C4 The following law of large numbers holds: sup sup θ Θ r 2 1 n n ρ(y i, θ, r) E H ρ(y, θ, r)] = o p(1) The expected information When we consider SKL(θ) as defined in (9), the minimizer of SKL(θ) can be viewed as the solution to the following saddle point problem: θ = arg min sup θ Θ r L(θ) E H ρ(y, θ, r)], (12) 58

7 Semiparametric Information Criteria Using Quasi-likelihood where Θ denotes the parameter space for θ, L(θ) = {r : r m(y, θ) I}, and I is an open interval containing zero. Next we describe the process of solving the above saddle point problem. Firstly, the expectation of ρ(y, θ, r) is maximized for given θ, so that the θ satisfies the following condition: E H ρ 1 (y, θ, r) m(y, θ)] = 0, where ρ 1 (y, θ, r) = 1/(1 + r m(y, θ)) is the first derivative of ρ with respect to r m(y, θ). C5 Assume that ρ(y, θ, r)h(y) dy and ρ 1 (y, θ, r)h(y) dy are differentiable under the integral sign. Secondly, Since θ is the minimizer of the quantity E H ρ(y, θ, r)], for a proper r, implying that E H ρ 1 (y, θ, r) M(y, θ) r(θ)] = 0, where M(y, θ) = m(y, θ)/ θ. Now denote the matrix of the first derivatives of ρ(y, θ, r) with respect to θ and r by ( ) ρ1 (y, θ, r) M(y, θ) Υ(θ, r) = r, ρ 1 (y, θ, r)m(y, θ) The values (θ, r(θ )) must satisfy the following first-order condition: E H Υ (θ, r (θ ))] = 0. (13) C6 Define ] Υ(θ, r ) Q = E H (θ, S = E, r ) H Υ(θ, r )Υ(θ, r ) ]. Assume that Q and S are finite and S is invertible. However, θ is not computable because it involves the expectation with respect to the true density function h(y). One way is to use the value to minimize the empirical version of the expected semiparametric Kullback-Leibler information: θ = arg min θ Θ { 1 n } n ρ (y i, θ, r (θ)), (14) where r(θ) is defined in (10). To approximate r(θ), we use the empirical distribution function in place of the true distribution function H( ) to obtain { } 1 n r(θ) = arg max ρ(y i, θ, r). (15) r 2 n We can show that ( θ, r( θ )) are consistent and asymptotically normally distributed along similar lines of Chen et al. (2007), we omit the proof here. The calculation of ( θ, r( θ )) is unstable and usually difficult to compute. Since the θ defined in Section 2 is also a consistent estimator, it is reasonable to use ( θ, r( θ)) instead. It is easy to show that ( θ, r( θ)) holds the properties of consistency and asymptotic normality too. 59

8 TANG Definition 2 In the statistical model G, the expected semiparametric information is defined as follows: ] E H SKL( θ)] = E H ρ(y, θ, r( θ))h(y) dy. (16) For later use, we write explicitly the components of Υ(θ, r) as follows: ( ) m(y, θ) M1 M M(y, θ) = θ = 3, M 2 M 4 where M 1 = m 1 β = µ i(β) β, M 2 = m 2 β = 2(µ i (β) y) µ i β ϕ V (µ i(β)) µ i µ i β, M 3 = m 1 ϕ = 0, M 4 = m 2 ϕ = V (µ i(β)) The proposed information criteria Definition 3 Let k be the dimension of θ, then we define the following information criterion: SIC = n ( ρ y i, θ, ) r( θ) + k, (17) where θ = ( β, ϕ) with β being the quasi-likelihood estimator and ϕ the estimator from the usual Pearson residual. We shall call this information criterion the semiparametric information criterion, or SIC for short. The proposed criterion SIC has the following properties. Proposition 1 Suppose that the model is correctly specified. Further suppose that (C1) (C6) hold. Then SIC is an asymptotically unbiased estimator of the expected information. 1 n Proof. Under C4, we know that when n goes to infinity, n ρ(y i, θ, r( θ)) converges in probability to E H ρ(y, θ, ] r( θ)). So that the expected information can be written as follows: E H SKL( θ)] = E H E H ρ(y, θ, r( θ) ]] θ) + E H E H ρ(y, θ, r( θ) θ) ]] E H EH ρ(y, θ, r(θ ) θ ) ]] + E H EH ρ(y, θ, r(θ ) θ ) ]] E H EH ρ(y, θ, r(θ ) θ ) ]] + E H EH ρ(y, θ, r(θ ) θ ) ]] E H E H ρ(y, θ, r( θ) θ) ]] = E H E H ρ(y, θ, r( θ) θ) ]] + D 1 + D 2 + D 3, where D 1, D 2 and D 3 denote the term on the second, third and the fourth lines above, respectively. Now we compute these quantities next. First, we consider D 1. According to Taylor s theorem and C5 and C6, by expanding ρ(y, θ, r( θ)) at (θ, r ) up to the second order, we have 60

9 Semiparametric Information Criteria Using Quasi-likelihood ( ) ρ(y, θ, r( θ)) =. θ θ ρ(y, θ, r) ρ (y, θ, r ) + r( θ) r (θ, r ) (θ, r ( ) ) + 1 ( ) θ θ 2 ρ(y, θ, r) θ θ 2 r( θ) r (θ, r ) 2 (θ, r r( θ) r ) Taking expectation of both sides with respect to the true distribution function H(y), we obtain that ] ρ(y, θ, r) E H (θ, r ) = E H Υ(θ, r )] = 0. (θ, r ) Thus we deduce that ( )]] ( )]] D 1 = E H E H ρ y, θ, r( θ) θ E H E H ρ y, θ, r(θ ) θ ( ) = 1 ( )]] θ 2 E θ 2 ρ(y, θ, r) H E H θ θ r( θ) r (θ, r ) 2 (θ, r + o p (1). r( θ) r ) Next, we consider a second-order Taylor expansion of D 3, obtaining ( ρ (y, θ, r(θ )) =. ( ρ y, θ, ) θ r( θ) + θ ) ρ(y, θ, r) r(θ ) r( θ) (θ, r ) ( + 1 θ θ ) ( 2 ρ(y, θ, r) θ 2 r(θ ) r( θ) (θ, r ) 2 θ ) ( θ, r ( θ)) r(θ ) r( θ) The properties of the consistency of ( θ, r( θ) ) implies that ] ρ(y, θ, r) E H (θ, r ) = 0. ( θ, r ( θ)) These results jointly imply that ( D 3 = E H E H ρ y, θ, r(θ ) θ ( = 1 2 E θ H E θ H r( θ ) r( θ) ( θ θ )]] + o p (1). r( θ ) r( θ) )]] ( E H E H ρ y, θ, r( θ) ) 2 ρ(y, θ, r) (θ, r ) 2 ( θ, r ( θ)) ( θ, r ( θ)) )]] θ Finally, we consider D 2. According to the law of large numbers, we have that D 2 = 0. In addition, the quadratic forms in both D 1 and D 3 converge to centrally distributed chi-square random variables with k degrees of freedom. This completes the proof. For a possibly misspecified model, we propose the following criterion for model selection. 61

10 TANG Definition 4 Consider a possibly misspecified semiparametric model G as defined in Section 3.1. Let ) Ŝ = 1 n ) ) ] 1 n Υ i ( θ, r( θ) Υ i ( θ, r( θ), Q = Υ i ( θ, r( θ) n n (θ., r ) Then we define the following information criterion SIC T = n ( ρ y i, θ, ) r( θ) trace(ŝ Q 1 ). (18) The proposed criterion SIC T has the following properties. Proposition 2 Under the regularity conditions (C1) (C6) listed in Section 3.3 and 3.4, SIC T is an asymptotically unbiased estimator of the expected information. SIC T can be understood as a semiparametric extension of the Takeuchi information criterion for parametric likelihood analysis (Takeuchi, 1976). Proof. Under (C1) (C6), we obtain that and 1 n n Υ i (θ, r )Υ i (θ, r ) P EH Υ(θ, r )Υ(θ, r ) ] = S, 1 n n Υ i (θ, r) (θ, r ) ] P Υ(θ, r ) EH (θ, r (θ = Q., r ) ) Furthermore, using the consistency and the asymptotic normality of ( θ, r( θ)): ( ) θ θ n N(0, Q 1 SQ 1 ), r( θ) r we obtain that ( ) ]] 2nE H E H ρ y, θ, r( θ) θ ρ (y, θ, r ) ( ) ( )]] θ θ 2 ρ(y, θ, r) = ne H E H θ θ r( θ) r (θ, r ) 2 (θ, r r( θ) r ) ] ( ) ( ) ]] = trace E 2 ρ(y, θ, r) H θ θ θ θ (θ, r ) 2 (θ, r ne H r( θ) r ) r( θ) r P trace Q ( Q 1 SQ 1)] = trace ( SQ 1), 62

11 Semiparametric Information Criteria Using Quasi-likelihood where S and Q can be explicitly written as follows: S = E H Υ (θ, r ) Υ (θ, r ) ] = E H ( ρ1 M(y, θ ) r ρ 1 m(y, θ ) ) (ρ1 r M(y, θ ) ρ 1 m(y, θ ) )] = E H ρ 2 1 M(y, θ ) r r M(y, θ ) ρ 2 1M(y, θ ) r m(y, θ ) ρ 2 1m(y, θ ) r M(y, θ ) ρ 2 1m(y, θ )m(y, θ ) Q = E H Υ (θ, r ) (θ, r ) ] = E H ρ 1 M θ r + ρ 2 M r r M ρ 1 M + ρ 2 M r m(y, θ ) ρ 1 M + ρ 2 m(y, θ ) r M ρ 2 m(y, θ )m(y, θ ) where M = M(y, θ ) and the derivative with respect to θ can be written as M(y, θ ) r ( ) ( ) 2 m 1 = β β, 2 m 2 r 2 m 1 β β β, 2 m 2 r ϕ β ) ϕ. (19) θ ( 2 m 1 ϕ β, 2 m 2 ϕ β r ( 2 m 1 ϕ 2, 2 m 2 ϕ 2 ) r The entries in (19) are calculated as follows: 2 m 1 β β = µ ] i 2 ( m 1 2 ) β β, ϕ β = m 1 2 m 1 β = 0, ϕ ϕ 2 = 0, 2 m 2 β β = 2(µ i (β) y i ) µ i β β ϕ V (µ ] i(β)) µ i µ i β, 2 ( m 2 2 ) ϕ β = m 2 β = V (µ i(β)) µ i ϕ µ i β, 2 m 2 ϕ 2 = Data Analysis Crowder (1978, Table 3) considered the data on germination of the seed variates Orobanche. Orobanche is a parasitic plant growing on the roots of flowering plants. In this experiment, two types of seeds from Orobanche (O.aegyptiaca75 and O.aegyptiaca73) were tested for germination when they were given two different root extracts (Bean and cucumber). The goal of the experiment was to determine the effect of the root extract on inhibiting the growth of the parasitic plant. This data was also analyzed by Breslow and Clayton (1993) to illustrate the use of generalized linear mixed models for overdispersion. In these studies, the authors have not considered the aspects of model selection. Now we reanalyze it by applying the model selection technique proposed based on the quasi-likelihood. In the Orobanche data, y i are the proportion of germinated seeds, and n i are seeds in the ith data set (i = 1,..., n = 21). Here, we apply the quasi-likelihood method to deal with these overdispersed data. We suppose that Ey i ] = µ i = π i and Vary i ] = ϕv (µ i ) = ϕπ i (1 π i )/n i, where ϕ is the overdispersion parameter. Furthermore, suppose each mean π i is associated with a set of covariates x i through a logistic link function, namely logit(π i ) = log(π i /(1 π i )) = Xi T β. Here X i = (1, x 1i, x 2i, x 1i x 2i ), where x 1i = 1 if the root extract is cucumber, otherwise 0 if the root extract is Bean; and x 2i = 1 if the type of seeds is O.aegyptiaca75, otherwise 0 if the type of seeds is O.aegyptiaca73. Here we have five candidate models: ], ], 63

12 TANG (A) logit(π i ) = β 0 (B) logit(π i ) = β 0 + β 1 x 1i (C) logit(π i ) = β 0 + β 2 x 2i (D) logit(π i ) = β 0 + β 1 x 1i + β 2 x 2i (E) logit(π i ) = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i We first used the quasi-binomial likelihood method of the free software R (R Development Core Team, 2012) to estimate the regression coefficients in all the candidate models, the results are summarized in Table 1. From Table 1, we know that the standard errors of the parameters are adjusted by the overdispersion factor to a reasonable level. Furthermore, the overdispersion parameter also has an impact on hypothesis testing. Comparing model (E) with model (D) by F -test, we found that the p value is The p value is 0.25, when model (B) is compared with model (D). Thus these two pairs of models are not significantly different when the p values are compared with the threshold value 5%. On the other hand, both the difference between model (A) and model (B) and that between model (C) and model (D) are significant using appropriate F -tests. Table 1: Summaries of quasi-binomial analysis for the Orobanche data Models (A) (B) (C) (D) (E) Intercept (sd) (0.15) (0.15) (0.27) (0.22) (0.25) Extracts (sd) (0.21) (0.21) (0.34) Seeds (sd) (0.33) (0.23) (0.30) Interaction (sd) (0.42) Overdispersion Next we computed the values of the various information criteria and the results are summarized in Table 2. SIC has the smallest value at model (B) among all candidate models. In Table 2, AIC BB denotes the values of AIC based on the beta-binomial model. As can be seen in Table 2, SIC is the smallest for model (B). SIC ranks the models in the following way: (B) > (A) > (D) > (E) > (C). This result is consistent with the results obtained Crawley (2005, p ). Crawley (2005) used the quasi-likelihood method to analyze the Orobanche data. He applied one F-test to compare the full model (E) with other simpler models. He found that there is no compelling evidence that the types of seeds and the interaction should be kept in the model, and the minimal adequate model is model (B). On the other hand, SIC T ranks the models in the following way: (C) > (A) > (B) > (D) > Table 2: Semiparametric model selection criteria for the Orobanche data Models (A) (B) (C) (D) (E) SIC SIC T AIC BB

13 Semiparametric Information Criteria Using Quasi-likelihood (E), which is quite different from the results obtained by using SIC. Even in full likelihood analysis, Takeuchi type information criteria have been reported to be unstable in some cases (Shibata, 1989). Comparing to SIC and SIC T, AIC BB selects the most complicated model (E) as the best one. 5. Simulation Studies Now we examine the performance of SIC and SIC T numerically. The other criterion considered in the simulation studies is AIC (Akaike, 1973). We computed the frequencies of selecting the candidate models for each information criterion, following the idea of Greven and Kneib (2010). We generated data {(y i, n i ), i = 1,..., n}, where y i is the proportion of successes in n i trials. The total sample size is fixed as N = n n i. Similar to the germination data set, we shall consider the case of two covariates x 1 and x 2. Firstly, we generated the number of trials n i. The data are divided into J clusters. Assume that the total number of trials in the jth cluster is m j, so that J j=1 m j = N. We generated the m j from the multinomial distribution Multinomial(N, p N ). We further suppose that there are v sets of data in every cluster. Given a fixed number of trials m j, the number of trials n ij in set i are generated from the multinomial distribution Multinomial ( m j, (q 1,..., q v ) ), where q 1 =... = q v = 1/v and m j = v n ij. The next step is to generate the proportion of successes y i. In order to ensure the existence of overdispersion, we suppose that the data follow a mixture model so that n i y i Binomial(n i, P i ), that is, n i y i follows a binomial distribution with mean n i π i, where P i is a beta random variable with parameters α i = cπ i, β i = c(1 π i ) with c = α i +β i being a constant independent of i. Simple calculation shows that y i has mean EY i ] = π i and variance 1 VarY i ] = π i (1 π i )1 + (n i 1)( c + 1 )]/n i, the variance being larger than the nominal binomial variance π i (1 π i )/n i. The degree of overdispersion is controlled by the positive constant c, smaller c corresponding to larger overdispersion. Finally, π i relates to the regression parameters through the logistic link function, namely logit(π i ) = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i. The five candidate models in our simulation studies are the same as the models considered in Section 4. We considered three scenarios. In each case, we set J = 4, N = 2000, p N = (1/2, 1/4, 1/8, 1/8), n = (40, 60, 80) and c = (20, 40, 60, 80). The true model is logit(π i ) = β 0 + β 1 x 1i, where the β 0 = 0.27,and β 1 = In each case we generated the pseudo random numbers 2000 times. In the following tables, AIC denotes the Akaike s information criterion AIC based on the binomial model while AIC BB refers to the AIC based on the beta-binomial model Scenario I In this case the covariates x 1i and x 2i are binary variables taking values 0, 1. The obtained results are summarized in Table 3. In Table 3, we find that when n = 40 and c = 20, AIC BB chooses the true model (B) over 76% of the samples. Both SIC and SIC T select the true model over 60% of the samples, and SIC is slightly better than SIC T. AIC ignores the overdispersion and performs poorly with only about 45% selection frequency. When c increases from 20 to 80, the degree of overdispersion decreases. In the extreme case, when c goes into infinity, the beta-binomial distribution tends to the binomial distribution. 65

14 TANG Table 3: The frequency of models selected by various criteria in scenario I n c Criteria (A) (B) (C) (D) (E) AIC(%) 0(0) 907(45) 0(0) 452(23) 641(32) SIC(%) 657(33) 1329(66) 8(0) 6(0) 0(0) SIC T (%) 690(34) 1206(60) 3(0) 47(2) 54(3) AIC BB (%) 0(0) 1524(76) 0(0) 284(14) 192(10) AIC(%) 0(0) 1327(66) 0(0) 350(18) 323(16) SIC(%) 209(10) 1778(89) 2(0) 11(1) 0(0) SIC T (%) 217(11) 1610(80) 0(0) 96(5) 77(4) AIC BB(%) 0(0) 1519(76) 0(0) 289(14) 192(10) AIC(%) 0(0) 1172(59) 0(0) 396(20) 432(22) SIC(%) 137(7) 1844(92) 3(0) 15(1) 1(0) SIC T (%) 149(7) 1541(77) 3(0) 88(4) 219(11) AIC BB (%) 0(0) 1544(77) 0(0) 276(14) 180(9) AIC(%) 0(0) 1458(73) 0(0) 341(17) 201(10) SIC(%) 14(1) 1981(99) 1(0) 4(0) 0(0) SIC T (%) 12(1) 1444(72) 1(0) 227(11) 316(16) AIC BB(%) 0(0) 1567(78) 0(0) 282(14) 151(8) In every scenario when n = 40 and c increases from 40 to 60, the selection frequencies of AIC, SIC and SIC T become higher while the behavior of AIC BB seems to be stable. A similar tendency is observed in the cases when n = 80 and c = (40, 60). The frequencies of models selected by various criteria when n = 60 and c = (20, 40, 60, 80) are the intermediate results between the cases when n = 40, c = (20, 80) and the cases when n = 80, c = (20, 80), we omit them in all scenarios. When the sample size n = 80, AIC BB selects the true models 77% of times over the samples. SIC behaves better than AIC BB in all cases, it chooses the true models 92% of the times and achieves the highest rate 99% when n = 80 and c = 80. However, SIC T behaves better than AIC while is a little worse than AIC BB. When c increases, AIC, SIC and SIC T become better and better; while AIC BB seems to be stable Scenario II In this scenario, the covariate x 1i was generated from the standard Normal distribution N(0, 1) and x 2i was generated from the Uniform distribution U(0, 1). The obtained results are shown in Table 4. In Table 4, when the sample size n = 40, both SIC and SIC T perform better than AIC but worse than AIC BB. When c increases from 20 to 80, the values of AIC BB keep constant, selecting the true model 74% times over the samples; SIC has selection rates increasing correspondingly from 54% to 68% while the selection rates of SIC T increase from 49% to 62%. When the sample size is set to n = 80, AIC BB performs stably with selection rates varying around 76%. Both SIC and SIC T perform better than AIC BB in all cases. SIC behaves more stably than SIC T Scenario III In this case, both x 1i and x 2i were generated independently from the standard Normal distribution N(0, 1). The obtained results are shown in Tables 5. In Table 5, when the sample size n = 40, both SIC and SIC T perform better than AIC but worse than AIC BB. When c increases from 20 to 80, the values of AIC BB are relatively 66

15 Semiparametric Information Criteria Using Quasi-likelihood Table 4: The frequency of models selected by various criteria in scenario II n c Criteria (A) (B) (C) (D) (E) AIC(%) 0(0) 677(34) 0(0) 507(25) 816(41) SIC(%) 927(46) 1070(54) 3(0) 0(0) 0(0) SIC T (%) 993(50) 984(49) 3(0) 11(1) 9(0) AIC BB (%) 0(0) 1504(75) 0(0) 290(14) 206(10) AIC(%) 0(0) 1175(59) 0(0) 443(22) 382(19) SIC(%) 643(32) 1356(68) 0(0) 1(0) 0(0) SIC T (%) 708(35) 1243(62) 0(0) 29(1) 20(1) AIC BB(%) 0(0) 1474(74) 0(0) 310(16) 216(11) AIC(%) 0(0) 930(46) 0(0) 493(25) 577(29) SIC(%) 143(7) 1744(87) 106(5) 4(0) 3(0) SIC T (%) 156(8) 1678(84) 98(5) 44(2) 24(1) AIC BB (%) 0(0) 1530(76) 0(0) 267(13) 203(10) AIC(%) 0(0) 1354(68) 0(0) 365(18) 281(14) SIC(%) 35(2) 1936(97) 27(1) 2(0) 0(0) SIC T (%) 42(2) 1758(88) 26(1) 106(5) 68(3) AIC BB(%) 0(0) 1546(77) 0(0) 290(14) 164(8) Table 5: The frequency of models selected by various criteria in scenario III n c Criteria (A) (B) (C) (D) (E) AIC(%) 0(0) 704(35) 0(0) 526(26) 770(38) SIC(%) 923(46) 1077(54) 0(0) 0(0) 0(0) SIC T (%) 988(49) 987(49) 0(0) 9(0) 16(1) AIC BB(%) 0(0) 1501(75) 0(0) 289(14) 210(10) AIC(%) 0(0) 1180(59) 0(0) 396(20) 424(21) SIC(%) 642(32) 1357(68) 0(0) 1(0) 0(0) SIC T (%) 703(35) 1253(63) 0(0) 25(1) 19(1) AIC BB (%) 0(0) 1503(75) 0(0) 285(14) 212(11) AIC(%) 0(0) 871(44) 0(0) 470(24) 659(33) SIC(%) 175(9) 1766(88) 46(2) 11(1) 2(0) SIC T (%) 185(9) 1707(85) 39(2) 44(2) 25(1) AIC BB(%) 0(0) 1506(75) 0(0) 278(14) 216(11) AIC(%) 0(0) 1314(66) 0(0) 398(20) 288(14) SIC(%) 41(2) 1946(97) 10(1) 3(0) 0(0) SIC T (%) 51(3) 1769(88) 9(0) 102(5) 69(3) AIC BB (%) 0(0) 1519(76) 0(0) 322(16) 159(8) stable with selection rates of the true model larger than 75%; SIC becomes a little better, the percentage of the true model selected increases from 54% to 68% while the selection rates of SIC T increase from 49% to 63%. When the sample size n = 80, we see that AIC BB performs stably with the selection rates being around 76%. Both SIC and SIC T perform better than AIC BB in all cases. 67

16 TANG 6. Conclusions In this paper we proposed two information criteria, namely SIC and SIC T, for selecting semiparametric models for use in conjunction with the quasi-likelihood methodologies. These model selection techniques complement the usual deviance analyses for generalized linear models. We demonstrated the possible usefulness of the proposed information criteria through both data analysis and simulation studies. In our simulation studies, we generated the data using the beta-binomial model so that the standard Akaike information criterion can be used. In this sense, it is unfair for our proposed methods, since both SIC and SIC T only use partial information on the first and second moments. However, our simulation results show that both SIC and SIC T are comparable with AIC in terms of the frequency of the true model selected. When trace(ŝ Q 1 ) = k, the Takeuchi-type information criterion SIC T reduces to SIC. This occurs when the model is correctly-specified. However, our simulation results show that SIC T may be unstable for a couple of reasons. One reason is the difficulty to estimate S and Q. The instability of the Takeuchi-type information criterion has been reported even in full-likelihood analysis (Shibata, 1989). The results in Section 5 suggest that both SIC and SIC T may be used when sample size is moderately large. Our simulation results also indicate that there are two factors affecting the performance of SIC and SIC T, namely the sample size n and the degree of overdispersion c. We see that when the sample size is fixed and c increases, the frequencies of the true models selected by both SIC and SIC T also increase. When the sample size n increases, both SIC and SIC T have higher true model selection percentages. As often pointed out in the literature, AIC tends to select overparametrized models. From the results shown in Tables 3 5, the phenomenon is also observed here. AIC based on the inappropriate binomial model particularly tends to select the most complicated model (E). AIC based on the beta-binomial distributions is obviously better but has the same tendency. On the other hand, the quasilikelihood based model selection criteria SIC and SIC T tend to select simpler models. SIC looks to be more stable than SIC T in our simulation studies. Due to the relative instability and computational complexity of SIC T, when both SIC and SIC T are available, use of SIC is a relatively conservative choice. There are still problems left unsolved. For example, The performance of SIC and SIC T is not good when the true model becomes more complicated. We will continue to involve these problems in our future works. Acknowledgments This paper is part of my Ph.D. project. I would like to thank my supervisor Professor Jinfang Wang for his kind supervision during these years. I am grateful to the editor and the anonymous referees for their valuable suggestions that improved the quality of the paper. REFERENCES Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory, B. N. Petrov and F. Csaki (eds), , Budapest: Akademiai Kiado. Anderson, D. R., Burnham, K. P. and White, G. C. (1994). AIC model selection in overdispersed capture-recapture data. Ecology, 75, Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J Am. Statist. Ass., 88,

17 Semiparametric Information Criteria Using Quasi-likelihood Chen, X., Hong, H., and Shum, M. (2007). Nonparametric likelihood ratio model selection tests between parametric likelihood and moment conditional models, Journal of Econometrics, 141, Crawley, M. J. (2005). Statistics: an introduction using R. John Wiley & Sons, Ltd. Crowder, M. J. (1978). Beta-binomial anova for proportions. Appl. Statist., 27, Efron, B. (1986). Double exponential families and their use in generalized linear regression. J Am. Statist. Ass., 81, Fitzmaurice, D. M. (1997). Model selection with overdispersed data. Statistician, 46, Greven, S. and Kneib, T. (2010). On the behaviour of marginal and conditional AIC in linear mixed models. Biometrika, 97, Hurvich, C. M. and Tsai, C-L. (1995). Model selection for extended quasi-likelihood models in small samples. Biometrics, 51, Kitamura, Y. (2006). Empirical likelihood methods in econometrics: theory and practice. Unpublished Manuscript, Yale University. Lawless, J. F. (1987). Negative binomial and mixed Poisson regression. Canadian J. Statist., 15, McCullagh, P. (1983). Quasi-likelihood functions. Ann. Statist., 11, McCullagh, P. and Nelder, J. A. (1989). Generalized linear models. Chapman and Hall. Moore, D. F. (1986). Asymptotic properties of moment estimators for overdispersed counts and proportions. Biometrika, 73, Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. J.R. Statist. Soc. A, 135, R Development Core Team. (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN , URL Shibata, R. (1989). Statistical aspect of model selection, In From Data to Model, Ed. J. C. Willems, Springer, Takeuchi, K. (1976). Distribution of informational statistics and criteria for adequacy of models. Mathematical Sciences (In Japanese), 153, Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika, 61, Williams, D. A. (1982). Extra-binomial variation in logistic-linear models. Appl. Statist., 31, (Received: December 26, 2012, Accepted: September 23, 2013) 69

18

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of

More information

KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE

KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE KULLBACK-LEIBLER INFORMATION THEORY A BASIS FOR MODEL SELECTION AND INFERENCE Kullback-Leibler Information or Distance f( x) gx ( ±) ) I( f, g) = ' f( x)log dx, If, ( g) is the "information" lost when

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Open Problems in Mixed Models

Open Problems in Mixed Models xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For

More information

SUPPLEMENTARY SIMULATIONS & FIGURES

SUPPLEMENTARY SIMULATIONS & FIGURES Supplementary Material: Supplementary Material for Mixed Effects Models for Resampled Network Statistics Improve Statistical Power to Find Differences in Multi-Subject Functional Connectivity Manjari Narayan,

More information

The In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification p.1/27

The In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification p.1/27 The In-and-Out-of-Sample (IOS) Likelihood Ratio Test for Model Misspecification Brett Presnell Dennis Boos Department of Statistics University of Florida and Department of Statistics North Carolina State

More information

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1 Count models A classical, theoretical argument for the Poisson distribution is the approximation Binom(n, p) Pois(λ) for large n and small p and λ = np. This can be extended considerably to n approx Z

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

Comparison of Estimators in GLM with Binary Data

Comparison of Estimators in GLM with Binary Data Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com

More information

Bootstrap prediction and Bayesian prediction under misspecified models

Bootstrap prediction and Bayesian prediction under misspecified models Bernoulli 11(4), 2005, 747 758 Bootstrap prediction and Bayesian prediction under misspecified models TADAYOSHI FUSHIKI Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569,

More information

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression

More information

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model Applied and Computational Mathematics 2014; 3(5): 268-272 Published online November 10, 2014 (http://www.sciencepublishinggroup.com/j/acm) doi: 10.11648/j.acm.20140305.22 ISSN: 2328-5605 (Print); ISSN:

More information

ATINER's Conference Paper Series STA

ATINER's Conference Paper Series STA ATINER CONFERENCE PAPER SERIES No: LNG2014-1176 Athens Institute for Education and Research ATINER ATINER's Conference Paper Series STA2014-1255 Parametric versus Semi-parametric Mixed Models for Panel

More information

Approximate Likelihoods

Approximate Likelihoods Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

arxiv: v2 [stat.me] 8 Jun 2016

arxiv: v2 [stat.me] 8 Jun 2016 Orthogonality of the Mean and Error Distribution in Generalized Linear Models 1 BY ALAN HUANG 2 and PAUL J. RATHOUZ 3 University of Technology Sydney and University of Wisconsin Madison 4th August, 2013

More information

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood.

Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood. Improving the Precision of Estimation by fitting a Generalized Linear Model, and Quasi-likelihood. P.M.E.Altham, Statistical Laboratory, University of Cambridge June 27, 2006 This article was published

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA

CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA STATISTICS IN MEDICINE, VOL. 17, 59 68 (1998) CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA J. K. LINDSEY AND B. JONES* Department of Medical Statistics, School of Computing Sciences,

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Discrepancy-Based Model Selection Criteria Using Cross Validation

Discrepancy-Based Model Selection Criteria Using Cross Validation 33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Quasi-likelihood Scan Statistics for Detection of

Quasi-likelihood Scan Statistics for Detection of for Quasi-likelihood for Division of Biostatistics and Bioinformatics, National Health Research Institutes & Department of Mathematics, National Chung Cheng University 17 December 2011 1 / 25 Outline for

More information

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data

Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized Linear Mixed Models for Count Data Sankhyā : The Indian Journal of Statistics 2009, Volume 71-B, Part 1, pp. 55-78 c 2009, Indian Statistical Institute Generalized Quasi-likelihood versus Hierarchical Likelihood Inferences in Generalized

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Computational methods for mixed models

Computational methods for mixed models Computational methods for mixed models Douglas Bates Department of Statistics University of Wisconsin Madison March 27, 2018 Abstract The lme4 package provides R functions to fit and analyze several different

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

More information

Akaike Information Criterion

Akaike Information Criterion Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7, 2012-1- background Background Model statistical model: Y j =

More information

Weighted Least Squares I

Weighted Least Squares I Weighted Least Squares I for i = 1, 2,..., n we have, see [1, Bradley], data: Y i x i i.n.i.d f(y i θ i ), where θ i = E(Y i x i ) co-variates: x i = (x i1, x i2,..., x ip ) T let X n p be the matrix of

More information

Generalized Quasi-likelihood (GQL) Inference* by Brajendra C. Sutradhar Memorial University address:

Generalized Quasi-likelihood (GQL) Inference* by Brajendra C. Sutradhar Memorial University  address: Generalized Quasi-likelihood (GQL) Inference* by Brajendra C. Sutradhar Memorial University Email address: bsutradh@mun.ca QL Estimation for Independent Data. For i = 1,...,K, let Y i denote the response

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Stat 710: Mathematical Statistics Lecture 12

Stat 710: Mathematical Statistics Lecture 12 Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:

More information

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion Biostokastikum Overdispersion is not uncommon in practice. In fact, some would maintain that overdispersion is the norm in practice and nominal dispersion the exception McCullagh and Nelder (1989) Overdispersion

More information

Generating Half-normal Plot for Zero-inflated Binomial Regression

Generating Half-normal Plot for Zero-inflated Binomial Regression Paper SP05 Generating Half-normal Plot for Zero-inflated Binomial Regression Zhao Yang, Xuezheng Sun Department of Epidemiology & Biostatistics University of South Carolina, Columbia, SC 29208 SUMMARY

More information

Geographically Weighted Regression as a Statistical Model

Geographically Weighted Regression as a Statistical Model Geographically Weighted Regression as a Statistical Model Chris Brunsdon Stewart Fotheringham Martin Charlton October 6, 2000 Spatial Analysis Research Group Department of Geography University of Newcastle-upon-Tyne

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next Book Contents Previous Next SAS/STAT User's Guide Overview Getting Started Syntax Details Examples References Book Contents Previous Next Top http://v8doc.sas.com/sashtml/stat/chap29/index.htm29/10/2004

More information

Generalized Linear Models I

Generalized Linear Models I Statistics 203: Introduction to Regression and Analysis of Variance Generalized Linear Models I Jonathan Taylor - p. 1/16 Today s class Poisson regression. Residuals for diagnostics. Exponential families.

More information

Estimating prediction error in mixed models

Estimating prediction error in mixed models Estimating prediction error in mixed models benjamin saefken, thomas kneib georg-august university goettingen sonja greven ludwig-maximilians-university munich 1 / 12 GLMM - Generalized linear mixed models

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

Likelihood and p-value functions in the composite likelihood context

Likelihood and p-value functions in the composite likelihood context Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining

More information

A test for improved forecasting performance at higher lead times

A test for improved forecasting performance at higher lead times A test for improved forecasting performance at higher lead times John Haywood and Granville Tunnicliffe Wilson September 3 Abstract Tiao and Xu (1993) proposed a test of whether a time series model, estimated

More information

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression. Ryan Godwin. ECON University of Manitoba Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

Chapter 4: Generalized Linear Models-II

Chapter 4: Generalized Linear Models-II : Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Statistica Sinica 15(05), 1015-1032 QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Scarlett L. Bellamy 1, Yi Li 2, Xihong

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Trends in Human Development Index of European Union

Trends in Human Development Index of European Union Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development

More information

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches

1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

Generalized linear models

Generalized linear models Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................

More information

A NOTE ON BAYESIAN ESTIMATION FOR THE NEGATIVE-BINOMIAL MODEL. Y. L. Lio

A NOTE ON BAYESIAN ESTIMATION FOR THE NEGATIVE-BINOMIAL MODEL. Y. L. Lio Pliska Stud. Math. Bulgar. 19 (2009), 207 216 STUDIA MATHEMATICA BULGARICA A NOTE ON BAYESIAN ESTIMATION FOR THE NEGATIVE-BINOMIAL MODEL Y. L. Lio The Negative Binomial model, which is generated by a simple

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta

Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in the Niger Delta International Journal of Science and Engineering Investigations vol. 7, issue 77, June 2018 ISSN: 2251-8843 Application of Poisson and Negative Binomial Regression Models in Modelling Oil Spill Data in

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

2.2 Classical Regression in the Time Series Context

2.2 Classical Regression in the Time Series Context 48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages

More information

i=1 h n (ˆθ n ) = 0. (2)

i=1 h n (ˆθ n ) = 0. (2) Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating

More information

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton

Last lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information