E[Y i θ i, α] = µ i = b (θ i ) and var(y i θ i, α) = b (θ i )α, for i = 1,..., n, with cov(y i, Y j θ i, θ j, α) = 0 for i j.

Size: px
Start display at page:

Download "E[Y i θ i, α] = µ i = b (θ i ) and var(y i θ i, α) = b (θ i )α, for i = 1,..., n, with cov(y i, Y j θ i, θ j, α) = 0 for i j."

Transcription

1 29 Jon Wakefield, Stat/Biostat 57 GENERAL REGRESSION MODELS We consider the class of Generalized Linear Mixed Models (GLMMs) and non-linear mixed effects models (NLMEMs). In this chapter we will again consider both a conditional approach to modeling, via the introduction of random effects, and a marginal approach using GEEs. Likelihood and Bayesian methods will be used for inference in the conditional approach. First we briefly review generalized linear models and non-linear models for independent data Jon Wakefield, Stat/Biostat 57 Generalized Linear Models GLMs provide a very useful extension to the linear model class. GLMs specify stochastic and deterministic components: The responses y i follow an exponential family so that the distribution is of the form p(y i θ i, α) = exp({y i θ i b(θ i )}/α + c(y i, α)), (4) where θ i and α are scalars. This is sometimes referred to as a linear or natural exponential family. It is straightforward to show that E[Y i θ i, α] = µ i = b (θ i ) and var(y i θ i, α) = b (θ i )α, for i =,..., n, with cov(y i, Y j θ i, θ j, α) = for i j. 2

2 29 Jon Wakefield, Stat/Biostat 57 The link function g( ) provides the connection between µ = E[Y θ, α] and the linear predictor xβ, via g(µ) = xβ, where x is a (k + ) vector of explanatory variables (including a for the intercept) and β is a (k + ) of regression parameters. If α is known this is a one-parameter (natural) exponential family model and there is a (k + )-dimensional sufficient statistic for β. If α is unknown then the distribution may or may not be a two-parameter exponential family model Jon Wakefield, Stat/Biostat 57 Likelihood Inference We now derive the score vector and information matrix. For an independent sample from the exponential family (4), we have l(θ) = i= y i θ i b(θ i ) α + c(y i, α) = l i (θ), where θ = [θ (β),..., θ n (β)] is the vector of canonical parameters. Using the chain rule we may write S(β) = l β = where V i = var(y β)/α and = i= i= l i dθ i µ i θ i dµ i β Y i b (θ i ) α i= dµ i V i dβ d 2 b dθ 2 i = d dθ i µ i = V i. 22

3 29 Jon Wakefield, Stat/Biostat 57 Hence S(β) = i= «T µi {Y i E[Y i µ i ]} β var(y i µ i ) = D T V {Y µ(β)}, (42) where D is the n p matrix with elements µ i / β j, i =,..., n, j =,..., p, and V is the n n diagonal matrix with i th diagonal element var(y i µ i ) Jon Wakefield, Stat/Biostat 57 The MLE has asymptotic distribution where In practice we use I n (β) /2 ( b β n β) d N p (, I p ), I n (β) = E[S(β)S(β) T ] = D T V D. I n ( b β) = b D T b V b D, where b V and b D are V and D evaluated at b β. Hence an estimator b β defined through S( b β) = will be consistent so long as the mean function is appropriate. The variance of the estimator is appropriately estimated if the second moment is correctly specified. 24

4 29 Jon Wakefield, Stat/Biostat 57 Nonlinear Regression Models Consider models of the form Y i = µ i (β) + ǫ i, i =,..., n, where µ i (β) = µ(x i, β) is nonlinear in x i, β is assumed to be of dimension k, and E[ǫ i µ i ] =, var(ǫ i µ i ) = σ 2 f(µ i ) with cov(ǫ i, ǫ j µ i ) =. Such models are often used for positive responses, and if such data are modeled on the original scale it is common to find that the variance is of the form f(µ) = µ or f(µ) = µ 2. An alternative approach that is appropriate in the case of f(µ) = µ 2 is to log transform the responses and then assume constant errors Jon Wakefield, Stat/Biostat 57 Likelihood Inference To obtain the likelihood function the probability model for the data must be fully specified. Assume Y i β, σ 2 ind N{µ i (β), σ 2 µ i (β) r }, for i =,..., n, and known r to give the likelihood function l(β, σ) = nlog σ r log µ i (β) {Y i µ i (β)} 2 2 2σ 2 µ r i (β). i= Differentiation with respect to β and σ yields the score equations S (β, σ) = r µ i 2 β (β) µ i (β) + {Y i µ i (β)} µ i σ 2 µ i (β) r β r 2σ 2 i= {Y i µ(β)} 2 i= S 2 (β, σ) = n σ + σ 3 µ r+ i (β) i= µ i β. {Y i µ(β)} 2 µ r i (β). Notice that we have a pair of quadratic estimating functions here and if the first two moments are correctly specified then E[S ] = and E[S 2 ] =. i= i= 26

5 29 Jon Wakefield, Stat/Biostat 57 Under the usual regularity conditions I(θ) /2 ( b θ θ) d N k+ (, I). where θ = (β, σ) and I(θ) is Fisher s expected information. In the case of r = we obtain l(β, σ) = n log σ {Y 2σ 2 i µ i (β)} 2 S (β, σ) = σ 2 i= S 2 (β, σ) = n σ 2 + σ 4 i= {Y i µ i (β)} µ i β {Y i µ i (β)} 2 i=» S I = E = β σ 2» S I 2 = E = T σ» S2 I 2 = E = β» S2 I 22 = E = 2n σ σ 2 «µi µi i= β β «T Jon Wakefield, Stat/Biostat 57 Identifiability For many nonlinear models identifiability is an issue. Specifically the same curve may be obtained with different parameter values. For example, consider the sum-of-exponentials model µ(x, β) = β exp( xβ ) + β 2 exp( xβ 3 ), where β = (β, β, β 2, β 3 ) and β j >, j =,, 2,3. The same curve results under the parameter sets (β, β, β 2, β 3 ) and (β 2, β 3, β, β ) and so we have non-identifiability. A solution to this problem, and to ensure that the parameters can only take admissible values, is to work with the set which constrains β 3 > β >. γ = (log β,log(β 3 β ),log β 2,log β ) 28

6 29 Jon Wakefield, Stat/Biostat 57 Example of a non-linear model: One compartmental open model Let w i (x) represent the amount of drug in compartment i, i =,, at time x. Assume: dw dx dw dx = k a w = k a w k e w where k a is the absorption rate, and k e is the elimination rate. Leads to µ(x) = Note: non-identifiability (flip-flop). Assume Dk a V (k a k e ) {exp[ k ex] exp[ k a x]} Y i β, σ 2 = LogNorm{µ i (x i ), σ 2 }. Mimics assay precision constant CV Jon Wakefield, Stat/Biostat 57 Pharmacokinetic Data Analysis Concentration (y) as a function of time (x), obtained from a new-born baby following the administration of a mg dose of Theophylline. Time Conc Concentration (mg/liter) Time (hours) 2

7 29 Jon Wakefield, Stat/Biostat 57 Generalized Linear Mixed Models A GLMM is defined by. Random Component: Y ij θ ij, α p( ) where p( ) is a member of the exponential family, that is p(y ij θ ij, α) = exp[{y ij θ ij b(θ ij )})/a(α) + c(y ij, α)], for i =,..., m units, and j =,..., n i, measurements per unit. 2. Systematic Component: If µ ij = E[Y ij θ ij, α] then we have a link function g( ), with g(µ ij ) = x ij β + z ij b i, so that we have introduced random effects into the linear predictor. The above defines the conditional part of the model. The random effects are then assigned a distribution, and in a GLMM this is assumed to be b i iid N(, D). We also have var(y ij θ ij, α) = αv(µ ij ) Jon Wakefield, Stat/Biostat 57 Marginal Moments Mean: E[Y ij ] = E{E[Y ij b i ]} = E[µ ij ] = E b [g (x ij β + z ij b i )]. Variance: var(y ij ) = E[var(Y ij b i )] + var(e[y ij b i ]) = αe b [v{g (x ij β + z ij b i )}] + var b [g (x ij β + z ij b i )]. Covariance: cov(y ij, Y ik ) = E[cov(Y ij, Y ik b i )] + cov(e[y ij b i ],E[Y ik b i ]) = cov{g (x ij β + z ij b i ), g (x ik β + z ik b i )}, for j k due to shared random effects, and cov(y ij, Y lk ) =, for i l, as there are no shared random effects. 22

8 29 Jon Wakefield, Stat/Biostat 57 Example: Log-Linear Regression for Seizure Data Data on seizures were collected on 59 epileptics. For each patient the number of epileptic seizures were recorded during a baseline period of eight weeks, after which patients were randomized to treatment with the anti-epileptic drug progabide, or to placebo. The number of seizures was then recorded in four consecutive two-week periods. The age of the patient was also available. Figures 8-2 contain summaries Jon Wakefield, Stat/Biostat Figure 8: Number of seizures for selected individuals over time for placebo group. 24

9 29 Jon Wakefield, Stat/Biostat Figure 9: Number of seizures for selected individuals over time for progabide group Jon Wakefield, Stat/Biostat 57 Baseline counts, =placebo,=progabide Means versus time, =placebo,=progabide 5 5 Average seizures Treatment Mean Var, =placebo,=progabide Seizures by age, =placebo,=progabide Var seizures Mean seizures Age (years) Figure 2: Summaries for seizure data. 26

10 29 Jon Wakefield, Stat/Biostat 57 A model for the seizure data Let Y ij = number of seizures on patient i at occasion j t ij = observation period on patient i at occasion j x i = / if patient i was assigned placebo/progabide x ij2 = / if j = /,2, 3,4 with t ij = 8 if j = and t ij = 2 if j =,2,3,4, i =,...,59. The question of primary scientific interest here is whether progabide reduces the number of seizures. A marginal mean model is given by E[Y ij ] = t ij exp(β + β x i + β 2 x ij2 + β 3 x i x ij2 ) Group j = period j =,2,3,4 period Placebo β β + β 2 Progabide β + β β + β + β 2 + β 3 Table : Parameter interpretation Jon Wakefield, Stat/Biostat 57 Precise definitions: exp(β ) is the expected number of seizures in the placebo group in time period ; exp(β ) is the ratio of the expected seizure rate in the progabide group, compared to the placebo group, in time period ; exp(β 2 ) is the ratio of the expected seizure rate at times j =,2, 3,4, as compared to j =, in the placebo group; exp(β 3 ) is the ratio of the expected seizure rates in the progabide group in the j =,2,3,4 period, as compared to the placebo group, in the same period. Hence exp(β 3 ) is the parameter of interest. More colloquially: β INTERCEPT β BASELINE TREATMENT GROUP EFFECT β 2 PERIOD EFFECT β 3 TREATMENT PERIOD EFFECT 28

11 29 Jon Wakefield, Stat/Biostat 57 Mixed Effects Model for Seizure Data Stage : Y ij β, b i ind Poisson(µ ij ), with g(µ ij ) = log µ ij = log t ij + x ij β + b i, where x ij β = β + β x ij + β 2 x i2 + β 3 x ij x i2. Hence E[Y ij b i ] = µ ij = t ij exp(x ij β + b i ), var(y ij b i ) = µ ij. Stage 2: b i iid N(, σ 2 ). The marginal mean is given by E[Y ij ] = t ij exp(x ij β + σ 2 /2), and the marginal median by t ij exp(x ij β) Jon Wakefield, Stat/Biostat 57 The marginal variance is given by var(y ij ) = E[µ ij ] + var(µ ij ) = E[Y ij ]{ + E[Y ij ](e σ2 )} = E[Y ij ]( + E[Y ij ] κ) where κ = e σ2 > illustrating excess-poisson variation which increases as σ 2 increases. For the marginal covariance cov(y ij, Y ik ) = cov{t ij exp(x ij β + b i ), t ij exp(x ik β + b i )} = t ij t ik exp(x ij β + x ik β) e σ2 {e σ2 } = E[Y ij ]E[Y ik ]κ. Hence for individual i we have variance-covariance matrix 2 µ i + µ 2 i κ µ iµ i2 κ... µ i µ ini κ µ i2 µ i κ µ i2 + µ 2 i2 κ... µ i2µ ini κ µ ini µ i κ µ ini µ i2 κ... µ ini + µ 2 in i κ 3, 7 5 where κ = e σ2 >. A deficiency of this model is that we only have a single parameter (σ 2 ) to control both excess-poisson variability and dependence. 22

12 29 Jon Wakefield, Stat/Biostat 57 Likelihood Inference In general there are two approaches to inference from a likelihood perspective:. Carry out conditional inference in order to eliminate the random effects. 2. Make a distributional assumption for b i, and then carry out likelihood inference (using some form of approximation to evaluate the required integrals). We first consider the conditional likelihood approach Jon Wakefield, Stat/Biostat 57 Conditional Likelihood Recall the definition of conditional likelihood. Suppose the distribution of the data may be factored as p(y β, γ) = h(y) p(t, t 2 β, γ) = h(y) p(t t 2, β) p(t 2 β, γ), where we choose to ignore the second term and consider the conditional likelihood L c (β) = p(t t 2, β) = p(t, t 2 β, γ). p(t 2 β, γ) Maximizing the conditional likelihood yields an estimator, b β c with the usual properties, for example I c (β) /2 ( b β c β) d N(, I), and I c (β) is the expected information derived from the conditional likelihood. 222

13 29 Jon Wakefield, Stat/Biostat 57 Conditional Likelihood for GLMMs In the context of GLMMs we have my L c (β) = p(t i t 2i, β) = where and i= my i= p(t i, t 2i β, b i ) p(y i β, b i ) p(t 2i β, b i ) = X p(t i, t 2i β, b i ) p(t 2i β, b i ) u i S 2i p(u i, t 2i β, b i ), and S 2i is the set of values of y i such that T 2i = t 2i, a set of disjoint events. The different notation is to emphasize that T i takes on values different to t i Jon Wakefield, Stat/Biostat 57 For simplicity we assume the canonical link function, g(µ ij ) = θ ij = x ij β + z ij b i and assume α =. Viewing b i as fixed effects we have the likelihood 8 9 < mx Xn i = L(β, b) = exp y ij x ij β + y ij z ij b i b(x ij β + z ij b i ) : ;, i= j= so that and mx mx Xn i t = t i = y ij x ij i= i= j= n i X t 2i = y ij z ij. j= We emphasize that no distribution has been specified for the b i, and they are being viewed as fixed effects. 224

14 29 Jon Wakefield, Stat/Biostat 57 Conditional Likelihood for the Poisson GLMM Assume for simplicity that z ij b i = b i, so that we have the random intercepts only model. Also, in an obvious change in notation x ij β + x i β + b i = x ij β + γ i so that β are the regression associated with covariates that change within an individual. Then p(y β, γ) = my p(y i β, γ i ) = i= my i= exp P m j= µ ij + P m j= y ij log µ ij Q ni j= y ij! Y m Xn i = c µ i+ + y i+ γ i + y ij log (t ij exp(x ij β)) A i= j= where c = Q i Qj y ij! and µ i+ = P m j= t ij exp(x ij β) Jon Wakefield, Stat/Biostat 57 In this case the distribution of the conditioning statistic is straightforward: so that y i+ β, γ i Poisson(µ i+ ) Y m p(y i+ β, γ i ) = c 2 exp( µ i+ + y i+ log µ i+ ) where c 2 = y i+! Hence i= Y m Xn i = c 2 µ i+ + y i+ γ i + y i+ t ij exp(x ij β) AA i= p(y y +,..., y ni +, β) = j= p(y β, γ) p(y +,..., y ni + β, γ) which is given by Q c i exp µ i+ + y i+ γ i + P n i j= y ij log(t ij exp(x ij β) c 2 Q m i= exp µ i+ + y i+ γ i + y i+ log Pni j= t ij exp(x ij β) 226

15 29 Jon Wakefield, Stat/Biostat 57 After simplification: p(y y +,..., y ni +, β) = = Q c m Q ni i= j= (t ij exp(x ij β)) y ij yi+ Q c m Pni 2 i= j= t ij exp(x ij β)! i+ my Yn i yij t A ij exp(x ij β) P ni y i...y ini l= t il exp(x il β) i= j= which is a multinomial likelihood (we have conditioned a set of Poisson counts on their total so obvious!): where π T i = (π i,..., π ini ) and y ij y i+, β Mult ni (y i+, π i ) π ij = t ij exp(x ij β) P ni l= t il exp(x il β) Jon Wakefield, Stat/Biostat 57 Conditional Likelihood for the Seizure Data Recall Y ij = number of seizures on patient i at occasion j t ij = observation period on patient i at occasion j x i = / if patient i was assigned placebo/progabide x ij2 = / if j = /,2, 3,4 with t ij = 8 if j = and t ij = 2 if j =,2,3,4, i =,...,59. A log-linear random intercept model is given by log E[Y ij b i ] = log t ij + β + β x i + β 2 x ij2 + β 3 x i x ij2 + b i 228

16 29 Jon Wakefield, Stat/Biostat 57 Precise definitions: exp(β ) is the expected number of seizures for a typical individual in the placebo group in time period ; exp(β ) is the ratio of the expected seizure rate in the progabide group, compared to the placebo group, for a typical individual, i.e. one with b i =, in time period ; exp(β 2 ) is the ratio of the expected seizure rate at times j =,2, 3,4, as compared to j =, for a typical individual in the placebo group; exp(β 3 ) is the ratio of the expected seizure rates in the progabide group in the j =,2,3,4 period, as compared to the placebo group, in the same period for a typical individual. Hence exp(β 3 ) is the parameter of interest Jon Wakefield, Stat/Biostat 57 In the conditional likelihood notation: log E[Y ij γ i ] = log t ij + γ i + β 2 x ij2 + β 3 x i x ij2 where γ i = β + β x i + b i so that we cannot estimate β, which is not a parameter of primary interest. Since x i = x i2 = x i3 = x i4 and t i = 8 = P 4 j= t ij, we effectively have two observation periods which we label (slightly abusing our previous notation), j =,. Let Y i = P 4 j= Y ij. For the placebo group: Y i ind Binomial(Y i+, π i ) for i =,...,29, with For the progabide group: π i = exp(β 2) + exp(β 2 ). Y i ind Binomial(Y i+, π i ) for i = 3,...,59, where. π i = exp(β 2 + β 3 ) + exp(β 2 + β 3 ). 23

17 29 Jon Wakefield, Stat/Biostat 57 This model is straightforward to fit in R: > xcond <- c(rep(,28),rep(,3)) > condmod <- glm(cbind(y,y)~xcond,family=binomial) > summary(condmod) Call: glm(formula = cbind(y, y) ~ xcond, family = binomial) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) * xcond Signif. codes: ***. **. *.5.. (Dispersion parameter for binomial family taken to be ) Null deviance: 36.5 on 58 degrees of freedom Residual deviance: on 57 degrees of freedom Hence the treatment effect is exp(.) =.9 so that the rate of seizures is estimated as % less in the progabide group, though this change is not statistically significant Jon Wakefield, Stat/Biostat 57 Conditional Likelihood for the Seizure Data The overall fit of the random intercept model is poor (34 on 57 degrees of freedom). Once possibility is to extend the model to allow a random slope for the effect of treatment x ij2, i.e. β 2i = β 2 + b 2i, but a conditional likelihood approach for this model will condition away the information relevant for estimation of β 3. We will examine such a model using a mixed effects approach. 232

18 29 Jon Wakefield, Stat/Biostat 57 Likelihood Inference in the Mixed Effects Model As with the linear mixed effects model (LMEM) we maximize L(β, α) where α denote the variance components in D, and my Z L(β, α) = p(y i β, b i ) p(b i α) db i. i= As with the NLMEM the required integrals are not available in closed form and so some sort of analytical or numerical approximation is required Jon Wakefield, Stat/Biostat 57 Example: Log-linear Poisson regression GLMM With a single random effect we have α = σ 2. L(β, α) = = = my Z i= n i Y j= exp( µ ij )µ y ij ij y ij! my (2πσ 2 ) /2 exp i= Z n i (2πσ 2 ) /2 exp «2σ 2 b2 i! X y ij x ij β i= Xn i e b i e x Xn i ijβ + y ij b i my exp i= j=! Xn i Z y ij x ij β i= j= 2σ 2 b2 i A db i h(b i ) exp{ b2 i /(2σ2 )} (2πσ 2 ) /2 db i, an integral with respect to a normal random variable (which is analytically intractable). db i 234

19 29 Jon Wakefield, Stat/Biostat 57 Integration in the GLMM As with the NLMEM there are a number of possible approaches for integrating out the random effects including: Analytical approximations, including Laplace, and the closely-related penalized quasi-likelihood approach. Gaussian quadrature. Importance sampling Monte Carlo Jon Wakefield, Stat/Biostat 57 Overview of Integration Techniques We describe a number of generic integration techniques, in particular: Laplace approximation (an analytical approximation). Quadrature (numerical integration). Importance sampling (a Monte Carlo method). Before the MCMC revolution these techniques were used in a Bayesian context. 236

20 29 Jon Wakefield, Stat/Biostat 57 Laplace Approximation Let Z I = exp{ng(θ)}dθ, denote a generic integral of interest and suppose m is the maximum of g( ). We have ng(θ) = n X (θ m) k g (k) (m), k! k= where g (k) (m) represents the k th derivative of g evaluated at m. Hence Z ( ) X (θ m) k I = exp n g (k) (m) dθ k! k= Z j (θ m) e ng(m) 2 ff exp 2/[ng (2) dθ (m)] = e ng(m) (2πv) /2 n /2 where v = /[g (2) (m)], and we have ignored terms in cubics or greater in the Taylor series Jon Wakefield, Stat/Biostat 57 Gaussian Quadrature A general method of integration is provided by quadrature (numerical integration) in which an integral Z I = f(u) du, is approximated by bi = n w X i= f(u i )w i, for design points u,..., u nw and weights w,..., w nw. Different choices of (u i, w i ) lead to different integration rules. In mixed model applications we have integrals with respect to a normal density, Gauss-Hermite quadrature is designed for problems of this type. Specifically, it provides exact integration of Z g(u)e u2 du, where g( ) is a polynomial of degree 2n w. 238

21 29 Jon Wakefield, Stat/Biostat 57 The design points are the zeroes of the so-called Hermite polynomials. Specifically, for a rule of n w points, u i is the i th zero of H nw (u), the Hermite polynomial of degree n w, and w i = wn w n w! π n 2 w[h nw (u i )] 2. Now suppose θ is two-dimensional and we wish to evaluate Z Z Z Z I = f(θ)dθ = f(θ, θ 2 )dθ 2 dθ = f (θ )dθ, where Z f (θ ) = f(θ, θ 2 )dθ 2. Now form where m X bi = w if b (θ i ), i= m 2 X bf (θ i ) = u j f(θ i, θ 2j ). j= Jon Wakefield, Stat/Biostat 57 Then we have m m 2 X X bi = w i u j f(θ i, θ 2j ), i= j= which is known as the Cartesian Product. Scaling and reparameterization To implement this method the function must be centered and scaled in some way, for example we could center and scale by the current estimates of the mean, m, and variance-covariance matrix, V known as adaptive quadrature. We then form X = L(θ m) where L L = V and carry out integation in the space of X. There is no guarantee that the most efficient rule is obtained by scaling in terms of the posterior mean and variance, but we note that the best normal approximation to a density (in terms of Kullbach-Leibler divergence) has the same mean and variance. 24

22 29 Jon Wakefield, Stat/Biostat 57 Gauss-Hermite Code in R Nodes and weights for n = 4: > n <- 4 > quad <- gauss.quad(n,kind="hermite") > quad$nodes [] > quad$weights [] Nodes and weights for n = 5: > n <- 5 > quad <- gauss.quad(n,kind="hermite") > quad$nodes [] > quad$weights [] Jon Wakefield, Stat/Biostat 57 Importance Sampling Rather than deterministically selecting points we may randomly generate points from some density h(θ). We have Z I = where w(θ) = f(θ)/h(θ). Z f(θ)dθ = Hence we have the obvious estimator mx bi = w(θ i ), f(θ) h(θ)dθ = E[w(θ)], h(θ) i= where θ i iid h( ). We have E[ b I] = I and V = var( b I) = m var{w(θ)}. From this expression it is clear that a good h( ) produces an approximately constant w(θ). 242

23 29 Jon Wakefield, Stat/Biostat 57 We may estimate V via bv = m mx i= f 2 (θ i ) h 2 (θ i ) bi 2, m and (appealing to the central limit theorem) Î is asymptotically normal and so a ( α)% confidence interval is given by Î ± Z α/2 ˆV /2 where Z α/2 is the α/2 point of an N(,) random variable. Hence the accuracy of the approximation may be directly assessed, providing an advantage over analytical approximations and quadrature methods. Notes on Importance Sampling We require an h( ) with heavier tails than the integrand. We can carry out importance sampling with any h but if the tails are lighter we will have an estimator with infinite variance (and hence an inconsistent procedure). Many suggestions for h have been made including Student t distributions and mixtures of Student t distributions. Iteration may again be used to obtain an estimator with good properties Jon Wakefield, Stat/Biostat 57 Notes on Implementation If the number of parameters is small then numerical integration techniques (e.g. quadrature) are highly efficient in terms of the number of function evaluations required. Hence if, for example, obtaining a point on the likelihood surface is computationally expensive (as occurs if a large simulation is required) then such techniques are preferable to Monte Carlo methods. The method employed will depend on whether it is for a one-off application, in which case ease-of-implementation is a consideration, or for a great deal of use, in which case an efficient method may be required. In general it is difficult to assess the accuracy of Laplace/numerical integration techniques. For simulation methods we note that independent samples are ideal for assessing Monte Carlo error since standard errors on expectations of interest may be simply calculated. Evans and Swartz (995, Statistical Science) provide a good review of integration techniques. 244

24 29 Jon Wakefield, Stat/Biostat 57 Penalized Quasi-Likelihood Breslow and Clayton (993) introduced the method of Penalized Quasi-Likelihood (PQL) which was an attempt to extend quasi-likelihood to GLMMs. One justification of the method is a first-order Laplace approximation. PQL is very poor for binary data but may be OK for binomial and Poisson data (as long as the counts are not too small). Within the lme4 package the lmer function may be used to fit GLMMs using MLE/REML; the required integrals can be approximated using penalized quasi-likelihood, Laplace, or adaptive Gaussian quadrature Jon Wakefield, Stat/Biostat 57 GLMMs for the Seizure Data PQL standard error for β looks off here (doesn t tie in with later analyses). Adaptive quadrature option is not available for this model. > library(lme4) # Need Matrix package version > lmermod <- lmer(y ~ x+x2+x3+( ID)+offset(log(time)),family=poisson, data=seiz,method="pql") > summary(lmermod) Generalized linear mixed model fit using PQL Random effects: Groups Name Variance Std.Dev. ID (Intercept) Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) < 2e-6 *** x x * x > lmermod2 <- lmer(y ~ x+x2+x3+( ID)+offset(log(time)),family=poisson, data=seiz,method="laplace") > summary(lmermod2) Generalized linear mixed model fit using Laplace 246

25 29 Jon Wakefield, Stat/Biostat 57 Formula: y ~ x + x2 + x3 + ( ID) + offset(log(time)) Data: seiz Family: poisson(log link) AIC BIC loglik deviance Random effects: Groups Name Variance Std.Dev. ID (Intercept) # of obs: 295, groups: ID, 59 Estimated scale (compare to ).674 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) e- *** x x * x The Laplace approach gives significantly different (and more reliable estimates) Jon Wakefield, Stat/Biostat 57 Random intercepts and slopes We may also allow the treatment effect to vary between individuals. > lmermod4 <- lmer(y ~ x+x2+x3+(+x2 ID)+offset(log(time)), family=poisson,data=seiz,method="laplace") > summary(lmermod4) Generalized linear mixed model fit using Laplace Random effects: Groups Name Variance Std.Dev. Corr ID (Intercept) x # of obs: 295, groups: ID, 59 Estimated scale (compare to ).4377 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) e-4 *** x x x * 248

26 29 Jon Wakefield, Stat/Biostat 57 Bayesian Inference for GLMMs A Bayesian approach to inference for a GLMM adds a prior distribution for β, α, to the likelihood L(β, α). Again a proper prior is required for the matrix D. In general a proper prior is not required for β the exponential family and linear link lead to a likelihood that is well-behaved Jon Wakefield, Stat/Biostat 57 Priors for β and α in the GLMM Lognormal Priors It is convenient to specify lognormal priors for positive parameters θ, since one may specify two quantiles of the distribution, and directly solve for the two parameters of the prior. In a GLMM we can often specify priors for more meaningful parameters than elements of β. For example, e β is the relative risk/rate in a log linear model, and is the odds ratio in a logistic model. Suppose we wish to specify a lognormal prior for a generic parameter θ. Denote by LN(µ, σ) the lognormal distribution with E[log θ] = µ and var(log θ) = σ 2, and let θ and θ 2 be the q and q 2 quantiles of this prior. Then µ = log(θ ) zq2 z q2 z q ««zq log(θ 2 ), σ = log(θ ) log(θ 2 ). (43) z q2 z q z q z q2 As an example, suppose that for θ we believe there is a 5% chance that the relative risk is less than and a 95% chance that it is less than 5; with q =.5, θ =. and q 2 =.95, θ 2 = 5., we obtain lognormal parameters µ = and σ = log 5/.96 =

27 29 Jon Wakefield, Stat/Biostat 57 Gamma Priors Consider the random intercepts model with b i iid N(, σ 2 ). It is not straightforward to specify a prior for σ, which represents the standard deviation of the residuals on the linear predictor scale, and is consequently not easy to interpret. We specify a gamma prior Ga(a, b) for the precision τ = /σ 2, with parameters a, b specified a priori. The choice of a gamma distribution is convenient since it produces a marginal distribution for the residuals in closed form. Specifically the two-stage model b i σ iid N(, σ 2 ), τ = σ 2 Ga(a, b) produces a marginal distribution for b i which is t d (, λ 2 ), a Student s t distribution with d = 2a degrees of freedom, location zero, and scale λ 2 = b/a. We now consider a log link, in which case the above is equivalent to the residual relative risks following a log t distribution. We specify the range exp(±r) within which the residual relative risks will lie with probability q, and use the relationship ±t d q/2 λ = ±R, where td q is the q-th quantile of a Student t random variable with d degrees of freedom, to give a = d/2, b = R 2 d/2(t d q/2 ) Jon Wakefield, Stat/Biostat 57 For example, if we assume a priori that the residual relative risks follow a log Student t distribution with 2 degrees of freedom, and that 95% of these risks fall in the interval (.5,2.) then we obtain the prior, Ga(,.26). In terms of σ this results in (2.5%, 97.5%) quantiles of (.84,.) with posterior median.9. It is important to assess whether the prior allows all reasonable levels of variability in the residual relative risks, in particular small values should not be excluded. The prior Ga(.,.) which has previously been used (e.g. in the WinBUGS manual) should be avoided for this very reason (this corresponds to relative risks which follow a log Student t distribution with.2 degrees of freedom). 252

28 29 Jon Wakefield, Stat/Biostat 57 Implementation Closed-form inference is unavailable, but MCMC is almost as straightforward as in the linear mixed model case. The joint posterior is p(β, W, b y) Suppose we have priors: my {p(y i β, b i )p(b i W)} π(β)π(w). i= β N q+ (β, V ) W W q+ (r, R ) The conditional distributions for β, τ, W are unchanged from the linear case. There is no closed form conditional distribution for β, or for b i, but Metropolis-Hastings step can be used (or adaptive rejection sampling can be utilized, the conditional is log concave) Jon Wakefield, Stat/Biostat 57 Integrated Nested Laplace Approximation (INLA) Recently an approach has emerged that combines Laplace approximations and numerical integration in a very efficient manner, see Rue, Martino and Chopin (28) for more detail. The method is designed for latent Gaussian model we describe in the context of a GLMM. Suppose Y ij is of exponential family form: Y ij θ ij, α p( ) where p( ) is a member of the exponential family, that is p(y ij θ ij, α) = exp[{y ij θ ij b(θ ij )})/a(α) + c(y ij, α)], for i =,..., m units, and j =,..., n i, measurements per unit. Lwt µ ij = E[Y ij θ ij, α] with g(µ ij ) = η ij = x ij β + z ij b i, where b i N(, D), and β is assigned a normal prior. We also have priors for α (if not a constant) and D these priors are non-normal. 254

29 29 Jon Wakefield, Stat/Biostat 57 Let γ = (b, β) denote the p vector of Gaussian parameters with π(γ θ ) N{, Q(D)} and α = (α, D) be the hyperparameters which are not Gaussian. Then π(γ, α y) π(α)π(γ α) Y i p(y i γ, α) π(α) Q(D) p/2 exp ( 2 γt Q(D)γ + X i log p(y i γ, α) ) Jon Wakefield, Stat/Biostat 57 The INLA Algorithm We wish to obtain the posterior marginals π(γ i y) and π(α i y). We have Z π(γ i y) = π(γ i α, y) π(α y)dα which is evaluated via the approximation Z eπ(γ i y) = eπ(γ i α, y) eπ(α y)dα (44) = X k eπ(γ i α k, y) eπ(α k y) k (45) where eπ(γ i α, y) is approximated by Laplace (or other analytical approximations). For eπ(α k y) the mode is located and then the Hessian is approximated, from which a grid of points is found that cover the density. 256

30 29 Jon Wakefield, Stat/Biostat 57 Pros and Cons of INLA Advantages: Quite widely applicable: GLMMs including temporal and spatial error terms. Very fast. An R package is available. Disadvantages: Can t do NLMEMs (though could in principle). Difficult to access when the approximation is failing. R package is new, and so there are wrinkles Jon Wakefield, Stat/Biostat 57 Bayesian Inference for the Seizure Data We fit various models and begin with a discussion of prior specification. We fit four models to the seizure data. Model Random intercepts only, π(β), τ Ga(,.26) corresponds to Student t 2 residuals and 95% (.5,2.). Model 2 Random intercepts only, π(β), τ Ga(2,.376) corresponds to Student t 4 residuals and 95% (.,.). Model 3 Random effects for intercept and for x 2. Model 4 We allow a bivariate Student t distribution for the pair of random effects introduced in Model 3. Model 5 We introduce measurement error into the model. 258

31 29 Jon Wakefield, Stat/Biostat 57 WinBUGS for Model model { for (i in :n){ for (j in :k){ Y[i,j] ~ dpois(mu[i,j]) log(mu[i,j]) <- log(t[j])+beta+beta*x[i]+beta2*x2[j]+ beta3*x[i]*x2[j]+b[i] } b[i] ~ dnorm(,tau) } tau ~ dgamma(,.26) sigma <- /tau beta ~ dflat() beta ~ dflat() beta2 ~ dflat() beta3 ~ dflat() } Jon Wakefield, Stat/Biostat 57 # Data list(k = 5, n = 59, t = c(8,2,2,2,2), x2 = c(,,,,), x = c(,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,), Y = structure(.data = c(,5,3,3,3,,3,5,3,3,6,2,4,,5,8,4,4,,4,66,7,8,9,2, 27,5,2,8,7,2,6,4,,2,52,4,2,23,2,23,5,6,6,5,,4,3,6,,52,26,2,6,22, 33,2,6,8,4,8,4,4,6,2,42,7,9,2,4,87,6,24,,9,5,,,,5,8,,,3,3,,37,29,28,29,8,3,5,2,5,2,3,,6,7,2,3,4,3,4,9,3,4,3,4,7,2,3,3,5, 28,8,2,2,8,55,8,24,76,25,9,2,,2,,,3,,4,2,47,3,5,3,2,76,,4,9,8, 38,8,7,9,4,9,,4,3,,,3,6,,3,9,2,6,7,4,24,4,3,,3,3,22,7,9,6, 4,5,4,7,4,,2,4,,4,67,3,7,7,7,4,4,8,2,5,7,2,,,,22,,2,4,,3,5,4,,3, 46,,4,25,5,36,,5,3,8,38,9,7,6,7,7,,,2,3,36,6,,8,8,,2,,,, 5,2,65,72,63,22,4,3,2,4,4,8,6,5,7,32,,3,,5,56,8,,28,3,24,6,3,4,, 6,3,5,4,3,22,,23,9,8,25,2,3,,,3,,,,,2,,4,3,2),.Dim = c(59,5))) # Initial estimates list(b=c(,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,), beta=, beta=, beta2=, beta3=, tau=) 26

32 29 Jon Wakefield, Stat/Biostat 57 Estimates (standard deviations) Model Model 2 Model 3 Model 4 Model 5 β.3 (.5).3 (.5).8 (.3).92 (.5). (.8) β -.24 (.2) -.34 (.2).42 (.9).6 (.2).9 (.24) β 2. (.47). (.47).45 (.) -.3 (.).2 (.) β 3 -. (.65) -. (.65) -.3 (.5) -.32 (.5) -.3 (.4) σ.64 (.3).66 (.3).7 (.72).7 (.).82 (.84) σ.473 (.62).399 (.78) ρ.9 (.6).2 (.2) σ e.39 (.33) Table 2: Posterior means and standard deviations for Bayesian analysis of seizure data; σ is the standard deviation of the random intercepts, σ is the standard deviation of the random period effect, and ρ is the correlation between these random effects; σ e is the standard deviation of the measurement error Jon Wakefield, Stat/Biostat 57 Poisson Model with a nugget effect Recall the model Y ij b i Poisson(t ij exp(x ij β + b i )) b i N(, σ 2 ) has a single parameter only, σ to allow for excess-poisson variability and between-individual variability. In the LMEM model we have with b i and ǫ ij independent. E[Y ij b i ] = x ij β + b i + ǫ ij b i N(, σ 2 ) ǫ ij N(, σ 2 e) 262

33 29 Jon Wakefield, Stat/Biostat 57 By analogy we might consider the model: with b i and b ij independent. Y ij b i, b ij Poisson(t ij exp(x ij β + b i + b ij )) b i N(, σ 2 ) b ij N(, σ 2 e ) We now two parameters to allow for between-individual variability, σ, and excess-poisson variability, σ e. Unfortunately there is no simple marginal interpretation of σ and σ e : E[Y ij ] = t ij exp(x ij η + σ 2 e /2 + σ2 o ) = µ ij var(y ij ) = µ ij + µ 2 ij (eσ2 e )(e σ2 ) cov(y ij ) = t ij t ik exp(x ij β + x ik β)e σ2 (e σ2 ) Another possibility would be to start with a negative binomial distibution, and then introduce a random effect, b i. This reveals the heaven and hell of mixed-effects models we have a lot of flexibility in the models we can fit, but many formulations that are similar produce different marginal mean and covariance structures, and often there is no obvious right choice Jon Wakefield, Stat/Biostat 57 WinBUGS for Model # Model 3 - Poisson lognormal for nugget also model { for (i in :n){ for (j in :k){ Y[i,j] ~ dpois(mu[i,j]) log(mu[i,j]) <- log(t[j])+beta+beta*x[i]+beta2*x2[j]+ beta3*x[i]*x2[j]+b[i]+be[i,j] be[i,j] ~ dnorm(,taue) } b[i] ~ dnorm(,tau) } taue ~ dgamma(,.26) tau ~ dgamma(,.26) sigma <- sqrt(/tau) sigmae <- /sqrt(taue) beta ~ dflat() beta ~ dflat() beta2 ~ dflat() beta3 ~ dflat() } 264

34 29 Jon Wakefield, Stat/Biostat 57 Seizure Data Patient 49 had counts 5,2,65,72,63 under progabide very surprising. In DHLZ dropping this individual gave a parameter of interest of -.3. Posterior medians of b ij for this individual (i = 49, j =,,2,3,4) are: -.6,.6,.8,.27,.65,.5 Conclusions: there is evidence of a statistically significant treatment effect, under Model 4 the 95% credible interval on β 3 is (-.6,-28). Under model 5 the 95% credible interval on β 3 is (-.59,-.3) Jon Wakefield, Stat/Biostat 57 Seizure Data: INLA implementation > require (glmmak); data(epileptic) > epileptic$y = epileptic$seizure; epileptic$x = epileptic$trt > epileptic$x2 = as.integer(epileptic$visit>) > epileptic$t = 8-6*epileptic$x2 > epileptic$rand = :nrow(epileptic) > formula = Y ~ x + x2 + I(x*x2) + + f(id,model="iid",param=c(,.26)) + f(rand,model="iid",param=c(,.26)) > inla = inla (formula, family="poisson", data=epileptic,offset=i(log(t))) > summary(inla) Fixed effects: mean sd.25quant.975quant kld dist. x e+ x e+ i(x*x2) e-32 intercept e+ Model hyperparameters: mean sd.25quant.975quant Precision.for..id Precision.for..rand

35 29 Jon Wakefield, Stat/Biostat 57 Generalized Estimating Equations (GEEs) Liang and Zeger (986, Biometrika), and Zeger and Liang (986, Biometrics) considered GLMs with dependence within individuals (in the context of longitudinal data). Theorem (Liang and Zeger, 986): the estimator b β that satisfies G(β, bα) = mx i= D T i W i (Y i µ i ) =, where D i = µ i β, W i = W i (β, α) is the working covariance model, µ i = µ i (β) and bα is a consistent estimator of α, is such that V /2 β ( β b β) d N(, I), where V β is given by mx i= D T i W i D i! ( m X i= ) D T i W i cov(y i )W i D i mx i=! D T i W i D i. In practice an empirical estimator of cov(y i ) is substituted to give b V β Jon Wakefield, Stat/Biostat 57 Choice of Working Covariance Models As in the linear case, various assumptions about the form of the working covariance may be assumed (what is a natural choice?); we write W i = /2 i R i (α) /2 i, where i = diag[var(y i ),...,var(y ini )] T and R i is a working correlation model, for example, independence, exchangeable, AR(), unstructured. For small m the sandwich estimator will have high variability and so model-based variance estimators may be preferable (but would we trust asymptotic normality if m were small anyway?). Model-based estimators are more efficient if the model is correct. Published comments: Liang and Zeger (986): little difference when correlation is moderate. McDonald (993): The independence estimator may be reccomended for practical purposes. Zhao, Prentice and Self (992): Assuming independence can lead to important losses of efficiency. Fitmaurice, Laird and Rotnitsky (993): important to obtain a close approximation to cov(y i ) in order to achieve high efficiency. 268

36 29 Jon Wakefield, Stat/Biostat 57 GEE for the Seizure Data We have the log-linear model is given log E[Y ij ] = log µ ij = log t ij + β + β x i + β 2 x ij2 + β 3 x i x ij2 and var(y ij ) = αµ ij. Recall β is baseline comparison of rates, β 2 is period effect in the placebo group and β 3 is treatment period effect of interest. Both quasi-likelihood and working independence GEE have estimating equation G(β, bα) = mx x T i (Y i µ i ) =, but differ in the manner in which the standard errors are calculated. i= Jon Wakefield, Stat/Biostat 57 Estimates (standard errors) Poisson Quasi-Lhd GEE Ind GEE Exch GEE AR() β.35 (.34).35 (.5).35 (.6).35 (.6).3 (.6) β.27 (.47).27 (.2).27 (.22).27 (.22).5 (.2) β 2. (.47). (.2). (.2). (.2).6 (.) β 3 -. (.65) -. (.29) -. (.22) -. (.22) -.3 (.27) α, α 2., 9.7, 9.4, 9.4,.78 2.,.89 Table 3: Parameter estimates and standard errors under various models; α is a variance parameter, and α 2 a correlation parameter. The point estimates under Poisson, quasi-likelihood and GEE working independence will always agree. The Poisson standard errors are clearly much too small. The quasi-likelihood standard errors are increased by 9.7 = 4.4, but do not acknowledge dependence on observations on the same individual (it is as if we have 59 5 independent observations). The standard errors of estimated parameters that are associated with time-varying covariates (β 2 and β 3 ) are reduced under GEE, since within-person comparisons are being made. The coincidence of the estimates and standard errors for independence and exchangeability is a consequence of the balanced design. 27

37 29 Jon Wakefield, Stat/Biostat 57 Interpretation of Marginal and Conditional Coefficients In a marginal model (which we consider under GEE), we have E[Y x] = exp(γ + γ x) in which case e γ is the change in the average response when we increase x by unit in the population under consideration. Under the conditional (mixed effects) model the interpretation of regression coefficients is conditional on the value of the random effect. For the model E[Y x, b] = exp(β + β x + b), with b iid N(, σ 2 ), the marginal mean is given by: E[Y x] = E b σ 2{E[Y x, b]} = exp(β + σ 2 /2 + β x). Hence for the log-linear model, e β has the same marginal interpretation to e γ (the marginal intercept is γ = β + σ / 2), though estimation of the latter via GEE produces a consistent estimator in more general circumstances (though there is an efficiency loss if the random effects model is correct) Jon Wakefield, Stat/Biostat 57 In the model E[Y x, b] = exp{β + b i + (β + b i )x i } e β is the relative risk between two populations with the same b but whose x values differ by one unit, that is: exp(β ) = E[Y x, b] E[Y x, b]. An alternative interpretation is to say that it is the expected change between two typical individuals, that is, individuals with random effects, b =. With b iid N(, D) we have the marginal mean E[Y x] = exp{β + D /2 + x(β + D ) + x 2 D /2} so that there is no marginal mean interpretation of exp(β ) (the latter is the marginal median). 272

38 29 Jon Wakefield, Stat/Biostat 57 Second Extension to GEE: Connected Estimating Equations, GEE2 a In GEE2, there are a connected set of joint estimating equations for β and α. Such an approach was proposed by Zhao and Prentice (99), and Prentice and Zhao (99). This approach is particularly appealing if the variance-covariance model is of interest. To motivate such a pair, consider the following model for a single individual with n independent observations: Y i β, α ind N {µ i (β),σ i (β, α)}, where, for example, we may have Σ i (β, α) = αµ i (β) 2, i =,..., n. We have the likelihood l(β, α) = 2 log Σ i 2 i= i= (Y i µ i ) 2 Σ i. a GEE is the method in which we have a single estimating equation, and a consistent estimator of α Jon Wakefield, Stat/Biostat 57 The score equations are given by l β = 2 and = i= l α i= µi β «T Σi + β Σ i = 2 = i= «T (Y i µ i ) Σ i + i= i= Σi α «T µi (Y i µ i ) + β Σ i 2 i= «T Σi ˆ(Yi µ i ) 2 Σ i β i= «T Σi + α Σ i 2 i= «T ˆ(Yi µ i ) 2 Σ i 2Σ 2 i «T Σi (Y i µ i ) 2 α Σ 2 i «T Σi (Y i µ i ) 2 β Σ 2 i This pair of quadratic estimating functions, are unbiased given correct specification of the first two moments so note that if the variance model is wrong, we are no longer guaranteed a consistent estimator of β. 2Σ 2 i If it s correct, however, there will be a gain in efficiency. (47) (46) 274

39 29 Jon Wakefield, Stat/Biostat 57 Let with, under the model, S i = (Y i µ i ) 2 E[S i ] = Σ i var(s i ) = E[S 2 i ] E[S i] 2 = 3Σ 2 i Σ2 i = 2Σ2 i Hence we can rewrite (46) and (47) l β l α = = i= i= D T i V i (Y i µ i ) + F i W i (S i Σ i ) i= where D i = µ i / β, E i = Σ i / β and F i = Σ i / α. E i W i (S i Σ i ) This can be compared with the usual estimating equation specification: l β = i= D T i V i (Y i µ i ) Jon Wakefield, Stat/Biostat 57 GEE2 Continued The general form of estimating equations, in the dependent data setting, is given by mx 4 D i 5 = 4 5 E i F i i= 2 5T 4 V i C i C T i W i Y i µ i S i Σ i where D i = µ i / β, E i = Σ i / β and F i = Σ i / α, and we have working variance-covariance structure V i = var(y i ) C i = cov(y i, S i ) W i = var(s i ) 276

40 29 Jon Wakefield, Stat/Biostat 57 When C i = we obtain G (β, α) = G 2 (β, α) = mx i= mx i= D T i V i (Y i µ i ) + F i W i (S i Σ i ) mx i= E i W i (S i Σ i ) which are the dependent data version of the normal score equations we obtained earlier. Prentice and Zhao show that these equations arise from a so-called quadratic exponential model: p(y i θ i, λ i ) = k i exp[y T i θ i + S T i λ i + c i (Y i )]. For example, c i = gives the multivariate normal. For consistency of b β we require models for both Y i and S i to be correct increased eficiency if models are correct. 277

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

STAT 526 Advanced Statistical Methodology

STAT 526 Advanced Statistical Methodology STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs Outline Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009,

More information

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58 Generalized Linear Mixed-Effects Models Copyright c 2015 Dan Nettleton (Iowa State University) Statistics 510 1 / 58 Reconsideration of the Plant Fungus Example Consider again the experiment designed to

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Generalized Linear Models. Kurt Hornik

Generalized Linear Models. Kurt Hornik Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general

More information

Generalized Estimating Equations

Generalized Estimating Equations Outline Review of Generalized Linear Models (GLM) Generalized Linear Model Exponential Family Components of GLM MLE for GLM, Iterative Weighted Least Squares Measuring Goodness of Fit - Deviance and Pearson

More information

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

STAT 705 Generalized linear mixed models

STAT 705 Generalized linear mixed models STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Approximate Likelihoods

Approximate Likelihoods Approximate Likelihoods Nancy Reid July 28, 2015 Why likelihood? makes probability modelling central l(θ; y) = log f (y; θ) emphasizes the inverse problem of reasoning y θ converts a prior probability

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA216: Generalized Linear Models. Lecture 1. Review and Introduction STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution

Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Peng WANG, Guei-feng TSAI and Annie QU 1 Abstract In longitudinal studies, mixed-effects models are

More information

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model Regression: Part II Linear Regression y~n X, 2 X Y Data Model β, σ 2 Process Model Β 0,V β s 1,s 2 Parameter Model Assumptions of Linear Model Homoskedasticity No error in X variables Error in Y variables

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression

More information

Generalized Linear Models Introduction

Generalized Linear Models Introduction Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Non-Gaussian Response Variables

Non-Gaussian Response Variables Non-Gaussian Response Variables What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Introduction (Alex Dmitrienko, Lilly) Web-based training program

Introduction (Alex Dmitrienko, Lilly) Web-based training program Web-based training Introduction (Alex Dmitrienko, Lilly) Web-based training program http://www.amstat.org/sections/sbiop/webinarseries.html Four-part web-based training series Geert Verbeke (Katholieke

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples. Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression

More information

Generalized Estimating Equations (gee) for glm type data

Generalized Estimating Equations (gee) for glm type data Generalized Estimating Equations (gee) for glm type data Søren Højsgaard mailto:sorenh@agrsci.dk Biometry Research Unit Danish Institute of Agricultural Sciences January 23, 2006 Printed: January 23, 2006

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Discussion of Maximization by Parts in Likelihood Inference

Discussion of Maximization by Parts in Likelihood Inference Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) arxiv:1611.01450v1 [stat.co] 4 Nov 2016 Aliaksandr Hubin Department of Mathematics, University of Oslo and Geir Storvik

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

ML estimation: Random-intercepts logistic model. and z

ML estimation: Random-intercepts logistic model. and z ML estimation: Random-intercepts logistic model log p ij 1 p = x ijβ + υ i with υ i N(0, συ) 2 ij Standardizing the random effect, θ i = υ i /σ υ, yields log p ij 1 p = x ij β + σ υθ i with θ i N(0, 1)

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression

More information

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS 1. (a) The posterior mean estimate of α is 14.27, and the posterior mean for the standard deviation of

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Proportional hazards regression

Proportional hazards regression Proportional hazards regression Patrick Breheny October 8 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/28 Introduction The model Solving for the MLE Inference Today we will begin discussing regression

More information

PAPER 218 STATISTICAL LEARNING IN PRACTICE

PAPER 218 STATISTICAL LEARNING IN PRACTICE MATHEMATICAL TRIPOS Part III Thursday, 7 June, 2018 9:00 am to 12:00 pm PAPER 218 STATISTICAL LEARNING IN PRACTICE Attempt no more than FOUR questions. There are SIX questions in total. The questions carry

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Lecture 3 Linear random intercept models

Lecture 3 Linear random intercept models Lecture 3 Linear random intercept models Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under

More information

Package HGLMMM for Hierarchical Generalized Linear Models

Package HGLMMM for Hierarchical Generalized Linear Models Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

SB1a Applied Statistics Lectures 9-10

SB1a Applied Statistics Lectures 9-10 SB1a Applied Statistics Lectures 9-10 Dr Geoff Nicholls Week 5 MT15 - Natural or canonical) exponential families - Generalised Linear Models for data - Fitting GLM s to data MLE s Iteratively Re-weighted

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Semiparametric Generalized Linear Models

Semiparametric Generalized Linear Models Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

Open Problems in Mixed Models

Open Problems in Mixed Models xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two

More information

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San

More information

Lecture 9 STK3100/4100

Lecture 9 STK3100/4100 Lecture 9 STK3100/4100 27. October 2014 Plan for lecture: 1. Linear mixed models cont. Models accounting for time dependencies (Ch. 6.1) 2. Generalized linear mixed models (GLMM, Ch. 13.1-13.3) Examples

More information

2017 SISG Module 1: Bayesian Statistics for Genetics Lecture 7: Generalized Linear Modeling

2017 SISG Module 1: Bayesian Statistics for Genetics Lecture 7: Generalized Linear Modeling 2017 SISG Module 1: Bayesian Statistics for Genetics Lecture 7: Generalized Linear Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction and Motivating

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Chapter 4: Generalized Linear Models-II

Chapter 4: Generalized Linear Models-II : Generalized Linear Models-II Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

Outline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model

Outline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model Outline for today Computation of the likelihood function for GLMMs asmus Waagepetersen Department of Mathematics Aalborg University Denmark likelihood for GLMM penalized quasi-likelihood estimation Laplace

More information

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks. Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think

More information

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM An R Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM Lloyd J. Edwards, Ph.D. UNC-CH Department of Biostatistics email: Lloyd_Edwards@unc.edu Presented to the Department

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1 Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1

More information

1/15. Over or under dispersion Problem

1/15. Over or under dispersion Problem 1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Generalized, Linear, and Mixed Models

Generalized, Linear, and Mixed Models Generalized, Linear, and Mixed Models CHARLES E. McCULLOCH SHAYLER.SEARLE Departments of Statistical Science and Biometrics Cornell University A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS, INC. New

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information