Stat 579: Generalized Linear Models and Extensions
|
|
- Tamsyn Oliver
- 5 years ago
- Views:
Transcription
1 Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38
2 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject y 21 y 22 y 2n2. m 1 th subject y m1 1 y m1 2 y m1 n m1 1st subject y m1 +1,1 y m1 +1,2 y m1 +1,n m1 +1 Control 2nd subject y m1 +2,1 y m1 +2,2 y m1 +2,n m1 +2. m th subject y m1 y m2 y mnm 2 / 38
3 Linear mixed model Y i = X i β + Z i b i + ɛ i (1) where β g = [ βg0 β g1 ] [ βb0, β B = β B1 ] [ βg, β = β B ] = β g0 β g1 β B0 β B1, b i = [ bi0 b i1 ] b i N(0, g) indep ɛ i N(0, R i ) Therefore, Y i N(X i β, Z i gz i + R i ) 3 / 38
4 { Xi β Y i = g + Z i b i + ɛ i child i is a girl X i β B + Z i b i + ɛ i child i is a boy where 1 t i1 1 t i2 Z i =.., X i = 1 t ini δ i δ i t i1 (1 δ i ) (1 δ i )t i1 δ i δ i t i2 (1 δ i ) (1 δ i )t i2.... δ i δ i t ini (1 δ i ) (1 δ i )t ini with δ i = { 1 girls 0 boys 4 / 38
5 Y i = X i β + Z i b i + ɛ i The parameters of the model are β and those in g and R i. The b i are unknown random variables, while Z i is fixed and known. The likelihood function is based on the marginal distribution Y i N(X i β, Z i gz i + R i ) where responses Y 1, Y 2,, Y m for different individuals are assumed independent. same form as the longitudinal models Y i N(X i β, Σ i ) where here Σ i = Z i gz i + R i 5 / 38
6 Thus standard ML and REML can be used for inference on β, whereas AIC/BIC may be used for selections of appropriate covariance structures. 6 / 38
7 We can also state model (1) as Y 1 Y 2. Y m = X 1 X 2. X m Y = Xβ + Zb + ɛ (2) Z Z b 1 β b b m 0 0 Z m where E(ɛ) = 0, E(b) = 0 and R R var(ɛ) = = R R m ɛ 1 ɛ 2. ɛ m 7 / 38
8 and var(b) = g g g = G Note that G and g are different. R and G are block diagonal which reflects assumptions that b 1, b 2,, b m are mutually independent, so are ɛ 1, ɛ 2,, ɛ m. As before, we also assume the b i s and ɛ i s are independent 8 / 38
9 Concisely, the linear mixed model for longitudinal data can be written as Y = Xβ + Zb + ɛ with b N(0, G) indep of ɛ N(0, R). Y N(Xβ, V), where var(y) = V = ZGZ + R -V is block-diagonal with ith diagonal block var(y i ) = Z i gz i + R i 9 / 38
10 Model (1) includes many settings for special case Y i = X i β + Z i b i + ɛ i the standard linear regression model, which excludes Z i b i and assumes that R i = σ 2 I ni independent errors with constant variance (same is true for standard ANOVA) longitudinal model from week 13 which also excludes Z i b i the random coefficient model where X i β was Z i β the split plot, randomized block and one-way random effects model when we assume E(b i ) = 0, we typically are thinking of b i as subject specific deviations, which makes sense when the effects in Z i are contained in X i. 10 / 38
11 One-way random effects model Y i = Y ij = µ + α i + ɛ ij, i = 1, 2,, m, j = 1,, n i (3) Y i1 ɛ i1 α 1 1 Y i2.., ɛ ɛ i2 i =.., α = α 2.., X 1 i =.. Y ini 1 α N(0, σ 2 αi m ) ɛ N(0, σ 2 I n ) α and ɛ independent ɛ ini Y i = X i µ + Z i α i + ɛ i, where Z i = X i α m n i 1 11 / 38
12 β = µ, X = X 1 X 2. X m Z = mk 1, Y = Y 1 Y 2. Y m, ɛ = mk m ɛ 1 ɛ 2. ɛ m, 12 / 38
13 One way random effects model (3) can be written as Y = Xβ + Zα + ɛ where µ = β, α = b, J ni is an n i 1 vector of 1s, g = σα 2 and R i = σei 2 ni, I ni is an n i 1 identity matrix which implies Z i gz i + R i = σαj 2 ni + σei 2 ni 13 / 38
14 Best linear unbiased prediction (BLUPS) The expected response for subject i conditional on b i is E(Y i b i ) = X i β + Z i b i Unconditional expectation, averaging over all subjects is E(Y i ) = E {E(Y i b i )} = X i β + Z i E(b i ) = X i β Given the ML or REML estiator of β, say ˆβ, we estimate the unconditional (or pop averaged) mean with X i ˆβ 14 / 38
15 How do we estimate (actually predict) the subject specific E(Y i b i ) = X i β + Z i b i It is reasonable to think X i ˆβ + Zi ˆbi for a suitable chosen ˆb i. Statistical theory suggests that the best prediction for the random variable b i based on Y i is E(b i Y i ). Under normality, this can be shown to equal E(b i Y i ) = gz iv 1 i (Y i X i β) This follows because Y i b i N(X i β + Z i b i, R i ) and b i N(0, g) implies (Y i, b i ) has a multivariate normal distribution. It is well known that b i Y i is normal with E(b i Y i ) = E(b i ) + cov(b i, Y i )var(y i ) 1 (Y i E(Y i )) = 0 + gz iv 1 i (Y i X i β) 15 / 38
16 Here V i = var(y i ) and cov(b i, Y i ) = cov(b i, X i β + Z i b i + ɛ i ) = cov(b i, Z i b i + ɛ i ) = cov(b i, Z i b i ) + cov(b i, ɛ i ) = cov(b i, b i )Z i = gz i So result follows. 16 / 38
17 Best Prediction E(b i Y i ) = gz iv 1 i (Y i Xβ) Replace unknown parameters g, V i and β by MLES or REMLs gives ˆb i = ĝz 1 i ˆV i (Y i Xˆβ) which is often called the empirical estimated BLUP (best unbiased predictor). Thus the subject specific trajectory X i β + Z i b i is predicted via X i ˆβ + Z i ˆb i = X i ˆβ + Z i ĝz i = (I Z i ĝz i 1 ˆV i 1 ˆV i = (I W i )X i ˆβ + W i Y i (Y i X i ˆβ) )X i ˆβ + Zi ĝz i 1 ˆV i Y i 17 / 38
18 If you consider W i a weight, then our base prediction is a weighted average of an individual s response Y i and the estimated mean response X i ˆβ, averaged over all individuals with same design matrix X i as subject i. Note that prediction in the one-way random effects ANOVA we give before are special cases of the above results. 18 / 38
19 Tests on variance component Suppose we wish to test Y ij = β 0 + β 1 t ij + b 0i + b 1i t ij + ɛ ij ( ) ( [ ]) b0i σ 2 N 0, 0 σ 01 b 1i σ 10 σ1 2 H 0 : σ 2 1 = 0 i.e.,b 1i = 0 for all i. In this case β 1 is the slope of each individual s regression line. This test can be done informally using AIC/BIC (i.e., compare null model to alternative model with arbitrary σ1 2 ), or using LR test LR = 2log(ˆL red /ˆL full ) where ˆL red and ˆL full are the maximized log-likelihood for the reduced and full models. 19 / 38
20 Boundry correction H 0 specifies that σ 2 1 lies on the boundary of possible value for σ2 1 (i.e., σ 2 1 0). The standard theory for LR tests doesn t hold. One can show under H 0, LR 0.5χ χ 2 2 i.e., LR statistic has a null distribution that is a 50:50 mixture of χ 2 1 and χ2 2 distribution (rather than usual) χ2 1. Result mixture holds since we are comparing 2 nested models, one with 1 random effect b 0i, the other with two b 0i and b 1i. More generally if we are comparing nested models with q and q + 1 correlated random-effects (which still differ by 1 random effect), then LR 0.5χ 2 q + 0.5χ 2 q+1 20 / 38
21 Time dependent covariance Mixed models allow for a variety of complications with longitudinal data, for example dropout/missing covariates in fixed effects can either be static (constant over time) such as baseline measurements or time dependent such as time. IN some cases, the treatment (or group) one is assigned changes with time. For example, you may be switched to placebo group if you are having all adverse reactions to a drug, or switched to a higher dose of a drug if given dose is not effective. Some care is needed to ensure proper inferences with these complications. 21 / 38
22 Generalized estimating equations (GEEs) Liang and Zeger (1986) Apply to non-normal data, shortage of joint distribution such as multivariate probability distribution GEE approach bypasses the distribution problem by focusing on modeling the mean function and correlation function only. An estimating equation is proposed for estimating the regression effects and then appropriate statistical theory is applied to generate standard errors and procedures for inference. 22 / 38
23 Standard longitudinal model for normal responses. The standard model for longitudinal data can be written as with ɛ N(0, Σ). Y = Xβ + ɛ Y N(Xβ, Σ), where Σ is block-diagonal with ith diagonal block Σ i usually write Σ = σ 2 R, var(y ij ) = σ 2, cov(y i ) = Σ i, Σ i = σ 2 R i Equivalently, y i N(X i β, σ 2 R i ) 23 / 38
24 L β = 1 σ 2 X R (y µ) = 1 σ 2 k j=1 X jr 1 j (y j µ j ) = 0 (4) ˆβ = (X R 1 X) 1 X R 1 y k = (X jr 1 j X j ) 1 X jr 1 j y j (5) j=1 MLEs of σ 2 and R can be computed independently of β, plugging these into (5) gives the overall MLE of β. 24 / 38
25 Generalized linear models y i indep Bin(n i, µ i ) A glm has the following form with some link function g( ) g(µ i ) = η i = x i β ( ) µi log, logit link: logistic regression 1 µ g(µ i ) = i Φ 1 (µ i ), Probit link: probit regression log { log(1 µ i )} Complementary log-log link exp(η i )/(1 + exp(η i )), logit link µ i = g 1 (η i ) = Φ(η i ), Probit link 1 exp { exp(η i )} Complementary log-log 25 / 38
26 General longitudinal model In a general case, y N(Xβ, Σ) ˆβ = (X ˆΣ 1 X) 1 X ˆΣ 1 y is the MLE of β and ˆΣ is the MLE of Σ. Key to generalizing glms to longitudinal responses is to modify equation (4) to accommodate g(µ ij ) = X ijβ instead of µ ij = X ijβ for some link function g( ). If the responses within an individual were independent and satisfied an EF model, this would be a glm and the score function for β based solely on y i would be w ij l i β = X iw i i (y i µ i ) i = diag(g (µ ij )) w ii = diag φv (µ ij )g (µ i ) 2 w ij, weight for jth obsn in y i, i.e., y ij 26 / 38
27 Assuming responses between individuals are independent, the score function based on the sample is L k β = L j k β = X jw j j (y j µ j ) j=1 j=1 Recall the score function under normality, 0 = 1 σ 2 k j=1 X jr 1 j (y j µ j ) R j = I nj, corresponds to the case where responses within an individual are independent. Idea behind GEEs for glms is to take the score function when responses within an individual are independent. 0 = k j=1 X j w j j (y j µ j ) and replace w j j by a matrix that captures the within individual correlation structure, but reduces to w j j if responses within an individual are independent. 27 / 38
28 SAS ( ) 1 w i i = diag φ w ij V (µ ij )g (µ i ) 2 g (µ ij ) ( ) wij = diag φv (µ ij 1 g (µ ij ) ( ) ( ) 1 wij = diag g diag (µ ij ) φv (µ ij ) = i 1 V 1 i 28 / 38
29 Recall for a glm that var(y ij ) = φv (µ ij) w ij φ is the dispersion parameter, so if responses within an individual were independent, then var(y i ) = V i and the score function for β for a glm when responses within an individual are independent reduces to L k β = X jw j j (y j µ j ) = = j=1 k j=1 k j=1 X j 1 j V 1 j (y j µ j ) D jv 1 j (y j µ j ) = 0 D j depends on X j and link function, but not the variance function. 29 / 38
30 GEE extension L β = k j=1 D jv 1 j (y j µ j ) = 0 (6) GEE extension replaces V i as defined above by a covariance matrix for y i that reflects the correlation structure for a longitudinal response vector. (w i ) 0.5 = diag Var(y i ) = φa 1 2 i Wi 1 2 Ri (α)wi A 2 i where A i = diag(v (µ ij )), wi = diag(w ij ), so that ( ) A 0.5 i = diag V (µ ij ) ( ) 1 wij A 0.5 i (wi ) 0.5 = diag ( ) V (µ ij ) w ij 30 / 38
31 R i (α) = n i n i correlation matrix for y i which depends on a set α of correlation parameters. If responses within an individual are independent, R i (α) = I ni and thus as before. V i = φa 0.5 i wi 0.5 w ( ) φv (µij ) = diag w ij i 0.5 A 0.5 i 31 / 38
32 Several comments Consider equation (6), L β = k j=1 D jv 1 j (y j µ j ) = 0 D j = D j (β), V j = V j (α, β, φ), µ j = µ j (β) for normal response and V i = σ 2 R i (α), the GEE of β is MLE when σ 2 and α are estimated by ML (and REML if σ 2 and α are estimated by REML) GEE estimators are MLEs when R i (α) = I ni and responses are from an EF distribution Scale parameter φ is optional (not required for some models) 32 / 38
33 Choice of R i (α) Independent R i (α) = I ni Exchangeable R i (α) = Unstructured, corr(y ij, y ik ) = α jk AR(1), corr(y ij, y ik ) = α k j 1 α α α α 1 α α α α... α 1 33 / 38
34 Estimators Need to estimate α to compute the GEE, using exchangeable R i (α) as an example. y i = (y i1, y i2,, y ini ), corr(y i ) = R i (α), R i (α) is exchangeable. scaled Pearson residuals Γ ij = y ij µ ij var(yij ) = y ij µ ij φv (µij )/w ij α = ρ jk = corr(y ij, y ik ) = E(Γ ij, Γ ik ), j k if the parameters in Γ ij were known, then each pair Γ ij, Γ ik would be an unbiased estimator of α. There are n i (n i 1)/2 such pairs for individual i. Adding up all pairs over all subjects and divided by the total number of pairs N = 0.5 m i=1 n i(n i 1) gives an estimate of α ˆα = 1 N m n i i=1 j,k=1 Γ ij Γ ik 34 / 38
35 SAS unscaled residuals Γ ij = e ij / φ, so e ij = ˆα = 1 N φ y ij µ ij V (µij )/w ij m e ij e ik (7) i=1 j<k Note that ˆα depends on φ and β (through µ ij ), ˆα(φ, β) E(e 2 ij ) = var(e ij) = φ, var(γ ij ) = 1 so var(e ij ) = φ ˆφ = 1 N m n i m eij, 2 N = n i (8) i=1 j=1 i=1 e ij depends on β, ˆφ is a function of β, say ˆφ(β). 35 / 38
36 Fitting algorithm 1. Get an initial estimate of β, say ˆβ 0 by solving equation (6), assuming independence structure (i.e., R i (α) = I ni ) doesn t require estimates of φ or α. 2. Compute ˆµ ij = µ ij (ˆβ 0 ) and ê ij = y ij ˆµ ij and plug into V (ˆµij )/w ij (8) to get ˆφ 0 = ˆφ( ˆβ i ) and into (7) with ˆφ 0 to get ˆα 0 = ˆα( ˆφ 0, ˆβ 0 ) 3. Compute V i = V i (ˆα 0, ˆβ 0, ˆφ 0 ) and D i = D i (ˆβ 0 ) and plug into (6). Solve GEE score function (6) for ˆβ, say ˆβ (1) 4. Repeat 2 and 3 with ˆβ (1), iterate until (hopefully) convergence 5. Final estimates ˆβ, ˆα and ˆφ Same idea applied to other correlation structures. 36 / 38
37 The working correlation structure It is difficult to propose a suitable correlation structure R i (α) for non-normal longitudinal data. GEE approach has some good properties to deal with the problem β is consistently estimated even if R i (α) is uncorrectly specified Standard errors and inference procedure are available to account for R i (α) possibly being misspecified In standard ML based inference for normally distributed repeated measures, the ML estimate of β is also consistent even R i (α) is misspecified, but the ML-based standard errors and tests are not robust to misspecification of R i (α). This makes GEE-based inference attractive even for normal response. R i (α) is not necessarily taken seriously, it is called working correlation structure. But estimators with less variability will be produced using R i (α) that mimic the true correlation. 37 / 38
38 Distribution of GEE estimators ˆβ Under suitable regularity conditions, ˆβ is approximately normally distributed ˆβ N(β, Σ) given and estimator ˆΣ of Σ, we can construct CI, tests and inferences. 38 / 38
Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses
Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association
More informationGeneralized Estimating Equations (gee) for glm type data
Generalized Estimating Equations (gee) for glm type data Søren Højsgaard mailto:sorenh@agrsci.dk Biometry Research Unit Danish Institute of Agricultural Sciences January 23, 2006 Printed: January 23, 2006
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34 Correlated data multivariate observations clustered data repeated measurement
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationREPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29
REPEATED MEASURES Copyright c 2012 (Iowa State University) Statistics 511 1 / 29 Repeated Measures Example In an exercise therapy study, subjects were assigned to one of three weightlifting programs i=1:
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More information1/15. Over or under dispersion Problem
1/15 Over or under dispersion Problem 2/15 Example 1: dogs and owners data set In the dogs and owners example, we had some concerns about the dependence among the measurements from each individual. Let
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationModeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use
Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationOrdinary Least Squares Regression
Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section
More informationSTAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.
STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review
More informationGEE for Longitudinal Data - Chapter 8
GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationMixed-Models. version 30 October 2011
Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationSTAT 705 Chapter 16: One-way ANOVA
STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21 What is ANOVA? Analysis of variance (ANOVA) models are regression
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationMixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012
Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationQuestions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares
Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares L Magee Fall, 2008 1 Consider a regression model y = Xβ +ɛ, where it is assumed that E(ɛ X) = 0 and E(ɛɛ X) =
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationOn Properties of QIC in Generalized. Estimating Equations. Shinpei Imori
On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More information13. October p. 1
Lecture 8 STK3100/4100 Linear mixed models 13. October 2014 Plan for lecture: 1. The lme function in the nlme library 2. Induced correlation structure 3. Marginal models 4. Estimation - ML and REML 5.
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationCovariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects
Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationSTAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test
STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test Rebecca Barter April 13, 2015 Let s now imagine a dataset for which our response variable, Y, may be influenced by two factors,
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationSample Size and Power Considerations for Longitudinal Studies
Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationModeling the Mean: Response Profiles v. Parametric Curves
Modeling the Mean: Response Profiles v. Parametric Curves Jamie Monogan University of Georgia Escuela de Invierno en Métodos y Análisis de Datos Universidad Católica del Uruguay Jamie Monogan (UGA) Modeling
More informationLecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is
Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationFigure 36: Respiratory infection versus time for the first 49 children.
y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationSTA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random
STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More information36-720: Linear Mixed Models
36-720: Linear Mixed Models Brian Junker October 8, 2007 Review: Linear Mixed Models (LMM s) Bayesian Analogues Facilities in R Computational Notes Predictors and Residuals Examples [Related to Christensen
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationVarious types of likelihood
Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,
More informationStat 710: Mathematical Statistics Lecture 12
Stat 710: Mathematical Statistics Lecture 12 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 12 Feb 18, 2009 1 / 11 Lecture 12:
More informationGeneralized linear models
Generalized linear models Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 29, 202 Contents Densities for generalized linear models. Mean and variance...............................
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More information9.1 Orthogonal factor model.
36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationHigh-Throughput Sequencing Course
High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationRobust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study
Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance
More information