The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
|
|
- Pamela Cummings
- 5 years ago
- Views:
Transcription
1 The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San Francisco
2 Outline Motivation/Background 1 Motivation/Background 2 3
3 Antibotic Rx Study in acute care (Gonzales et al, Academic Emergency Medicine 2006) Study objectives: 1) assess intervention to reduce inappropriate antibiotic prescription rates - Rx for antibiotic-nonresponsive conditions (e.g. acute bronchitis); 2) identify poorly performing providers Design: multiple responses from 720 providers (clusters) Large between-provider variability in RX rates & cluster sizes - borrow strength between providers Popular approach: Use estimated regression coefficients & predicted random effects from GLMMs
4 Generalized Linear Mixed Models Y ij b i, X ij GLM E(Y ij X ij, b i ) = h 1 (β 0 + b i + β 1 X ij ) h( ) link function, e.g. identity, logit, probit i=1,..., m : subjects; j=1,..., n i : repeats per cluster When b G T, Y i1,..., Y ni cond ly indep given b i & b i indep of X m Likelihood: L(β, G T ) = n i i=1 j=1 f (Y ij b, X ij ) dg T (b)
5 Misspecified random effects distributions G T typically unknown & we might assume b G F G T. Often G F = N(0, σb 2 ) (e.g. Stata default) Two general forms of misspecified G T : Incorrect distributional shape Incorrectly assuming b indep of X 1 E(b X) = µ M (X), Neuhaus & McCulloch (2006) 2 var(b X) = µ V (X), Heagerty & Kurland (2001)
6 Worry about correctly specifying the shape of G T? Some say yes - Motivation for more flexible G F, e.g. mixture of normals (Chen et al 2002), nonparametric G F (Lesperance & Kalbfleisch 1992) & specification tests (Tchetgen & Coull 2006) Others show little bias in slopes", ˆβ 1 (Neuhaus et al 1992) Little work on random effects prediction under misspecification
7 Bias with misspecified shape of G? Linear mixed effects model: Y ij b i = β 0 + σ b b i + β 1 X ij + e ij b e, b G T, E(b) = 0, var(b) = 1, var(e) = σ 2 e cov(y i1,..., Y ini ) = V = σ 2 ei + σ 2 b J, indep of G T E{ ˆβ GLS } = E{(X T V 1 X) 1 X T V 1 Y } = β 1, indep of G F Unbiased estimators of slopes, β 1, with misspecified shape
8 Assessing consequences of random effects misspecification Follow theory on inference with misspecified models (e.g. White 1994) - let ξ = (β, θ) True : f T (Y i X i ; ξ) = Fitted : f F (Y i X i ; ξ ) = n i f (Y ij b, X ij ; β) dg T (b; θ) j=1 n i f (Y ij b, X ij ; β ) dg F (b; θ ) j=1 MLE" ˆξ ξ minimizes E X E Y X log {f T (y X; ξ)/f F (y X; ξ )} Kullback-Leibler Divergence
9 Misspecified models Minimizing K-L Divergence wrt ξ = (ξ 1,..., ξ q) yields E X y λ(y X, ξ ) ξ { k n i f (Y ij b, X ij ; β ) dg F (b; θ )}dy = 0 j=1 where λ(y X, ξ ) = f T (Y = y X, ξ)/f F (Y = y X, ξ ) Note: ξ that yield λ(y X, ξ ) = 1 X solve system Can solve for ξ analytically in some simple cases, i.e. match fitted & true densities at all points X
10 Matched pairs, solutions under misspecification For some link functions can find analytic solution with β 1 = β 1, but β 0 β 0 & σ σ Special cases: ˆβ 1 consistent, but ˆβ 0 & ˆσ inconsistent when G F G T
11 Logistic link Motivation/Background Binary matched pairs - when G F generates a wide range of pr(y 1, y 2 ) (e.g. G F = Normal) ˆβ 1 is consistent (& ˆβ 1 ˆβ CML ) (Neuhaus et al 1994) G F unspecified, ˆβ 1 ˆβ CML (Lindsay et al 1991) General X - when true β 1 = 0, β 1 = 0 solves Kullback-Leibler minimizing equations ˆβ 1 consistent Typically cannot solve Kullback-Leibler system analytically
12 Numerical solution of Kullback-Leibler equations Numerically find ξ that minimizes E X E Y X log {f T (y ξ, X)/f F (y ξ, X)} using routines in R Numerically evaluate integrals Numerically minimize Kullback-Leibler divergence Allows evaluation of bias over wide range of parameter values, random effect & covariate distributions
13 Numerical assessment of bias True: logit {pr(y ij = 1 X ij, b i )} = β 0 + σb i + β 1 X ij True G T : b exp(1), shifted to E(b) = 0 Fitted G F : b N(0, 1) X varied within clusters X = (0,.25,.5,.75, 1), n i = 5 Range of parameter values: β 0 : (-3,-1) by 0.1 β 1 : (0.5,2) by 0.1 σ: (0.2,2.5) by 0.1 Numerically solved for β 0, β 1, σ that minimize Kullback-Leibler divergence
14 10 Bias in beta1, True = exponential, Fitted = Normal beta0: 3 beta0: 2 beta0: 1 % Bias sigma: 0.2 sigma: 1 sigma: 2.5 beta True beta1
15 50 40 Bias in sigma, true = exponential, fitted = Normal beta0: 3 beta0: 2 beta0: 1 % Bias beta1: 0.5 beta1: 1 beta1: 1.9 beta True sigma
16 % Bias Bias in beta0, true = exponential, fitted = Normal sigma: 0.2 sigma: 1 sigma: True beta0 beta1: 0.5 beta1: 1 beta1: 1.9 sigma
17 Logistic link - further results Find approximate solution of Kullback-Leibler equations using Taylor series methods (Neuhaus et al 1992) Approximate solution & simulations indicate E( ˆβ 1 ) β 1 when G T misspecified, but biased ˆβ 0 & ˆσ b
18 Antibiotic Rx of acute respiratory infections Study objective: assess intervention to reduce inappropriate antibiotic prescription rates - Rx for antibiotic-nonresponsive conditions (e.g. acute bronchitis) Cluster: visits to a provider for antibiotic-nonresponsive conditions - number of visits varied from 1 to 71 Design: Baseline Y intervention Post-Y Binary outcome: prescribed antibiotics for nonresponsive conditions (yes/no) - measured at each visit Covariates: intervention, time, time*intervention, provider type, illness duration prior to visit
19 Antibiotic Rx study - Fitted models logit pr(y ij = 1 b, X ij ) = β 0 + exp(log σ b ) b + β 1 X ij where b G with E(b) = 0, Var(b) = 1 G 1 = N(0,1) G 2 = Exponential (1) standardized to E(b) = 0 G 3 = Nonparametric, 4 support points (GLLAMM)
20 Antibiotic Rx study - Results Estimated parameters from mixed-effects logistic models with different random effects distributions ˆβ, (SE) G β 0 TREAT TIME TRT*TIME log σ b Normal (0.53) (0.32) (0.32) (0.20) (0.08) Exponential (0.52) (0.31) (0.31) (0.19) (0.09) NP (4) (0.51) (0.32) (0.31) (0.19)
21 for fixed effects estimation 1 Misspecifying shape of G T yields accurate slopes" ˆβ 1 over wide range of β 0, σ 2 b 2 Misspecifying shape of G T can yield biased estimates of β 0 & σb 2 %bias in ˆβ 0 with σb 2 & can be substantial % bias in ˆσ can also be large 3 When interest focuses on slope parameters, misspecifying shape has little effect on bias
22 of random effects Useful method: Predict b by b that minimize E[ b b] 2 = ( b b) 2 f (b, y) dy db, where f (b, y) is the joint density of b & y 1,..., y n Can show: b i = E(b i y i1,..., y ini ) Depends on f (b, y), hence on G Misspecifying G may produce inaccurate b
23 Linear Mixed-Effects Model Y ij = β 0 + b i + β 1 X ij + ɛ ij, b i N(0, σ 2 b ), ɛ ij N(0, σ 2 ɛ ) i = 1,..., m; j = 1,..., n i Estimated BLUP: Estimated b = ˆb i = ˆD i Z T i ˆV 1 i (y i X i ˆβ) ˆD i = Cov(b ˆ i ) ˆV i = var(y ˆ i ) Expression depends on joint normality of b & y Misspecification of distribution of b may produce inaccurate b
24 Theory for Linear Mixed-Effects Model (simple case) Y ij = µ + b i + ɛ ij, i = 1,..., m; j = 1,..., n i b i N(0, σ 2 b ), ɛ ij N(0, σ 2 ɛ ), ɛ ij b i, µ, σ 2 b, σ2 ɛ known Best Linear Unbiased Predictor (BLUP) is b i that minimizes E[( b i b i ) 2 ] Simple calculations show that b i = E[b i y] = σ 2 b σ 2 b + σ2 ɛ /n i (Ȳi µ) = σ 2 b σ 2 b + σ2 ɛ /n i (b i + ɛ i )
25 Theory for LME (cont.) Conditional on b i, the Y ij are independent N(µ + b i, σ 2 ɛ ) σ b i b i N µ b b 2 = σb 2 + b i, σ2 ɛ /n i ( ) 2 σb 2 σ 2 ɛ σb 2 + σ2 ɛ /n i n i Thus, b i is conditionally biased However, since calculations are b i, result does not depend on distribution of b i i.e. conditional bias does not depend on distribution of b i
26 Features of b i As n i, b i = σ 2 b σ 2 b + σ2 ɛ /n i (b i + ɛ i ) σ2 b σ 2 b (b i + 0) = b i b i converges to true value as n i Such asymptotics typically not of interest, rather as m
27 Density of b i What does the density of b i look like? Also, what if b i misspecified to be normal? If n i large, then each b i b i density is approximately correct, irrespective of assumed density What if n i not large, usual case of interest? Then density of b i is convolution of true density of b i with the conditional density of b i given b i
28 Density of b i What does the density of b i look like? Also, what if b i misspecified to be normal? If n i large, then each b i b i density is approximately correct, irrespective of assumed density What if n i not large, usual case of interest? Then density of b i is convolution of true density of b i with the conditional density of b i given b i
29 Density of b i What does the density of b i look like? Also, what if b i misspecified to be normal? If n i large, then each b i b i density is approximately correct, irrespective of assumed density What if n i not large, usual case of interest? Then density of b i is convolution of true density of b i with the conditional density of b i given b i
30 Density of b i What does the density of b i look like? Also, what if b i misspecified to be normal? If n i large, then each b i b i density is approximately correct, irrespective of assumed density What if n i not large, usual case of interest? Then density of b i is convolution of true density of b i with the conditional density of b i given b i
31 Density of b i (cont) Suppose b i Exponential(1), shifted so that E(b i ) = 0 Density of b i (under normal assumption) is 0 exp{ ( b µ b) 2 n i /(2σ 2 ɛ )} exp( b 1) d b Straightforward to evaluate numerically
32 Plot of BLUP Densities for Cluster Size 2 Density Assumed Normal Exponential BLUP
33 Plot of BLUP Densities for Cluster Size 4 Density Assumed Normal Exponential BLUP
34 Plot of BLUP Densities for Cluster Size 6 Density Assumed Normal Exponential BLUP
35 Plot of BLUP Densities for Cluster Size 8 Density Assumed Normal Exponential BLUP
36 Plot of BLUP Densities for Cluster Size 10 Density Assumed Normal Exponential BLUP
37 Plot of BLUP Densities for Cluster Size 16 Density Assumed Normal Exponential BLUP
38 Plot of BLUP Densities for Cluster Size 20 Density Assumed Normal Exponential BLUP
39 BLUP density plot findings Density of BLUPs inherits much of its shape from assumed density Doesn t reflect shape of true density of the random effects until cluster sizes get large
40 Best Predicted values Under LME, with e ij N(0, σ 2 e), & true density f bi (b i ), b i = E[b i Y i ] = b i exp{ n i 2σ 2 ɛ exp{ n i 2σ 2 ɛ (ν i b i ) 2 }f bi (b i )db i (ν i b i ) 2 }f bi (b i )db i where ν i = (Ȳi x i β) Given ν i, compute b i for various true f bi (b i ): Normal, T 3, Exponential (1), Mixture of Normals
41 BP plots Motivation/Background Best Within Cluster Deviation Normal Mixture T w/ 3 d.f. Exponential
42 Order preservation for LME Can show b i ν i > 0 for any f bi (b i ) transformation ν i b i monotone Given n i, BP s under any assumed f ordered based on ν i Let f 1 (b i ), f 2 (b i ) denote assumed random effects densities b i1 (ν i ) & b i2 (ν i ), predictions from each value ν i Order the pairs [ b i1 (ν i ), b i2 (ν i )] by ν i Since b i ν i > 0, pairs also ordered by b i1 (ν i ) & b i2 (ν i ). Thus, all pairs concordant & Kendall s τ between b i1 (ν i ) and b i2 (ν i ) is 1
43 Mean square error of prediction Given σ 2 b, σ2 e, we calculated b assuming b N(0,1) b Exponential b Mixture of two N(0,1) Using expressions for b we calculated MSE = E[( b b) 2 ] for various true f b (b) using numerical integration
44 True Gaussian Random Effects True Exponential Random Effects True Mixture Random Effects MSEP Random Effects SD 0.5 MSEP Random Effects SD 0.5 MSEP Random Effects SD Sample Size Per Cluster Sample Size Per Cluster Sample Size Per Cluster MSEP Random Effects SD 1 MSEP Random Effects SD 1 MSEP Random Effects SD Sample Size Per Cluster Sample Size Per Cluster Sample Size Per Cluster MSEP Random Effects SD 2 MSEP Random Effects SD 2 MSEP Random Effects SD Sample Size Per Cluster Sample Size Per Cluster Sample Size Per Cluster Assumed distributions: Solid line/square=gaussian, dotted line/circle=exponential, dashed line/x=mixture
45 Simulations Motivation/Background Evaluate performance of BP s when all parameters estimated Two true & assumed f (b), i.e. 4 true/assumed settings b N(0, 1) b Tukey(g = 0.5, h = 0.1) Linear mixed effects & mixed effects logistic models Linear predictor: η ij = 2 + b i + x between + x within σb 2 = 1, σ2 e = 1 for LME Calculated MSE ˆ = (1/mK ) K m k=1 i=1 ( b ki b ki ) 2 K = 1000, m = 100, cluster sizes=2, 4, 6, 10, 20, 40
46 Percentiles of N(0,1) & standardized Tukey(0.5,0.1) Percentile N(0,1) Standardized Tukey(0.5,0,1).1% % % % % % % % % % %
47 Continuous Outcome Binary Outcome MSEP True Gaussian MSEP True Gaussian Cluster size Cluster size MSEP True Tukey MSEP True Tukey Cluster size Cluster size Assumed distributions: Solid line/square=gaussian, dotted line/circle=tukey
48 Antibiotic Rx study - Fitted models Objective: identify poorly performing providers (i.e. large predicted random effects) logit pr(y ij = 1 b, X ij ) = β 0 + exp(log σ b ) b + β 1 X ij where b G with E(b) = 0, Var(b) = 1 G 1 = N(0,1) G 2 = Exponential (1) standardized to E(b) = 0 Compute b that maximizes log{f (y i1,..., y ini X i1,..., X ini, β, b i ) g(b i θ)} Fit models using PROC NLMIXED in SAS
49 b.norm b.exp Predicted random effects from normal & exponential Spearman corr=0.99
50 of effects on prediction 1 Predicted values of random effects show modest sensitivity to assumed distributional shape. 2 Distribution shape of BLUPs often not reflective of true random effects distribution. 3 Ranking of predicted values is little affected. 4 Misspecified shape can produce modest increases in MSE of prediction.
51 Motivation/Background Misspecifying the shape of the random effects produces little bias in estimated covariate effects slight deterioration in random effects prediction. Stata default of b N(0, σb 2 ) yields accurate estimates of covariate effects & reasonably precise predicted random effects
Stat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationType I and type II error under random-effects misspecification in generalized linear mixed models Link Peer-reviewed author version
Type I and type II error under random-effects misspecification in generalized linear mixed models Link Peer-reviewed author version Made available by Hasselt University Library in Document Server@UHasselt
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationT E C H N I C A L R E P O R T A SANDWICH-ESTIMATOR TEST FOR MISSPECIFICATION IN MIXED-EFFECTS MODELS. LITIERE S., ALONSO A., and G.
T E C H N I C A L R E P O R T 0658 A SANDWICH-ESTIMATOR TEST FOR MISSPECIFICATION IN MIXED-EFFECTS MODELS LITIERE S., ALONSO A., and G. MOLENBERGHS * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationLogistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20
Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationMulti-level Models: Idea
Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random
More informationSpatio-Temporal Latent Variable Models: A Potential Waste of Space and Time?
Spatio-Temporal Latent Variable Models: A Potential Waste of Space and Time? Francis K.C. Hui (Australian National University) Nicole Hill (Institute of Marine and Antarctic Studies) A.H. Welsh (Australian
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSTA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random
STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationSemiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationThe impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationWeek 2: Review of probability and statistics
Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationSome New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.
Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness
More informationLecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression
Lecture notes to Chapter 11, Regression with binary dependent variables - probit and logit regression Tore Schweder October 28, 2011 Outline Examples of binary respons variables Probit and logit - examples
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationQUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS
Statistica Sinica 15(05), 1015-1032 QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Scarlett L. Bellamy 1, Yi Li 2, Xihong
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationGeneralized Linear Models
Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.
More informationTECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study
TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationInstructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses
ISQS 5349 Final Spring 2011 Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses 1. (10) What is the definition of a regression model that we have used throughout
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationMixed model analysis of censored longitudinal data with flexible random-effects density
Biostatistics (2012), 13, 1, pp. 61 73 doi:10.1093/biostatistics/kxr026 Advance Access publication on September 13, 2011 Mixed model analysis of censored longitudinal data with flexible random-effects
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationIntroduction (Alex Dmitrienko, Lilly) Web-based training program
Web-based training Introduction (Alex Dmitrienko, Lilly) Web-based training program http://www.amstat.org/sections/sbiop/webinarseries.html Four-part web-based training series Geert Verbeke (Katholieke
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationSampling bias in logistic models
Sampling bias in logistic models Department of Statistics University of Chicago University of Wisconsin Oct 24, 2007 www.stat.uchicago.edu/~pmcc/reports/bias.pdf Outline Conventional regression models
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationThe Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen
The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 23-24, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationWeb-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros
Web-based Supplementary Material for A Two-Part Joint Model for the Analysis of Survival and Longitudinal Binary Data with excess Zeros Dimitris Rizopoulos, 1 Geert Verbeke, 1 Emmanuel Lesaffre 1 and Yves
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationLast lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton
EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationModeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses
Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association
More informationConfounding, mediation and colliding
Confounding, mediation and colliding What types of shared covariates does the sibling comparison design control for? Arvid Sjölander and Johan Zetterqvist Causal effects and confounding A common aim of
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1
MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical
More informationOn the Choice of Parametric Families of Copulas
On the Choice of Parametric Families of Copulas Radu Craiu Department of Statistics University of Toronto Collaborators: Mariana Craiu, University Politehnica, Bucharest Vienna, July 2008 Outline 1 Brief
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationConditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution
Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Peng WANG, Guei-feng TSAI and Annie QU 1 Abstract In longitudinal studies, mixed-effects models are
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationModels for binary data
Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More information1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches
Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationChapter 1. Modeling Basics
Chapter 1. Modeling Basics What is a model? Model equation and probability distribution Types of model effects Writing models in matrix form Summary 1 What is a statistical model? A model is a mathematical
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationNon-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More information13. October p. 1
Lecture 8 STK3100/4100 Linear mixed models 13. October 2014 Plan for lecture: 1. The lme function in the nlme library 2. Induced correlation structure 3. Marginal models 4. Estimation - ML and REML 5.
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationIV estimators and forbidden regressions
Economics 8379 Spring 2016 Ben Williams IV estimators and forbidden regressions Preliminary results Consider the triangular model with first stage given by x i2 = γ 1X i1 + γ 2 Z i + ν i and second stage
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More information