Shared Random Parameter Models for Informative Missing Data
|
|
- Joel Pitts
- 5 years ago
- Views:
Transcription
1 Shared Random Parameter Models for Informative Missing Data Dean Follmann NIAID NIAID/NIH p.
2 A General Set-up Longitudinal data (Y ij,x ij,r ij ) Y ij = outcome for person i on visit j R ij = 1 if observed 0 if missing D i = R i+ dropout time X ij = covariate e.g. time on study t ij P(missing) determined by a coin flip MCAR. Easy P(missing) depends on observed data MAR. Doable P(missing) depends on unobserved value of the missing data MNAR. Ambitious NIAID/NIH p.
3 Shared Parameter Model Assume that each person draws a random effect b 0i,b 1i from a N(0, Σ). Y ij = β 0 + β 1 t ij + b i0 + b i1 t ij + e ij Interested in the overall slope. What if faster decliners tend to drop out? Simple model P(R ij = 0 R ij 1 = 1) = Φ(α j + θb 1i ) At visit j, each person decides to dropout based on their own coin that depends on b 1i. NIAID/NIH p.
4 b i governs subject i s slope & P(dropout) Yearly drop in Y s h h s Probability of dropout NIAID/NIH p.
5 Data patterns generated from h and s 650 h Outcome s h h h s s s h s h s h h h h Years Since Randomization NIAID/NIH p.
6 Culling of patients in a 2 visit trial Distribution of slopes obs obs Density obs Slopes NIAID/NIH p.
7 Likelihood We can always write f(y i, r i, b i ) = g(y i b i, r i )m(r i b i )h(b i ) Key: g(y i b i, r i ) = g(y i b i ) Allows us to essentially eliminate the density for the missing ys. f(y o i, r i ) = f(y i, r i, b)dy m i db b y m i = b f(yo i b)m(r i b)db Maximum likelihood requires specialized software, can be difficult if dimension of b i is large. NIAID/NIH p.
8 How MNAR? MNAR: Probability of missingness depends on y m i. Let s work on R for a simple model with b scalar. P(R = 0 y m y o ) = P(R = 0 b) g(ym y o b)h(b)db g(ym y o b)h(b)db = P(R = 0 b)h(b y o y m )db h(b y o, y m ) can be viewed as an Empirical Bayes Posterior Distribution. Posteriors depend on all the data. Intuitively, y m should improve guess about b NIAID/NIH p.
9 Choice of covariates Without missing data covariate selection requires familiar judgment Clinical trials who cares Observational studies control confounding, what s of interest In shared parameter models, covariates impact the selection probabilities: Suppose E[Y man] = 10, E[Y woman] = 10. Fred s true mean is 10, typical for a man but good overall. Assume θ < 0. Adjust, Fred s b i = 0 typical P(dropout) Don t adjust Fred b i = 10 lower P(dropout) NIAID/NIH p.
10 Choice of covariates Model Consider two possible models for mean response. (1) Y ij = β 0 + b 0i + e ij (2) Y ij = β 0 + β 1 I(man) + b 0i + e ij Want sickest overall dropout? Under (1) use P(R ij = 0 R ij 1 = 1, b 0i ) = Φ(α j + θb 0i ) Under (2) use P(R ij = 0 R ij 1 = 1, b 0i ) = Φ(α j +θ(β 1 I(man)+b 0i )) NIAID/NIH p. 1
11 A Simpler Approach Unweighted analysis. Suppose each subject has their own mean: Y ij = β 0 + b 0i + e ij Then no matter how P(dropout) depends on b 0i, an unbiased estimate of β 0 is n i=1 Y i/n One Subject, One Vote" Simple fix for clinical trial compare two unweighted estimates. Related to Within Cluster Resampling approach to correct informative cluster size. NIAID/NIH p. 1
12 An Inconvenient Truth Endless two group study with m = intended observations. Y ij = b 0i + e ij with b 0i N(0, 1) e ij N(0, 100). In group 0, m i = 1 if b 0i < 0 but m i = if b 0i > 0. In group 1 no missing data. Sample mean of Y i s has the same expectation in the two groups. Y i s like a 50:50 mixture of a TN + (0,1) & N(0, 100). NIAID/NIH p. 1
13 An inconvenient truth Histogram of ybar0 Frequency 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 6e ybar0 Histogram of ybar ybar1 NIAID/NIH p. 1 Frequency
14 Approximate Conditional Approach Numerical integration of the complete likelihood can be hard. Let s try to make things simple by factoring differently f(y o i, r i, b i ) = f(y o i b i, r i )h(b i r i )m(r i ) = f(y o i b i )h(b i r i )m(r i ) f(y o i, r i ) = f(y o i b i )h(b i r i )db i m(r i ) m() can be estimated by empirical distribution, Random effects conditional on r i can be approximated. NIAID/NIH p. 1
15 Approximate Conditional Approach Twist is to modify the distribution h(b i r i ) e.g. b 0i = b 0i + ωd i, Y ij = β 0 + β 1 X i + b 0i + ωd i + e ij = β 0 + β 1 X i + β 2 D i + b 0i + e ij Uncondition at the end. E[Y X = 1] E[Y X = 0] = E[E[Y X = 1, D]] E[E[Y X = 0, D]] = β 1 + β 2 (D 1 D 0 ) NIAID/NIH p. 1
16 Heckman s model Suppose Y is the wage of a plumber, and Y the perceived utility of being a plumber. Y i = x iβ + b i + e i1 Y i = w iα + b i + e i2 with b i, e i1, e i2 iid normal, τ 2, σ1, 2 σ2. 2 People with Yi > 0 choose to be plumbers, Y i missing for nonplumbers. Can show that P(R i = 1) = P(Yi > 0) = Φ(w iα/ τ 2 + σ2 2) NIAID/NIH p. 1
17 Heckman s model Heckman showed E[b i R i = 1] λ( w iα/ τ 2 + σ2 2), λ() is Mills ratio. var[b i R i = 1] = c. Fix up the mean for the observed plumbers R i = 1: Y i = x iβ + ωλ( w i α ) + ǫ ω = σb 2/ σ2 2 + σ2 b > 0, people with a knack for plumbing choose it. NIAID/NIH p. 1
18 IPPB Trial Intermittent positive pressure breathing versus standard compressor nebulizer therapy. IPPB = Short term mechanical ventilation to force meds into lung. Primary endpoint: rate of change in FEV 1. n=984, 3 years FU, measured every 3 months 39% dropout. NIAID/NIH p. 1
19 IPPB group: estimates by dropout time A B intercept -0.1 slope dropout time dropout visit NIAID/NIH p. 1
20 IPPB Model Naive model: for each group we fit Y ij = β 0 + β 1 + b 0i + b 1i t j + e ij e ij iid N(0, σ 2 ), b i iid N(0,Σ) Shared Parameter Model: As above but with a connection P(R ij = 0 R ij 1 = 1) = Φ(α + θ 0 b 0i + θ 1 b 1i ) Approximate Conditional Model: given D i = R i+ Y ij = β 0 + β 1 t j + β 2 D i + β 3 t j D i + b 0i + b 1i t j + e ij NIAID/NIH p. 2
21 Model Estimates 1 A. Random Effects Model. Standard IPPB Parameter Estimate se Estimate se β β NIAID/NIH p. 2
22 Model Estimates 2 B. Shared Parameter Model Standard IPPB Parameter Estimate se Estimate se β β α θ θ NIAID/NIH p. 2
23 Model Estimates 3 C. Conditional Model Standard IPPB Parameter Estimate se Estimate se β β β β NIAID/NIH p. 2
24 Summary of IPPB Trial Model ˆ SE Naive Shared Approx Cond Unweighted IPPB not helpful beyond standard nebulizer therpay. Dropout appears related to intercept not slope. All analyses similar NIAID/NIH p. 2
25 Epilepsy Study Wanted to see if Felbamate reduced seizure frequency n = 40 patients titrated off meds, given drug/placebo & followed for 17 days. 11/19 Placebo 8/21 Felbamate dropped off. Number of seizures recorded each day. NIAID/NIH p. 2
26 Seizure Rates by Dropout O-placebo Avg. Daily Seizure Freq x 5 0 x x x x x o x x x Days in Study NIAID/NIH p. 2
27 Model Let Y ij denote the seizure count for patient i on day j Assume Poisson(λ ij ) Seizure rate seemed constant over time. Within each group log(λ ij ) = β + b i with b i N(0, σ 2 ) Shared Parameter Model, also assume P(R ij = 0 R ij 1 = 1, b i ) = 1/(1 + exp( γ θb i )) Approximate Conditional Model log(λ ij ) = β + ω log(d i ) + b i NIAID/NIH p. 2
28 Estimates Random Effects Shared Parameter Conditional Parm Est se Est se Est se β β σ F σ P ω F ω P γ γ θ F θ P NIAID/NIH p. 2
29 Treatment Effect For the conditional model, need to uncondition E F [log(ˆλ)] E P [log(ˆλ)] ˆβ 1 + ˆω F D F ˆω P D P Estimate=-1.58 (p>.05) versus (p<.05) for shared parameter model? ˆω F significant & substantial. ˆθ F NS & small. NIAID/NIH p. 2
30 Summary Shared parameter & naive models similar with significant effect of Felbamate. Conditional model: seizures & dropout related in Felbamate group. Inconsistency of models required investigation. Performed simulations Conditional model more robust with better small sample properties. NIAID/NIH p. 3
31 Extensions Appealing to embed shared parameter model in a larger class of models Allow perturbations of the Common b ic Y ij = β 0 + β 1 t ij + b ic + b iy + e ij P(R ij = 0 R ij 1 = 1) = Φ(α j + θ(b ic + b ir )) Allow smooth mean functions, more general error distributions, richer random effects dbns. Tradeoff between flexibility and burden of estimation. NIAID/NIH p. 3
32 Conclusions Shared Parameter Models a form of NMAR. Require unexaminable assumptions. Model fitting can be involved, approximate conditional linear model simpler to fit, need to uncondition. Extensions/flexibile modeling makes sense. When confronting missing data, appealing to try different methods. NIAID/NIH p. 3
33 A Book NIAID/NIH p. 3
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More information2 Naïve Methods. 2.1 Complete or available case analysis
2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationSome methods for handling missing values in outcome variables. Roderick J. Little
Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean
More informationMulti-level Models: Idea
Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random
More informationStatistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23
1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing
More informationCase Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial
Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona,
More informationWhether to use MMRM as primary estimand.
Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38
More information7 Sensitivity Analysis
7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption
More informationE(Y ij b i ) = f(x ijβ i ), (13.1) β i = A i β + B i b i. (13.2)
1 Advanced topics 1.1 Introduction In this chapter, we conclude with brief overviews of several advanced topics. Each of these topics could realistically be the subject of an entire course! 1. Generalized
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationMissing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models
Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models Hedeker D & Gibbons RD (1997). Application of random-effects pattern-mixture models for missing data in longitudinal
More informationLecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions
Econ 513, USC, Department of Economics Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions I Introduction Here we look at a set of complications with the
More informationMixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago
Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data Don Hedeker University of Illinois at Chicago This work was supported by National Institute of Mental Health Contract N44MH32056. 1
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More information6 Pattern Mixture Models
6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics
More informationSTAT 705 Generalized linear mixed models
STAT 705 Generalized linear mixed models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 24 Generalized Linear Mixed Models We have considered random
More informationOpen book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.
ISQS 5347 Final Exam Spring 2017 Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you. 1. Recall the commute
More information[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements
[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationDe-mystifying random effects models
De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,
More informationBayesian course - problem set 5 (lecture 6)
Bayesian course - problem set 5 (lecture 6) Ben Lambert November 30, 2016 1 Stan entry level: discoveries data The file prob5 discoveries.csv contains data on the numbers of great inventions and scientific
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationSTAT 526 Advanced Statistical Methodology
STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More information36-463/663: Multilevel & Hierarchical Models
36-463/663: Multilevel & Hierarchical Models (P)review: in-class midterm Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 In-class midterm Closed book, closed notes, closed electronics (otherwise I have
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationThe consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationMissing Covariate Data in Matched Case-Control Studies
Missing Covariate Data in Matched Case-Control Studies Department of Statistics North Carolina State University Paul Rathouz Dept. of Health Studies U. of Chicago prathouz@health.bsd.uchicago.edu with
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationNonrespondent subsample multiple imputation in two-phase random sampling for nonresponse
Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nanhua Zhang Division of Biostatistics & Epidemiology Cincinnati Children s Hospital Medical Center (Joint work
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationModel Assumptions; Predicting Heterogeneity of Variance
Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of
More informationOn Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation
On Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation Structures Authors: M. Salomé Cabral CEAUL and Departamento de Estatística e Investigação Operacional,
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationAnalyzing Group by Time Effects in Longitudinal Two-Group Randomized Trial Designs With Missing Data
Journal of Modern Applied Statistical Methods Volume Issue 1 Article 6 5-1-003 Analyzing Group by Time Effects in Longitudinal Two-Group Randomized Trial Designs With Missing Data James Algina University
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationEM Algorithm II. September 11, 2018
EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data
More informationLecture 10: Introduction to Logistic Regression
Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors
More informationMultinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is
Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in
More informationDiscussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data
Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small
More informationMethods for Handling Missing Data
Methods for Handling Missing Data Joseph Hogan Brown University MDEpiNet Conference Workshop October 22, 2018 Hogan (MDEpiNet) Missing Data October 22, 2018 1 / 160 Course Overview I 1 Introduction and
More informationBios 6648: Design & conduct of clinical research
Bios 6648: Design & conduct of clinical research Section 2 - Formulating the scientific and statistical design designs 2.5(b) Binary 2.5(c) Skewed baseline (a) Time-to-event (revisited) (b) Binary (revisited)
More informationSpecial Topic: Bayesian Finite Population Survey Sampling
Special Topic: Bayesian Finite Population Survey Sampling Sudipto Banerjee Division of Biostatistics School of Public Health University of Minnesota April 2, 2008 1 Special Topic Overview Scientific survey
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationIncentives Work: Getting Teachers to Come to School. Esther Duflo, Rema Hanna, and Stephen Ryan. Web Appendix
Incentives Work: Getting Teachers to Come to School Esther Duflo, Rema Hanna, and Stephen Ryan Web Appendix Online Appendix: Estimation of model with AR(1) errors: Not for Publication To estimate a model
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationMODEL ASSESSMENT FOR MODELS WITH MISSING DATA. Xiaolei Zhou. Chapel Hill 2015
MODEL ASSESSMENT FOR MODELS WITH MISSING DATA Xiaolei Zhou A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the
More informationAn Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data
An Application of a Mixed-Effects Location Scale Model for Analysis of Ecological Momentary Assessment (EMA) Data Don Hedeker, Robin Mermelstein, & Hakan Demirtas University of Illinois at Chicago hedeker@uic.edu
More informationT E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE
T E C H N I C A L R E P O R T 0465 KERNEL WEIGHTED INFLUENCE MEASURES HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY ATTRACTION
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analsis of Longitudinal Data Patrick J. Heagert PhD Department of Biostatistics Universit of Washington 1 Auckland 2008 Session Three Outline Role of correlation Impact proper standard errors Used to weight
More informationST 790, Homework 1 Spring 2017
ST 790, Homework 1 Spring 2017 1. In EXAMPLE 1 of Chapter 1 of the notes, it is shown at the bottom of page 22 that the complete case estimator for the mean µ of an outcome Y given in (1.18) under MNAR
More informationAnalyzing Pilot Studies with Missing Observations
Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.
More informationLatent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness
Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 36 11-1-2016 Latent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness Li Qin Yale University,
More informationBIOS 2083: Linear Models
BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationBinomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials
Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationAdvantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual
Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual change across time 2. MRM more flexible in terms of repeated
More informationAccounting for Complex Sample Designs via Mixture Models
Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationLast lecture 1/35. General optimization problems Newton Raphson Fisher scoring Quasi Newton
EM Algorithm Last lecture 1/35 General optimization problems Newton Raphson Fisher scoring Quasi Newton Nonlinear regression models Gauss-Newton Generalized linear models Iteratively reweighted least squares
More informationQuiz 1. Name: Instructions: Closed book, notes, and no electronic devices.
Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the
More informationCalibration Estimation for Semiparametric Copula Models under Missing Data
Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre
More informationShort Questions (Do two out of three) 15 points each
Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal
More informationBased on slides by Richard Zemel
CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we
More informationCSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.
() () a. X is a binomial distribution with n = 000, p = /6 b. The expected value, variance, and standard deviation of X is: E(X) = np = 000 = 000 6 var(x) = np( p) = 000 5 6 666 stdev(x) = np( p) = 000
More informationGlobal Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach
Global for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach Daniel Aidan McDermott Ivan Diaz Johns Hopkins University Ibrahim Turkoz Janssen Research and Development September
More informationA weighted simulation-based estimator for incomplete longitudinal data models
To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationStat 502X Exam 2 Spring 2014
Stat 502X Exam 2 Spring 2014 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This exam consists of 12 parts. I'll score it at 10 points per problem/part
More informationSimple Linear Regression for the MPG Data
Simple Linear Regression for the MPG Data 2000 2500 3000 3500 15 20 25 30 35 40 45 Wgt MPG What do we do with the data? y i = MPG of i th car x i = Weight of i th car i =1,...,n n = Sample Size Exploratory
More informationOn Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure
Biometrics DOI: 10.1111/j.1541-0420.2009.01324.x On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure Paul S. Albert 1, and Joanna
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationEstimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.
Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationBayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples
Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,
More informationMasters Comprehensive Examination Department of Statistics, University of Florida
Masters Comprehensive Examination Department of Statistics, University of Florida May 6, 003, 8:00 am - :00 noon Instructions: You have four hours to answer questions in this examination You must show
More information