IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM
|
|
- Ashley Maxwell
- 5 years ago
- Views:
Transcription
1 IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM
2 IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling 12.3 Stabilized IP weights 12.4 Marginal structural models 12.5 Effect modification and marginal structural models 12.6 Censoring and missing data BIOS IPW and MSM
3 12.1 The causal question Goal: Estimate effect of smoking cessation (A) on weight gain (Y ) using data from National Health and Nutrition Examination Survey Data I Epidemiologic Follow-up Study (NHEFS) n = 1566 smokers w/ baseline visit and follow-up visit 10 years later Outcome Y body weight (kg) at 10 years minus weight at baseline A = 1 if reported quitting smoking before 10 yr visit, A = 0 o/w Average weight gain Ê(Y A = 1) = 4.5 among quitters, Ê(Y A = 0) = 2.0 in non-quitters Estimated difference 2.5 (95% CI 1.7, 3.4), p < BIOS IPW and MSM
4 12.1 The causal question Causal estimand of interest: E(Y a=1 ) E(Y a=0 ) Difference in mean weight gain that would have been observed if all individuals in the population had quit smoking before the follow-up visit versus if all individuals in the population had not quit smoking Because exposure A not randomly assigned, there may be confounding. Thus we are not willing to assume E(Y A = a) E(Y A = 0) = E(Y a=1 ) E(Y a=0 ) Let L denote vector of 9 baseline covariates: sex (0 male, 1 female), age (yrs), race (0 white, 1 other), education (5 categories), intensity and duration of smoking (cigarettes/day and yrs of smoking), physical activity in daily life (3 categories), recreational exercise (3 categories), and weight (kg) BIOS IPW and MSM
5 12.2 Estimating IP weights via modeling Assume conditional exchangeability Y a A L, i.e., covariates L are sufficient to block all backdoor paths from A to Y Use IP weighting to estimate E(Y a=1 ) E(Y a=0 ) Recall from 2.4 IP estimator has the form or where 1 n n i=1 Y i A i Pr[A i = 1 L i ] 1 n n i=1 Y i (1 A i ) Pr[A i = 0 L i ] 1 n W i Y i A i 1 n W i Y i (1 A i ) W i = A i Pr[A i = 1 L i ] 1 + (1 A i )Pr[A i = 0 L i ] 1 BIOS IPW and MSM
6 12.2 Estimating IP weights via modeling In a conditionally randomized trial, W i known function of L i In an observational setting, the assignment mechanism Pr[A i = a L i ] is unknown and needs to be estimated 1 n ŴiY i A i 1 n ŴiY i (1 A i ) If L were low-dimensional (e.g., univariate and binary) could estimate non-parametrically based on sample means However, here L is a 9 dimensional covariate, with some covariates taking on more than 2 values and age continuous; need to model BIOS IPW and MSM
7 12.2 Estimating IP weights via modeling Logistic regression model of Pr[A = 1] on all nine covariates, linear and quadratic terms for age, weight, intensity and duration of smoking, and no product (interaction) terms between the covariates Based on fitted model compute Pr[A = 1 L] = logit 1 ( ˆβL) Pr[A = 0 L] = 1 logit 1 ( ˆβL) Inverse weight to create pseudo-population and fit linear model (using MLE/LS) IPW estimator of causal effect E(Y A = a) = θ 0 + θ 1 A ˆθ 1 = 3.44 BIOS IPW and MSM
8 12.2 Estimating IP weights via modeling # R code from chapter12.r fit <- glm(qsmk ~ as.factor(sex) + as.factor(race) + age + I(age^2) + as.factor(education.code) + smokeintensity + I(smokeintensity^2) + smokeyrs + I(smokeyrs^2) + as.factor(exercise) + as.factor(active) + wt71 + I(wt71^2), family = binomial(), data = nhefs0) p.qsmk.obs <- ifelse(nhefs0$qsmk == 0, 1 - predict(fit, type = "response"), predict(fit, type = "response")) nhefs0$w <- 1/p.qsmk.obs glm.obj <- glm(wt82_71~qsmk, data = nhefs0, weights = w) BIOS IPW and MSM
9 12.3 Stabilized IP weights Does weighted LS give IP estimator from 2.4? Actually, no. Weighted LS estimator minimizes Ŵ i {Y i (θ 0 + θ 1 A i } 2 i where Ŵ i = A i Pr[A i = 1 L i ] 1 + (1 A i ) Pr[A i = 0 L i ] 1 This yields (homework) ˆθ 1 = iŵ i Y i A i iŵ i Y i (1 A i ) i Ŵ i A i i Ŵ i (1 A i ) which has slightly different from the original IPW estimator i Ŵ i Y i A i n iŵ i Y i (1 A i ) n The former known as stabilized IPW estimator; the latter aka unstabilized BIOS IPW and MSM
10 12.3 Stabilized IP weights In survey sampling nomenclature, stabilized IPW estimator is difference in Hajek estimators and unstablized IPW estimator is difference in Horwitz-Thompson estimators Hajek-type estimators tend to be less variable than HT estimators Intuition: if Pr[A i = a L i ] v small, then Ŵ i v large So unstabilized/ht estimators can be highly variable Stabilized/Hajek estimators replace denominator n w/ an unbiased estimator of n New denominator tends to be large (small) when numerator is large (small) BIOS IPW and MSM
11 12.3 Stabilized IP weights HR describe 1 W i = A i Pr[A i = 1 L i ] + (1 A 1 i) Pr[A i = 0 L i ] as unstabilized weight, and as stabilized weight W i = A i Pr[A i = 1] Pr[A i = 1 L i ] + (1 A i) Pr[A i = 0] Pr[A i = 0 L i ] Using either form of W i in ˆθ 1 is equivalent BIOS IPW and MSM
12 12.3 Stabilized IP weights What are large sample properties of stabilized IPW estimator ˆθ? Let µ a = E(Y a ) for a = 0,1; µ = (µ 1, µ 0 ) Assume for now weights W i known function of L i Consider vector estimating equation ( ) W i A i (Y i µ 1 ) ψ(y i,a i,l i, µ) = = 0 W i (1 A i )(Y i µ 0 ) Solution Note ˆµ 1 ˆµ 0 = ˆθ 1 ( ) ( ) ˆµ1 W i Y i A i / W i A i ˆµ = = ˆµ 0 W i Y i (1 A i )/ W i (1 A i ) BIOS IPW and MSM
13 M-Estimators Under suitable regularity conditions (Stefanski and Boos TAS 2002) n( ˆµ µ) d N(0,V (µ)) as n where V (µ) = A(µ) 1 B(µ){A(µ) 1 } T A(µ) = E[ ψ(y i,a i,l i, µ)] B(µ) = E[ψ(Y i,a i,l i, µ)ψ(y i,a i,l i, µ) T ] ψ(y i,a i,l i, µ) = ψ(y i,a i,l i, µ)/ µ Empirical sandwich variance estimator consistent for V (µ) ˆV =  1 ˆB[ 1 ] T where  = 1 n ψ(y i,a i,l i, ˆµ) ˆB = 1 n ψ(y i,a i,l i, ˆµ)ψ(Y i,a i,l i, ˆµ) T BIOS IPW and MSM
14 Sandwich Estimator For problem at hand, ψ(y i,a i,l i, µ) = ( W i A i 0 0 W i (1 A i ) ) From here calculation of  and ˆB, and therefore ˆV, straightforward By delta method where g = ( ) 1 1 ˆµ 1 ˆµ 0 N(µ 1 µ 0,n 1 g T ˆV g) For NHEFS data, n 1 g T ˆV g = (homework?) implying estimated std err equals w/ Wald 95% CI (2.4,4.5) In practice, need not compute by hand; can use standard software to compute empirical sandwich variance estimate BIOS IPW and MSM
15 Sandwich Estimator in SAS /* from chapter12.sas */ proc genmod data= nhefs_w; class seqn; weight w; model wt82_71= qsmk; repeated subject=seqn / type=ind; run; Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept <.0001 qsmk <.0001 BIOS IPW and MSM
16 12.3 Stabilized IPW Estimators But wait! All of that assumed weights W i known function of L i What about the fact that in an observational study we don t know the assignment mechanism and therefore have to estimate the weights? Eg, suppose we use logistic regression logit(pr[a = 1 L]) = βl Then we have vector estimating equation ψ(y i,a i,l i, µ,β) = ψ β (A i,l i,β) W i A i (Y i µ 1 ) W i (1 A i )(Y i µ 0 ) = 0 where ψ β () is vector of score equations from log likelihood corresponding to logistic regression model BIOS IPW and MSM
17 12.3 Stabilized IPW Estimators Can show when weights n{( ˆµ 1 ˆµ 0 ) (µ 1 µ 0 )} d N(0,Σ ) where { (Y Σ 1 µ 1 ) 2 = E Pr[A = 1 L] + (Y 0 µ 0 ) 2 } Pr[A = 0 L] Whereas when the weights are estimated (e.g., based on logistic regression), asy var equals where c 0 (homework) Σ c Interesting result: Even if W i known, it is better to estimate! Unfortunately consistent estimator of asy var Σ c cannot be obtained using standard software. However, if we do use standard software as in previous slides, the above result indicates we are being conservative BIOS IPW and MSM
18 12.3 Stabilized IPW Estimators Sketch of proof of first claim on previous slide First note ( A(µ) = Therefore ), B(µ) = E ( W 2 A(Y 1 µ 1 ) W 2 (1 A)(Y 0 µ 0 ) 2 Σ = g T V (µ)g = g T B(µ)g = E[W 2 A(Y 1 µ 1 ) 2 +W 2 (1 A)(Y 0 µ 0 ) 2 ] Finally note [ A(Y E[W 2 A(Y 1 µ 1 ) 2 1 µ 1 ) 2 ] [ (Y 1 µ 1 ) 2 ] ] = E = E Pr[A = 1 L] 2 Pr[A = 1 L] and similarly [ (Y E[W 2 (1 A)(Y 0 µ 0 ) 2 0 µ 0 ) 2 ] ] = E Pr[A = 0 L] ) BIOS IPW and MSM
19 12.4 Marginal structural models Consider the following model E[Y a ] = β 0 + β 1 a Note outcome variable is a potential outcome (counterfactual) Models for mean counterfactual outcomes are referred to as structural mean models. Marginal structural model because modeling the marginal distn of the counterfactual rather than joint distn of Y 0 and Y 1 (Hernan, Robins, Brumback Epid 2000); or b/c structural mean model does not include any covariates (HR 12.4) Estimator ˆθ 1 = ˆµ 1 ˆµ 0 from previous section CAN for parameter β 1 = E(Y 1 ) E(Y 0 ) (causal risk difference) BIOS IPW and MSM
20 12.4 Marginal structural models Suppose A takes on many values, eg, number of cigarettes per day in 1982 (year of follow-up visit) minus number of cigarettes per day at baseline Each individual has many potential outcomes, eg, Y a= 25 if an individual decreased cigs/day by 25 Consider the MSM E[Y a ] = β 0 + β 1 a + β 2 a 2 Suppose interested in effect of increasing smoking by 20 cigs/day compared to no change E[Y a=20 ] E[Y a=0 ] = 400β β 1 BIOS IPW and MSM
21 12.4 Marginal structural models As before, can consistently estimate parameters of MSM (β 0,β 1,β 2 ) using IP weighting For continuous A, stabilized weights are no longer Pr[A = a]/pr[a = a L], but rather of the form f (A)/ f (A L) where f denotes the (conditional) density of A (given L) Eg, we might assume the usual linear model A = αl + ε where ε N(0,σ 2 ) in order to estimate f (A L); similarly for f (A) Based on estimated stablized weights, inverse weight to create pseudo-population and fit model E(Y A = a) = θ 0 + θ 1 a + θ 2 a 2 BIOS IPW and MSM
22 SAS Program 12.4 /* estimation of denominator of ip weights */ proc glm data= nhefs_nmv_s outstat= ss_den(keep= _source type_ df ss where=(_source_ in( ERROR ) and _type_ in( ERROR )));; class exercise active education; model smkintensity82_71 = sex race age age*age education smokeintensity smokeintensity*smokeintensity smokeyrs smokeyrs*smokeyrs exercise active wt71 wt71*wt71 / solution; output out= temp_den p= pred; data sd_den; set ss_den; rootmse_n= sqrt(ss/df); match= 1; keep rootmse_n match; data est_dens_d; merge temp_den sd_den; by match; dens_den = pdf( NORMAL, smkintensity82_71, pred, rootmse_n); proc sort; by seqn; /* estimation of numerator of ip weights */ proc glm data= nhefs_nmv_s outstat= ss_num(keep= _source type_ df ss where=(_source_ in( ERROR ) and _type_ in( ERROR )));; model smkintensity82_71 = / solution; output out= temp_num p= pred; BIOS IPW and MSM
23 data sd_num; set ss_num; rootmse_n= sqrt(ss/df); match= 1; keep rootmse_n match; data est_dens_n; merge temp_num sd_num; by match; dens_num = pdf( NORMAL, smkintensity82_71, pred, rootmse_n); proc sort; by seqn; data nhefs_sw_cont; merge est_dens_d est_dens_n ; by seqn; sw_a= dens_num / dens_den; proc univariate data=nhefs_sw_cont; var sw_a; id seqn; proc genmod data= nhefs_sw_cont; class seqn; weight sw_a; model wt82_71= smkintensity82_71 smkintensity82_71*smkintensity82_71; estimate No change intercept 1 smkintensity82_71 0; estimate Increase smoking by 20 cig/day intercept 1 smkintensity82_71 20; estimate Effect of increase smoking by 20 cig/day intercept 0 smkintensity82_71 20; repeated subject=seqn / type=ind; BIOS IPW and MSM
24 Weight Check Note proc univariate included to check weights have mean near 1 If the models are correctly specified, then mean should be near 1. Why? Consider A binary and stabilized weights ( ) Pr[A = 1] Pr[A = 0] E(W) = E A + (1 A) Pr[A = 1 L] Pr[A = 0 L] ( ) A = Pr[A = 1]E + Pr[A = 0]E Pr[A = 1 L] = Pr[A = 1]E L ( EA L A Pr[A = 1 L] ( (1 A) ) Pr[A = 0 L] ) +Pr[A = 0]E L ( EA L (1 A) Pr[A = 0 L] Deviations from 1 indicate model misspecification or possible violations, or near violations, of positivity ) = 1 BIOS IPW and MSM
25 12.4 Marginal structural models What if outcome Y is dichotomous? Eg suppose A = 1 quit smoking, A = 0 o/w and Y = 1 dead by 1982, Y = 0 alive Marginal structural logistic model logitpr[y a = 1] = α 0 + α 1 a where exp(α 1 ) is causal odds ratio of death for quitting versus not quitting smoking Parameters α 0 and α 1 can be consistently estimated by fitting logistic model logitpr[y = 1 A] = θ 0 + θ 1 A to pseudo-population created by IP weighting BIOS IPW and MSM
26 12.5 Effect modification and MSM Covariates can be included in MSM to assess effect modification Eg suppose V encodes sex (1 male, 0 female) and consider MSM E[Y a V ] = β 0 + β 1 a + β 2 Va + β 3 V Additive effect modification if β 2 0 Consistently estimate parameters of MSM by fitting via weighted LS E[Y A,V ] = γ 0 + γ 1 A + γ 2 AV + γ 3 V Weights based on covariates L that include V and any other variables sufficient to ensure exchangeability within level of V BIOS IPW and MSM
27 12.6 Censoring and missing data When estimating effect of smoking cessation A on weight gain Y, we restricted analysis to n = 1566 individuals with a body weight measurement at end of follow-up in 1982 There were, however, 63 additional individuals excluded from the analysis because their weight in 1982 was not known Selecting only individuals with non-missing outcome values may introduce selection bias ( 8) Let C = 1 if body weight missing (censored), C = 0 otherwise Unstabilized weights W now equal 1 A(1 C) Pr[C = 0 A,L]Pr[A = 1 L] +(1 A)(1 C) 1 Pr[C = 0 A,L]Pr[A = 0 L] and stabilized weight are adjusted analogously BIOS IPW and MSM
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationOUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores
OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity
More informationMarginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback
University of South Carolina Scholar Commons Theses and Dissertations 2017 Marginal Structural Cox Model for Survival Data with Treatment-Confounder Feedback Yanan Zhang University of South Carolina Follow
More informationStandardization methods have been used in epidemiology. Marginal Structural Models as a Tool for Standardization ORIGINAL ARTICLE
ORIGINAL ARTICLE Marginal Structural Models as a Tool for Standardization Tosiya Sato and Yutaka Matsuyama Abstract: In this article, we show the general relation between standardization methods and marginal
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationGEE for Longitudinal Data - Chapter 8
GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationReview. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationCohen s s Kappa and Log-linear Models
Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance
More informationCausal inference in epidemiological practice
Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationA comparison of 5 software implementations of mediation analysis
Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen Illustrative example
More informationMediation Analysis for Health Disparities Research
Mediation Analysis for Health Disparities Research Ashley I Naimi, PhD Oct 27 2016 @ashley_naimi wwwashleyisaacnaimicom ashleynaimi@pittedu Orientation 24 Numbered Equations Slides at: wwwashleyisaacnaimicom/slides
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationFlexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.
FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke
More informationarxiv: v1 [stat.me] 15 May 2011
Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland
More informationCHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials
CHL 55 H Crossover Trials The Two-sequence, Two-Treatment, Two-period Crossover Trial Definition A trial in which patients are randomly allocated to one of two sequences of treatments (either 1 then, or
More informationSections 4.1, 4.2, 4.3
Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear
More informationSociology 362 Data Exercise 6 Logistic Regression 2
Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs
More information22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression
22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationLogistic regression model for survival time analysis using time-varying coefficients
Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationInvestigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts?
Investigating mediation when counterfactuals are not metaphysical: Does sunlight exposure mediate the effect of eye-glasses on cataracts? Brian Egleston Fox Chase Cancer Center Collaborators: Daniel Scharfstein,
More informationSensitivity analysis and distributional assumptions
Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationEstimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.
Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationModels for binary data
Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationBootstrapping Sensitivity Analysis
Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.
More informationAppendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator
Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator As described in the manuscript, the Dimick-Staiger (DS) estimator
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables
ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationHarvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen
Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen
More informationEffects of multiple interventions
Chapter 28 Effects of multiple interventions James Robins, Miguel Hernan and Uwe Siebert 1. Introduction The purpose of this chapter is (i) to describe some currently available analytical methods for using
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationDepartment of Biostatistics University of Copenhagen
Comparison of five software solutions to mediation analysis Liis Starkopf Mikkel Porsborg Andersen Thomas Alexander Gerds Christian Torp-Pedersen Theis Lange Research Report 17/01 Department of Biostatistics
More informationCausal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk
Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline
More informationWeighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai
Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 260 Collaborative Targeted Maximum Likelihood For Time To Event Data Ori M. Stitelman Mark
More informationCHAPTER 1: BINARY LOGIT MODEL
CHAPTER 1: BINARY LOGIT MODEL Prof. Alan Wan 1 / 44 Table of contents 1. Introduction 1.1 Dichotomous dependent variables 1.2 Problems with OLS 3.3.1 SAS codes and basic outputs 3.3.2 Wald test for individual
More informationMultiple imputation to account for measurement error in marginal structural models
Multiple imputation to account for measurement error in marginal structural models Supplementary material A. Standard marginal structural model We estimate the parameters of the marginal structural model
More informationDoes low participation in cohort studies induce bias? Additional material
Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More informationGeneralized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.
Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint
More informationContrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:
Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.
More informationFigure 36: Respiratory infection versus time for the first 49 children.
y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects
More informationApplication of Item Response Theory Models for Intensive Longitudinal Data
Application of Item Response Theory Models for Intensive Longitudinal Data Don Hedeker, Robin Mermelstein, & Brian Flay University of Illinois at Chicago hedeker@uic.edu Models for Intensive Longitudinal
More informationSupplementary Materials for Residual Balancing: A Method of Constructing Weights for Marginal Structural Models
Supplementary Materials for Residual Balancing: A Method of Constructing Weights for Marginal Structural Models Xiang Zhou Harvard University Geoffrey T. Wodtke University of Toronto March 29, 2019 A.
More informationeappendix: Description of mgformula SAS macro for parametric mediational g-formula
eappendix: Description of mgformula SAS macro for parametric mediational g-formula The implementation of causal mediation analysis with time-varying exposures, mediators, and confounders Introduction The
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationAn Introduction to Causal Inference, with Extensions to Longitudinal Data
An Introduction to Causal Inference, with Extensions to Longitudinal Data Tyler VanderWeele Harvard Catalyst Biostatistics Seminar Series November 18, 2009 Plan of Presentation Association and Causation
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationLogistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ
Logistic Regression The goal of a logistic regression analysis is to find the best fitting and most parsimonious, yet biologically reasonable, model to describe the relationship between an outcome (dependent
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationComparative effectiveness of dynamic treatment regimes
Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2003 Paper 140 Comparison of the Inverse Probability of Treatment Weighted (IPTW) Estimator With a Naïve
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationPSC 504: Dynamic Causal Inference
PSC 504: Dynamic Causal Inference Matthew Blackwell 4/8/203 e problem Let s go back to a problem that we faced earlier, which is how to estimate causal effects with treatments that vary over time. We could
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationESTIMATE PROP. IMPAIRED PRE- AND POST-INTERVENTION FOR THIN LIQUID SWALLOW TASKS. The SURVEYFREQ Procedure
ESTIMATE PROP. IMPAIRED PRE- AND POST-INTERVENTION FOR THIN LIQUID SWALLOW TASKS 18:58 Sunday, July 26, 2015 1 The SURVEYFREQ Procedure Data Summary Number of Clusters 30 Number of Observations 360 time_cat
More informationPropensity Score Methods for Estimating Causal Effects from Complex Survey Data
Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationPerson-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data
Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationSAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling
SAMPLIG BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-14 15:55 BIOS 662 1 Sampling Outline Preliminaries Simple random sampling Population mean Population
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationSection 9c. Propensity scores. Controlling for bias & confounding in observational studies
Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a
More informationECON Introductory Econometrics. Lecture 17: Experiments
ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.
More informationCIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis
CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation
More informationGenerating survival data for fitting marginal structural Cox models using Stata Stata Conference in San Diego, California
Generating survival data for fitting marginal structural Cox models using Stata 2012 Stata Conference in San Diego, California Outline Idea of MSM Various weights Fitting MSM in Stata using pooled logistic
More informationJournal of Biostatistics and Epidemiology
Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department
More information