Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
|
|
- Thomas Townsend
- 5 years ago
- Views:
Transcription
1 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary Thompson) November 13, 2015
2 2 / 32 1 Empirical Likelihood for Two-sample Problems 2 Survey Samples for Observational Studies 3 Concluding Remarks
3 3 / 32 1 Empirical Likelihood for Two-sample Problems 2 Survey Samples for Observational Studies 3 Concluding Remarks
4 4 / 32 (1) Two Independent Samples Two groups: treatment vs control Response Y: Y 1 for treatment and Y 0 for control Sample data on Y: { Y11,, Y 1n1 } Covariates might be involved µ 1 = E(Y 1 ) and µ 0 = E(Y 0 ) and F 1 (t) = P(Y 1 t) and F 0 (t) = P(Y 0 t) S 1 (t) = P(Y 1 > t) = 1 F 1 (t) S 0 (t) = P(Y 0 > t) = 1 F 0 (t) { Y01,, Y 0n0 }
5 5 / 32 (1) Two Independent Samples Inference on treatment effect: θ = µ 1 µ 0 Inference on survival functions (distribution functions): Y 1 is stochastically larger than Y 0 if S 1 (t) > S 0 (t) for t > 0. Empirical likelihood methods for two-sample problems (Wu and Yan, 2012): Two independent samples with no missing values Two independent samples with responses missing at random Two independent survey samples with no missing values
6 6 / 32 (2) Pretest-Posttest Studies A very popular approach in medical and social sciences Measure changes resulting from a treatment or an intervention A version of two-sample problems Design I: Paired comparison (less popular one) Select a random sample of n units from the target population; Measure the response Y on all units BEFORE and AFTER the treatment/intervention.
7 7 / 32 (2) Pretest-Posttest Studies Design II: (commonly used method) A random sample of n units is taken from the target population Measures on certain baseline variables Z are obtained for ALL n individuals (pretest measures) Among the n units, n 1 are randomly selected and assigned to treatment ; Values of the response variable Y are obtained (posttest measures) The other n 0 = n n 1 units are assigned to control ; Values of the response variable Y are also obtained
8 8 / 32 (2) Pretest-Posttest Studies Y 1 : response under treatment Y 0 : response under control The available data (n = n 1 + n 0 ) i 1 2 n 1 n n n Z Z 1 Z 2 Z n1 Z n1 +1 Z n1 +2 Z n Y 1 Y 11 Y 12 Y 1n1 Y 0 Y 0(n1 +1) Y 0(n1 +2) Y 0n Two distinct features of the design: Response Missing-by-Design Availability of baseline information for all n units The Z variables follow the same distributions for both groups (due to randomization)
9 9 / 32 (2) Pretest-Posttest Studies A two-sample problem with unique features Parameter of primary interest: θ = µ 1 µ 0 Test H 0 : θ = 0 vs H 1 : θ 0 (or θ > 0) Test H 0 : F 1 = F 0 vs H 1 : F 1 < F 0 (OR H 0 : S 1 = S 0 vs H 1 : S 1 > S 0 ) Question: How to effectively use the baseline information and the feature of missing-by-design?
10 10 / 32 (3) Two-sample Problems for Observational Studies Baseline information (Z) collected for all n units Each unit is assigned to either treatment or control (Missing-by-Design) Assignments to treatment or control are not randomized: Example 1. Patients self-selection of treatment among two alternative choices. Example 2. Voluntary participation in a school smoking-intervention education program. Example 3. Modes of survey data collection: Web versus telephone interview.
11 11 / 32 An EL Approach to Pretest-Posttest Studies (Huang, Qin and Follmann, JASA, 2008) Parameter of interest: θ = µ 1 µ 0 Find the EL estimators of µ 1 and µ 0 separately Estimate θ by ˆθ = ˆµ 1 ˆµ 0 Estimate the standard error of ˆθ through a bootstrap method Inference on θ = µ 1 µ 0 using (ˆθ θ)/se(ˆθ) Finding ˆµ 1 (and ˆµ 0 ) is the main focus of the HQF paper
12 12 / 32 An Imputation-based Two-Sample EL Approach Why imputation? More efficient use of baseline information! i 1 2 n 1 n n n Z Z 1 Z 2 Z n1 Z n1 +1 Z n1 +2 Z n Y 1 Y 11 Y 12 Y 1n1 Y 0 Y 0(n1 +1) Y 0(n1 +2) Y 0n Regression modelling: Y 1i = Z T i β 1 + ɛ 1i, i = 1,, n, (1) Y 0i = Z T i β 0 + ɛ 0i, i = 1,, n, (2)
13 13 / 32 An Imputation-based Two-Sample EL Approach Let R i = 1 if i is under treatment, R i = 0 otherwise, ˆβ 1 = ˆβ 0 = ( n ) 1 n R i Z i Z T i R i Z i Y 1i, i=1 i=1 ( n ) 1 n (1 R i )Z i Z T i (1 R i )Z i Y 0i i=1 i=1 Regression imputation: Y 1i = Z T i ˆβ 1, i = n 1 + 1,, n Y 0i = Z T i ˆβ 0, i = 1,, n 1
14 14 / 32 Imputation-basede EL Inference on θ = µ 1 µ 0 Two augmented samples after imputation: {Ỹ 1i = R i Y 1i + (1 R i )Y 1i, i = 1,, n} {Ỹ 0i = (1 R i )Y 0i + R i Y 0i, i = 1,, n}. Each sample has an enlarged sample size at n = n 1 + n 0 The imputed samples are no longer independent Inferences are under the assumed regression models (the imputation model)
15 15 / 32 1 Empirical Likelihood for Two-sample Problems 2 Survey Samples for Observational Studies 3 Concluding Remarks
16 16 / 32 Two Independent Surveys Two independent survey samples S 1 and S 0 from the same population Two (possibly different) designs: d 1i, i S 1 ; d 0i, i S 0 d 1i = d 1i k S 1 d 1k and d0i = d 0i k S 0 d 0k Response variables Y 1 and Y 0 ; Survey sample data: } } {(y 1i, z 1i ), i S 1 and {(y 0i, z 0i ), i S 0 Parameter of interest: θ = µ y1 µ y0 Design-based estimator of θ: ˆθ 1 = ˆµ y1 ˆµ y0 = i S1 d 1i y 1i i S0 d 0i y 0i
17 17 / 32 Two-sample Pseudo Empirical Likelihood Two-sample pseudo EL function l(p, q) = 1 d1i log(p 1i ) + 1 d0i log(q 0i ) 2 2 i S 1 i S 0 Normalization constraints i S 1 p 1i = 1 and i S 0 q 0i = 1 Parameter constraint p 1i y 1i q 0i y 0i = θ i S 1 i S 0 Additional constraints on z 1 and z 0 depending on what s available
18 18 / 32 The ITC Four Country Survey (ITC4) The International Tobacco Control Policy Evaluation Project (The ITC Project) ITC Four Country Survey: Waves 1-7 by telephone interview ITC Four Country Survey: Wave 8, respondents chose to complete the survey either by telephone interview or self-administered web survey Total number of respondents at wave 8: 4507 Number of respondents by telephone: 2709 Number of respondents by web: 1798 The research question: Examine the effect of two different modes for data collection Y 1 : response by web (treatment) Y 0 : response by telephone (control)
19 19 / 32 Mode Effect in Survey Data Collection Survey sample S of size n; design weights d i, i = 1,, n. Units i = 1,, n 1 chose treatment Units i = n 1 + 1,, n chose control Let R i = 1 if unit i chose treatment, R i = 0 if unit i chose control, i = 1,, n Baseline information z i observed for all i = 1,, n Response variable missing-by-design: Y 1i observed for i = 1,, n 1 Y 0i observed for i = n 1 + 1,, n Point estimate and confidence interval for θ = µ y1 µ y0
20 20 / 32 Mode Effect Under randomization to treatment and control: E(Y 1 R = 1) E(Y 0 R = 0) = E(Y 1 ) E(Y 0 ) = θ With self-selection of treatment and control: E(Y 1 R = 1) E(Y 1 ), E(Y 0 R = 0) E(Y 0 ) Ignorable treatment assignment (Rosenbaum and Rubin, 1983): (Y 1, Y 0 ) and R are independent given Z Test ignobility using a two-phase sampling technique? (Chen and Kim, 2014)
21 21 / 32 Mode Effect: Propensity Score Adjustment (PSA) Treatment assignment (self-selection) depends only on Z P(R = 1 Y 1, Y 0, Z = z) = P(R = 1 Z = z) = r(z) Fit a feasible model to obtain the propensity scores ˆr i = ˆr(z i ), i = 1,, n Available data {(y 1i, z i ), i = 1,, n 1 } and { } (y 0i, z i ), i = n 1 + 1,, n Point estimator for θ = µ y1 µ y0 using PSA: ˆθ 2 = n 1 i=1 d i ˆr i y 1i n i=n 1 +1 d i 1 ˆr i y 0i
22 22 / 32 Pseudo EL Under Propensity Score Adjustment Design weights d i = P(i S), i = 1,, n; d i = d i / k S d k Pseudo empirical likelihood function and constraints l(p, q) = n 1 i=1 n 1 d i ˆr i log(p i ) + p i = 1, i=1 n 1 p i y 1i i=1 n j=n 1 +1 n j=n 1 +1 n j=n 1 +1 d j 1 ˆr j log(q j ) (3) q j = 1 (4) q j y 0j = θ (5) ˆp(θ) and ˆq(θ): Maximizer of (3) under constraints (4) and (5) The maximum ( pseudo EL ) estimator ˆθ 3 : Maximize l ˆp(θ), ˆq(θ) w.r.t. θ
23 23 / 32 A Simulation Study N = 20, 000; n = 200; Single stage PPS sampling x i1 Bernoulli(0.5); x i2 U[0, 1]; x i exp(1) π i x i3 ; max π i / min π i = 45 Linear models for the responses y i0 and y i1 : y ik = β 0k + β 1k x i1 + β 2k x i2 + β 3k x i3 + ε i, i = 1, 2,..., N Logistic regression model for R i : r i = P(R i = 1 x i ) ( ri ) log = γ 0 + γ 1 x i1 + γ 2 x i2 + γ 3 x i3, 1 r i i = 1, 2,..., N Three point estimators of θ = µ y1 µ y0 : ˆθ1, ˆθ2, ˆθ3 B = 1000 repeated simulation runs
24 24 / 32 A Simulation Study Table : Absolute Relative Bias (ARB, %) and Mean Square Error (MSE) θ = µ y1 µ y0 = 1 θ = µ y1 µ y0 = 2 E(R) ˆθ1 ˆθ2 ˆθ3 ˆθ1 ˆθ2 ˆθ ARB MSE ARB MSE ARB MSE
25 25 / 32 Post-stratification by Propensity Score With self-selection of treatment and control, the distributions of Z R = 1 and Z R = 0 tend to be different Matching by propensity score is an effective way to balance the distribution of Z between the treated and the untreated (Rosenbaum and Rubin, 1983, 1984) Order the units based on fitted propensity scores ˆr (1) ˆr (2) ˆr (n) Form K strata based on suitable cut-off of the propensity scores (Popular choice: K = 5) Units within the same stratum have similar values of propensity scores
26 26 / 32 Post-stratification by Propensity Score Post-stratified samples: S = Q 1 Q K Estimated stratum weights ( ) ( ) Ŵ k = / d i, k = 1,, K Population means µ y1 = i Q k d i i S K W k µ y1k and µ y0 = k=1 K W k µ y0k k=1 Treatment and control groups within Q k = S 1k S 0k : R i = 1 if i S 1k and R i = 0 if i S 0k
27 27 / 32 Post-stratification by Propensity Score For each Q k, the distributions of Z over S 1k and S 0k are approximately the same Balance diagnostics tools are available (Austin, 2008, 2009) The post-stratified estimator of θ: ˆµ y1k = ˆθ = K k=1 Ŵ k ( ˆµ y1k ˆµ y0k ) i S 1k d i y 1i i S 1k d i, ˆµ y0k = i S 0k d i y 0i i S 0k d i Post-stratification provides an effective way of using baseline information on Z
28 28 / 32 Using Z in Pseudo EL Through Additional Constraints The pseudo EL function for the post-stratified samples l = K Ŵ k i S 1k dik log(p ik ) + k=1 k=1 K Ŵ k i S 0k dik log(q ik ) Constraints p ik = 1, q ik = 1, k = 1,, K i S 1k i S 0k K K Ŵ k p ik y 1i Ŵ k q ik y 0i = θ k=1 i S 1k k=1 i S 0k K K Ŵ k p ik z i = Ŵ k q ik z i i S 1k i S 0k k=1 k=1
29 29 / 32 Using Z in Pseudo EL Through Imputation For each Q k = S 1k S 0k : (y 1i, z i ), i S 1k plus z i, i S 0k (y 0i, z i ), i S 0k plus z i, i S 1k Build a model using {(y 1i, z i ), i S 1k } Predict y 1i for i S 0k using z i, i S 0k Build a model using {(y 0i, z i ), i S 0k } Predict y 0i for i S 1k using z i, i S 1k Using the two imputed stratified samples for pseudo EL inference
30 30 / 32 1 Empirical Likelihood for Two-sample Problems 2 Survey Samples for Observational Studies 3 Concluding Remarks
31 31 / 32 Concluding Remarks Confidence intervals or hypothesis tests using the EL ratio statistic have better performances than the conventional normal theory methods Performances of imputation-based EL approach to pretest-posttest studies depend on two crucial conditions: Prediction power of the baseline variables Z Reliability of the model used for imputation Linear regression models are convenient, but kernel regression models can also be used Efficient computational procedures for EL are available for practical implementations We are currently conducting further simulation studies on mode effects in survey data collection
32 32 / 32 Acknowledgement This talk is partially based on Min Chen s PhD dissertation research at the University of Waterloo Chen, M., Wu, C. and Thompson, M.E. (2015). An Imputation Based Empirical Likelihood Approach to Pretest-Posttest Studies. The Canadian Journal of Statistics, 43, Chen, M., Wu, C. and Thompson, M.E. (2015). Mann-Whitney Test with Empirical Likelihood Methods for Pretest-Posttest Studies. Revised for Journal of Nonparametric Statistics. Wu, C. and Yan, Y. (2012). Weighted Empirical Likelihood Inference for Two-sample Problems. Statistics and Its Interface, 5,
Empirical Likelihood Methods for Pretest-Posttest Studies
Empirical Likelihood Methods for Pretest-Posttest Studies by Min Chen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in
More informationANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW
SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved
More informationCausal Inference Basics
Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,
More informationAn Introduction to Causal Analysis on Observational Data using Propensity Scores
An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationEmpirical Likelihood Inference for Two-Sample Problems
Empirical Likelihood Inference for Two-Sample Problems by Ying Yan A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Statistics
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationarxiv: v1 [stat.me] 15 May 2011
Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland
More informationCombining Non-probability and Probability Survey Samples Through Mass Imputation
Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu
More informationNew Developments in Nonresponse Adjustment Methods
New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample
More informationRecent Advances in the analysis of missing data with non-ignorable missingness
Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationPropensity Score Analysis with Hierarchical Data
Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational
More informationCausal Inference for Mediation Effects
Causal Inference for Mediation Effects by Jing Zhang B.S., University of Science and Technology of China, 2006 M.S., Brown University, 2008 A Dissertation Submitted in Partial Fulfillment of the Requirements
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationConstrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationPROPENSITY SCORE MATCHING. Walter Leite
PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working
More informationGeneralized Pseudo Empirical Likelihood Inferences for Complex Surveys
The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao
More informationWhat s New in Econometrics. Lecture 1
What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and
More informationEmpirical Likelihood Methods
Handbook of Statistics, Volume 29 Sample Surveys: Theory, Methods and Inference Empirical Likelihood Methods J.N.K. Rao and Changbao Wu (February 14, 2008, Final Version) 1 Likelihood-based Approaches
More informationEmpirical likelihood methods in missing response problems and causal interference
The University of Toledo The University of Toledo Digital Repository Theses and Dissertations 2017 Empirical likelihood methods in missing response problems and causal interference Kaili Ren University
More informationCausal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk
Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationSemiparametric Regression Based on Multiple Sources
Semiparametric Regression Based on Multiple Sources Benjamin Kedem Department of Mathematics & Inst. for Systems Research University of Maryland, College Park Give me a place to stand and rest my lever
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationAdvising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand
Advising on Research Methods: A consultant's companion Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand Contents Preface 13 I Preliminaries 19 1 Giving advice on research methods
More informationVariable selection and machine learning methods in causal inference
Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationSurvival Analysis for Case-Cohort Studies
Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz
More informationDOUBLY ROBUST NONPARAMETRIC MULTIPLE IMPUTATION FOR IGNORABLE MISSING DATA
Statistica Sinica 22 (2012), 149-172 doi:http://dx.doi.org/10.5705/ss.2010.069 DOUBLY ROBUST NONPARAMETRIC MULTIPLE IMPUTATION FOR IGNORABLE MISSING DATA Qi Long, Chiu-Hsieh Hsu and Yisheng Li Emory University,
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationDifference-in-Differences Methods
Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics
More informationA comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group
A comparison of weighted estimators for the population mean Ye Yang Weighting in surveys group Motivation Survey sample in which auxiliary variables are known for the population and an outcome variable
More informationMonte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics
Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,
More informationMann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study
MPRA Munich Personal RePEc Archive Mann-Whitney Test with Adjustments to Pre-treatment Variables for Missing Values and Observational Study Songxi Chen and Jing Qin and Chengyong Tang January 013 Online
More informationSection 9c. Propensity scores. Controlling for bias & confounding in observational studies
Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationApproximation of Survival Function by Taylor Series for General Partly Interval Censored Data
Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationNuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation
Biometrika Advance Access published October 24, 202 Biometrika (202), pp. 8 C 202 Biometrika rust Printed in Great Britain doi: 0.093/biomet/ass056 Nuisance parameter elimination for proportional likelihood
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationJong-Min Kim* and Jon E. Anderson. Statistics Discipline Division of Science and Mathematics University of Minnesota at Morris
Jackknife Variance Estimation for Two Samples after Imputation under Two-Phase Sampling Jong-Min Kim* and Jon E. Anderson jongmink@mrs.umn.edu Statistics Discipline Division of Science and Mathematics
More informationStrategy of Bayesian Propensity. Score Estimation Approach. in Observational Study
Theoretical Mathematics & Applications, vol.2, no.3, 2012, 75-86 ISSN: 1792-9687 (print), 1792-9709 (online) Scienpress Ltd, 2012 Strategy of Bayesian Propensity Score Estimation Approach in Observational
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationPropensity score modelling in observational studies using dimension reduction methods
University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationThe propensity score with continuous treatments
7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.
More informationMaster s Written Examination - Solution
Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2
More informationGuanglei Hong. University of Chicago. Presentation at the 2012 Atlantic Causal Inference Conference
Guanglei Hong University of Chicago Presentation at the 2012 Atlantic Causal Inference Conference 2 Philosophical Question: To be or not to be if treated ; or if untreated Statistics appreciates uncertainties
More informationPropensity Score Methods for Estimating Causal Effects from Complex Survey Data
Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School
More informationBalancing Covariates via Propensity Score Weighting
Balancing Covariates via Propensity Score Weighting Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu Stochastic Modeling and Computational Statistics Seminar October 17, 2014
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in
More informationLocal Polynomial Wavelet Regression with Missing at Random
Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800
More informationTopics and Papers for Spring 14 RIT
Eric Slud Feb. 3, 204 Topics and Papers for Spring 4 RIT The general topic of the RIT is inference for parameters of interest, such as population means or nonlinearregression coefficients, in the presence
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationLikelihood-based inference with missing data under missing-at-random
Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric
More informationAN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE
Statistica Sinica 24 (2014), 1097-1116 doi:http://dx.doi.org/10.5705/ss.2012.074 AN INSTRUMENTAL VARIABLE APPROACH FOR IDENTIFICATION AND ESTIMATION WITH NONIGNORABLE NONRESPONSE Sheng Wang 1, Jun Shao
More informationLecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions
Econ 513, USC, Department of Economics Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions I Introduction Here we look at a set of complications with the
More informationUNIVERSITÄT POTSDAM Institut für Mathematik
UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationPropensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments
Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao David A. van Dyk Kosuke Imai July 3, 2018 Abstract Propensity score methods are
More informationPropensity score adjusted method for missing data
Graduate Theses and Dissertations Graduate College 2013 Propensity score adjusted method for missing data Minsun Kim Riddles Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd
More informationTest of Association between Two Ordinal Variables while Adjusting for Covariates
Test of Association between Two Ordinal Variables while Adjusting for Covariates Chun Li, Bryan Shepherd Department of Biostatistics Vanderbilt University May 13, 2009 Examples Amblyopia http://www.medindia.net/
More informationMissing covariate data in matched case-control studies: Do the usual paradigms apply?
Missing covariate data in matched case-control studies: Do the usual paradigms apply? Bryan Langholz USC Department of Preventive Medicine Joint work with Mulugeta Gebregziabher Larry Goldstein Mark Huberman
More informationOUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores
OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity
More informationTwo-phase sampling approach to fractional hot deck imputation
Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.
More informationAdvanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1
Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()
More informationSemiparametric Regression Based on Multiple Sources
Semiparametric Regression Based on Multiple Sources Benjamin Kedem Department of Mathematics & Inst. for Systems Research University of Maryland, College Park Give me a place to stand and rest my lever
More informationCausal Sensitivity Analysis for Decision Trees
Causal Sensitivity Analysis for Decision Trees by Chengbo Li A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer
More informationHigh Dimensional Propensity Score Estimation via Covariate Balancing
High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)
More informationImplementing Matching Estimators for. Average Treatment Effects in STATA
Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University West Coast Stata Users Group meeting, Los Angeles October 26th, 2007 General Motivation Estimation
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationIntroduction An approximated EM algorithm Simulation studies Discussion
1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationOn the covariate-adjusted estimation for an overall treatment difference with data from a randomized comparative clinical trial
Biostatistics Advance Access published January 30, 2012 Biostatistics (2012), xx, xx, pp. 1 18 doi:10.1093/biostatistics/kxr050 On the covariate-adjusted estimation for an overall treatment difference
More informationProbabilistic Index Models
Probabilistic Index Models Jan De Neve Department of Data Analysis Ghent University M3 Storrs, Conneticut, USA May 23, 2017 Jan.DeNeve@UGent.be 1 / 37 Introduction 2 / 37 Introduction to Probabilistic
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationSampling Techniques. Esra Akdeniz. February 9th, 2016
Sampling Techniques Esra Akdeniz February 9th, 2016 HOW TO DO RESEARCH? Question. Literature research. Hypothesis. Collect data. Analyze data. Interpret and present results. HOW TO DO RESEARCH? Collect
More informationBounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small
Bounds on Causal Effects in Three-Arm Trials with Non-compliance Jing Cheng Dylan Small Department of Biostatistics and Department of Statistics University of Pennsylvania June 20, 2005 A Three-Arm Randomized
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationRatio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects
Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationThe Effective Use of Complete Auxiliary Information From Survey Data
The Effective Use of Complete Auxiliary Information From Survey Data by Changbao Wu B.S., Anhui Laodong University, China, 1982 M.S. Diploma, East China Normal University, 1986 a thesis submitted in partial
More information