Outline. Flexible, optimal matching for observational studies. Optimal matching of two groups Comparing nuclear plants: an illustration

Size: px
Start display at page:

Download "Outline. Flexible, optimal matching for observational studies. Optimal matching of two groups Comparing nuclear plants: an illustration"

Transcription

1 Outle Flexible, optimal matchg for observational studies Optimal matchg of two groups Comparg nuclear plants: an illustration Ben Hansen University of Michigan Generalizations of pair matchg 15 June 2006 The R implementation Existg site Existg site Example: 1:2 matchg by a

2 Existg site Example: 1:2 matchg by a Existg site Example: 1:2 matchg by a Existg site Example: 1:2 matchg by a Existg site Example: 1:2 matchg by a

3 Existg site Example: 1:2 matchg by a Existg site Example: 1:2 matchg by a New and refurbished nuclear plants: discrepancies capacity and year of construction Exist- s g H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Existg site Optimal vs. Greedy matchg By evaluatg potential matches all together rather than sequentially, optimal matchg (blue les) reduces the sum of distances from 82 to 63.

4 Introducg restrictions on who can be matched to whom: calipers In the nuclear plants example, suppose we choose to sist upon a caliper of three years the date of construction. This would forbid five potential matches, dicated below red. Introducg restrictions on who can be matched to whom: calipers With optmatch, matches are forbidden by placg s the distance matrix. Exist- s g H I J K L M N O P Q R S T U V W X Y Z A B C D E F G Exist- s g H I J K L M N O P Q R S T U V W X Y Z A Inf Inf B Inf C D E F 20 Inf 28 Inf G Outle Example # 2: Gender equity study for research scientists 1 Optimal matchg of two groups Comparg nuclear plants: an illustration Generalizations of pair matchg The R implementation and men scientists are to be matched on grant fundg. 1 Discussed Hansen and Klopfer (2005), Hansen (2004)

5 Full Matchg 2 the Gender Equity Sample Similar to matchg with replacement, but creates disjot matched sets better for tests & CIs. In contrast to pair matchg, it fds matches for everyone with a suitable counterpart. In contrast to multiple controls matchg, it doesn t force poor matches. In optmatch, can be combed with structural restrictions. 2 (Rosenbaum, 1991; Hansen and Klopfer, 2005) Full Matchg 2 the Gender Equity Sample Similar to matchg with replacement, but creates disjot matched sets better for tests & CIs. In contrast to pair matchg, it fds matches for everyone with a suitable counterpart. In contrast to multiple controls matchg, it doesn t force poor matches. In optmatch, can be combed with structural restrictions. 2 (Rosenbaum, 1991; Hansen and Klopfer, 2005) Full Matchg 2 the Gender Equity Sample Similar to matchg with replacement, but creates disjot matched sets better for tests & CIs. In contrast to pair matchg, it fds matches for everyone with a suitable counterpart. In contrast to multiple controls matchg, it doesn t force poor matches. In optmatch, can be combed with structural restrictions. 2 (Rosenbaum, 1991; Hansen and Klopfer, 2005) Full Matchg 2 the Gender Equity Sample Similar to matchg with replacement, but creates disjot matched sets better for tests & CIs. In contrast to pair matchg, it fds matches for everyone with a suitable counterpart. In contrast to multiple controls matchg, it doesn t force poor matches. In optmatch, can be combed with structural restrictions. 2 (Rosenbaum, 1991; Hansen and Klopfer, 2005)

6 Outle Optimal matchg of two groups Comparg nuclear plants: an illustration Generalizations of pair matchg Under the hood Full matchg via network flows 3 T C +U 0. 0 C +U [0,1] [0,1]. [0,1] [0,1] Sk The R implementation [0, U 1] C nu [0, Ũ 1] 0 Overflow Arguments to fullmatch() 3 (Hansen and Klopfer, 2005, Fig. 2). Time complexity of the algorithm is O(n 3 log(n max(dist))). Arguments to fullmatch() distance The argument demandg most attention from the user, b/c it defes good matches and because very large distance matrices can tax R s memory limits. A new helper function, makedist(), eases both of these efforts. m.controls, max.controls In propensity matchg, can be important for efficiency see Hansen (2004), Augursky and Kluve (2004). omit.fraction for use matched samplg (as opposed to matchg all or most of a sample). Not needed for gettg rid of subjects without suitable potential matches. If you re not specifically out to reduce the size of the control group, can be ignored. distance The argument demandg most attention from the user, b/c it defes good matches and because very large distance matrices can tax R s memory limits. A new helper function, makedist(), eases both of these efforts. m.controls, max.controls In propensity matchg, can be important for efficiency see Hansen (2004), Augursky and Kluve (2004). omit.fraction for use matched samplg (as opposed to matchg all or most of a sample). Not needed for gettg rid of subjects without suitable potential matches. If you re not specifically out to reduce the size of the control group, can be ignored.

7 Arguments to fullmatch() Summary distance The argument demandg most attention from the user, b/c it defes good matches and because very large distance matrices can tax R s memory limits. A new helper function, makedist(), eases both of these efforts. m.controls, max.controls In propensity matchg, can be important for efficiency see Hansen (2004), Augursky and Kluve (2004). omit.fraction for use matched samplg (as opposed to matchg all or most of a sample). Not needed for gettg rid of subjects without suitable potential matches. If you re not specifically out to reduce the size of the control group, can be ignored. With optmatch, R offers the most comprehensive optimal matchg implementation for statistics. fullmatch() solves optimally such traditional problems as matched samplg, pair matchg, and matchg with k controls. fullmatch() can also solve matchg problems flexibly, and far more effectively, by way of full matchg, with or without structural restrictions (Hansen and Klopfer, 2005; Hansen, 2004). The effort required to code optimal and full matchg algorithms seems to have dissuaded their widespread use. Now that I ve made that effort, perhaps this situation can change! :) Summary Summary With optmatch, R offers the most comprehensive optimal matchg implementation for statistics. fullmatch() solves optimally such traditional problems as matched samplg, pair matchg, and matchg with k controls. fullmatch() can also solve matchg problems flexibly, and far more effectively, by way of full matchg, with or without structural restrictions (Hansen and Klopfer, 2005; Hansen, 2004). The effort required to code optimal and full matchg algorithms seems to have dissuaded their widespread use. Now that I ve made that effort, perhaps this situation can change! :) With optmatch, R offers the most comprehensive optimal matchg implementation for statistics. fullmatch() solves optimally such traditional problems as matched samplg, pair matchg, and matchg with k controls. fullmatch() can also solve matchg problems flexibly, and far more effectively, by way of full matchg, with or without structural restrictions (Hansen and Klopfer, 2005; Hansen, 2004). The effort required to code optimal and full matchg algorithms seems to have dissuaded their widespread use. Now that I ve made that effort, perhaps this situation can change! :)

8 Summary With optmatch, R offers the most comprehensive optimal matchg implementation for statistics. fullmatch() solves optimally such traditional problems as matched samplg, pair matchg, and matchg with k controls. fullmatch() can also solve matchg problems flexibly, and far more effectively, by way of full matchg, with or without structural restrictions (Hansen and Klopfer, 2005; Hansen, 2004). The effort required to code optimal and full matchg algorithms seems to have dissuaded their widespread use. Now that I ve made that effort, perhaps this situation can change! :) Example with propensity scores and stratification prior to matchg >nuclear$pscore <- glm(pr~.-cost, + family=bomial(),data=nuclear)$lear.predictors > pscorediffs <- function(trtvar,data) { + pscr <- data[names(trtvar), pscore ] + abs(outer(pscr[trtvar],pscr[!trtvar], - )) + } > psd2 <- makedist(pr~pt, nuclear, pscorediffs) > fullmatch(psd2) > fullmatch(psd2, m.controls=1, max.controls=3) > fullmatch(psd2, m=1, max=c( 0 =3, 1 =2)) Jake Bowers and my RItools package provides diagnostics... Example with propensity scores and stratification prior to matchg >nuclear$pscore <- glm(pr~.-cost, + family=bomial(),data=nuclear)$lear.predictors > pscorediffs <- function(trtvar,data) { + pscr <- data[names(trtvar), pscore ] + abs(outer(pscr[trtvar],pscr[!trtvar], - )) + } > psd2 <- makedist(pr~pt, nuclear, pscorediffs) > fullmatch(psd2) > fullmatch(psd2, m.controls=1, max.controls=3) > fullmatch(psd2, m=1, max=c( 0 =3, 1 =2)) Jake Bowers and my RItools package provides diagnostics... Example with propensity scores and stratification prior to matchg >nuclear$pscore <- glm(pr~.-cost, + family=bomial(),data=nuclear)$lear.predictors > pscorediffs <- function(trtvar,data) { + pscr <- data[names(trtvar), pscore ] + abs(outer(pscr[trtvar],pscr[!trtvar], - )) + } > psd2 <- makedist(pr~pt, nuclear, pscorediffs) > fullmatch(psd2) > fullmatch(psd2, m.controls=1, max.controls=3) > fullmatch(psd2, m=1, max=c( 0 =3, 1 =2)) Jake Bowers and my RItools package provides diagnostics...

9 Example with propensity scores and stratification prior to matchg >nuclear$pscore <- glm(pr~.-cost, + family=bomial(),data=nuclear)$lear.predictors > pscorediffs <- function(trtvar,data) { + pscr <- data[names(trtvar), pscore ] + abs(outer(pscr[trtvar],pscr[!trtvar], - )) + } > psd2 <- makedist(pr~pt, nuclear, pscorediffs) > fullmatch(psd2) > fullmatch(psd2, m.controls=1, max.controls=3) > fullmatch(psd2, m=1, max=c( 0 =3, 1 =2)) Jake Bowers and my RItools package provides diagnostics... Agresti, A. (2002), Categorical data analysis, John Wiley & Sons. Augursky, B. and Kluve, J. (2004), Assessg the performance of matchg algorithms when selection to treatment is strong, Tech. rep., RWI-Essen. Cox, D. R. and Snell, E. J. (1989), Analysis of bary data, Chapman & Hall Ltd. Hansen, B. B. (2004), Full matchg an observational study of coachg for the SAT, Journal of the American Statistical Association, 99, (2005), OPTMATCH, an add-on package for R. Hansen, B. B. and Klopfer, S. O. (2005), Optimal full matchg and related designs via network flows, Tech. Rep. 416, Statistics Department, University of Michigan. Raudenbush, S. W. and Bryk, A. S. (2002), Hierarchical Lear Models: Applications and Data Analysis Methods, Sage Publications Inc. Rosenbaum, P. R. (1991), A Characterization of Optimal Designs for Observational Studies, Journal of the Royal Statistical Society, 53, (2002a), Attributg effects to treatment matched observational studies, Journal of the American Statistical Association, 97, (2002b), Covariance adjustment randomized experiments and observational studies, Statistical Science, 17, (2002c), Observational Studies, Sprger-Verlag, 2nd ed. Rub, D. B. (1979), Usg Multivariate Matched Samplg and Regression Adjustment to Control Bias Observational Studies, Journal of the American Statistical Association, 74, Smith, H. (1997), Matchg with Multiple Controls to Estimate Treatment Effects Observational Studies, Sociological Methodology, 27, Modes of estimation for treatment effects Preferred Type of outcome mode of ference Categorical Contuous Randomization Agresti (2002), Rosenbaum (2002c), Categorical Data Observational Studies; Analysis; Rosenbaum Rosenbaum (2002b), Covariance (2002a), Atributg adjustment.... effects to treatment... Conditional a Agresti (2002); Cox and ordary OLS b is fe; see Snell (1989), Analysis of also Rub (1979), Usg bary data multivariate matched.... Bayes, esp. Agresti (2002) Smith (1997), Matchg hierarchical with multiple controls... ; lear models Raudenbush and Bryk c (2002), Hierarchical lear models a Uses a fixed effect for each matched set. b i.e., OLS with a fixed effect for each matched set plus treatment effect(s) c Uses a random effect for each matched set.

Flexible, Optimal Matching for Comparative Studies Using the optmatch package

Flexible, Optimal Matching for Comparative Studies Using the optmatch package Flexible, Optimal Matching for Comparative Studies Using the optmatch package Ben Hansen Statistics Department University of Michigan ben.b.hansen@umich.edu http://www.stat.lsa.umich.edu/~bbh 9 August

More information

Optimal full matching and related designs via network flows

Optimal full matching and related designs via network flows Optimal full matching and related designs via network flows Ben B. Hansen and Stephanie Olsen Klopfer submitted January 2005; revised October 2005 Abstract In the matched analysis of an observational study,

More information

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755 Mixed- Model Analysis of Variance Sohad Murrar & Markus Brauer University of Wisconsin- Madison The SAGE Encyclopedia of Educational Research, Measurement and Evaluation Target Word Count: 3000 - Actual

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

GLM models and OLS regression

GLM models and OLS regression GLM models and OLS regression Graeme Hutcheson, University of Manchester These lecture notes are based on material published in... Hutcheson, G. D. and Sofroniou, N. (1999). The Multivariate Social Scientist:

More information

Studies. Frank E Harrell Jr. NIH AHRQ Methodologic Challenges in CER 2 December 2010

Studies. Frank E Harrell Jr. NIH AHRQ Methodologic Challenges in CER 2 December 2010 Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medice NIH AHRQ Methodologic Challenges CER 2 December 2010 Outle 1 2 3 4 What characterizes situations when causal ference

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

CRISP: Capture-Recapture Interactive Simulation Package

CRISP: Capture-Recapture Interactive Simulation Package CRISP: Capture-Recapture Interactive Simulation Package George Volichenko Carnegie Mellon University Pittsburgh, PA gvoliche@andrew.cmu.edu December 17, 2012 Contents 1 Executive Summary 1 2 Introduction

More information

LOGISTICS REGRESSION FOR SAMPLE SURVEYS

LOGISTICS REGRESSION FOR SAMPLE SURVEYS 4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester GLM models and OLS regression The

More information

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai Chapter 1 Propensity Score Analysis Concepts and Issues Wei Pan Haiyan Bai Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity score analysis, research using propensity score analysis

More information

Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573)

Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573) Prognostic Propensity Scores: A Method Accounting for the Correlations of the Covariates with Both the Treatment and the Outcome Variables in Matching and Diagnostics Authors and Affiliations: Nianbo Dong

More information

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14 STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2

More information

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement Second meeting of the FIRB 2012 project Mixture and latent variable models for causal-inference and analysis

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

(Mis)use of matching techniques

(Mis)use of matching techniques University of Warsaw 5th Polish Stata Users Meeting, Warsaw, 27th November 2017 Research financed under National Science Center, Poland grant 2015/19/B/HS4/03231 Outline Introduction and motivation 1 Introduction

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs Chad S. Briggs, Kathie Lorentz & Eric Davis Education & Outreach University Housing Southern Illinois

More information

Supplemental material to accompany Preacher and Hayes (2008)

Supplemental material to accompany Preacher and Hayes (2008) Supplemental material to accompany Preacher and Hayes (2008) Kristopher J. Preacher University of Kansas Andrew F. Hayes The Ohio State University The multivariate delta method for deriving the asymptotic

More information

COMBINING ROBUST VARIANCE ESTIMATION WITH MODELS FOR DEPENDENT EFFECT SIZES

COMBINING ROBUST VARIANCE ESTIMATION WITH MODELS FOR DEPENDENT EFFECT SIZES COMBINING ROBUST VARIANCE ESTIMATION WITH MODELS FOR DEPENDENT EFFECT SIZES James E. Pustejovsky, UT Austin j o i n t wo r k w i t h : Beth Tipton, Nor t h we s t e r n Unive r s i t y Ariel Aloe, Unive

More information

Massive Experiments and Observational Studies: A Linearithmic Algorithm for Blocking/Matching/Clustering

Massive Experiments and Observational Studies: A Linearithmic Algorithm for Blocking/Matching/Clustering Massive Experiments and Observational Studies: A Linearithmic Algorithm for Blocking/Matching/Clustering Jasjeet S. Sekhon UC Berkeley June 21, 2016 Jasjeet S. Sekhon (UC Berkeley) Methods for Massive

More information

Introduction to Propensity Score Matching: A Review and Illustration

Introduction to Propensity Score Matching: A Review and Illustration Introduction to Propensity Score Matching: A Review and Illustration Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January 28, 2005 For Workshop Conducted at the

More information

PROPENSITY SCORE MATCHING. Walter Leite

PROPENSITY SCORE MATCHING. Walter Leite PROPENSITY SCORE MATCHING Walter Leite 1 EXAMPLE Question: Does having a job that provides or subsidizes child care increate the length that working mothers breastfeed their children? Treatment: Working

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d. Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated

More information

Hierarchical Linear Models. Jeff Gill. University of Florida

Hierarchical Linear Models. Jeff Gill. University of Florida Hierarchical Linear Models Jeff Gill University of Florida I. ESSENTIAL DESCRIPTION OF HIERARCHICAL LINEAR MODELS II. SPECIAL CASES OF THE HLM III. THE GENERAL STRUCTURE OF THE HLM IV. ESTIMATION OF THE

More information

Gov 2002: 5. Matching

Gov 2002: 5. Matching Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

CENTERING IN MULTILEVEL MODELS. Consider the situation in which we have m groups of individuals, where

CENTERING IN MULTILEVEL MODELS. Consider the situation in which we have m groups of individuals, where CENTERING IN MULTILEVEL MODELS JAN DE LEEUW ABSTRACT. This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005. Consider the situation in which we have

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

2005 ICPSR SUMMER PROGRAM REGRESSION ANALYSIS III: ADVANCED METHODS

2005 ICPSR SUMMER PROGRAM REGRESSION ANALYSIS III: ADVANCED METHODS 2005 ICPSR SUMMER PROGRAM REGRESSION ANALYSIS III: ADVANCED METHODS William G. Jacoby Department of Political Science Michigan State University June 27 July 22, 2005 This course will take a modern, data-analytic

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

LINEAR MULTILEVEL MODELS. Data are often hierarchical. By this we mean that data contain information

LINEAR MULTILEVEL MODELS. Data are often hierarchical. By this we mean that data contain information LINEAR MULTILEVEL MODELS JAN DE LEEUW ABSTRACT. This is an entry for The Encyclopedia of Statistics in Behavioral Science, to be published by Wiley in 2005. 1. HIERARCHICAL DATA Data are often hierarchical.

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates Eun Sook Kim, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta E. Lanehart, Aarti Bellara,

More information

Advanced Quantitative Data Analysis

Advanced Quantitative Data Analysis Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 003 Paper Weighting Adustments for Unit Nonresponse with Multiple Outcome

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Propensity Score Matching

Propensity Score Matching Methods James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Methods 1 Introduction 2 3 4 Introduction Why Match? 5 Definition Methods and In

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Using the Cross-Match Test to Appraise Covariate Balance in Matched Pairs

Using the Cross-Match Test to Appraise Covariate Balance in Matched Pairs Using the Cross-Match Test to Appraise Covariate Balance in Matched Pairs Ruth HELLER,PaulR.ROSENBAUM,andDylanS.SMALL Having created a tentative matched design for an observational study, diagnostic checks

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

49th European Organization for Quality Congress. Topic: Quality Improvement. Service Reliability in Electrical Distribution Networks

49th European Organization for Quality Congress. Topic: Quality Improvement. Service Reliability in Electrical Distribution Networks 49th European Organization for Quality Congress Topic: Quality Improvement Service Reliability in Electrical Distribution Networks José Mendonça Dias, Rogério Puga Leal and Zulema Lopes Pereira Department

More information

Matching Methods for Observational Microarray Studies

Matching Methods for Observational Microarray Studies Bioinformatics Advance Access published December 19, 2008 Matching Methods for Observational Microarray Studies Ruth Heller 1,, Elisabetta Manduchi 2 and Dylan Small 1 1 Department of Statistics, Wharton

More information

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018 Schedule and outline 1:00 Introduction and overview 1:15 Quasi-experimental vs. experimental designs

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination

Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination The Korean Communications in Statistics Vol. 13 No. 1, 2006, pp. 49-60 Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination Chong Sun Hong 1) and

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Longitudinal Data Analysis. Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago

Longitudinal Data Analysis. Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago Longitudinal Data Analysis Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago Course description: Longitudinal analysis is the study of short series of observations

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

ECNS 561 Topics in Multiple Regression Analysis

ECNS 561 Topics in Multiple Regression Analysis ECNS 561 Topics in Multiple Regression Analysis Scaling Data For the simple regression case, we already discussed the effects of changing the units of measurement Nothing different here Coefficients, SEs,

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS A COEFFICIENT OF DETEMINATION FO LOGISTIC EGESSION MODELS ENATO MICELI UNIVESITY OF TOINO After a brief presentation of the main extensions of the classical coefficient of determination ( ), a new index

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Propensity score modelling in observational studies using dimension reduction methods

Propensity score modelling in observational studies using dimension reduction methods University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University

More information

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online

More information

Correspondence Analysis of Longitudinal Data

Correspondence Analysis of Longitudinal Data Correspondence Analysis of Longitudinal Data Mark de Rooij* LEIDEN UNIVERSITY, LEIDEN, NETHERLANDS Peter van der G. M. Heijden UTRECHT UNIVERSITY, UTRECHT, NETHERLANDS *Corresponding author (rooijm@fsw.leidenuniv.nl)

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs STAT 5500/6500 Conditional Logistic Regression for Matched Pairs The data for the tutorial came from support.sas.com, The LOGISTIC Procedure: Conditional Logistic Regression for Matched Pairs Data :: SAS/STAT(R)

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

1 Introduction. 2 Example

1 Introduction. 2 Example Statistics: Multilevel modelling Richard Buxton. 2008. Introduction Multilevel modelling is an approach that can be used to handle clustered or grouped data. Suppose we are trying to discover some of the

More information

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Summary and discussion of The central role of the propensity score in observational studies for causal effects Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background

More information

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis Progress in Applied Mathematics Vol. 6, No., 03, pp. [59-69] DOI: 0.3968/j.pam.955803060.738 ISSN 95-5X [Print] ISSN 95-58 [Online] www.cscanada.net www.cscanada.org Modelling Academic Risks of Students

More information

Longitudinal Analysis. Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago

Longitudinal Analysis. Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago Longitudinal Analysis Michael L. Berbaum Institute for Health Research and Policy University of Illinois at Chicago Course description: Longitudinal analysis is the study of short series of observations

More information

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded 1 Background Latent confounder is common in social and behavioral science in which most of cases the selection mechanism

More information

Lab 4, modified 2/25/11; see also Rogosa R-session

Lab 4, modified 2/25/11; see also Rogosa R-session Lab 4, modified 2/25/11; see also Rogosa R-session Stat 209 Lab: Matched Sets in R Lab prepared by Karen Kapur. 1 Motivation 1. Suppose we are trying to measure the effect of a treatment variable on the

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Regression Analysis: Exploring relationships between variables. Stat 251

Regression Analysis: Exploring relationships between variables. Stat 251 Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information

More information

Consistent Bivariate Distribution

Consistent Bivariate Distribution A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Attributing Effects to A Cluster Randomized Get-Out-The-Vote Campaign

Attributing Effects to A Cluster Randomized Get-Out-The-Vote Campaign Attributing Effects to A Cluster Randomized Get-Out-The-Vote Campaign Technical Report #448, Statistics Dept., University of Michigan Jake Bowers and Ben B. Hansen 1 October 16, 2006 Abstract In a landmark

More information

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION Ernest S. Shtatland, Ken Kleinman, Emily M. Cain Harvard Medical School, Harvard Pilgrim Health Care, Boston, MA ABSTRACT In logistic regression,

More information

LCA Distal Stata function users guide (Version 1.0)

LCA Distal Stata function users guide (Version 1.0) LCA Distal Stata function users guide (Version 1.0) Liying Huang John J. Dziak Bethany C. Bray Aaron T. Wagner Stephanie T. Lanza Penn State Copyright 2016, Penn State. All rights reserved. Please send

More information

The Method of Auxiliary Sources for the Thin Films Superimposed on the Dielectric Surface

The Method of Auxiliary Sources for the Thin Films Superimposed on the Dielectric Surface JAE, VOL. 7, NO., 05 JOURNAL OF APPLIED ELECTROMAGNETISM The Method of Auxiliary Sources for the Th Films Superimposed on the Dielectric Surface I. M. Petoev, V. A. Tabatadze, R. S. Zaridze Tbilisi State

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

A Practitioner s Guide to Generalized Linear Models

A Practitioner s Guide to Generalized Linear Models A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

R-squared for Bayesian regression models

R-squared for Bayesian regression models R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the

More information

Table B1. Full Sample Results OLS/Probit

Table B1. Full Sample Results OLS/Probit Table B1. Full Sample Results OLS/Probit School Propensity Score Fixed Effects Matching (1) (2) (3) (4) I. BMI: Levels School 0.351* 0.196* 0.180* 0.392* Breakfast (0.088) (0.054) (0.060) (0.119) School

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Variable Ratio Matching with Fine Balance in a Study of the Peer Health Exchange

Variable Ratio Matching with Fine Balance in a Study of the Peer Health Exchange This is the peer reviewed version of the following article: Pimentel, S.D., Yoon, F., and Keele, L. (2015). "Variable-ratio matching with fine balance in a study of the Peer Health Exchange." Statistics

More information

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Linear Models 1. Isfahan University of Technology Fall Semester, 2014 Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

Comparison of Estimators in GLM with Binary Data

Comparison of Estimators in GLM with Binary Data Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 10 11-2014 Comparison of Estimators in GLM with Binary Data D. M. Sakate Shivaji University, Kolhapur, India, dms.stats@gmail.com

More information

Multilevel Regression Mixture Analysis

Multilevel Regression Mixture Analysis Multilevel Regression Mixture Analysis Bengt Muthén and Tihomir Asparouhov Forthcoming in Journal of the Royal Statistical Society, Series A October 3, 2008 1 Abstract A two-level regression mixture model

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS

AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS Submitted to the Annals of Statistics AFFINELY INVARIANT MATCHING METHODS WITH DISCRIMINANT MIXTURES OF PROPORTIONAL ELLIPSOIDALLY SYMMETRIC DISTRIBUTIONS By Donald B. Rubin and Elizabeth A. Stuart Harvard

More information

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS M. Gasparini and J. Eisele 2 Politecnico di Torino, Torino, Italy; mauro.gasparini@polito.it

More information