Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573)

Similar documents
The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Propensity Score Methods for Causal Inference

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

An Introduction to Causal Analysis on Observational Data using Propensity Scores

University of Michigan School of Public Health

On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm

DATA-ADAPTIVE VARIABLE SELECTION FOR

Propensity Score Weighting with Multilevel Data

Estimating the Marginal Odds Ratio in Observational Studies

arxiv: v1 [stat.me] 15 May 2011

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

Propensity Score Analysis with Hierarchical Data

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Propensity Score Matching

What s New in Econometrics. Lecture 1

PEARL VS RUBIN (GELMAN)

Flexible, Optimal Matching for Comparative Studies Using the optmatch package

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Targeted Maximum Likelihood Estimation in Safety Analysis

Measuring Social Influence Without Bias

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Analysis of propensity score approaches in difference-in-differences designs

A Note on Bayesian Inference After Multiple Imputation

Covariate Balancing Propensity Score for General Treatment Regimes

Matching for Causal Inference Without Balance Checking

Variable selection and machine learning methods in causal inference

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

The propensity score with continuous treatments

Propensity Score Matching and Genetic Matching : Monte Carlo Results

Confidence Intervals for the Interaction Odds Ratio in Logistic Regression with Two Binary X s

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

Statistical Practice

PROPENSITY SCORE MATCHING. Walter Leite

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Propensity Score Analysis Using teffects in Stata. SOC 561 Programming for the Social Sciences Hyungjun Suh Apr

Causal Inference Basics

Balancing Covariates via Propensity Score Weighting

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University

Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies.

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Gov 2002: 5. Matching

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Using Estimating Equations for Spatially Correlated A

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Discussion of Papers on the Extensions of Propensity Score

Statistics in medicine

New Developments in Nonresponse Adjustment Methods

Propensity Score for Causal Inference of Multiple and Multivalued Treatments

APPENDIX 1 BASIC STATISTICS. Summarizing Data

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

University of Michigan School of Public Health

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

JOINTLY MODELING CONTINUOUS AND BINARY OUTCOMES FOR BOOLEAN OUTCOMES: AN APPLICATION TO MODELING HYPERTENSION

Indirect estimation of a simultaneous limited dependent variable model for patient costs and outcome

Combining multiple observational data sources to estimate causal eects

Strategy of Bayesian Propensity. Score Estimation Approach. in Observational Study

NISS. Technical Report Number 167 June 2007

Causal Analysis in Social Research

Memorial Sloan-Kettering Cancer Center

Passing-Bablok Regression for Method Comparison

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Data Integration for Big Data Analysis for finite population inference

Tests for the Odds Ratio of Two Proportions in a 2x2 Cross-Over Design

Propensity Score Methods, Models and Adjustment

Dynamics in Social Networks and Causality

Incorporating published univariable associations in diagnostic and prognostic modeling

A Sampling of IMPACT Research:

In Praise of the Listwise-Deletion Method (Perhaps with Reweighting)

Journal of Biostatistics and Epidemiology

Sensitivity analysis and distributional assumptions

Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

Causal modelling in Medical Research

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Penalized Spline of Propensity Methods for Missing Data and Causal Inference. Roderick Little

Impact of covariate misclassification on the power and type I error in clinical trials using covariate-adaptive randomization

Integrated approaches for analysis of cluster randomised trials

Ignoring the matching variables in cohort studies - when is it valid, and why?

Exogenous Treatment and Endogenous Factors: Vanishing of Omitted Variable Bias on the Interaction Term

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

Stratified Randomized Experiments

Matching with Multiple Control Groups, and Adjusting for Group Differences

Nuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Asymptotic equivalence of paired Hotelling test and conditional logistic regression

Controlling for overlap in matching

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Transcription:

Prognostic Propensity Scores: A Method Accounting for the Correlations of the Covariates with Both the Treatment and the Outcome Variables in Matching and Diagnostics Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO 65211 Phone: (573) 882-1495 dong.nianbo@gmail.com Metin Bulus University of Missouri 14 Hill Hall, Columbia, MO 65211 mbnt9@mail.missouri.edu 1

Prognostic Propensity Scores: A Method Accounting for the Correlations of the Covariates with Both the Treatment and the Outcome Variables in Matching and Diagnostics Background Propensity scores (PS, Rosenbaum & Rubin, 1983) and prognostic scores (PGS, Hansen, 2006, 2008; Wyss et al., 2015) are two summary scores to address confounding for treatment effect estimation. PS focus on capturing the associations of the covariates and the treatment variable and PGS focus on capturing the associations of the covariates and the outcome variable. Because the associations of the covariates with both the treatment and the outcome variables may affect accuracy of treatment effect estimates, combining PS and PGS may have desirable properties. Hansen (2006) combined PS and PGS into a single index using Mahalanobis distance and used this index for full matching. He concluded that full matching on the combined index resulted in smaller bias and variance compared to matching on PS alone. Kelcey (2013) further compared full matching on the combined index proposed by Hansen (2006) to full matching on PS (0.1 caliper size), and full matching on PS within PGS strata (quintiles). Kelcey (2013) concluded that full matching on PS within PGS strata is superior to full matching on the combined index, and full matching on PS alone. A more comprehensive simulation study was conducted by Leacy and Stuart (2014) to compare full matching on the combined index with full matching on PGS within PS calipers (0.2 caliper size), subclassification based on a grid of 5 x 5 subclasses obtained from quintiles of PS and PGS, full matching on PS or PGS alone, and subclassification (quintiles) on PS or PGS. Leacy and Stuart (2014) concluded that full matching on the combined index and full matching on PGS within PS calipers are superior to other methods. Albeit contradictory in their findings with regard to performance of the combined index, it is difficult to draw any definite conclusion since all three studies only have full matching on PS in common as a comparison method. Nevertheless, results are encouraging and warrant further exploration of combining PS and PGS. Furthermore, the diagnostics for PS that commonly rely on covariate-based absolute standardized difference (ASD) or PS-based standardized difference (PSSD), may not apply to PGS or its joint use with PS. The PGS-based standardized difference (PGSSD) was suggested as a measure of balance for both PS and PGS (Stuart, Lee, & Leacy, 2013). However, diagnostics for the joint use of PS and PGS have not been investigated to date. Purpose The purpose is twofold: (1) We propose an alternative method to combine PS and PGS, and compare its performance to regression adjustment, optimal matching on PS, and optimal matching on PGS, and (2) We propose a correlation-based balance measure for diagnostics. Research Design Let a matrix of covariates be denoted as XX, and define PS as XXXX, where αα being a column vector of standardized regression weights to predict treatment status (T). For the same matrix of XX, let PGS be denoted as XXXX, where ββ being a column vector of standardized regression weights to predict the outcome (Y). Then the prognostic propensity score (PGPS) is defined based on redefined weights as PGPS = XX(dddddddd[ααββ TT ]). 2

This combined score (rather than combined index derived from Mahalanobis distance) is used for optimal matching (Ming & Rosenbaum, 2001). We also propose a correlation based covariate balance measure, which can be used to diagnose PS, PGS, and PGPS. Let the correlation between the covariates and the treatment status (TT) be denoted in vector form as cccccccc(xx, TT), the correlation between the covariates and the outcome (YY) in vector form as cccccccc(xx, YY), and number of covariates as NN, then the proposed covariate balance measure is defined as: RR XXXX RR XXXX = [cccccccc(xx, TT) TT cccccccc(xx, YY)]/NN. We conducted Monte Carlo simulation to compare the root mean square error (RMSE) of the treatment effect estimates among several estimation methods: analysis of variance (ANOVA), regression (REG), prognostic score (PGS), propensity score (PS), and prognostic propensity score (PGPS), as well as the doubly robust methods by including covariates in the outcome models (doubly robust prognostic score (DR.PGS), doubly robust prognostic score (DR.PS), and doubly robust prognostic propensity score (DR.PGPS)). Balance measure, RR XXXX RR XXXX, were compared with other balance measures regarding the correlations between the balance measures and bias. Simulation scenarios (n = 576) are adapted from Ali et al. (2014) and Stuart, Lee, and Leacy (2013). See Appendix A for further details. Findings and Discussion This is an ongoing study. We report partial simulation results from 500 replications for each scenario below. Table 1 and Figure 1 present the results of root mean square error (RMSE) of various estimation methods in 576 simulation scenarios. Overall, there is no superiority of PGPS in terms of RMSE. The doubly robust methods (DR.PGS; DR.PS; DR.PGPS) perform similarly better than the methods (PGS; PS; PGPS) without including covariates in the outcome model, and better than regression. Simulation results may guide researchers as to when PGS may be a better option compared to regression adjustment and PS. There are already some studies focusing on these comparisons (Argobast & Ray, 2011; Argobast et al, 2012; Pfeiffer & Riedl, 2015; Schmidt et al, 2016; Wyss et al., 2015; Xu et al., 2016). These studies concluded that PS and PGS have different characteristics that may reduce bias under different circumstances, in particular when treatmentto-control ratio is small. We will identify specific scenarios where some methods are better than the others in the final paper. Finally, this study shows that RR XXXX RR XXXX balance measure have higher correlation with bias and relatively more robust to estimation methods (PS, PGS, and PGPS) (Table 2 and Figure 2). The second best measure, PGSSD is superior to ASD or PSD, confirming Stuart, Lee, and Leacy (2013) findings. Nonetheless, this holds when the outcome model is correctly specified, or when there is no omitted variables in the outcome model. In all other cases, we recommend using RR XXXX RR XXXX balance measure regardless of method because it has the highest correlation when models are miss-specified, and it is more robust to estimation methods. 3

References Ali, M. S., Groenwold, R. H., Pestman, W. R., Belitser, S. V., Roes, K. C., Hoes, A. W.,... & Klungel, O. H. (2014). Propensity score balance measures in pharmacoepidemiology: a simulation study. Pharmacoepidemiology and drug safety, 23(8), 802-811. Arbogast, P. G., & Ray, W. A. (2011). Performance of disease risk scores, propensity scores, and traditional multivariable outcome regression in the presence of multiple confounders. American journal of epidemiology, 174(5), 613-620. Arbogast, P. G., Seeger, J. D., & DEcIDE Methods Center Summary Variable Working Group. (2012). Summary variables in observational research: propensity scores and disease risk scores. Effective health care program research report, (33). Hansen, B. B. (2006). Bias reduction in observational studies via prognosis scores. Technical report #441, University of Michigan Statistics Department. Hansen, B. B. (2008). The prognostic analogue of the propensity score. Biometrika, 95(2), 481-488. Ming, K, & Rosenbaum, P. R. (2001). A note on optimal matching with variable controls using the assignment algorithm. Journal of Computational and Graphical Statistics, 10:455 463. Pfeiffer, R. M., & Riedl, R. (2015). On the use and misuse of scalar scores of confounders in design and analysis of observational studies. Statistics in medicine, 34(18), 2618-2635. Kelcey, B. (2013). Propensity Score Matching within Prognostic Strata. Society for Research on Educational Effectiveness. Leacy, F. P., & Stuart, E. A. (2014). On the joint use of propensity and prognostic scores in estimation of the average treatment effect on the treated: a simulation study. Statistics in medicine, 33(20), 3488-3508. Rosenbaum, P. R. & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41 55 Schmidt, A. F., Klungel, O. H., Groenwold, R. H., & GetReal Consortium. (2016). Adjusting for confounding in early post launch settings: Going beyond logistic regression models. Epidemiology, 27(1), 133-142. Stuart, E. A., Lee, B. K., & Leacy, F. P. (2013). Prognostic score based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research. Journal of clinical epidemiology,66(8), S84-S90. Wyss, R., Ellis, A. R., Brookhart, M. A., Jonsson Funk, M., Girman, C. J., Simpson, R. J., & Stürmer, T. (2015). Matching on the disease risk score in comparative effectiveness research of new treatments.pharmacoepidemiology and drug safety, 24(9), 951-961. Xu, S., Shetterly, S., Cook, A. J., Raebel, M. A., Goonesekera, S., Shoaibi, A.,... & Fireman, B. (2016). Evaluation of propensity scores, disease risk scores, and regression in confounder adjustment for the safety of emerging treatment with group sequential monitoring. Pharmacoepidemiology and drug safety, 25(4), 453-461. 4

Tables & Figures Table 1 Root Mean Squared Error of Various Estimation Methods in 576 Simulation Scenarios Sample Estimation LCL UCL Mean Method Mean Mean SD Median Min Max Full ANOVA 0.642 0.626 0.659 0.207 0.564 0.427 1.048 Sample REG 0.202 0.192 0.212 0.121 0.185 0.013 0.513 DR.PGPS 0.164 0.153 0.175 0.134 0.118 0.015 0.533 DR.PGS 0.172 0.161 0.182 0.131 0.123 0.015 0.531 Common DR.PS 0.174 0.164 0.185 0.131 0.128 0.015 0.527 Support PGPS 0.277 0.267 0.286 0.114 0.232 0.117 0.613 Sample PGS 0.259 0.250 0.267 0.105 0.240 0.060 0.594 PS 0.269 0.261 0.277 0.098 0.242 0.134 0.598 Note. ANOVA: Analysis of Variance. REG: Regression. DR.PGPS: Doubly Robust Prognostic Propensity Score. DR.PGS: Doubly Robust Prognostic Score. DR.PS: Doubly Robust Prognostic Score. PGPS: Prognostic Propensity Score. PGS: Prognostic Score. PS: Propensity Score. 5

Table 2 Correlations of Covariate Balance Measure with Bias in 576 Simulation Scenarios Estimation Method Covariate Balance Measure Mean LCL Mean UCL Mean SD Min Max ASD 0.257 0.242 0.273 0.190-0.189 0.646 PGPSSD 0.376 0.361 0.391 0.182-0.024 0.815 PGPS PGSSD 0.573 0.556 0.591 0.212-0.097 0.949 PSSD 0.338 0.316 0.360 0.268-0.311 0.811 RxtRxy 0.731 0.720 0.743 0.137 0.184 0.888 ASD 0.204 0.194 0.215 0.124-0.155 0.603 PGPSSD 0.309 0.295 0.323 0.174-0.109 0.769 PGS PGSSD 0.384 0.361 0.406 0.276-0.250 0.902 PSSD 0.248 0.229 0.266 0.226-0.415 0.775 RxtRxy 0.609 0.599 0.620 0.131 0.164 0.805 ASD 0.438 0.420 0.457 0.226-0.135 0.841 PGPSSD 0.496 0.477 0.516 0.238-0.144 0.904 PS PGSSD 0.547 0.526 0.569 0.263-0.140 0.947 PSSD 0.495 0.474 0.517 0.265-0.099 0.895 RxtRxy 0.617 0.601 0.633 0.197 0.030 0.906 Note. ANOVA: Analysis of Variance. PGPS: Prognostic Propensity Score. PGS: Prognostic Score. PS: Propensity Score. ASD: Absolute Standardized Difference. PSSD: Propensity Score Standardized Difference. PGSSD: Prognostic Score Standardized Difference. PGPSSD: Prognostic Propensity Score Standardized Difference. RxtRxy: Product of correlations between covariate and treatment status, and covariate and outcome. 6

Note. DR.PGPS: Doubly Robust Prognostic Propensity Score. DR.PGS: Doubly Robust Prognostic Score. DR.PS: Doubly Robust Prognostic Score. PGPS: Prognostic Propensity Score. PGS: Prognostic Score. PS: Propensity Score. REG: Regression. Figure 1. Optimal Matching within Common Support Sample 7

PS MS01 MS02 MS03 MS04 MS05 MS06 MS07 MS08 MS09 MS10 MS11 MS12 MS01 MS02 MS03 MS04 MS05 MS06 MS07 MS08 MS09 MS10 MS11 MS12 MS01 MS02 MS03 MS04 MS05 MS06 MS07 MS08 MS09 MS10 MS11 Model Specification MS12 MS01 MS02 MS03 MS04 MS05 MS06 MS07 MS08 MS09 MS10 MS11 MS12 MS01 MS02 MS03 MS04 MS05 MS06 MS07 MS08 MS09 MS10 MS11 MS12 Estimation Method PGS PGPS ASD PGPSSD PGSSD PSSD RxtRxy Covariate Balance Measure Note. PS: Propensity Score. PGS: Prognostic Score. PGPS: Prognostic Propensity Score. ASD: Absolute Standardized Difference. PSSD: Propensity Score Standardized Difference. PGSSD: Prognostic Score Standardized Difference. PGPSSD: Prognostic Propensity Score Standardized Difference. RxtRxy: Product of correlations between covariate and treatment status, and covariate and outcome. Figure 2. Covariate Balance Measure Correlation with Bias 8

Appendix A. Simulation Framework The following figure demonstrates the relationship of covariates and confounders with the treatment and the outcome variables. Unlike Ali et al (2014), X 9 is not in this framework (related to neither treatment nor outcome). We also modified coefficients due to having a continuous outcome instead of a binary outcome. We adapted model specifications from Stuart, Lee and Leacy (2013), and modified them by considering both prognostic score and propensity scores models. X7 and X8 X3 and X6 Treatment Outcome X1, X2, X4 and X5 Figure A1. Association of Covariates and Confounders with Treatment and Outcome True Models: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 44 + αα 4 xx 55 + αα 5 xx 7 + αα 6 xx 88 + αα 7 xx 2 xx 44 + αα 8 xx 2 xx 7 + αα 9 xx 7 xx 88 + αα 10 xx 44 xx 55 + αα 11 xx 55 22 + αα 12 xx 88 22 yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 5 xx 55 + ββ 6 xx 66 + ββ 7 xx 2 xx 44 + ββ 8 xx 3 xx 55 + ββ 9 xx 3 xx 66 + ββ 10 xx 44 xx 55 + ββ 11 xx 55 22 + ββ 12 xx 66 22 Color and Font Codes: X 7, X 8: only treatment X 1, X 2, X 4, X 5: both treatment and outcome X 3, X 6: only outcome X 1, X 2, X 3, X 7: binary {~ Bernoulli (p), where p= (.30,.40,.50,.70)} X 4, X 5, X 6, X 8: continuous {all ~ Normal (0, 1)} In this study, for the true treatment model the intercept value is assigned such that treatment-tocontrol ratio is approximately.30. We further used acceptance-rejection technique and discarded samples having less than.20 and more than.40 treatment-to-control ratio. 9

Table A1. Four Scenarios for True Model Coefficients True Models (TM) True Model Coefficients TM1: Propensity score Main effects in propensity score model (PSM) model and outcome model αα 3, αα 4, αα 6 : log(1.5), log(2), log (3) ffffff cccccccc coefficients are both in αα 1, αα 2, αα 5 : log(1.2), log(1.5), log(2) ffffff bbbbbbbbbbbb increasing order Interactions and quadratic terms in PSM αα 7, αα 8, αα 9, αα 10, αα 11, αα 12 : log(1.1), log(1.2), log(1.3), log(1.4), log(1.5), log (1.6) TM2: Propensity score model coefficients are in increasing order when outcome model coefficients are in decreasing order TM3: Propensity score model coefficients are in decreasing order when outcome model coefficients are in increasing order TM4: Propensity score model and outcome model coefficients are both in decreasing order Main effects in outcome model (OM) γγ: 0 oooo 0.5 oooo 1 ββ 4, ββ 5, ββ 6 : 0.40, 0.70, 1 ffffff cccccccc ββ 1, ββ 2, ββ 3 : 0.20, 0.40, 0.70 ffffff bbbbbbbbbbbb Interactions and quadratic terms in OM ββ 7, ββ 8, ββ 9, ββ 10, ββ 11, ββ 12 : 0.1, 0.2, 0.3, 0.4, 0.5, 0.6 Main effects in propensity score model (PSM) αα 3, αα 4, αα 6 : log(1.5), log(2), log (3) ffffff cccccccc αα 1, αα 2, αα 5 : log(1.2), log(1.5), log(2) ffffff bbbbbbbbbbbb Interactions and quadratic terms in PSM αα 7, αα 8, αα 9, αα 10, αα 11, αα 12 : log(1.1), log(1.2), log(1.3), log(1.4), log(1.5), log (1.6) Main effects in outcome model (OM) γγ: 0 oooo 0.5 oooo 1 ββ 4, ββ 5, ββ 6 : 1, 0.70, 0.40 ffffff cccccccc ββ 1, ββ 2, ββ 3 : 0.70, 0.40, 0.20 ffffff bbbbbbbbbbbb Interactions and quadratic terms in OM ββ 7, ββ 8, ββ 9, ββ 10, ββ 11, ββ 12 : 0.6, 0.5, 0.4, 0.3, 0.2, 0.1 Main effects in propensity score model (PSM) αα 3, αα 4, αα 6 : log(3), log(2), log (1.5) ffffff cccccccc αα 1, αα 2, αα 5 : log(2), log(1.5), log(1.2) ffffff bbbbbbbbbbbb Interactions and quadratic terms in PSM αα 7, αα 8, αα 9, αα 10, αα 11, αα 12 : log(1.6), log(1.5), log(1.4), log(1.3), log(1.2), log (1.1) Main effects in outcome model (OM) γγ: 0 oooo 0.5 oooo 1 ββ 4, ββ 5, ββ 6 : 0.40, 0.70, 1 ffffff cccccccc ββ 1, ββ 2, ββ 3 : 0.20, 0.40, 0.70 ffffff bbbbbbbbbbbb Interactions and quadratic terms in OM ββ 7, ββ 8, ββ 9, ββ 10, ββ 11, ββ 12 : 0.1, 0.2, 0.3, 0.4, 0.5, 0.6 Main effects in propensity score model (PSM) αα 3, αα 4, αα 6 : log(3), log(2), log (1.5) ffffff cccccccc αα 1, αα 2, αα 5 : log(2), log(1.5), log(1.2) ffffff bbbbbbbbbbbb Interactions and quadratic terms in PSM αα 7, αα 8, αα 9, αα 10, αα 11, αα 12 : log(1.6), log(1.5), log(1.4), log(1.3), log(1.2), log (1.1) Main effects in outcome model (OM) γγ: 0 oooo 0.5 oooo 1 ββ 4, ββ 5, ββ 6 : 1, 0.70, 0.40 ffffff cccccccc ββ 1, ββ 2, ββ 3 : 0.70, 0.40, 0.20 ffffff bbbbbbbbbbbb Interactions and quadratic terms in OM ββ 7, ββ 8, ββ 9, ββ 10, ββ 11, ββ 12 : 0.6, 0.5, 0.4, 0.3, 0.2, 0.1 10

Model Specifications (MS) Propensity Score Model (PSM) Prognostic Score Model (PGSM) Outcome Model (OM) MS1: True treatment model PSM1: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 4 + αα 4 xx 5 + αα 5 xx 7 + αα 6 xx 8 + αα 7 xx 2 xx 4 + αα 8 xx 2 xx 7 + αα 9 xx 7 xx 8 + αα 10 xx 4 xx 5 + αα 11 xx 5 2 + αα 12 xx 8 2 PGSM1: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 4 + ξξ 4 xx 5 + ξξ 5 xx 7 + ξξ 6 xx 8 + ξξ 7 xx 2 xx 4 + ξξ 8 xx 2 xx 7 + ξξ 9 xx 7 xx 8 + ξξ 10 xx 4 xx 5 + ξξ 11 xx 5 2 + ξξ 12 xx 8 2 OM1: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 4 + ββ 4 xx 5 + ββ 5 xx 7 + ββ 6 xx 8 + ββ 7 xx 2 xx 4 + ββ 8 xx 2 xx 7 + ββ 9 xx 7 xx 8 + ββ 10 xx 4 xx 5 + ββ 11 xx 5 2 + ββ 12 xx 8 2 MS2: True outcome model PSM2: llllllllll(pp tt ) = αα 0 +αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 44 + αα 5 xx 55 + αα 6 xx 66 + αα 7 xx 2 xx 44 + αα 8 xx 3 xx 55 + αα 9 xx 3 xx 66 + αα 10 xx 44 xx 55 + αα 11 xx 55 22 + αα 12 xx 66 22 PGSM2: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 44 + ξξ 5 xx 55 + ξξ 6 xx 66 + ξξ 7 xx 2 xx 44 + ξξ 8 xx 3 xx 55 + ξξ 9 xx 3 xx 66 + ξξ 10 xx 44 xx 55 + ξξ 11 xx 55 22 + ξξ 12 xx 66 22 OM2: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 5 xx 55 + ββ 6 xx 66 + ββ 7 xx 2 xx 44 + ββ 8 xx 3 xx 55 + ββ 9 xx 3 xx 66 + ββ 10 xx 44 xx 55 + ββ 11 xx 55 22 + ββ 12 xx 66 22 MS3: Main effects for all covariates PSM3: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 4 + αα 5 xx 5 + αα 6 xx 66 + αα 7 xx 7 + αα 8 xx 8 PGSM3: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 4 + ξξ 5 xx 5 + ξξ 6 xx 66 + ξξ 7 xx 7 + ξξ 8 xx 8 OM3: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 4 + ββ 5 xx 5 + ββ 6 xx 66 + ββ 7 xx 7 + ββ 8 xx 8 MS4: Main effects for six covariates related to treatment PSM4: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 +αα 3 xx 4 + αα 4 xx 5 + αα 5 xx 7 + αα 6 xx 8 PGSM4: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 +ξξ 3 xx 4 + ξξ 4 xx 5 + ξξ 5 xx 7 + ξξ 6 xx 8 OM4: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 +ββ 3 xx 4 + ββ 4 xx 5 + ββ 5 xx 7 + ββ 6 xx 8 MS5: Main effects for six covariates related to outcome PSM5: llllllllll(pp tt ) = αα 0 +αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 44 + αα 5 xx 55 + αα 6 xx 66 PGSM5: yy = ξξ 0 +ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 44 + ξξ 5 xx 55 + ξξ 6 xx 66 OM5: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 5 xx 55 + ββ 6 xx 66 11

MS6: Main effects for five covariates related to treatment, omit confounder strongly related to outcome PSM6: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 +αα 3 xx 4 + αα 5 xx 7 + αα 6 xx 8 PGSM6: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 +ξξ 3 xx 4 + ξξ 5 xx 7 + ξξ 6 xx 8 OM6: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 +ββ 3 xx 4 + ββ 5 xx 7 + ββ 6 xx 8 MS7: Main effects for five covariates related to outcome, omit confounder strongly related to treatment PSM7: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 44 + αα 6 xx 66 PGSM7: yy = ξξ 0 + ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 44 + ξξ 6 xx 66 OM7: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 6 xx 66 MS8: Main effects for five covariates related to treatment, omit confounder weakly related to outcome PSM8: llllllllll(pp tt ) = αα 0 + αα 2 xx 2 +αα 3 xx 4 + αα 4 xx 5 + αα 5 xx 7 + αα 6 xx 8 PGSM8: yy = ξξ 0 + ξξ 2 xx 2 +ξξ 3 xx 4 + ξξ 4 xx 5 + ξξ 5 xx 7 + ξξ 6 xx 8 OM8: yy = ββ 0 + γγγγ + ββ 2 xx 2 +ββ 3 xx 4 + ββ 4 xx 5 + ββ 5 xx 7 + ββ 6 xx 8 MS9: Main effects for five covariates related to outcome, omit confounder weakly related to treatment PSM9: llllllllll(pp tt ) = αα 0 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 44 + αα 5 xx 55 + αα 6 xx 66 PGSM9: yy = ξξ 0 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 44 + ξξ 5 xx 55 + ξξ 6 xx 66 OM9: yy = ββ 0 + γγγγ + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 5 xx 55 + ββ 6 xx 66 MS10: Main effects for six covariates related to treatment, add one covariate strongly related to outcome only PSM10: llllllllll(pp tt ) = αα 0 + αα 1 xx 1 + αα 2 xx 2 +αα 3 xx 4 + αα 4 xx 5 + αα 5 xx 7 + αα 6 xx 8 + αα 7 xx 66 PGSM10: yy = ξξ 0 + ξξxx 1 + ξξ 2 xx 2 +ξξ 3 xx 4 + ξξ 4 xx 5 + ξξ 5 xx 7 + ξξxx 8 + ξξ 7 xx 66 OM10: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 +ββ 3 xx 4 + ββ 4 xx 5 + ββ 5 xx 7 + ββ 6 xx 8 + ββ 7 xx 66 MS11: Main effects for six covariates related to outcome, add one covariate strongly related to treatment only PSM11: llllllllll(pp tt ) = αα 0 +αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 3 + αα 4 xx 44 + αα 5 xx 55 + αα 6 xx 66 + αα 7 xx 88 PGSM11: 12

yy = ξξ 0 +ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 3 + ξξ 4 xx 44 + ξξ 5 xx 55 + ξξ 6 xx 66 + ξξ 7 xx 88 OM11: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 3 + ββ 4 xx 44 + ββ 5 xx 55 + ββ 6 xx 66 + ββ 7 xx 88 MS12: Main effects for four covariates related to both treatment and outcome PSM12: llllllllll(pp tt ) = αα 0 +αα 1 xx 1 + αα 2 xx 2 + αα 3 xx 44 + αα 4 xx 55 PGSM12: yy = ξξ 0 +ξξ 1 xx 1 + ξξ 2 xx 2 + ξξ 3 xx 44 + ξξ 4 xx 55 OM12: yy = ββ 0 + γγγγ + ββ 1 xx 1 + ββ 2 xx 2 + ββ 3 xx 44 + ββ 4 xx 55 13

Table A2 Simulation Scenarios and Labels (Example: TM1GAM1N1) True Model (TM) (1) TM1 (2) TM2 (3) TM3 (4) TM4 Treatment Effect (γγ) (1) 0 (2) 0.5 (3) 1 Sample Size (1) 200 (2) 400 (3) 800 (4) 1600 Estimation Method (1) Regression (2) PS (3) PGS (4) PGPS (5) Doubly-robust PS (6) Doubly-robust PGS (7) Doubly-robust PGPS (8) PS within CS (9) PGS within CS (10) PGPS within CS (11) Doubly-robust PS within CS (12) Doubly-robust PGS within CS (13) Doubly-robust PGPS within CS Model Specification (MS) (1) MS1 (2) MS2 (3) MS3 (4) MS4 (5) MS5 (6) MS6 (7) MS7 (8) MS8 (9) MS9 (10) MS10 (11) MS11 (12) MS12 Balance Measures (1) ASD (2) PSSD (3) PGSSD (4) PGPSSD (5) R XTR XY Note. PS: Propensity Score. PGS: Prognostic Score. PGPS: Prognostic Propensity Score. CS: Common Support. ASD: Absolute Standardized Difference. PSSD: Propensity Score Standardized Difference. PGSSD: Prognostic Score Standardized Difference. PGPSSD: Prognostic Propensity Score Standardized Difference. 14