Interrupted Time Series Analysis for Single Series and Comparative Designs: Using Administrative Data for Healthcare Impact Assessment

Similar documents
ITSx: Policy Analysis Using Interrupted Time Series

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Supplemental materials for:

Evaluating Interventions on Drug Utilization: Analysis Methods

Member Level Forecasting Using SAS Enterprise Guide and SAS Forecast Studio

1 Impact Evaluation: Randomized Controlled Trial (RCT)

Formula for the t-test

The ARIMA Procedure: The ARIMA Procedure

Regression Models for Time Trends: A Second Example. INSR 260, Spring 2009 Bob Stine

LDA Midterm Due: 02/21/2005

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Ch3. TRENDS. Time Series Analysis

Suan Sunandha Rajabhat University

Lecture 19 Box-Jenkins Seasonal Models

Time Series I Time Domain Methods

Growth Mixture Model

at least 50 and preferably 100 observations should be available to build a proper model

Available online Journal of Scientific and Engineering Research, 2017, 4(10): Research Article

Statistical Power and Autocorrelation for Short, Comparative Interrupted Time Series Designs with Aggregate Data

Intervention Analysis and Transfer Function Models

Longitudinal Modeling with Logistic Regression

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Conceptual overview: Techniques for establishing causal pathways in programs and policies

TRANSFER FUNCTION MODEL FOR GLOSS PREDICTION OF COATED ALUMINUM USING THE ARIMA PROCEDURE

Time Series Analysis of Currency in Circulation in Nigeria

Measuring community health outcomes: New approaches for public health services research

Poisson regression: Further topics

Chapter 11. Correlation and Regression

Journal of Asian Scientific Research PREDICTION OF FATAL ROAD TRAFFIC CRASHES IN IRAN USING THE BOX-JENKINS TIME SERIES MODEL. Ayad Bahadori Monfared

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Using PROC ARIMA in Forecasting the Demand and Utilization of Inpatient Hospital Services

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Lecture 5: Estimation of time series

Development. ECON 8830 Anant Nyshadham

SplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006

Økonomisk Kandidateksamen 2005(I) Econometrics 2 January 20, 2005

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

General Linear Model (Chapter 4)

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Normalization of Peak Demand for an Electric Utility using PROC MODEL

LECTURE 9: GENTLE INTRODUCTION TO

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Week 11 Heteroskedasticity and Autocorrelation

arxiv: v1 [stat.me] 5 Nov 2008

Minitab Project Report - Assignment 6

Ross Bettinger, Analytical Consultant, Seattle, WA

Comments on Best Quasi- Experimental Practice

VIII. ANCOVA. A. Introduction

A Re-Introduction to General Linear Models (GLM)

Liang Li, PhD. MD Anderson

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Forecasting. Simon Shaw 2005/06 Semester II

Analysing longitudinal data when the visit times are informative

Community Health Needs Assessment through Spatial Regression Modeling

Liang Li, PhD. MD Anderson

Applied Econometrics Lecture 1

Modeling Spatial Relationships Using Regression Analysis. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Regression of Time Series

Geomapping Drive-Time Based Market Areas for DoD TRICARE Beneficiaries

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

De-mystifying random effects models

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES

Contractual Arrangements between

An Application of the Autoregressive Conditional Poisson (ACP) model

Ph.D. course: Regression models. Introduction. 19 April 2012

Sample Size Calculations

Dummy Variables. Susan Thomas IGIDR, Bombay. 24 November, 2008

Analysis. Components of a Time Series

The Simple Linear Regression Model

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

Lecture 7 Time-dependent Covariates in Cox Regression

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Trend and Variability Analysis and Forecasting of Wind-Speed in Bangladesh

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

A MACRO-DRIVEN FORECASTING SYSTEM FOR EVALUATING FORECAST MODEL PERFORMANCE

A Robust Interrupted Time Series Model for Analyzing Complex Healthcare Intervention Data

Transmission of Hand, Foot and Mouth Disease and Its Potential Driving

Theoretical and Simulation-guided Exploration of the AR(1) Model

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

The log transformation produces a time series whose variance can be treated as constant over time.

176 Index. G Gradient, 4, 17, 22, 24, 42, 44, 45, 51, 52, 55, 56

Module 1b: Inequalities and inequities in health and health care utilization

Scenario 5: Internet Usage Solution. θ j

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

John C. Pickett. David P. Reilly. Robert M. McIntyre

Correlation and Regression

A Comparison of Collaborative Stage with UICC TNM. Darlene Dale Head, Princess Margaret Cancer Registry June 11, 2013

Conducting Multivariate Analyses of Social, Economic, and Political Data

Accounting for Baseline Observations in Randomized Clinical Trials

Algebra I. 60 Higher Mathematics Courses Algebra I

Applied Microeconometrics (L5): Panel Data-Basics

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

MATH 2560 C F03 Elementary Statistics I Solutions to Assignment N3

Practical Considerations Surrounding Normality

Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions. Alan J Xiao, Cognigen Corporation, Buffalo NY

A Comparison of the Forecast Performance of. Double Seasonal ARIMA and Double Seasonal. ARFIMA Models of Electricity Load Demand

INTRODUCTION TO TIME SERIES ANALYSIS. The Simple Moving Average Model

Use of Dummy (Indicator) Variables in Applied Econometrics

Transcription:

Interrupted Time Series Analysis for Single Series and Comparative Designs: Using Administrative Data for Healthcare Impact Assessment Joseph M. Caswell, Ph.D. Lead Analyst Institute for Clinical Evaluative Sciences (ICES) North and Epidemiology, Outcomes & Evaluation Research Health Sciences North Research Institute (HSNRI) Northeast Cancer Centre

Presentation Overview Quasi-experimental research Interrupted time series (ITS) ITS single series with example ITS comparative with example Autocorrelation Adjusting standard errors (SE) in SAS Guide and macro

Quasi-Experimental Research Experimental research: Gold-standard is randomized controlled trial (RCT) i.e., drug trials (random assignment to treatment and placebo groups) Treatment and control groups balanced on baseline measures Can be time-consuming, expensive, and even unethical (i.e., withholding care) Quasi-experimental research: Alternative(s) available when RCT is not an option Quickly implemented, cost-efficient Often times, observational data are already available Various statistical methodologies for observational studies (i.e., propensity scores, ITS, instrumental variables, etc)

Observational Studies Administrative data: Routinely collected by hospitals and other healthcare facilities Great source for conducting observational health studies ICES data holdings: Examples: Ontario Cancer Registry (OCR) Registered Persons Database (RPDB) Discharge Abstract Database (DAD) Ontario Health Insurance Plan (OHIP) Many, many more We can use administrative data to build a study cohort and create various indicator variables

What is ITS Analysis? Increasingly popular quasi-experimental alternative Analysis of time series data (i.e., an outcome measured over time) Comparison before and after an intervention or interruption Particularly useful for assessing impact of policy or some other healthcare initiative

Time Series Caveats Example where simple pre- to post- comparison would be misleading Must control for pre-interruption trend Must also control for any autocorrelation (will get to this later!)

ITS Analysis Can identify different effects: Level change (immediate effect) Slope change (sustained effect) Both

Single Series ITS Analysis Single time series for outcome variable Example: annual rates of influenza, monthly counts of administered chemotherapy, etc Measured before and after some intervention Example: implementing a new hand hygiene regimen, changing policy for use of chemotherapy, etc Are there significant changes in level and/or slope following the intervention?

Some Statistical Methodology ITS analyses use regression-based techniques Added dummy variables for ITS Standard linear regression: y = α + βx + ε where α = intercept, β = coefficient, x = independent variable, ε = residual (error) Single ITS based on segmented linear regression: y = α + β 1 T + β 2 X + β 3 XT + ε where T = time, X = study phase, XT = time after interruption Year Rate T X XT 2001 31.67 1 0 0 2002 30.19 2 0 0 2003 32.44 3 0 0 2004 31.50 4 0 0 2005 29.62 5 0 0 2006 30.18 6 0 0 2007 29.76 7 0 0 2008 29.89 8 0 0 2009 25.42 9 1 1 2010 24.26 10 1 2 2011 25.11 11 1 3 2012 24.07 12 1 4 2013 23.95 13 1 5 2014 22.78 14 1 6 2015 21.12 15 1 7

Single ITS Example What do the results tell us? Fictional example: The town of Squaresville kept records of % population eating candy at least once per day Implemented new candy tax in 2008 (interruption) Data from before and after candy tax analyzed

Single ITS Example Level change: 3.44% RESULTS Parameter Interpretation Estimate Standard Error Probability β 1 Pre- Trend -0.27702 0.1242 0.0475 β 2 Post- Level Change -3.43952 0.8562 0.0020 β 3 Post- Trend Change -0.33083 0.1964 0.1202 β 1 +β 3 Post- Trend -0.60786 - <.0001

Comparative Design ITS Analysis We can strengthen ITS approach by including comparable control series (i.e., no interruption) Outcome measured from two sources (treatment and control) during same time period Were level and/or slope changes of treatment series significantly different from control series? Used far less often compared to single series ITS, even when control series are available

Building on Single Series Method Treatment and control time series are appended Regression equation is expanded: y = α + β 1 T + β 2 X + β 3 XT + β 4 Z + β 5 ZT + β 6 ZX + β 7 ZXT + ε where Z = treatment or control, ZT = time for treatment and 0 for control, ZX = study phase for treatment and 0 for control, ZXT = time after interruption for treatment and 0 for control Year Rate T X TX Z ZT ZX ZTX 2001 31.67 1 0 0 1 1 0 0 2002 30.19 2 0 0 1 2 0 0 2003 32.44 3 0 0 1 3 0 0 2004 31.50 4 0 0 1 4 0 0 2005 29.62 5 0 0 1 5 0 0 2006 30.18 6 0 0 1 6 0 0 2007 29.76 7 0 0 1 7 0 0 2008 29.89 8 0 0 1 8 0 0 2009 25.42 9 1 1 1 9 1 1 2010 24.26 10 1 2 1 10 1 2 2011 25.11 11 1 3 1 11 1 3 2012 24.07 12 1 4 1 12 1 4 2013 23.95 13 1 5 1 13 1 5 2014 22.78 14 1 6 1 14 1 6 2015 21.12 15 1 7 1 15 1 7 2001 30.81 1 0 0 0 0 0 0 2002 30.96 2 0 0 0 0 0 0 2003 31.23 3 0 0 0 0 0 0 2004 28.65 4 0 0 0 0 0 0 2005 29.33 5 0 0 0 0 0 0 2006 29.10 6 0 0 0 0 0 0 2007 30.27 7 0 0 0 0 0 0 2008 28.64 8 0 0 0 0 0 0 2009 27.95 9 1 1 0 0 0 0 2010 29.55 10 1 2 0 0 0 0 2011 29.14 11 1 3 0 0 0 0 2012 26.87 12 1 4 0 0 0 0 2013 28.72 13 1 5 0 0 0 0 2014 27.89 14 1 6 0 0 0 0 2015 29.82 15 1 7 0 0 0 0

Comparative ITS Example What do the results tell us? Fictional example: The town of Squaresville wanted to compare their results to a control series Another nearby town, Sweet Tooth Valley, also kept records of % population eating candy at least once per day Sweet Tooth Valley did not implement a candy tax in 2008 (no interruption) Data sampled at same rate and during same time period as Squaresville time series

Comparative ITS Example Post-Intervention Level difference 2.88% Δslope difference 0.69% RESULTS Parameter Interpretation Estimate Standard Error Probability β 1 Control Pre- Trend -0.28988 0.1398 0.0500 β 2 Control Post- Level -0.56345 0.9632 0.5645 Change β 3 Control Post- Trend 0.356667 0.2210 0.1208 Change β 4 Treatment/Control Pre- 0.724643 0.9981 0.4755 Level Difference β 5 Treatment/Control Pre- 0.012857 0.1977 0.9487 Trend Difference β 6 Treatment/Control Post- -2.87607 1.3622 0.0463 Level Difference β 7 Treatment/Control Change in Slope Difference Pre- to Post- -0.6875 0.3125 0.0386

But Remember Autocorrelation! What is it? An outcome measured at some point in time is correlated with past values of itself The lag order is how far back in time the correlation extends Example: monthly data, seasonal cycle (lag order = 12 autocorrelation) Temperature in January correlated with temperature in January from previous year, etc Image source: NOAA

Autocorrelation How do we test for it? 1. Conduct our ITS regression analysis 2. Obtain residuals (error) Lag = 2 3. Identify residual autocorrelation: Compute autocorrelation functions (ACF) up to specified lag (i.e., monthly data ~12-24) lag 0 always has correlation of 1 series correlated with itself 4. Identify optimal lag order: Check partial ACF, use last significant lag before others drop to non-significant Note: this is a very general explanation (see Box-Jenkins approach for time series) Look for patterns that are guided by theoretical considerations

Adjusting Standard Errors (SE) What can we do when autocorrelation is present? Use more complicated models (i.e., autoregressive or ARIMA models, generalized linear mixed model) Or Use OLS regression as usual, adjust SE

ITS with Adjusted Standard Errors (SE) Newey-West autocorrelation adjusted standard errors Can do this in SAS with proc model after creating ITS dummy variables (T, X, TX): proc model data=dataset_name; run; quit; parms b0 b1 b2 b3; OUTCOME_VAR = b0 + (b1*t) + (b2*x) + (b3*tx); fit OUTCOME_VAR /gmm kernel=(bart,lag+1,0) vardef=n; test b1+b3; Underlined section is for adjusted SE LAG+1 is a positive integer (lag order + 1) SAS documentation: http://support.sas.com/kb/40/098.html

SAS Macro and Guide for ITS I have written a macro to perform ITS analyses in SAS software Based on Stata program by Ariel Linden (2015) Can perform single series or comparative ITS analyses Will create all necessary dummy variables Will adjust for autocorrelation (order needs to be determined before analysis) using Newey- West standard errors Will test β 1 +β 3 for single series analysis (post- slope) Compute residual and predicted values for further model diagnostics Produce table of estimates and plot time series Companion guide also available

SAS Macro and Guide for ITS Guide paper and full macro code: https://www.linkedin.com/pulse/interrupted-time-series-analysis-singlecomparative-designs-caswell-1/ Contact me directly: jcaswell@hsnri.ca joseph.caswell@ices.on.ca

Thanks for listening!