September 25, Abstract

Size: px
Start display at page:

Download "September 25, Abstract"

Transcription

1 On improving Neymanian analysis for 2 K factorial designs with binary outcomes arxiv: v1 [stat.me] 12 Mar 2018 Jiannan Lu 1 1 Analysis and Experimentation, Microsoft Corporation September 25, 2018 Abstract 2 K factorialdesignsarewidely adoptedbystatisticiansand the broaderscientificcommunity. In this short note, under the potential outcomes framework (Neyman, 1923; Rubin, 1974), we adopt the partial identification approach and derive the sharp lower bound of the sampling variance of the estimated factorial effects, which leads to an improved Neymanian variance estimator that mitigates the over-estimation issue suffered by the classic Neymanian variance estimator by Dasgupta et al. (2015). Keywords: Partial identification; potential outcome; randomization; robust inference 1. INTRODUCTION Originally introduced for agricultural research at the famous Rothamsted Experimental Station more than a century ago (Fisher, 1926; Yates and Mather, 1963), randomized controlled factorial designs (Fisher, 1935) have been widely adopted by researchers in social, behavior and biomedical sciences to simultaneously assess the main and interactive causal effects of multiple treatment factors. In applied research, a very frequently encountered scenario is where not only the treatment factors but also the analysis endpoint(s) are binary. For instance, Harris (1974) studied whether Address for correspondence: Jiannan Lu, One Microsoft Way, Redmond, Washington , U.S.A. jiannl@microsoft.com 1

2 gender difference and dressing in high status clothing lead to occurrences of various aggressive behaviors after frustrating incidents. Nair et al. (2008) explored how differentiating the format of computer-aided encouragement program (e.g., a single session vs. multiple occurrences, personalized vs. more general feedback and advice) affected the abstinence from smoking. Stampfer et al. (1985) investigated whether aspirin and β carotene could help prevent cardiovascular mortality. For such studies, to guarantee trustworthy discovery and reporting of causal effects that are scientifically meaningful, it is imperative to adopt an interpretable and robust methodology for estimation and inference. Indeed, as pointed out by Brunner and Puri (2001), the analysis of factorial designs is one of the most important and frequently encountered problems in statistics. During recent years, the potential outcomes framework (Neyman, 1923; Rubin, 1974, 1990) has drawn increasing attractions, because it enjoys clear interpretability (causal effects are defined as comparisons between potential outcomes under different treatments), flexible scopes of interests (i.e., finite-population or super-population), and robustness to model mis-specifications (because potential outcomes are treated as fixed constants). In particular, it is worth mentioning that, the only source of randomness under the potential outcomes framework is the treatment assignment itself, which Rubin (1990, 2008) advocated as the gold standard for causal inference. Realizing this salient feature of the potential outcomes framework, Dasgupta et al. (2015) extended it to 2 K factorial designs, and claimed that the proposed randomization-based approach results in better understanding of the estimands and allows greater flexibility in statistical inference of factorial effects, compared to the commonly used linear model based approach. However, as widely acknowledged by the causal inference community and extensively discussed in the existing literature (e.g., Aronow et al., 2014; Ding and Dasgupta, 2016; Ding, 2017; Imbens and Rubin, 2015, Section 6.5), a long-standing and fundamental challenge faced by the Neymanian framework (both in general and for factorial designs) is that, it over-estimates the sampling variances of the estimated factorial effects, because we cannot jointly observe the potential outcomes under different treatments, and therefore directly identify the strengths of association between them. This missing data problem is sometimes referred to as the fundamental problem of causal inference (e.g., Holland, 1986; Imbens and Rubin, 2015, Section 1.3). Among the numerous proposals that mitigate the variance over-estimation of the Neymanian causal inference framework, one solution that completely preserves the randomization-based fla- 2

3 vor is the partial identification approach (Richardson et al., 2014), which is widely employed by both statisticians (e.g., Cheng and Small, 2006; Zhang and Rubin, 2003; Aronow et al., 2014; Lu et al., 2015; Ding and Dasgupta, 2016) and econometricians (e.g., Fan and Park, 2010). The key idea of partial identification, in the context of factorial designs, is that although we cannot directly identify the sampling variances of the estimated factorial effects, we can derive their sharp lower bounds which are identifiable from observed data, which leads to an improved Neymanian variance estimator that guarantees better performance, regardless of the underlying dependency structure of the potential outcomes (however, the extent of performance improvement depends on the dependency structure). Along this line of research, Robins (1988) and Ding and Dasgupta (2016) solved the problem for treatment-control studies (i.e., 2 1 factorial designs), and in a recent paper Lu (2017) proposed the said improved variance estimator for 2 2 factorial designs. Nevertheless, we still need a unifying framework applicable to general 2 K factorial designs, which to our best knowledge is lacking from the existing literature. From a theoretical perspective, it seems non-trivial to generalize the main results in Lu (2017) to arbitrary 2 K factorial designs, because the complexity of the dependency structure of the potential outcomes grow exponentially as K increases. From a practical perspective, although 2 2 factorial designs seem common in applied research, high-order factorial designs were also frequently employed (Berkowitz and Daniels, 1964; Kim et al., 2008; Yuan et al., 2008) (e.g., to screen a large number of candidate treatment factors). In this paper, we fill this theoretical gap by deriving the desired improved Neymanian variance estimator, for arbitrary 2 K factorial designs. We organize the remainder of the paper as follows. Section 2 reviews Dasgupta et al. (2015) s randomization-based causal inference framework for 2 K factorial designs, with a primary focus on binary outcomes. Section 3 first highlights the variance over-estimation issue that the Neymanian causal inference framework suffers, and presents the improved Neymanian variance estimator that achieves unified bias-correction. Section 4 concludes with a discussion. 3

4 2. NEYMANIAN INFERENCE FOR FACTORIAL DESIGNS 2.1. Factorial designs We adapt some materials from Dasgupta et al. (2015) and Lu (2016a,b) to review the Neymanian causal inference framework for 2 k factorial designs. To maintain consistency, we inherit the set of notations from Lu (2017). Consider K( 1) distinct treatment factors with two-levels -1 (placebo) and 1 (active treatment), resulting a total number of J = 2 K treatment combinations, labelled as z 1,...,z J. Their definitions depend on the J J model matrix H = (h 0,...,h J 1 ) (c.f. Wu and Hamada, 2009), constructed as follows: 1. Let h 0 = 1 J ; 2. For k = 1,...,K, construct h k by letting its first 2 K k entries be -1, the next 2 K k entries be 1, and repeating 2 k 1 times; 3. If K 2, order all subsets of {1,...,K} with at least two elements, first by cardinality and then lexicography. For k = 1,...J 1 K, let σ k be the kth subset and h K+k = l σ k h l, where stands for entry-wise product. Given the resulting design matrix H, the jth row of the corresponding sub-matrix (h 1,...,h K ) is the jth treatment combination z j, for j = 1,...,J. For example, for K = 2, H = h 0 h 1 h 2 h 3 h 4 h 5 h 6 h ,

5 Therefore, the treatment combinations are z 1 = ( 1, 1, 1), z 2 = ( 1, 1,1), z 3 = ( 1,1, 1), z 4 = ( 1,1,1), z 5 = (1, 1, 1), z 6 = ( 1, 1,1), z 7 = (1,1, 1) and z 8 = (1,1,1), respectively Potential outcomes and factorial effects Consider 2 K factorial designs with N( 2 K+1 ) experimental units. Under the Stable Unit Treatment Value Assumption (SUTVA, Rubin, 1980), for all i = 1,...,N, we let Y i (z j ) {0,1} be the binary potential outcome of unit i under treatment z j, and Y i = {Y i (z 1 ),...,Y i (z J )}. To simplify future notations, let D k1,...,k J = N J 1 {Yi (z j )=k j } (k 1,...,k J {0,1}). By definition 1 k 1 = k J =0 D k 1,...,k J = N. Consequently, we can uniquely determine the potential outcomes using the following 2 J dimensional vector: D = (D 0,0,...,0,D 0,0,...,1,...,D 1,1,...,0,D 1,1,...,1 ), where the indices are ordered binary representations of {0,...,J 1}, respectively. Furthermore, for all {j 1,...,j s } {1,...,J}, let N j1,...,j s = N N 1 {Yi (z j1 )=1,...,Y i (z js )=1} = s Y i (z jr ). r=1 For j = 1,...,J, let the average potential outcome for z j is p j = N j /N, and let p = (p 1,...,p J ). For all l = 1,...,J 1, Dasgupta et al. (2015) defined the lth individual- and population-level factorial effects as τ il = 2 (K 1) h l Y i (i = 1,...,N); τ l = 2 (K 1) h lp. (1) 2.3. Factorial effects and randomization-based inference We consider a completely randomized treatment assignment. Let n 1,...,n J be positive constants such that n j = N. For all j = 1,...,J, randomly assign n j 2 units to z j. For all i = 1,...,N, 5

6 we let 1, if unit i is assigned to z j, W i (z j ) = 0, otherwise (j = 1,...,J). The observed outcomes for unit i is therefore Y obs i potential outcome for z j be ˆp j = n obs j /n j, where = J W i(z j )Y i (z j ). Let the average observed n obs j = N W i (z j )Y i (z j ) = i:w i (z j )=1 Y obs i (j = 1,...,J). Denote ˆp = (ˆp 1,..., ˆp J ). The randomization-based estimator for the factorial effects is ˆ τ l = 2 (K 1) h lˆp (l = 1,...,J 1). (2) Motivated by the seminal work of Dasgupta et al. (2015), Lu (2016b) derived the sampling variance of the estimator in (2) as where Var(ˆ τ l ) = S 2 j = (N 1) 1 N 1 2 2(K 1) is the variance of potential outcomes for z j, and Sj/n 2 j 1 N S2 ( τ l ), (3) { Yi (z j ) Ȳ(z 1) } 2 = N N 1 p j(1 p j ) N S 2 ( τ l ) = (N 1) 1 (τ il τ l ) 2 is the variance of the lth individual-level factorial effects among the experimental units. Following Dasgupta et al. (2015), Lu (2016b) derived the Neymanian estimator for the sampling variance (3), by substituting S 2 j with its unbiased estimate s 2 j = (n j 1) 1 N W i (z j ){Y obs i Ȳobs (z j )} 2 = n j n j 1ˆp j(1 ˆp j ), 6

7 and the unidentifiable S 2 ( τ l ) with its obvious lower bound, zero: Var Ney (ˆ τ l ) = 2 2(K 1) s 2 j /n j = 2 2(K 1) ˆp j (1 ˆp j ) n j 1, (4) This estimator is conservative, on average over-estimating the true sampling variance by } E{ VarNey (ˆ τ l ) Var(ˆ τ l ) = S 2 ( τ l )/N. The bias is generally positive, unless strict additivity (Dasgupta et al., 2015; Ding and Dasgupta, 2016; Ding, 2017) holds, i.e., τ il = τ i l for all i i. In other words, all experiment units have identical treatment effects. Several researchers (e.g., LaVange et al., 2005; Rigdon and Hudgens, 2015) pointed out that this condition is too strong in practice, especially for binary outcomes. In cases where the strict additivity does not hold, the estimator in (4) might be too conservative, as acknowledged by Aronow et al. (2014). 3. THE IMPROVED NEYMANIAN VARIANCE ESTIMATOR The core idea of the partial identification approach is to derive a non-zero and identifiable lower bound of S 2 ( τ l ). To achieve this goal, we rely on the following lemmas, which are 2 K versions of the corresponding 2 2 versions in Lu (2017). However, it is worth mentioning that, the original proofs of the 2 2 versions in Lu (2017) relied heavily on the inclusion-exclusion principle and Boole s inequality, and therefore are extremely difficult to be generalized to arbitrary 2 K factorial designs. To partially circumstance this technical difficulty, we adopt a methodology that is much simpler and more intuitive than the one used by Lu (2017). Lemma 1. For all l = 1,...,J 1, let h l = (h 1l,...,h Jl ), and S 2 ( τ l ) = 1 2 2(K 1) (N 1) N j + h lj h lj N jj j j N N 1 τ2 l. 7

8 Proof. The proof largely follows Lu (2017). First, by (1) N τil 2 = N 2 2(K 1) (h l Y i) 2 N = 2 2(K 1) h lj Y i (z j ) N = 2 2(K 1) h 2 lj Y i 2 (z j )+ h lj h lj Y i (z j )Y i (z j ) j j = 2 2(K 1) N h 2 lj 2 Y 2 i (z j)+ j j h lj h lj = 2 2(K 1) N j + h lj h lj N jj. j j N Y i (z j )Y i (z j ) Therefore ( N ) S 2 ( τ l ) = (N 1) 1 τil 2 N τ2 l = 1 2 2(K 1) (N 1) N j + h lj h lj N jj N N 1 τ2 l. j j Lemma 2. For all l = 1,...,J 1, N j + h lj h lj N jj h lj N l j j, (5) and the equality in (5) holds if and only if τ il {τ il +2 (K 1) } = 0 (i = 1,...,N) (6) or τ il {τ il 2 (K 1) } = 0 (i = 1,...,N). (7) 8

9 Proof. To prove that (5), we break it down into two parts: N j + h 1j h 1j N jj h 1j N l (8) j j and N j + h 1j h 1j N jj h 1j N l. (9) j j For the equality in (5) to hold, we only need the equality in either (8) or (9) to hold. We first prove (8), and derive the sufficient and necessary condition for the equality to hold. To simplify notations, denote J l = {j : h lj = 1}, J l+ = {j : h lj = 1}. Simple algebra suggests that (8) is equivalent to N j + j J l N jj + N jj j j J l j j J l+ j J l,j J l+ N jj. (10) To prove (10), for all i = 1,...,N let λ il = j J l Y i (z j ) and λ il+ = j J l+ Y i (z j ), which are two integer constant. Therefore, for i = 1,...,N, it is obvious that (λ il λ il+ )+(λ il λ il+ ) 2 0, or equivalently λ il + λ il (λ il 1) 2 + λ il+(λ il+ 1) 2 λ il λ il+. (11) Note that (11) immediately implies (8), because j J l N j = N λ il, j J l,j J l+ N jj = N λ il λ il+, and N jj = λ ils(λ ils 1) 2 j j J ls (s =,+). Moreover, the equality in (8) holds if and only if the equality in (11) holds for all i = 1,...,N, which is equivalent to (6), because by definition λ il λ il+ = 2 k 1 τ il. Similarly, we can prove (9), and its equality holds if and only if (7) holds. The proof is complete. 9

10 With the help of Lemmas 1 2, alongside with the definition of factorial effect in (1), we can derive the main theoretical result of the paper. Theorem 1. The sharp lower bound for S 2 ( τ l ) is SLB( τ 2 l ) = N N 1 max{2 (K 1) τ l τ l 2,0}. (12) The equality in (12) holds if and only if (6) or (7) holds. Proof. The proof follow from the definition of τ l in (1), and Lemmas 1 2. The lower bound in (12) sharp, in the sense that it is the minimum of all possible values of S 2 ( τ l ) compatible with the marginal distributions of the potential outcomes (i.e., N j for j = 1,...,J). To facilitate a better understanding of Theorem 1, we consider two special cases. First, when J = 1, we have the classic treatment-control studies. In this case, Theorem 1 reduces to the main result of Ding and Dasgupta (2016). Moreover, the condition in (6), which is sufficient and necessary to achieve the lower bound in (12), reduces to Y i (1) Y i ( 1) (i = 1,...,N) which is termed monotonicity by Ding and Dasgupta (2016). Second, when J = 2, Theorem 1 reduces to the main result of Lu (2017). Theorem 1 leads to the improved Neymanian variance estimator Var IN (ˆ τ l ) = Var Ney (ˆ τ l ) Ŝ2 LB ( τ l)/n, (13) where the bias-correlation term Ŝ 2 LB( τ l ) = N N 1 max{2 (K 1) ˆ τ l ˆ τ 2 l,0} is the finite-population analogue of the sharp lower bound presented in (12). This bias-correction term is always non-negative, which leads to a guaranteed improvement of variance estimation, for any observed data. As pointed out by several researchers, the consistency of the point estimate ˆ τ l 10

11 guarantees the performance of the improved Neymanian variance estimator in moderately large samples. For some simulated and empirical examples for 2 2 factorial designs, see Lu (2017). 4. CONCLUDING REMARKS Under the potential outcomes framework, we have proposed an improved Neymanian variance estimator for 2 K factorial designs with binary outcomes. Comparing to the classic variance estimator by Dasgupta et al. (2015), the newly proposed estimator guarantees bias-correction, regardless of the underlying dependency structure of the potential outcomes. The core idea behind the new estimator is the sharp lower bound of the sampling variance of the estimated factorial effects. We point out three directions of future research. First, although we focus on binary outcomes, it would be interesting to generalize the current work to general outcomes (e.g., continuous, time to event). Aronow et al. (2014) achieved this goal for treatment-control studies, but generalizing their results to factorial designs seems a highly non-trivial task. Second, in a recent paper Mukerjee et al. (2017) extended the potential outcomes framework to more complex experimental designs beyond 2 K factorial, and it is possible to study partial identification for more complex designs. Third, we can generalize our results beyond the classic difference-in-means type factorial effects, to more nuance causal estimands. REFERENCES Aronow, P., Green, D. P., and Lee, D. K. (2014). Sharp bounds on the variance in randomized experiments. The Annals of Statistics, 42: Berkowitz, L. and Daniels, L. R. (1964). Affecting the salience of the social responsibility norm: Effects of past help on the response to dependency relationships. Journal of Abnormal and Social Psychology, 68:275. Brunner, E. and Puri, M. L. (2001). Nonparametric methods in factorial designs. Statistical Papers, 42:1 52. Cheng, J. and Small, D. S.(2006). Bounds on causal effects in three-arm trials with non-compliance. Journal of the Royal Statistical Society: Series B, 68:

12 Dasgupta, T., Pillai, N., and Rubin, D. B. (2015). Causal inference from 2 k factorial designs using the potential outcomes model. Journal of the Royal Statistical Society: Series B, 77: Ding, P.(2017). A paradox from randomization-based causal inference(with discussions). Statistical Science, 32: Ding, P. and Dasgupta, T.(2016). A potential tale of two by two tables from completely randomized experiments. Journal of American Statistical Association, 111: Fan, Y. and Park, S. S. (2010). Sharp bounds on the distribution of treatment effects and their statistical inference. Econometric Theory, 26: Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain, 33: Fisher, R. A. (1935). The Design of Experiments. Edinburgh: Oliver and Boyd. Harris, M. B. (1974). Mediators between frustration and aggression in a field experiment. Journal of Experimental Social Psychology, 10: Holland, P. W. (1986). Statistics and causal inference. Journal of American Statistical Association, 81: Imbens, G. and Rubin, D. B.(2015). Causal Inference in Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press. Kim, K., Zhang, M., and Li, X. (2008). Effects of temporal and social distance on consumer evaluations. Journal of Consumer Research, 35: LaVange, L. M., Durham, T. A., and Koch, G. (2005). Randomization-based non-parametric methods for the analysis of multi-centre trials. Statistical Methods in Medical Research, 14: Lu, J. (2016a). Covariate adjustment in randomization-based causal inference for 2 k factorial designs. Statistics & Probability Letters, 119: Lu, J. (2016b). On randomization-based and regression-based inferences for 2 k factorial designs. Statistics & Probability Letters, 112:

13 Lu, J. (2017). Sharpening randomization-based causal inference for 2 2 factorial designs with binary outcomes. Statistical Methods in Medical Research, in press. doi: / Lu, J., Ding, P., and Dasgupta, T.(2015). Treatment effects on ordinal outcomes: Causal estimands and sharp bounds. arxiv preprint: Mukerjee, R., Dasgupta, T., and Rubin, D. B. (2017). Using standard tools from finite population sampling to improve causal inference for complex experiments. Journal of the American Statistical Association, in press. doi: / Nair, V., Strecher, V., Fagerlin, A., Ubel, P., Resnicow, K., Murphy, S., Little, R., Chakraborty, B., and Zhang, A. (2008). Screening experiments and the use of fractional factorial designs in behavioral intervention research. American Journal of Public Health, 98: Neyman, J. S. (1923). On the application of probability theory to agricultural experiments. Essay on principles: Section 9 (reprinted edition). Statistical Science, 5: Richardson, A., Hudgens, M. G., Gilbert, P. B., and Fine, J. P. (2014). Nonparametric bounds and sensitivity analysis of treatment effects. Statistical Science, 29:596. Rigdon, J. and Hudgens, M. G. (2015). Randomization inference for treatment effects on a binary outcome. Statistics in Medicine, 34: Robins, J. M. (1988). Confidence intervals for causal parameters. Statistics in Medicine, 7: Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66: Rubin, D. B. (1980). Comment on Randomized analysis of experimental data: The Fisher randomization test by D. Basu. Journal of American Statistical Association, 75: Rubin, D. B. (1990). Formal mode of statistical inference for causal effects. Journal of Statistical Planning and Inference, 25: Rubin, D. B. (2008). For objective causal inference, design trumps analysis. The Annals of Applied Statistics, 2:

14 Stampfer, M. J., Buring, J. E., Willett, W., Rosner, B., Eberlein, K., and Hennekens, C. H. (1985). The 2 2 factorial design: Its application to a randomized trial of aspirin and us physicians. Statistics in Medicine, 4: Wu, C. F. J. and Hamada, M. S. (2009). Experiments: Planning, Analysis, and Optimization. Wiley: New York. Yates, F. and Mather, K. (1963). Ronald Aylmer Fisher. Biographical Memoirs of Fellows of the Royal Society, 9: Yuan, X., Liu, J., Zeng, G., Shi, J., Tong, J., and Huang, G. (2008). Optimization of conversion of waste rapeseed oil with high FFA to bio-diesel using response surface methodology. Renewable Energy, 33: Zhang, J. L. and Rubin, D. B. (2003). Estimation of causal effects via principal stratification when some outcomes are truncated by death. Journal of Educational and Behavioral Statistics, 28:

arxiv: v1 [stat.me] 3 Feb 2017

arxiv: v1 [stat.me] 3 Feb 2017 On randomization-based causal inference for matched-pair factorial designs arxiv:702.00888v [stat.me] 3 Feb 207 Jiannan Lu and Alex Deng Analysis and Experimentation, Microsoft Corporation August 3, 208

More information

arxiv: v1 [math.st] 28 Feb 2017

arxiv: v1 [math.st] 28 Feb 2017 Bridging Finite and Super Population Causal Inference arxiv:1702.08615v1 [math.st] 28 Feb 2017 Peng Ding, Xinran Li, and Luke W. Miratrix Abstract There are two general views in causal analysis of experimental

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

arxiv: v1 [stat.me] 23 May 2017

arxiv: v1 [stat.me] 23 May 2017 Model-free causal inference of binary experimental data Peng Ding and Luke W. Miratrix arxiv:1705.08526v1 [stat.me] 23 May 27 Abstract For binary experimental data, we discuss randomization-based inferential

More information

arxiv: v1 [stat.me] 13 Nov 2017

arxiv: v1 [stat.me] 13 Nov 2017 Sharpening randomization-based causal inference for 2 2 factorial designs with binary outcomes arxiv:1711.04432v1 [stat.me] 13 Nov 2017 Jiannan Lu 1 1 Analysis and Experimentation, Microsoft Corporation

More information

Causal inference from 2 K factorial designs using potential outcomes

Causal inference from 2 K factorial designs using potential outcomes Causal inference from 2 K factorial designs using potential outcomes Tirthankar Dasgupta Department of Statistics, Harvard University, Cambridge, MA, USA 02138. atesh S. Pillai Department of Statistics,

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Optimal Blocking by Minimizing the Maximum Within-Block Distance Optimal Blocking by Minimizing the Maximum Within-Block Distance Michael J. Higgins Jasjeet Sekhon Princeton University University of California at Berkeley November 14, 2013 For the Kansas State University

More information

arxiv: v1 [stat.me] 16 Jun 2016

arxiv: v1 [stat.me] 16 Jun 2016 Causal Inference in Rebuilding and Extending the Recondite Bridge between Finite Population Sampling and Experimental arxiv:1606.05279v1 [stat.me] 16 Jun 2016 Design Rahul Mukerjee *, Tirthankar Dasgupta

More information

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Kosuke Imai First Draft: January 19, 2007 This Draft: August 24, 2007 Abstract Zhang and Rubin 2003) derives

More information

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M. AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS by Stanley A. Lubanski and Peter M. Steiner UNIVERSITY OF WISCONSIN-MADISON 018 Background To make

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Research Note: A more powerful test statistic for reasoning about interference between units

Research Note: A more powerful test statistic for reasoning about interference between units Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)

More information

arxiv: v1 [stat.me] 8 Jun 2016

arxiv: v1 [stat.me] 8 Jun 2016 Principal Score Methods: Assumptions and Extensions Avi Feller UC Berkeley Fabrizia Mealli Università di Firenze Luke Miratrix Harvard GSE arxiv:1606.02682v1 [stat.me] 8 Jun 2016 June 9, 2016 Abstract

More information

arxiv: v1 [math.st] 7 Jan 2014

arxiv: v1 [math.st] 7 Jan 2014 Three Occurrences of the Hyperbolic-Secant Distribution Peng Ding Department of Statistics, Harvard University, One Oxford Street, Cambridge 02138 MA Email: pengding@fas.harvard.edu arxiv:1401.1267v1 [math.st]

More information

Inference for Average Treatment Effects

Inference for Average Treatment Effects Inference for Average Treatment Effects Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2018 1 / 15 Social

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Technical Track Session I: Causal Inference

Technical Track Session I: Causal Inference Impact Evaluation Technical Track Session I: Causal Inference Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017 Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability Alessandra Mattei, Federico Ricciardi and Fabrizia Mealli Department of Statistics, Computer Science, Applications, University

More information

A randomization-based perspective of analysis of variance: a test statistic robust to treatment effect heterogeneity

A randomization-based perspective of analysis of variance: a test statistic robust to treatment effect heterogeneity Biometrika, pp 1 25 C 2012 Biometrika Trust Printed in Great Britain A randomization-based perspective of analysis of variance: a test statistic robust to treatment effect heterogeneity One-way analysis

More information

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy Department of Economics, Harvard University 1 / 40 Agenda instrumental variables part I Origins of instrumental

More information

Understanding Ding s Apparent Paradox

Understanding Ding s Apparent Paradox Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox

More information

arxiv: v4 [math.st] 23 Jun 2016

arxiv: v4 [math.st] 23 Jun 2016 A Paradox From Randomization-Based Causal Inference Peng Ding arxiv:1402.0142v4 [math.st] 23 Jun 2016 Abstract Under the potential outcomes framework, causal effects are defined as comparisons between

More information

Technical Track Session I:

Technical Track Session I: Impact Evaluation Technical Track Session I: Click to edit Master title style Causal Inference Damien de Walque Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development Human

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Applied Statistics Lecture Notes

Applied Statistics Lecture Notes Applied Statistics Lecture Notes Kosuke Imai Department of Politics Princeton University February 2, 2008 Making statistical inferences means to learn about what you do not observe, which is called parameters,

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Causal Mechanisms Short Course Part II:

Causal Mechanisms Short Course Part II: Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Potential Outcomes and Causal Inference I

Potential Outcomes and Causal Inference I Potential Outcomes and Causal Inference I Jonathan Wand Polisci 350C Stanford University May 3, 2006 Example A: Get-out-the-Vote (GOTV) Question: Is it possible to increase the likelihood of an individuals

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design

A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design Research Article Statistics Received XXXX www.interscience.wiley.com DOI: 10.1002/sim.0000 A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design Kosuke Imai

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Kosuke Imai Princeton University January 31, 2010 Kosuke Imai (Princeton) Introduction to Statistical Inference January 31, 2010 1 / 21 What is Statistics? Statistics

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Stratified Randomized Experiments

Stratified Randomized Experiments Stratified Randomized Experiments Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Stratified Randomized Experiments Stat186/Gov2002 Fall 2018 1 / 13 Blocking

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, 2016 Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh) Our team! 2 Avi Feller (Berkeley) Jane Furey (Abt Associates)

More information

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small Bounds on Causal Effects in Three-Arm Trials with Non-compliance Jing Cheng Dylan Small Department of Biostatistics and Department of Statistics University of Pennsylvania June 20, 2005 A Three-Arm Randomized

More information

A proof of Bell s inequality in quantum mechanics using causal interactions

A proof of Bell s inequality in quantum mechanics using causal interactions A proof of Bell s inequality in quantum mechanics using causal interactions James M. Robins, Tyler J. VanderWeele Departments of Epidemiology and Biostatistics, Harvard School of Public Health Richard

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Conditional randomization tests of causal effects with interference between units

Conditional randomization tests of causal effects with interference between units Conditional randomization tests of causal effects with interference between units Guillaume Basse 1, Avi Feller 2, and Panos Toulis 3 arxiv:1709.08036v3 [stat.me] 24 Sep 2018 1 UC Berkeley, Dept. of Statistics

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work

More information

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded 1 Background Latent confounder is common in social and behavioral science in which most of cases the selection mechanism

More information

arxiv: v3 [stat.me] 20 Feb 2016

arxiv: v3 [stat.me] 20 Feb 2016 Posterior Predictive p-values with Fisher Randomization Tests in Noncompliance Settings: arxiv:1511.00521v3 [stat.me] 20 Feb 2016 Test Statistics vs Discrepancy Variables Laura Forastiere 1, Fabrizia Mealli

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Sensitivity analysis and distributional assumptions

Sensitivity analysis and distributional assumptions Sensitivity analysis and distributional assumptions Tyler J. VanderWeele Department of Health Studies, University of Chicago 5841 South Maryland Avenue, MC 2007, Chicago, IL 60637, USA vanderweele@uchicago.edu

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects Guanglei Hong University of Chicago, 5736 S. Woodlawn Ave., Chicago, IL 60637 Abstract Decomposing a total causal

More information

Small-sample cluster-robust variance estimators for two-stage least squares models

Small-sample cluster-robust variance estimators for two-stage least squares models Small-sample cluster-robust variance estimators for two-stage least squares models ames E. Pustejovsky The University of Texas at Austin Context In randomized field trials of educational interventions,

More information

arxiv: v1 [stat.me] 3 Feb 2016

arxiv: v1 [stat.me] 3 Feb 2016 Principal stratification analysis using principal scores Peng Ding and Jiannan Lu arxiv:602.096v [stat.me] 3 Feb 206 Abstract Practitioners are interested in not only the average causal effect of the treatment

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

Simulation-based robust IV inference for lifetime data

Simulation-based robust IV inference for lifetime data Simulation-based robust IV inference for lifetime data Anand Acharya 1 Lynda Khalaf 1 Marcel Voia 1 Myra Yazbeck 2 David Wensley 3 1 Department of Economics Carleton University 2 Department of Economics

More information

Ratio of Vector Lengths as an Indicator of Sample Representativeness

Ratio of Vector Lengths as an Indicator of Sample Representativeness Abstract Ratio of Vector Lengths as an Indicator of Sample Representativeness Hee-Choon Shin National Center for Health Statistics, 3311 Toledo Rd., Hyattsville, MD 20782 The main objective of sampling

More information

Propensity score modelling in observational studies using dimension reduction methods

Propensity score modelling in observational studies using dimension reduction methods University of Colorado, Denver From the SelectedWorks of Debashis Ghosh 2011 Propensity score modelling in observational studies using dimension reduction methods Debashis Ghosh, Penn State University

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis Causal Interaction in Factorial Experiments: Application to Conjoint Analysis Naoki Egami Kosuke Imai Princeton University Talk at the Institute for Mathematics and Its Applications University of Minnesota

More information

Lecture 1 January 18

Lecture 1 January 18 STAT 263/363: Experimental Design Winter 2016/17 Lecture 1 January 18 Lecturer: Art B. Owen Scribe: Julie Zhu Overview Experiments are powerful because you can conclude causality from the results. In most

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

CompSci Understanding Data: Theory and Applications

CompSci Understanding Data: Theory and Applications CompSci 590.6 Understanding Data: Theory and Applications Lecture 17 Causality in Statistics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu Fall 2015 1 Today s Reading Rubin Journal of the American

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Supplemental Appendix to Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides

More information

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small

More information

Gov 2002: 4. Observational Studies and Confounding

Gov 2002: 4. Observational Studies and Confounding Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What

More information

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN Ying Liu Division of Biostatistics, Medical College of Wisconsin Yuanjia Wang Department of Biostatistics & Psychiatry,

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

Growth Mixture Modeling and Causal Inference. Booil Jo Stanford University

Growth Mixture Modeling and Causal Inference. Booil Jo Stanford University Growth Mixture Modeling and Causal Inference Booil Jo Stanford University booil@stanford.edu Conference on Advances in Longitudinal Methods inthe Socialand and Behavioral Sciences June 17 18, 2010 Center

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B. RUBIN My perspective on inference for causal effects: In randomized

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Econometric Causality

Econometric Causality Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

A Distinction between Causal Effects in Structural and Rubin Causal Models

A Distinction between Causal Effects in Structural and Rubin Causal Models A istinction between Causal Effects in Structural and Rubin Causal Models ionissi Aliprantis April 28, 2017 Abstract: Unspecified mediators play different roles in the outcome equations of Structural Causal

More information

Analysis of propensity score approaches in difference-in-differences designs

Analysis of propensity score approaches in difference-in-differences designs Author: Diego A. Luna Bazaldua Institution: Lynch School of Education, Boston College Contact email: diego.lunabazaldua@bc.edu Conference section: Research methods Analysis of propensity score approaches

More information

Transparent Structural Estimation. Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro)

Transparent Structural Estimation. Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro) Transparent Structural Estimation Matthew Gentzkow Fisher-Schultz Lecture (from work w/ Isaiah Andrews & Jesse M. Shapiro) 1 A hallmark of contemporary applied microeconomics is a conceptual framework

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2010 Paper 117 Estimating Causal Effects in Trials Involving Multi-treatment Arms Subject to Non-compliance: A Bayesian Frame-work

More information