Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

Similar documents
External validity, causal interaction and randomised trials

Causal inference. Gary Goertz Kroc Institute for International Peace Studies University of Notre Dame Spring 2018

Tools for causal analysis and philosophical theories of causation. Isabelle Drouet (IHPST)

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Econometric Causality

From Causality, Second edition, Contents

Appendix A Lewis s Counterfactuals

Commentary on Guarini

Causality II: How does causal inference fit into public health and what it is the role of statistics?

OUTLINE CAUSAL INFERENCE: LOGICAL FOUNDATION AND NEW RESULTS. Judea Pearl University of California Los Angeles (

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Statistical Models for Causal Analysis

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Lecture 34 Woodward on Manipulation and Causation

MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS. By Tyler J. VanderWeele and James M. Robins University of Chicago and Harvard University

LECTURE FOUR MICHAELMAS 2017 Dr Maarten Steenhagen Causation

CAUSALITY. Models, Reasoning, and Inference 1 CAMBRIDGE UNIVERSITY PRESS. Judea Pearl. University of California, Los Angeles

Comments on Best Quasi- Experimental Practice

A Distinction between Causal Effects in Structural and Rubin Causal Models

1 Correlation and Inference from Regression

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Research Note: A more powerful test statistic for reasoning about interference between units

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Technical Track Session I: Causal Inference

Counterfactual Reasoning in Algorithmic Fairness

Causal Inference. Prediction and causation are very different. Typical questions are:

Mediation for the 21st Century

Structure learning in human causal induction

Conceivability and Modal Knowledge

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Causal Inference with Big Data Sets

Cartwright on Economics Lawrence Boland Simon Fraser University, Burnaby, B.C.

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

OUTLINE THE MATHEMATICS OF CAUSAL INFERENCE IN STATISTICS. Judea Pearl University of California Los Angeles (

Analysis of Panel Data: Introduction and Causal Inference with Panel Data

Association Between Variables Measured at the Ordinal Level

Technical and Practical Considerations in applying Value Added Models to estimate teacher effects

MINIMAL SUFFICIENT CAUSATION AND DIRECTED ACYCLIC GRAPHS 1. By Tyler J. VanderWeele and James M. Robins. University of Chicago and Harvard University

3 The Semantics of the Propositional Calculus

Empirical approaches in public economics

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Social Science Counterfactuals. Julian Reiss, Durham University

Background on Coherent Systems

2) There should be uncertainty as to which outcome will occur before the procedure takes place.

Lecture 12: Arguments for the absolutist and relationist views of space

1 Motivation for Instrumental Variable (IV) Regression

DISCUSSION CENSORED VISION. Bruce Le Catt

A proof of Bell s inequality in quantum mechanics using causal interactions

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Toward a Mechanistic Interpretation of Probability

Advanced Statistical Methods for Observational Studies L E C T U R E 0 1

Measurement Independence, Parameter Independence and Non-locality

Experimental Designs for Identifying Causal Mechanisms

Advising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)

Selection on Observables: Propensity Score Matching.

EMERGING MARKETS - Lecture 2: Methodology refresher

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

THE ROLE OF COMPUTER BASED TECHNOLOGY IN DEVELOPING UNDERSTANDING OF THE CONCEPT OF SAMPLING DISTRIBUTION

Commentary. Regression toward the mean: a fresh look at an old story

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

NGSS Example Bundles. Page 1 of 23

Modeling Mediation: Causes, Markers, and Mechanisms

Quantitative Economics for the Evaluation of the European Policy

UCLA Department of Statistics Papers

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

CEPA Working Paper No

All instruction should be three-dimensional. Page 1 of 9

Ignoring the matching variables in cohort studies - when is it valid, and why?

Do not copy, post, or distribute

A Measure of Robustness to Misspecification

Chapter 9: Association Between Variables Measured at the Ordinal Level

CAUSATION CAUSATION. Chapter 10. Non-Humean Reductionism

Making Sense. Tom Carter. tom/sfi-csss. April 2, 2009

Research Design: Causal inference and counterfactuals

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Desire-as-belief revisited

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

Technical Track Session I:

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Modeling Log Data from an Intelligent Tutor Experiment

Local Cognitive Theory for Coordinate Systems. Components of Understanding:

On the Use of Linear Fixed Effects Regression Models for Causal Inference

Analysis of propensity score approaches in difference-in-differences designs

Probability and Statistics

CAUSALITY CORRECTIONS IMPLEMENTED IN 2nd PRINTING

Correlation and Linear Regression

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

The Causal Inference Problem and the Rubin Causal Model

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013

Metric spaces and metrizability

Philosophy 5340 Epistemology. Topic 3: Analysis, Analytically Basic Concepts, Direct Acquaintance, and Theoretical Terms. Part 2: Theoretical Terms

Dan Graham Professor of Statistical Modelling. Centre for Transport Studies

Conditional probabilities and graphical models

Course Evaluation, FYTN04 Theoretical Particle. Particle Physics, Fall 11, Department of Astronomy and Theoretical Physics

Assessing Studies Based on Multiple Regression

Inferential statistics

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

Transcription:

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D. David Kaplan Department of Educational Psychology

The General Theme Definition of Condition for This paper addresses a large number of important issues that go well beyond Eckhard s presentation: 1. The paper presents a very nice and succinct history of I. 2. The paper raises important issues around problems of causal inference in the context of I. 3. The paper presents some empirical findings that address what can be learned from I and how I can be improved for educational effectiveness studies. In the time I have, I would like to focus on issues raised in this paper regarding causal inference in I. ILSA 2011 2/18

Important Features of I Definition of Condition for Among the many uses of I that Eckhard articulates, an important use is that I can establish a knowledge base regarding educational effectiveness. We come to understand educational effectiveness by unpacking the inputs, processes and outputs of the educational system. But inputs, processes, and outputs exist, as Eckhard notes, within contexts. This leads to the CIPO model which we find underlies our work on PISA of which Eckhard and others has made substantial contributions. Unpacking the CIPO model also requires accounting for the fact that these processes are, by definition, mediators, and hence structural specifications are required. The CIPO model, as Eckhard points out, is also multilevel, and therefore multilevel specifications are required. Finally, the CIPO model implies an underlying dynamic, and this needs to be considered as well. ILSA 2011 3/18

Issues of Causal Definition of Condition for Eckhard then goes on to raise issues regarding the limitations of I for effectiveness studies, and for drawing causal inferences in particular. Naturally all data collection and design paradigms have limitations (e.g. generalizability of RCTs), and those that are raised by Eckhard are important to note. However, I am not as pessimistic about using I for causal inference as Eckhard seems to be, and I would like to attend to certain issues that I think might have a solution along the lines of the enhancements to I raised by Eckhard. Of course, there is always the chance that I am a naive optimist. But first...my biases. ILSA 2011 4/18

Definition of Condition for My orientation is toward a structural approach to causal inference, but I am a pluralist in the sense of the philosopher Nancy Cartwright (assuming I understood her). I believe that there are many ways to get at how interventions might lead to change. I also believe that the limitations of design constrain how we might assess causal effects. Sometimes RCTs are needed, sometimes I are available and should be exploited. I also believe that a distinction can be made between studying the effects of causes versus the causes of effects (Holland, 1986). Whereas the focus in education effectiveness (at least in the US) has been on the former via RCTs, I believe that I are very well suited for the latter. Despite my pluralism, I accept that the potential outcomes framework of Neyman and Rubin is the strongest conceptual framework for causal inference. ILSA 2011 5/18

Definition of Condition for To fully realize the utility of I for examining the causes of effects, we need to worry about some issues that Eckard raises, but not others. Focusing on PISA, Eckhard understandably worries that I can t be used for causal inference because prior achievement is not obtained. However, as PISA is a yield study and not an achievement test, it is unclear what prior yield would mean in this context. Of course, longitudinal data would be ideal, but then yield would indicate a school or classroom effect, changing, perhaps, the meaning the goals of PISA. Eckhard is also concerned that ANCOVA or propensity score balancing might not be successful because prior achievement variable is not available. However, as nice as having prior achievement would be, we know (e.g. Steiner, et al., 2010) that we can achieve very good balance with a judicious choice of covariates, reliably measured and relevant to the selection mechanism. QEGs need to consider this carefully. ILSA 2011 6/18

Definition of Condition for It is this judicious choice of covariates that Eckhard raises as the most important limitation of I and I agree. Regardless of whether one desires to use I to study a particular quasi-experimental effect or (but not in opposition to) specifying a large multilevel structural model, the choice of covariates and how they are measured is of utmost importance. The issue of how the inputs and processes are measured allows one to consider the counterfactual/potential outcomes theory of causal inference more seriously in the ILSA context. My view, which I have shared with Eckhard, examines how I can be used within the counterfactual/potential outcomes framework, rooted in Mackie s (1980) idea of the INUS condition, Woodward s (2003) ideas of manipulability, and Hoover s (e.g. 2001) structural econometric perspective. ILSA 2011 7/18

The Counterfactual Definition of Definition of Condition for To keep things simple, what does it mean to say x causes y? Based on Hume s (second) definition of causality, the counterfactual definition of causation is If x occurs then y occurs, and y would not have occurred if x had not This is the operating definition underlying potential outcomes literature (e.g. Neyman, 1923, Rubin, 1974)....if x had not is the counterfactual. Under potential outcomes, we must imagine the possibility of an individual causal effect δ i = y 1 i y 0 i (1) where δ i is the treatment effect for individual i under the treatment condition and the control condition. ILSA 2011 8/18

Mackie and the INUS Condition for Definition of Condition for This definition is too strong, because y can occur if x had not. We need to understand the context under which x occurred Let A, B, C..., etc, be a list of factors that lead to some effect whenever some conjunction of the factors occurs. For example, take three conditions under which reading development can take place: incoming phonemic awareness; teacher competency; parental reading activities. A conjunction of events may be ABC or DEF or JKL, etc. This allows for the possibility that ABC might be a cause or DEF might be a cause, etc. Assume the set is finite. Then factors (ABC or DEF or JKL) isa condition that is both necessary and sufficient for the effect to occur. Each specific conjunction, such as ABC, is sufficient but not necessary for the effect. ILSA 2011 9/18

Mackie (cont d) Definition of Condition for For example: ABC is a minimal sufficient condition insofar as none of its conjuncts are redundant. AB is not sufficient for the effect. A itself is neither a necessary nor sufficient condition for the effect. However, the single factor A, is related to the effect in an important fashion - viz. [I]t is an insufficient but non-redundant part of an unnecessary but sufficient condition: it will be convenient to call this... an inus condition. (pg. 62) If that is confusing, consider calling it a Necessary Element of a Sufficient Set. ILSA 2011 10 / 18

Mackie (cont d) Definition of Condition for But what about remaining relevant causes? According to Mackie, these remaining causes are relegated to the causal field which includes what we recognize as the disturbance term. Hoover argues that the notion of a causal field has to be expanded for Mackie s ideas to be relevant to indeterministic problems. If parameters of a causal question are truly constant or mostly stable, then they can be relegated to the causal field without too much worry. Prior cycles of PISA, say, would help determine this. The causal field is a background of standing conditions and, within the boundaries of validity claimed for the causal relation, must be invariant to exercises of controlling the consequent by means of the particular causal relation (INUS condition) of interest (Hoover, 2001, pg. 222) ILSA 2011 11 / 18

Definition of Condition for The INUS condition should be an attractive idea to those of us working on the development and application of I to support causal inferences because it focuses our attention on some aspect of the causal problem without having to be concerned directly with knowing every minimal sufficient subset of the full cause of the event. In fact, I would argue that a fruitful collaboration of content area expert groups (REG, MEG, SEG) along with the QEG, with focused attention on instantiating INUS causal variables, could lead to I being more productively used for educational effectiveness studies. ILSA 2011 12 / 18

Woodward and the Definition of Condition for In many situations, the goal is to change x i.e. intervene on x to change y. In practice, we may want to engage in model-based simulations involving a hypothetical change in x as a precursor to an experiment or to gain insights into potential problems when scaling up an intervention. Intervention on x requires that x be measured so as to support a counterfactual proposition. Recently, a manipulability theory of causation was put forth by Woodward (2003) as an attempt to provide a firmer foundation for causal explanation. For Woodward a causal explanation is an explanation that provides information for purposes of manipulation and control. An important aspect of Woodward s theory involves clarifying the problem of intervention and invariance. ILSA 2011 13 / 18

Invariance Definition of Condition for Intervention relates to the notion of an ideal experimental manipulation of a variable, x, with the goal of determining if this ideal manipulation changes y, and that no change in y would have occurred without the change in x. For Woodward, a necessary and sufficient condition for the description of a causal relationship is that it be invariant under some appropriate set of interventions. If a purported causal relationship is not invariant under a set of interventions, it is not, in fact, a causal relationship and indeed might only describe a correlation. Statistically, invariance concerns the extent to which the parameters of the conditional distribution do not change when there are changes in the parameters of the marginal distribution. This leads to the problem exogeneity. ILSA 2011 14 / 18

Definition of Condition for How might we conceive of this invariance and exogeneity problem in the context of I? Over cycles of PISA (for example), we can examine statistical relationships among variables that have been repeatedly measured, say for trend. Examining these relationships with an eye toward major country-level policy shifts, we could then examine if these relationships have remained somewhat stable. Given the careful nature of PISA sampling (particular because of its need to preserve trend), any clear change in relationships of interest might signal super-exogenous changes in policy that has led to problems of parameter invariance. Careful linkages with system level data provided by the OECD could give insights into the nature of these changes. ILSA 2011 15 / 18

What does this have to Definition of Condition for In my view I provide the data collection framework to measure variables that can be used to help define the causal field in which interventions take place. This is the C in the CIPO model that Eckhard discusses. I can be enhanced to ask specific questions regarding the inputs and processes of schooling. Here, Eckhard and I are in complete agreement and Eckhard s paper does a nice job of pointing out possible enhancements. However, these questions should be measured in such a way as to support a hypothetical manipulability question of the sort what if things were different? I m less concerned about outputs. I such as PISA deal with them very well. ILSA 2011 16 / 18

What about Definition of Condition for For an analytic framework to support these ideas, I believe that multilevel structural modeling within a Bayesian perspective is most useful. Why? 1. Multilevel SEM (Muthen & Satorra, 1995; Kaplan & Kreisman, 2000) can address mediating and moderating effects within a multilevel framework (ala Levin). 2. The Bayesian perspective of multilevel SEM (Kaplan & Depaoli, in press) allows one to incorporate prior information into model estimation via careful elicitation (e.g. from prior cycles of PISA; consultation with PISA expert groups, etc.). 3. Bayesian multilevel SEM directs us away from some of the internal inconsistencies of null hypothesis testing and toward model selection based on predictive capacity. 4. Bayesian multilevel SEM (and conventional SEM for that matter) can allow for model-based simulations of counterfactuals within varying context settings. This can direct us toward evolutionary knowledge development within the CIPO framework of schooling. ILSA 2011 17 / 18

Conclusion Definition of Condition for Although I might not share Eckhard s pessimism regarding the use of I for causal inference, his paper gives us a great deal to think about, and is well worth reading. It is extremely important to consider the enhancements he suggests. But I also believe we have the methodologies and philosophical underpinnings in place to engage in bold modeling efforts. The issues raised in his paper need to be taken very seriously if we are to move forward in developing I to support an evolutionary knowledge base of research on educational effectiveness and school development in an international context. ILSA 2011 18 / 18