An Ordinal Approach to Decomposing Test Score Gaps. David M. Quinn University of Southern California

Size: px
Start display at page:

Download "An Ordinal Approach to Decomposing Test Score Gaps. David M. Quinn University of Southern California"

Transcription

1 An Ordinal Approach to Decomposing Test Score Gaps David M. Quinn University of Southern California Andrew D. Ho Harvard Graduate School of Education

2 Background and Purpose The estimation of racial/ethnic and socioeconomic test score gaps plays an important role in monitoring educational inequality. Since the Coleman Report (Coleman et al., 1966), researchers have decomposed gaps into within-and between school portions in order to learn about the potential sources of gap formation, and where resources might be targeted to narrow gaps (Hanushek & Rivkin, 2006). A limitation of existing decomposition methods is that they are parametric in nature; as such, they rely on the interval test scale assumption, which is often questionable and difficult to verify (Reardon, 2008; Dominigue, 2014). When the interval scale assumption cannot be verified, any monotonic scale transformation is permissible (Reardon, 2008). This is problematic because analytic results can vary meaningfully across such transformations (Bond & Lang, 2013). Research into the extent to which results are sensitive to test score transformations in various settings has grown recently, as has the development of approaches for addressing this concern (Barlevy & Neal, 2012; Bond & Lang, 2013; Briggs & Domingue, 2013; Cunha & Heckman, 2008; Nielsen, 2015; Reardon, 2008). In this study, we develop and evaluate ordinal methods for decomposing test score gaps. These methods offer a transformation-invariant decomposition that can be applied in any situation in which the interval properties of the outcome cannot be established. Research Design Methods We extend the V-gap framework, a framework for estimating transformation-invariant test score gaps (Ho, 2009; Ho & Reardon, 2012), by applying ordinal techniques inspired by Reardon s (2008) three-part parametric gap decomposition, which establishes: δ = β 1(1 VR ) + β 1VR + β 2VR (1) where δ is the total Black-White gap (as an illustrative example), β 1 is the mean within-school gap (i.e., from a school fixed effects model), VR is the variance ratio index of segregation, and β 2 is the effect of school proportion Black on student test scores (see Appendix A for detail). Our decomposition method involves mapping the empirical cumulative distribution functions (ECDFs) by race within schools to the total within-school ECDF. In our simulations (described below), we take the following approach (assuming a Black/White population): 1) Divide the full sample by q quantiles (in simulations below, q=10) and assign each student the number corresponding to the bin in which they fall. 2) Within each school s, find p (all) sb,the proportion of students whose scores fall into each bin b. 3) Across all schools and bins, multiply p (all) sb by the total number of Black (White) students sampled in school s to get the (approximate) number of Black (White) students who would be in each bin b if the Black (White) distribution in school s equaled the overall distribution in school s, or n (Black) sb, n (White) sb. 4) Use n (White) sb and n (Black) sb from (3) as weights in a model estimating V. This V will represent the total between-school V gap. This is the ordinal analogue to Reardon s (2008) total between-school gap. We will call this V btwn. We also apply alternative procedures in which we map the Black to White, or White to Black, ECDF within each school in step 2. After these mappings, the resulting V gaps are ordinal analogues to the residual gap that remains after fitting a school fixed effects model, or what Reardon (2008) calls the unambiguously between-school portion of the gap. However,

3 unlike in Reardon s (2008) parametric framework, our non-parametric approach yields different unambiguously-between school gaps depending on which racial group s distribution is treated as (B to the reference. We call these V W) (W to btwn and V B) btwn, with superscripts indicating whether the Black ECDF was mapped to the White ECDF within school, or vice versa. Simulations We employ a multilevel model-based approach to simulate samples from a population with known values for parametric and non-parametric gaps and gap decompositions (see Appendix B for detail on the model and on how we calculate true population values). For each simulated sample, we: 1) Estimate the overall V total (using Stata s rocfit routine): V total = 2Φ 1 (P(b > w)), where Φ 1 is the inverse of the standard normal CDF and P(b > w) is the probability that a random black student will score higher than a random white student. For comparison, we also estimate V total (quant), or V after assigning students quantile scores as described above. 2) Generate new scores for students by mapping ECDFs within schools using the procedures described above. 3) Using the new scores from (2), obtain new V statistics, yielding estimates of the ordinal total between-school gap (V btwn ) or the ordinal unambiguously between school gaps (B to (V btwn W) (W to and V btwn B) ) 4) We also present the parametric gap estimates for each simulated data set: β 1, β 2, the total parametric population gap (δ ), and the total parametric between-school gap (δ btwn ). In Table 1, we present the parameter values we use for β 1 and β 2 across simulations, along with the parametric and non-parametric parameter values that correspond to each set of values for β 1 and β 2. Results In Table 2, we present simulation results for the range of parameter values. As seen in the top panel, V and V (quant) are unbiased across parameter values (in each simulation set, estimates are never statistically different from the true parameter values). For the ordinal decomposition elements V btwn, (B to W) V btwn, and (W to B) V btwn bias is small but increases slightly as parameter values increase; RMSD (bottom panel) also rises slightly with parameter values. The (B to W) bias-to-true-value ratio (middle panel) may increase with parameter values for V btwn. Bias in is minimal, however; as a reference point, the bias, bias-to-true-value ratio, and RMSD V btwn for V btwn are similar to the respective values for its parametric analogue, δ btwn.

4 References Barlevy, G., & Neal, D. (2012). Pay for percentile. American Economic Review, 102(5), Bond, T. N., & Lang, K. (2013). The evolution of the Black-White test score gap in Grades K 3: The fragility of results. Review of Economics and Statistics, 95(5), Briggs, D.C., & Domingue, B. (2013). The gains from vertical scaling. Journal of Educational and Behavioral Statistics, 38(6), DOI: / Coleman, James S., Ernest Q. Campbell, Carol J. Hobson, James McPartland, Alexander M. Mood, Frederic D. Weinfeld, and Robert York Equality of Educational Opportunity. Washington, D.C. Cunha, F., & Heckman, J.J. (2008). Identifying and estimating the technology of cognitive and noncognitive skill formation. Journal of Human Resources, 43, Domingue, B. (2014). Evaluating the equal-interval hypothesis with test score scales. Psychometrika, 79(1), DOI: /S Hanushek, Eric A., and Steven G. Rivkin School Quality and the Black-White Achievement Gap. (No. w12651). National Bureau of Economic Research. Ho, Andrew D A Nonparametric Framework for Comparing Trends and Gaps Across Tests. Journal of Educational and Behavioral Statistics, 34: Ho, A.D., & Reardon, S.F (2012). Estimating achievement gaps from test scores reported in ordinal proficiency categories. Journal of Educational and Behavioral Statistics, 37(4), Nielsen, E.R. (2015). Achievement estimates and deviations from cardinal comparability. Federal Reserve Working Paper. Retrieved from: Reardon, Sean F Thirteen Ways of Looking at the Black-White Test Score Gap. Working paper, Stanford University. Retrieved from: Reardon, S.F., & Ho, A.D. (2015). Practical issues in estimating achievement gaps from coarsened data. Journal of Educational and Behavioral Statistics, 40(2),

5 Table 1. Parameter Values for Simulations. Determining Parameters Parametric Ordinal (unambig) δ btwn V btwn (B to W) V btwn (W to B) V btwn β 1 β 2 δ δ btwn V

6 Table 2. Results from Simulations with Varying Parameter Values. Parameters Parametric β 1 β 2 δ V β 1 β 2 δ δ btwn Simulation Results V btwn (B to W) V btwn Ordinal (W to B) V btwn V (quant) V Bias Bias/Parameter Ratio ND ND ND ND ND ND ND ND ND RMSD

7 Note. ND=not defined. RMSD=root mean square deviation. Simulated school-level sample size=500; within-school sample size=25. Each set of parameter values was run for 5000 simulations; simulated data sets that did not converge were discarded.

8 Appendix A. Reardon s (2008) Three-part Gap Decomposition Our method is an ordinal analogue to Reardon s (2008) three-part gap decomposition. Reardon showed that the overall black-white test score gap in the population, δ, could be decomposed as: δ = β 1(1 VR ) + β 1VR + β 2VR (1) In (1), β 1 and β 2 are estimated from the model (with only white and black students): Y is = β 0 + β 1 Black is + β 2 Black s + ε is, (2) where Y is is the test score of student i in school s, Black is an indicator that the student is black, and Black s is the proportion of sampled students in school s who are black. VR in (1) is the estimated variance ratio index of segregation, which can be expressed as the difference in school proportion black between the average black student and the average white student: Black s (black) Black s (white). Reardon calls the first term on the RHS of (1) the unambiguously within school gap, the center term the ambiguous gap, and the last term the unambiguously between school gap. The unambiguously between-school gap is the portion of the gap that would remain if black and white mean performance were equalized within schools without altering the relationship between school proportion black and student test scores. The unambiguously within school gap is the portion of the gap that would be closed if mean performance between black and white students were equalized within schools, without changing the overall school mean. In Reardon s (2008) three-part decomposition, anchoring the fitted lines for black and white students to the same y-intercept (without changing their slopes) results in the closure of the total within-school gap, and the remaining black-white difference is the unambiguously betweenschool gap (or what would be left over if the within-school gap from a school fixed effects model were closed). In the parametric decomposition, the resulting unambiguously between-school gap would be the same if the black fitted line were anchored to the y-intercept for the white fitted line, vice versa, or any other location.

9 Appendix B. Drawing Samples for Simulations. We begin with a school-level population distribution of school proportion black (similar to that found in the ECLS-K:99). We randomly sample 500 values from this distribution, and assign each of these schools true school-by-race test score means using: Μ s (b) ~N(μ =.11 β 1 β 2 P s (B = 1), σ =.026) Μ s (w) ~N(μ =.11 β 2 P s (B = 1), σ =.011) where Μ s (b) represents the true mean in school s for black students and Μ s (w) represents the same for white students, β 1 and β 2 are as defined in equation 1 above, and P s (B = 1) is the sampled true school proportion black. After establishing a random sample of school-by-race means in this way, we draw a random sample of 25 students from each school and assign each a race (where B i = 1 means the student is black and B i = 0 means the student is white), where each race draw is a Bernoulli trial with probability equal to the true school proportion black. Then, for each student i in school s, we draw a test score, conditional on the student s race and schoolby-race mean: Y is (B i = 1, Μ s (b) = μs (b) )~ N(μs (b), σ =.97) Y is (B i = 0, Μ s (w) = μs (w) )~ N(μs (w), σ =.89) We vary β 1 and β 2 by simulation in order to understand how the method performs under different population values. Calculating Population Values for Ordinal Decomposition For each value of school probability black in our distribution, school-by-race means are normally distributed. Therefore, assuming equally-sized schools for simplicity, the overall (i.e., across schools) white mean at a given school probability black is β 2 (E(B)) and the overall black mean is.11 + β 2 (E(B)) + β 1. By the law of total variance, the overall racespecific variances (conditional on school probability black) are sums of the within-school variances and the variance of the means. With the overall race-specific means and variances at each school probability black, we find the overall population CDF for white students by applying the formula for the CDF of the mixture of normal distributions 1 : (w) 2(w) ) σ p F white (x) = w (w) p Φ ( x μ p p (B1) where p indexes a particular school probability black, Φ is the normal CDF, w (w) p is a weight giving the proportion of the total white population that attends schools with school proportion black p, or w p (w) = 1 P(b) p (1 P(b) p ) p, μ p (w) is the true white mean across schools with probability black p, and σ p 2(w) is the true white variance across schools with probability black p. Similarly, we find the overall population PDF for black students using the formula for the PDF of a mixture of normals: f black (x) = w (b) s ϕ(x, μ (b) 2(b) p s, σ s ), (B2) where φ is the normal PDF and the weight w s (b) = P(b) s s P(b) s. This gives a total V of: 1 We used the nor1mix package in R to find all mixture PDFs and CDFs.

10 V total = 2Φ 1 (w) (w) x μ ( w s Φ ( s (b) p 2(w) )) ( w s ϕ(x, (b) 2(b) p μs, σ s ) ) dx (B3) σ s To solve for the true value of the total between-school V (V btwn ), we find the overall black PDF and the overall white CDF when the black and white CDFs at each school probability black are equal to the total CDF at that school probability black. We find the overall black PDF by applying the mixture of normal using weight P(B) P(B) to each conditional (on school 2000 probability black) black distribution (where P(B) is the school probability black, and 2000 is the normalizing constant for the distribution) and P(B) (1 P(B)) to each conditional white 2000 distribution. To find the overall white CDF, we apply weights 1 P(B) P(B) to the conditional black parameters and 1 P(B) (1 P(B)) to the conditional white parameters. This yields, for 2000 each racial group, the mixture distribution for the overall population that would result if the within-school distributions for each racial group matched the observed combined (black/white) distribution within school. We then find V as in B3, using these newly weighted overall distributions by race. Mapping Black Distributions to White Distributions Within Schools and Vice Versa When mapping the black CDF within school to the white CDF, we keep all white parameters unchanged; we also keep all black parameters unchanged for students in schools with p(b) s = 1. For black students in schools where p(b) s 1, we assign them the parameter values of white students in the same school; weighting their new parameter values by w (b) s, we apply the (B to formulas above and find V W) btwn. Following a similar but reversed procedure to map the white distribution to the black distribution within school, we find V btwn 2000 (W to B).

Modeling Mediation: Causes, Markers, and Mechanisms

Modeling Mediation: Causes, Markers, and Mechanisms Modeling Mediation: Causes, Markers, and Mechanisms Stephen W. Raudenbush University of Chicago Address at the Society for Resesarch on Educational Effectiveness,Washington, DC, March 3, 2011. Many thanks

More information

THE INCOME-ACHIEVEMENT GAP AND ADULT OUTCOME INEQUALITY*

THE INCOME-ACHIEVEMENT GAP AND ADULT OUTCOME INEQUALITY* THE INCOME-ACHIEVEMENT GAP AND ADULT OUTCOME INEQUALITY* ERIC R. NIELSEN THE FEDERAL RESERVE BOARD Abstract. This paper discusses various methods for assessing group differences in academic achievement

More information

Data-analysis and Retrieval Ordinal Classification

Data-analysis and Retrieval Ordinal Classification Data-analysis and Retrieval Ordinal Classification Ad Feelders Universiteit Utrecht Data-analysis and Retrieval 1 / 30 Strongly disagree Ordinal Classification 1 2 3 4 5 0% (0) 10.5% (2) 21.1% (4) 42.1%

More information

STAT Section 2.1: Basic Inference. Basic Definitions

STAT Section 2.1: Basic Inference. Basic Definitions STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.

More information

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix) Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University

More information

Impact Evaluation Technical Workshop:

Impact Evaluation Technical Workshop: Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature

More information

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University

Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,

More information

Statistical and psychometric methods for measurement: G Theory, DIF, & Linking

Statistical and psychometric methods for measurement: G Theory, DIF, & Linking Statistical and psychometric methods for measurement: G Theory, DIF, & Linking Andrew Ho, Harvard Graduate School of Education The World Bank, Psychometrics Mini Course 2 Washington, DC. June 27, 2018

More information

Week 1 Quantitative Analysis of Financial Markets Distributions A

Week 1 Quantitative Analysis of Financial Markets Distributions A Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

PhD/MA Econometrics Examination January 2012 PART A

PhD/MA Econometrics Examination January 2012 PART A PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator

More information

Lecture 2: CDF and EDF

Lecture 2: CDF and EDF STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all

More information

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( )

Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil ( ) Trends in the Relative Distribution of Wages by Gender and Cohorts in Brazil (1981-2005) Ana Maria Hermeto Camilo de Oliveira Affiliation: CEDEPLAR/UFMG Address: Av. Antônio Carlos, 6627 FACE/UFMG Belo

More information

Using Heteroskedastic Ordered Probit Models to Recover Moments of Continuous Test Score Distributions From Coarsened Data

Using Heteroskedastic Ordered Probit Models to Recover Moments of Continuous Test Score Distributions From Coarsened Data Article Journal of Educational and Behavioral Statistics 2017, Vol. 42, No. 1, pp. 3 45 DOI: 10.3102/1076998616666279 # 2016 AERA. http://jebs.aera.net Using Heteroskedastic Ordered Probit Models to Recover

More information

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248) AIM HIGH SCHOOL Curriculum Map 2923 W. 12 Mile Road Farmington Hills, MI 48334 (248) 702-6922 www.aimhighschool.com COURSE TITLE: Statistics DESCRIPTION OF COURSE: PREREQUISITES: Algebra 2 Students will

More information

Hierarchical Linear Modeling. Lesson Two

Hierarchical Linear Modeling. Lesson Two Hierarchical Linear Modeling Lesson Two Lesson Two Plan Multivariate Multilevel Model I. The Two-Level Multivariate Model II. Examining Residuals III. More Practice in Running HLM I. The Two-Level Multivariate

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Bivariate Relationships Between Variables

Bivariate Relationships Between Variables Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods

More information

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics CHAPTER 4 & 5 Linear Regression with One Regressor Kazu Matsuda IBEC PHBU 430 Econometrics Introduction Simple linear regression model = Linear model with one independent variable. y = dependent variable

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Statistical and psychometric methods for measurement: Scale development and validation

Statistical and psychometric methods for measurement: Scale development and validation Statistical and psychometric methods for measurement: Scale development and validation Andrew Ho, Harvard Graduate School of Education The World Bank, Psychometrics Mini Course Washington, DC. June 11,

More information

Test Score Measurement, Value-Added Models, and the Black-White Test Score Gap

Test Score Measurement, Value-Added Models, and the Black-White Test Score Gap Test Score Measurement, Value-Added Models, and the Black-White Test Score Gap Jeffrey Penney 1 Department of Economics, Queen s University penneyj@econ.queensu.ca version: November 14, 2014 Research in

More information

All rights reserved. Reproduction of these materials for instructional purposes in public school classrooms in Virginia is permitted.

All rights reserved. Reproduction of these materials for instructional purposes in public school classrooms in Virginia is permitted. Algebra I Copyright 2009 by the Virginia Department of Education P.O. Box 2120 Richmond, Virginia 23218-2120 http://www.doe.virginia.gov All rights reserved. Reproduction of these materials for instructional

More information

Equations and Inequalities

Equations and Inequalities Algebra I SOL Expanded Test Blueprint Summary Table Blue Hyperlinks link to Understanding the Standards and Essential Knowledge, Skills, and Processes Reporting Category Algebra I Standards of Learning

More information

Week 2: Review of probability and statistics

Week 2: Review of probability and statistics Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Chapter 1. The data we first collected was the diameter of all the different colored M&Ms we were given. The diameter is in cm.

Chapter 1. The data we first collected was the diameter of all the different colored M&Ms we were given. The diameter is in cm. + = M&M Experiment Introduction!! In order to achieve a better understanding of chapters 1-9 in our textbook, we have outlined experiments that address the main points present in each of the mentioned

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing & z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

a table or a graph or an equation.

a table or a graph or an equation. Topic (8) POPULATION DISTRIBUTIONS 8-1 So far: Topic (8) POPULATION DISTRIBUTIONS We ve seen some ways to summarize a set of data, including numerical summaries. We ve heard a little about how to sample

More information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

Explaining Racial/Ethnic Gaps In Spatial Mismatch: The Primacy of Racial Segregation

Explaining Racial/Ethnic Gaps In Spatial Mismatch: The Primacy of Racial Segregation National Poverty Center Working Paper Series #10-02 February 2010 Explaining Racial/Ethnic Gaps In Spatial Mismatch: The Primacy of Racial Segregation Michael A. Stoll, UCLA and Kenya L. Covington, California

More information

Inference in Normal Regression Model. Dr. Frank Wood

Inference in Normal Regression Model. Dr. Frank Wood Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being

More information

EFFORT INCENTIVES, ACHIEVEMENT GAPS AND AFFIRMA. AND AFFIRMATIVE ACTION POLICIES (very preliminary)

EFFORT INCENTIVES, ACHIEVEMENT GAPS AND AFFIRMA. AND AFFIRMATIVE ACTION POLICIES (very preliminary) EFFORT INCENTIVES, ACHIEVEMENT GAPS AND AFFIRMATIVE ACTION POLICIES (very preliminary) Brent Hickman Department of Economics, University of Iowa December 22, 2008 Brief Model Outline In this paper, I use

More information

ACHIEVEMENT GAP ESTIMATES AND DEVIATIONS FROM CARDINAL COMPARABILITY

ACHIEVEMENT GAP ESTIMATES AND DEVIATIONS FROM CARDINAL COMPARABILITY ACHIEVEMENT GAP ESTIMATES AND DEVIATIONS FROM CARDINAL COMPARABILITY ERIC R. NIELSEN THE FEDERAL RESERVE BOARD Abstract. This paper assesses the sensitivity of standard empirical methods for measuring

More information

Comparing latent inequality with ordinal health data

Comparing latent inequality with ordinal health data Comparing latent inequality with ordinal health data David M. Kaplan University of Missouri Longhao Zhuo University of Missouri Midwest Econometrics Group October 2018 Dave Kaplan (Missouri) and Longhao

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

Declarative Statistics

Declarative Statistics Declarative Statistics Roberto Rossi, 1 Özgür Akgün, 2 Steven D. Prestwich, 3 S. Armagan Tarim 3 1 The University of Edinburgh Business School, The University of Edinburgh, UK 2 Department of Computer

More information

Chapter 2: Resampling Maarten Jansen

Chapter 2: Resampling Maarten Jansen Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,

More information

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

Combining Difference-in-difference and Matching for Panel Data Analysis

Combining Difference-in-difference and Matching for Panel Data Analysis Combining Difference-in-difference and Matching for Panel Data Analysis Weihua An Departments of Sociology and Statistics Indiana University July 28, 2016 1 / 15 Research Interests Network Analysis Social

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region Impact Evaluation Technical Session VI Matching Techniques Jed Friedman Manila, December 2008 Human Development Network East Asia and the Pacific Region Spanish Impact Evaluation Fund The case of random

More information

Prentice Hall CME Project Algebra

Prentice Hall CME Project Algebra Prentice Hall CME Project Algebra 1 2009 Algebra 1 C O R R E L A T E D T O from March 2009 Algebra 1 A1.1 Relations and Functions A1.1.1 Determine whether a relation represented by a table, graph, words

More information

Comparing latent inequality with ordinal health data

Comparing latent inequality with ordinal health data Comparing latent inequality with ordinal health data David M. Kaplan Longhao Zhuo July 16, 2018 Abstract Using health as an example, we consider comparing two latent distributions when only ordinal data

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

3. QUANTILE-REGRESSION MODEL AND ESTIMATION

3. QUANTILE-REGRESSION MODEL AND ESTIMATION 03-Hao.qxd 3/13/2007 5:24 PM Page 22 22 Combining these two partial derivatives leads to: m + y m f(y)dy = F (m) (1 F (m)) = 2F (m) 1. [A.2] By setting 2F(m) 1 = 0, we solve for the value of F(m) = 1/2,

More information

Using Relative Distribution Software

Using Relative Distribution Software Michele L SHAFFER and Mark S HANDCOCK Using Relative Distribution Software Relative distribution methods are a nonparametric statistical approach to the comparison of distribution These methods combine

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

2012 Assessment Report. Mathematics with Calculus Level 3 Statistics and Modelling Level 3

2012 Assessment Report. Mathematics with Calculus Level 3 Statistics and Modelling Level 3 National Certificate of Educational Achievement 2012 Assessment Report Mathematics with Calculus Level 3 Statistics and Modelling Level 3 90635 Differentiate functions and use derivatives to solve problems

More information

A Non-parametric bootstrap for multilevel models

A Non-parametric bootstrap for multilevel models A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area

Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area under the curve. The curve is called the probability

More information

Counting principles, including permutations and combinations.

Counting principles, including permutations and combinations. 1 Counting principles, including permutations and combinations. The binomial theorem: expansion of a + b n, n ε N. THE PRODUCT RULE If there are m different ways of performing an operation and for each

More information

Probability Distribution

Probability Distribution Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Stat 704 Data Analysis I Probability Review

Stat 704 Data Analysis I Probability Review 1 / 39 Stat 704 Data Analysis I Probability Review Dr. Yen-Yi Ho Department of Statistics, University of South Carolina A.3 Random Variables 2 / 39 def n: A random variable is defined as a function that

More information

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1 Gary King GaryKing.org April 13, 2014 1 c Copyright 2014 Gary King, All Rights Reserved. Gary King ()

More information

More formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( )

More formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( ) Fortin Econ 56 3. Measurement The theoretical literature on income inequality has developed sophisticated measures (e.g. Gini coefficient) on inequality according to some desirable properties such as decomposability

More information

Chapter 8 of Devore , H 1 :

Chapter 8 of Devore , H 1 : Chapter 8 of Devore TESTING A STATISTICAL HYPOTHESIS Maghsoodloo A statistical hypothesis is an assumption about the frequency function(s) (i.e., PDF or pdf) of one or more random variables. Stated in

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

STATISTICS 1 REVISION NOTES

STATISTICS 1 REVISION NOTES STATISTICS 1 REVISION NOTES Statistical Model Representing and summarising Sample Data Key words: Quantitative Data This is data in NUMERICAL FORM such as shoe size, height etc. Qualitative Data This is

More information

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java)

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Pepi Zulvia, Anang Kurnia, and Agus M. Soleh Citation: AIP Conference

More information

The Geometric Distribution

The Geometric Distribution MATH 382 The Geometric Distribution Dr. Neal, WKU Suppose we have a fixed probability p of having a success on any single attempt, where p > 0. We continue to make independent attempts until we succeed.

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

Testing and Interpreting Interaction Effects in Multilevel Models

Testing and Interpreting Interaction Effects in Multilevel Models Testing and Interpreting Interaction Effects in Multilevel Models Joseph J. Stevens University of Oregon and Ann C. Schulte Arizona State University Presented at the annual AERA conference, Washington,

More information

Course Outline. TERM EFFECTIVE: Fall 2017 CURRICULUM APPROVAL DATE: 03/27/2017

Course Outline. TERM EFFECTIVE: Fall 2017 CURRICULUM APPROVAL DATE: 03/27/2017 5055 Santa Teresa Blvd Gilroy, CA 95023 Course Outline COURSE: MATH 430 DIVISION: 10 ALSO LISTED AS: TERM EFFECTIVE: Fall 2017 CURRICULUM APPROVAL DATE: 03/27/2017 SHT TITLE: ALGEBRA I LONG TITLE: Algebra

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Chapter 1 - Lecture 3 Measures of Location

Chapter 1 - Lecture 3 Measures of Location Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

The Effect of Private Tutoring Expenditures on Academic Performance of Middle School Students: Evidence from the Korea Education Longitudinal Study

The Effect of Private Tutoring Expenditures on Academic Performance of Middle School Students: Evidence from the Korea Education Longitudinal Study The Effect of Private Tutoring Expenditures on Academic Performance of Middle School Students: Evidence from the Korea Education Longitudinal Study Deockhyun Ryu Changhui Kang Department of Economics Chung-Ang

More information

Partitioning variation in multilevel models.

Partitioning variation in multilevel models. Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA. Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Gov 2002: 3. Randomization Inference

Gov 2002: 3. Randomization Inference Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via

More information

One Economist s Perspective on Some Important Estimation Issues

One Economist s Perspective on Some Important Estimation Issues One Economist s Perspective on Some Important Estimation Issues Jere R. Behrman W.R. Kenan Jr. Professor of Economics & Sociology University of Pennsylvania SRCD Seattle Preconference on Interventions

More information

Sources of Inequality: Additive Decomposition of the Gini Coefficient.

Sources of Inequality: Additive Decomposition of the Gini Coefficient. Sources of Inequality: Additive Decomposition of the Gini Coefficient. Carlos Hurtado Econometrics Seminar Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Feb 24th,

More information