Education Production Functions. April 7, 2009

Similar documents
Regression Discontinuity

Regression Discontinuity

Regression Discontinuity

Instrumental Variables

Inference in Regression Model

Simple Regression Model. January 24, 2011

ECON Introductory Econometrics. Lecture 17: Experiments

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Introduction to Econometrics

Introduction to Econometrics. Review of Probability & Statistics

Solving Equations. Another fact is that 3 x 4 = 12. This means that 4 x 3 = = 3 and 12 3 = will give us the missing number...

MITOCW ocw f99-lec30_300k

LECTURE 2: SIMPLE REGRESSION I

Recitation Notes 6. Konrad Menzel. October 22, 2006

Experiments and Quasi-Experiments

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Note: Please use the actual date you accessed this material in your citation.

Empirical approaches in public economics

1 Impact Evaluation: Randomized Controlled Trial (RCT)

Ch 7: Dummy (binary, indicator) variables

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

An overview of applied econometrics

ED 357/358 - FIELD EXPERIENCE - LD & EI LESSON DESIGN & DELIVERY LESSON PLAN #4

Midterm 1 ECO Undergraduate Econometrics

Chapter Course notes. Experiments and Quasi-Experiments. Michael Ash CPPA. Main point of experiments: convincing test of how X affects Y.

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Hestenes lectures, Part 5. Summer 1997 at ASU to 50 teachers in their 3 rd Modeling Workshop

EMERGING MARKETS - Lecture 2: Methodology refresher

Uncertainty. Michael Peters December 27, 2013

A Summary of Economic Methodology

Regression Discontinuity Designs.

Dynamics in Social Networks and Causality

1 Motivation for Instrumental Variable (IV) Regression

Parenting Tip of the Month. April. Lower Elementary Teachers

Assessing Studies Based on Multiple Regression

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

1.1 Administrative Stuff

ECNS 561 Multiple Regression Analysis

The Generalized Roy Model and Treatment Effects

One-to-one functions and onto functions

Chapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Chapter 1 Review of Equations and Inequalities

Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison

Descriptive Statistics (And a little bit on rounding and significant digits)

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics

Comments on Best Quasi- Experimental Practice

MA Advanced Macroeconomics: 7. The Real Business Cycle Model

Grades 7 & 8, Math Circles 10/11/12 October, Series & Polygonal Numbers

Probability and Statistics

Please bring the task to your first physics lesson and hand it to the teacher.

Calculus of One Real Variable Prof. Joydeep Dutta Department of Economic Sciences Indian Institute of Technology, Kanpur

EC402 - Problem Set 3

DIFFERENTIAL EQUATIONS

Chapter 6. Net or Unbalanced Forces. Copyright 2011 NSTA. All rights reserved. For more information, go to

Instrumental Variables

In this unit we will study exponents, mathematical operations on polynomials, and factoring.

Formalizing Probability. Choosing the Sample Space. Probability Measures

Big Bang, Black Holes, No Math

These are my slides and notes introducing the Red Queen Game to the National Association of Biology Teachers meeting in Denver in 2016.

Open book, but no loose leaf notes and no electronic devices. Points (out of 200) are in parentheses. Put all answers on the paper provided to you.

Talk Science Professional Development

Modeling Mediation: Causes, Markers, and Mechanisms

Systematic Uncertainty Max Bean John Jay College of Criminal Justice, Physics Program

#29: Logarithm review May 16, 2009

Applied Microeconometrics I


PSC 504: Differences-in-differeces estimators

A CHEMISTRY PARTY. A puppet show

We set up the basic model of two-sided, one-to-one matching

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1

The Math Learning Center PO Box 12929, Salem, Oregon Math Learning Center

Park School Mathematics Curriculum Book 9, Lesson 2: Introduction to Logarithms

1 Intro. 2 Guest Visitor. 3 Rulers

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Exploratory Factor Analysis and Principal Component Analysis

Strongly Agree Agree

Manipulating Radicals

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Solution to Proof Questions from September 1st

Natural deduction for truth-functional logic

Two-sample inference: Continuous Data

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

Soc500: Applied Social Statistics Week 1: Introduction and Probability

The Simple Linear Regression Model

Correlation and regression

Harvard University. Rigorous Research in Engineering Education

GRADE 3 SUPPLEMENT. Set A1 Number & Operations: Equal Expressions. Includes. Skills & Concepts

Lesson 19: Understanding Variability When Estimating a Population Proportion

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

The key is that there are two disjoint populations, and everyone in the market is on either one side or the other

Modern Algebra Prof. Manindra Agrawal Department of Computer Science and Engineering Indian Institute of Technology, Kanpur

Rockefeller College University at Albany

Class: Taylor. January 12, Story time: Dan Willingham, the Cog Psyc. Willingham: Professor of cognitive psychology at Harvard

MITOCW ocw f99-lec16_300k

Lecture 10: F -Tests, ANOVA and R 2

Transcription:

Education Production Functions April 7, 2009

Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule

What do I mean by Production Function? I mean just the standard thing In general Y = F(X 1, X 2 ) where Y is output (measured in dollars) and X 1 and X 2 are inputs. A firm then chooses the level of inputs to maximize profit: π = F(X 1, X 2 ) P 1 X 1 P 2 X 2

Is this a useful concept for thinking about education? Problem 1: Do schools really maximize profit? Maybe not, but it probably doesn t matter. Suppose that F was indeed the true relations hip between X 1, X 2, and Y. If we had data on X 1, X 2 and Y we could learn F. Then since we understand this we could choose the optimal levels of X 1 and X 2 Thus knowing F is certainly useful, but...

Problem 2: Is F really knowable? Can we really conceptualize Y, X 1,and X 2? Even if it was all measurable, its not clear that you and I could agree on what Y actually is. Presumably even if this isn t perfect, we can learn something from it.

Y So what is Y? Ideally it would be utility (In fact, with externalities, ideally it would be everybody s utility not just the guy who gets it) This is great but not particularly practical

Other things we might want to get are Wage levels Employment Job Satisfaction Patriotism Knowledge of Politics Technical Ability Creativity Intellectual Depth Kindness Criminal Activity

Some of things are possible to get, but most aren t Instead we need to use what we have Often we don t have long time spans, we need to use things that are measured while students are in school or shortly thereafter Main one is Test Scores Others are high school graduation college attendance

How do we use test scores? With test scores there multiple ways run regressions let T ig be the test score for child f in grade g Levels: T ig = X ig β + u ig Pure Gains: T ig T ig 1 = X ig δ + v ig New level conditional on old level: T ig = αt ig 1 + X ig γ + ε ig There are advantages and disadvantages of each

Inputs Now what are the inputs? Ideally things like Curriculum and how a teacher teaches different subjects How much time she spends with each student in the class How kids in the class interact How classrom materials help kids learn Ultimately these things are hard to measure

Whether this is what you want may actually depend on who you are Suppose you are the head of the school board: you can control the different type of resources that go to schools, but you do not have that much influence over precisely how they are used In this case we would imagine things like: Teacher salaries Teacher qualifications (clearly there is an interaction between the first two) Number of teachers per child Number of administrators Money for classroom resources (e.g. books) Money for school structures (e.g. gyms)

What Data can we get? Getting data on all of this is very difficult, but often we can get data on: Teacher/student ratio Per Pupil Expenditure Fraction of teachers with a Master s degree Teacher Experience Teacher Salary

Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule

Hanushek paper in Journal of Economic Literature Hanushek s paper is mostly about education production functions, but he discusses a number of other things as well I think this reads very well as a discussion of many of the most important issues in the economics of education literature so I will go through all of his tables

He starts with some raw descriptive information about how things have changed over time in the U.S.

Next he summarizes the results from a lot of studies in the following way

Hedges et. al. Critique of Hanushek The way he did it is somewhat controversial and informal There is a more formal way to combine studies using more sophisticated method Hedges, Land, and Greenwald reanalyze Hanushek s data in a paper entitled Does Money Matter? A Meta-Analysis of Studies of the Effects of Differential School Inputs on Student Outcomes I am not an expert on meta-analysis and this is not a technique that is common in economics, so I do not want to get into too much detail However, I will give the basic flavor.

First a fact suppose that Y F which means that Pr(Y y) = F (y). Now suppose we construct a random variable in the following way, we randomly draw Y from the distribution F and then we construct F(Y ). Assume that F is continuous. What is the distribution of F(Y )? To see this let y 0.5 be the median. By definition 0.5 = Pr(Y y 0.5 ) = Pr(F(Y ) F(y 0.5 )) = Pr(F(Y ) 0.5)

But this is not just true at the median, it is true anywhere so that if y q is the q th quantile q = Pr(Y y q ) = Pr(F(Y ) F(y q )) = Pr(F(Y ) q) but this means that F(Y ) is uniform.

Notice that a p-value is just a special case of this. If our null is β 1 = 0, then under the null β 1 se( β 1 ) N(0, 1) so will be uniform. Φ ( ) β1 se( β 1 ) Now suppose we have a bunch of different samples that all test whether β 1 = 0. As long as the samples are independent we have a bunch of independent p-values (under the null). This is the basis of a test that combines information across studies. Hedges and coauthors use this basic approach:

Further we might want to get effect sizes. Again if we have a bunch of estimators of β 1 with standard errors, as long as they are independent we can combine them. I will not talk about the details of this.

It really isn t so obvious how to put this together Hedges et. al. argue that there is real evidence that money matters" Hanushek responds that he is really looking at a different question There null hypothesis is that there is no study for which money matters His null is that the empirical literature does not reach a strong conclusion All of these studies suffer from a big problem They all essentially run outcomes on these inputs But, inputs are not randomly assigned

Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule

Card and Krueger used a fixed effect type approach They propose the framework where y ijkc = δ jc + µ kc + X ijkc β c + E ijkc ( γjc + ρ rc ) ) + ɛ ijkc i: individual j: state of birth c: cohort k: current state r: region X: observable stuff E: Education They interpret γ jc as a measure of school quality.

They then regress γ jc = a j + Q jc b, where Q picks up measures of quality of education. They estimate the model in two steps. They do more than just this, but lets focus on the main results

Two more recent papers deal with this is a better way Krueger uses data from an experiment with random assignment Angrist and Lavy come up with a creative instrument

Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule

Random Assignment Using notation similar to the Krueger paper we can write where Y is = as s + bf i + α s + ε i i represents student i s represents school s S s is observed characteristics of school s F i is observed family background variables of student i α s represents unobserved characteristics of school s ε i represents unobserved characteristics of student i Thus the error term is α s + ε i

What do we need for OLS to be consistent? We need 0 = cov(s s, α s + ε i ) = cov(s s, α s ) + cov(s s, ε i ) and 0 = cov(f i, α s + ε i ) = cov(f i, α s ) + cov(f i, ε i )

I am worried about all of these things but I am particularly worried about: cov(s s, ε i ) = 0 One would think that more highly motivated or richer parents would tend to send their kids to better schools Thus does finding that school resources are associated with positive outputs indicated that the school inputs are valuable or that good kids go to good schools?

Social Experiments One solution to this problem is random assignment If we could randomly assign kids to schools then by construction S s would be uncorrelated with ε i. Randomly assigning kids to schools is almost impossible, but randomizing kids to classes within schools is not A famous experiment in Tennessee did just that

The Tennessee Student/Teacher Achievement Ratio (STAR) Experiment Began during the 1985-1986 school year in Tennessee Kindergarten classes were divided into three types: Small classes: 13-17 students Regular classes: 22-25 students Regular classes with Aide: 22-25 students and an aide Only schools big enough to allow at least one of each type were eligible Students were randomly assigned to classes Teachers were randomly assigned to classes Kids stay in the same type of class for four years Test kids at the end of each year Total of 11,600 students and 80 schools were used

Unfortunately people are not like test tubes They do not necessarily do what the people running the experiment intended them to do like in the Milwaukee case We see a number of problems: Random assignment after kindergarden was done again for Regular vs. Regular/Aide (although small schools were OK) Some kids switched from small to Regular or vise versa (because of problems with kids or because parents complained) Kindergarten was not required so many students begin school in first grade

As in Milwaukee, Students leave school (either move or go to private school) Would not be that big a deal if it happened at random Probably not random, parents might be angry that their kid was assigned to a regular classroom This is called nonrandom attrition Not a small deal: about 1/2 of kids present in kindergarten were not there in at least 1 subsequent year Lets look at some of the raw data

OLS Estimation Hw do we use this to estimate the effect? It is not quite as clean as one might like because it is random assignment by class not random assignment by school Krueger estimates the following model Y ics = β 0 + β 1 Small cs + β 2 Reg/A cs + β 3 X ics + α s + ε ics

where now i represents student i c represents class s represents school s S s is observed characteristics of school s X ics is other observed stuff α s represents unobserved characteristics of school s ε i represents unobserved characteristics of student i A major idea here is that α s is allowing for a school specific shock Thus everything that is used comes from within the school

Class Size Effect This still doesn t answer precisely what we are interested in What does small class mean. We want the effect of adding one more student, the magnitude here is hard to interpret People switched classes and were able to take other things after kindergarten How can we deal with that? Lets think about the model Y ics = β 0 + β 1 C cs + β 2 X ics + α s + ε ics Can we just run OLS with the experimental sample?

No, C cs is not randomly assigned for a number of different reasons. This wouldn t really use the experiment. However, we can do IV (somewhat like in Milwaukee case) Use kindergarten assignment as instrument It will be correlated with class size for sure It will be uncorrelated with everything else for sure Thus, it solves both problems!

Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule

Maimonides Rule Is there anything we can do if we don t totally trust the experiment Angrist and Lavy found a very clever way to estimate the effects of class size Maimonides was a twelfth century Rabbinic scholar He interpreted the Talmud in the following way: Twenty-five children may be put it charge of one teacher. If the number in the class exceeds twenty-five but is not more than forty, he should have an assistant to help with the instruction. If there are more than forty, two teachers must be appointed.

This rule has had a major impact on education in Israel They try to follow this rule so that no class has more than 40 kids But this means that If you have 80 kids in a grade, you have two classes with 40 each if you have 81 kids in a grade, you have three classes with 27 each

That sounds like something we can use as an instrument We can write the rule as f sc = [ int e ( s es 1 40 ) ] + 1 Ideally we could condition on grades with either 80 or 81 kids More generally there are two ways to do this condition on people close to the cutoff and use f sc as an instrument Control for class size in a smooth way and use f sc as an instrument

To estimate the model we want to use an econometric framework similar to Krueger Y ics = β 0 + β 1 C cs + β 2 X ics + α s + ε ics Now we can t just put in a school effect because we will loose too much variation so think of α s as part of the error term Their data is a bit different because it is by class rather than by individual-but for this that isn t a big deal Angrist and Lavy first estimate this model by OLS to show what we would get

Next, they want to worry about the fact that C cs is correlated with α s + ε ics They run instrumental variables using f sc as an instrument.