Advanced Quantitative Research Methodology Lecture Notes: January Ecological 28, 2012 Inference1 / 38

Size: px
Start display at page:

Download "Advanced Quantitative Research Methodology Lecture Notes: January Ecological 28, 2012 Inference1 / 38"

Transcription

1 Advanced Quantitative Research Methodology Lecture Notes: Ecological Inference 1 Gary King January 28, c Copyright 2008 Gary King, All Rights Reserved. Gary King () Advanced Quantitative Research Methodology Lecture Notes: January Ecological 28, 2012 Inference1 / 38

2 Reading Reading: Gary King. A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data. Princeton University Press, 1997 Gary King () Ecological Inference 2 / 38

3 Preliminaries Gary King () Ecological Inference 3 / 38

4 Preliminaries Definition: Ecological Inference is the process of using aggregate (i.e., ecological ) data to infer discrete individual-level relationships of interest when individual-level data are not available. Gary King () Ecological Inference 3 / 38

5 Preliminaries Definition: Ecological Inference is the process of using aggregate (i.e., ecological ) data to infer discrete individual-level relationships of interest when individual-level data are not available. History of the Problem: Gary King () Ecological Inference 3 / 38

6 Preliminaries Definition: Ecological Inference is the process of using aggregate (i.e., ecological ) data to infer discrete individual-level relationships of interest when individual-level data are not available. History of the Problem: 1. Ogburn and Goltra (1919) in the very first multivariate statistical analysis of politics in a political science journal made ecological inferences and recognized the problem. The big issue in 1919: are the newly enfranchised women going to take over the political system? They regressed votes in referenda in Oregon precincts on the percent of women in each precinct. But they worried: Gary King () Ecological Inference 3 / 38

7 Preliminaries Definition: Ecological Inference is the process of using aggregate (i.e., ecological ) data to infer discrete individual-level relationships of interest when individual-level data are not available. History of the Problem: 1. Ogburn and Goltra (1919) in the very first multivariate statistical analysis of politics in a political science journal made ecological inferences and recognized the problem. The big issue in 1919: are the newly enfranchised women going to take over the political system? They regressed votes in referenda in Oregon precincts on the percent of women in each precinct. But they worried: It is also theoretically possible to gerrymander the precincts in such a way that there may be a negative correlative even though men and women each distribute their votes 50 to 50 on a given measure... (Ogburn and Goltra, 1919). Gary King () Ecological Inference 3 / 38

8 Preliminaries Gary King () Ecological Inference 4 / 38

9 Preliminaries 2. Robinson s (1950) clarified the problem, causing: Gary King () Ecological Inference 4 / 38

10 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. Gary King () Ecological Inference 4 / 38

11 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. Gary King () Ecological Inference 4 / 38

12 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. 3. Hundreds of other articles have helped us understand the problem. Gary King () Ecological Inference 4 / 38

13 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. 3. Hundreds of other articles have helped us understand the problem. History of Solutions: A 45-year war between supporters of Gary King () Ecological Inference 4 / 38

14 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. 3. Hundreds of other articles have helped us understand the problem. History of Solutions: A 45-year war between supporters of 1. Duncan and Davis (1953): a deterministic solution. Gary King () Ecological Inference 4 / 38

15 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. 3. Hundreds of other articles have helped us understand the problem. History of Solutions: A 45-year war between supporters of 1. Duncan and Davis (1953): a deterministic solution. 2. Goodman (1953, 1959): a statistical solution. Gary King () Ecological Inference 4 / 38

16 Preliminaries 2. Robinson s (1950) clarified the problem, causing: (a) several literatures to wither, including studies of local and regional politics through aggregate electoral statistics in favor of survey research based on national samples. (b) the development of a methodological literature devoted to solving the problem. 3. Hundreds of other articles have helped us understand the problem. History of Solutions: A 45-year war between supporters of 1. Duncan and Davis (1953): a deterministic solution. 2. Goodman (1953, 1959): a statistical solution. 3. for 50 years, no other methods used in applications. Gary King () Ecological Inference 4 / 38

17 If you can avoid making ecological inferences, do so! Gary King () Ecological Inference 5 / 38

18 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. Gary King () Ecological Inference 5 / 38

19 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? Gary King () Ecological Inference 5 / 38

20 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? 3. Marketing: What types of people buy your products? Gary King () Ecological Inference 5 / 38

21 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? 3. Marketing: What types of people buy your products? 4. Banking: Are banks complying with red-lining laws? Are there areas with certain types of people who might take out loans but have not? Gary King () Ecological Inference 5 / 38

22 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? 3. Marketing: What types of people buy your products? 4. Banking: Are banks complying with red-lining laws? Are there areas with certain types of people who might take out loans but have not? 5. Candidates for office: How do good representatives decide what policies they should favor? How can candidates tailor campaign appeals and target voter groups? Gary King () Ecological Inference 5 / 38

23 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? 3. Marketing: What types of people buy your products? 4. Banking: Are banks complying with red-lining laws? Are there areas with certain types of people who might take out loans but have not? 5. Candidates for office: How do good representatives decide what policies they should favor? How can candidates tailor campaign appeals and target voter groups? 6. Sociology: Do the unemployed commit more crimes or is it just that there are more crimes in unemployed areas? Gary King () Ecological Inference 5 / 38

24 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 1. Public policy: Applying the Voting Rights Act. 2. History: Who voted for the Nazi s? 3. Marketing: What types of people buy your products? 4. Banking: Are banks complying with red-lining laws? Are there areas with certain types of people who might take out loans but have not? 5. Candidates for office: How do good representatives decide what policies they should favor? How can candidates tailor campaign appeals and target voter groups? 6. Sociology: Do the unemployed commit more crimes or is it just that there are more crimes in unemployed areas? 7. Economics: With some exceptions, most theories are based on assumptions about individuals, but most data are on groups. Gary King () Ecological Inference 5 / 38

25 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 8. Education: Do students who attend private schools through a voucher system do as well as students who can afford to attend on their own? Gary King () Ecological Inference 6 / 38

26 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 8. Education: Do students who attend private schools through a voucher system do as well as students who can afford to attend on their own? 9. Atmospheric physics: How can we tell which types of the vehicles actually on the roads emit more carbon dioxide and carbon monoxide? Gary King () Ecological Inference 6 / 38

27 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 8. Education: Do students who attend private schools through a voucher system do as well as students who can afford to attend on their own? 9. Atmospheric physics: How can we tell which types of the vehicles actually on the roads emit more carbon dioxide and carbon monoxide? 10. Oceanography: How many marine organisms of a certain type were collected at a given depth, from fishing nets dropped from the surface down through a variety of depths. Gary King () Ecological Inference 6 / 38

28 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 8. Education: Do students who attend private schools through a voucher system do as well as students who can afford to attend on their own? 9. Atmospheric physics: How can we tell which types of the vehicles actually on the roads emit more carbon dioxide and carbon monoxide? 10. Oceanography: How many marine organisms of a certain type were collected at a given depth, from fishing nets dropped from the surface down through a variety of depths. 11. Epidemiology: Does radon cause lung cancer? Gary King () Ecological Inference 6 / 38

29 If you can avoid making ecological inferences, do so! Some of those who aren t so lucky: 8. Education: Do students who attend private schools through a voucher system do as well as students who can afford to attend on their own? 9. Atmospheric physics: How can we tell which types of the vehicles actually on the roads emit more carbon dioxide and carbon monoxide? 10. Oceanography: How many marine organisms of a certain type were collected at a given depth, from fishing nets dropped from the surface down through a variety of depths. 11. Epidemiology: Does radon cause lung cancer? 12. Changes in public opinion: How to use repeated independent cross-sectional surveys to measure individual change? Gary King () Ecological Inference 6 / 38

30 The Problem: The District Level Race of Voting Age Voting Decision Person Democrat Republican No vote black??? 55,054 white??? 25,706 19,896 10,936 49,928 80,760 The Ecological Inference Problem at the District-Level: The 1990 Election to the Ohio State House, District 42. The goal is to infer from the marginal entries (each of which is the sum of the corresponding row or column) to the cell entries. (Note information in the bounds.) Gary King () Ecological Inference 7 / 38

31 The Problem: The Precinct Level Race of Voting Age Voting Decision Person Democrat Republican No vote black??? 221 white??? The Ecological Inference Problem at the Precinct-Level: Precinct P in District 42 (1 of 131 in the district). The goal is to infer from the margins of a set of tables like this one to the cell entries in each. Gary King () Ecological Inference 8 / 38

32 The best we could do, circa 1996 Estimated Percent of Blacks Year District Voting for the Democratic Candidate % Sample Ecological Inferences: All Ohio State House districts where an African American Democrat ran against a white Republican, (Source: Statement of Gordon G. Henderson, presented as an exhibit in federal court, using Goodman s regression). Figures above 100% are logically impossible. Gary King () Ecological Inference 9 / 38

33 The best we could do, circa 1996: Continued Estimated Percent of Blacks Year District Voting for the Democratic Candidate % Sample Ecological Inferences: All Ohio State House districts where an African American Democrat ran against a white Republican, (Source: Statement of Gordon G. Henderson, presented as an exhibit in federal court, using Goodman s regression). Figures above 100% are logically impossible. Gary King () Ecological Inference 10 / 38

34 What Information Does The New Method Provide? Goodman s Method: One incorrect number (5 standard deviations outside the deterministic bounds) Gary King () Ecological Inference 11 / 38

35 What Information Does The New Method Provide? Goodman s Method: One incorrect number (5 standard deviations outside the deterministic bounds) The New Method: Gary King () Ecological Inference 11 / 38

36 What Information Does The New Method Provide? Goodman s Method: One incorrect number (5 standard deviations outside the deterministic bounds) The New Method: Non-minority Turnout in New Jersey Cities and Towns. In contrast to the best existing methods, which provide one (incorrect) number for the entire state, the method offered here gives an accurate estimate of white turnout for all 567 minor civil divisions in the state, a few of which are labeled. Gary King () Ecological Inference 11 / 38

37 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Gary King () Ecological Inference 12 / 38

38 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: Gary King () Ecological Inference 12 / 38

39 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: T i = voter Turnout in precinct i Gary King () Ecological Inference 12 / 38

40 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: T i = voter Turnout in precinct i X i = Black proportion of Voting Age Population in precinct i Gary King () Ecological Inference 12 / 38

41 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: T i = voter Turnout in precinct i X i = Black proportion of Voting Age Population in precinct i Unobserved quantities of interest: Gary King () Ecological Inference 12 / 38

42 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: T i = voter Turnout in precinct i X i = Black proportion of Voting Age Population in precinct i Unobserved quantities of interest: β b i = fraction of blacks who vote in precinct i Gary King () Ecological Inference 12 / 38

43 Notation Vote No vote black βi b 1 βi b X i white βi w 1 βi w 1 X i T i 1 T i Notation for Precinct i (i = 1,..., p). Observed variables: T i = voter Turnout in precinct i X i = Black proportion of Voting Age Population in precinct i Unobserved quantities of interest: βi b βi w = fraction of blacks who vote in precinct i = fraction of whites who vote in precinct i Gary King () Ecological Inference 12 / 38

44 Notation An accounting identity (a fact, not an assumption): Gary King () Ecological Inference 13 / 38

45 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) Gary King () Ecological Inference 13 / 38

46 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) = β w i + (β b i β w i )X i Gary King () Ecological Inference 13 / 38

47 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) = β w i + (β b i β w i )X i Goodman s regression: Gary King () Ecological Inference 13 / 38

48 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) = β w i + (β b i β w i )X i Goodman s regression: Run a regression of T i on X i and (1 X i ) (no constant term). Coefficients are intended to be: Gary King () Ecological Inference 13 / 38

49 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) = β w i + (β b i β w i )X i Goodman s regression: Run a regression of T i on X i and (1 X i ) (no constant term). Coefficients are intended to be: B b, District-wide black turnout Gary King () Ecological Inference 13 / 38

50 Notation An accounting identity (a fact, not an assumption): T i = β b i X i + β w i (1 X i ) = β w i + (β b i β w i )X i Goodman s regression: Run a regression of T i on X i and (1 X i ) (no constant term). Coefficients are intended to be: B b, District-wide black turnout B w, District-wide white turnout Gary King () Ecological Inference 13 / 38

51 Selected Problems with the Goodman s Approach Gary King () Ecological Inference 14 / 38

52 Selected Problems with the Goodman s Approach If we follow Goodman s advice, we won t apply the model. Gary King () Ecological Inference 14 / 38

53 Selected Problems with the Goodman s Approach If we follow Goodman s advice, we won t apply the model. If we don t follow Goodman s advice & apply it anyway: Gary King () Ecological Inference 14 / 38

54 Selected Problems with the Goodman s Approach If we follow Goodman s advice, we won t apply the model. If we don t follow Goodman s advice & apply it anyway: 1. We know parameters are not constant 1.75 T i X i Precincts in Marion County, Indiana: Voter Turnout for the U.S. Senate by Fraction Black, Gary King () Ecological Inference 14 / 38

55 Selected Problems with the Goodman s Approach The accounting identity, T i = β b i X i + β w i (1 X i ), contains no error other than due to parameter variation. Thus, all scatter around the regression line is due to parameter variation. Gary King () Ecological Inference 15 / 38

56 Selected Problems with the Goodman s Approach The accounting identity, T i = β b i X i + β w i (1 X i ), contains no error other than due to parameter variation. Thus, all scatter around the regression line is due to parameter variation. 2. Goodman s model does not take into account information from the method of bounds or from massive heteroskedasticity in aggregate data. See the graph. Gary King () Ecological Inference 15 / 38

57 Selected Problems with the Goodman s Approach The accounting identity, T i = β b i X i + β w i (1 X i ), contains no error other than due to parameter variation. Thus, all scatter around the regression line is due to parameter variation. 2. Goodman s model does not take into account information from the method of bounds or from massive heteroskedasticity in aggregate data. See the graph. 3. Goodman s regression is biased in the presence of aggregation bias: C(β b i, X i) 0 or C(β w i, X i ) 0 (True in any regression even if not ecological.) Gary King () Ecological Inference 15 / 38

58 Selected Problems with the Goodman s Approach Gary King () Ecological Inference 16 / 38

59 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. Gary King () Ecological Inference 16 / 38

60 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. Gary King () Ecological Inference 16 / 38

61 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. Gary King () Ecological Inference 16 / 38

62 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) Gary King () Ecological Inference 16 / 38

63 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) (d) Let B b = γ 0 + γ 1 X i and B w = θ 0 + θ 1 X i and substitute: Gary King () Ecological Inference 16 / 38

64 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) (d) Let B b = γ 0 + γ 1 X i and B w = θ 0 + θ 1 X i and substitute: E(T i ) = (γ 0 + γ 1 X i )X i + (θ 0 + θ 1 X i )(1 X i ) Gary King () Ecological Inference 16 / 38

65 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) (d) Let B b = γ 0 + γ 1 X i and B w = θ 0 + θ 1 X i and substitute: E(T i ) = (γ 0 + γ 1 X i )X i + (θ 0 + θ 1 X i )(1 X i ) = θ 0 + (γ 0 + θ 1 θ 0 )X i (γ 1 θ 1 )X 2 i Gary King () Ecological Inference 16 / 38

66 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) (d) Let B b = γ 0 + γ 1 X i and B w = θ 0 + θ 1 X i and substitute: E(T i ) = (γ 0 + γ 1 X i )X i + (θ 0 + θ 1 X i )(1 X i ) = θ 0 + (γ 0 + θ 1 θ 0 )X i (γ 1 θ 1 )X 2 i (e) Model is not identified: Four parameters need to be estimated (γ 0, γ 1, θ 0, and θ 1 ), but only 3 can be estimated (θ 0 and coefficients in parens on X i and X 2 i ). Gary King () Ecological Inference 16 / 38

67 Selected Problems with the Goodman s Approach 4. We cannot correct for aggregation bias within Goodman s framework. (a) The good idea that doesn t work: since the coefficients vary with X i, let s model that explicitly, hence using X i to control for the covariation. (b) More specifically, even if C(β b i, X i) 0, if we control for Z i it might be true that C(β b i, X i Z i ) = 0. And if Z i = X i, its true for sure. (c) Take Goodman s regression E(T i ) = B b X i + B w (1 X i ) (d) Let B b = γ 0 + γ 1 X i and B w = θ 0 + θ 1 X i and substitute: E(T i ) = (γ 0 + γ 1 X i )X i + (θ 0 + θ 1 X i )(1 X i ) = θ 0 + (γ 0 + θ 1 θ 0 )X i (γ 1 θ 1 )X 2 i (e) Model is not identified: Four parameters need to be estimated (γ 0, γ 1, θ 0, and θ 1 ), but only 3 can be estimated (θ 0 and coefficients in parens on X i and X 2 i ). 5. If the number of people differs across precinct, Goodman s model is not estimating the correct quantity of interest. Gary King () Ecological Inference 16 / 38

68 The Data 1.75 T i X i A Scattercross Graph of Voter Turnout by Fraction Hispanic Gary King () Ecological Inference 17 / 38

69 The Data 1.75 T i X i A Scattercross Graph of Voter Turnout by Fraction Hispanic Solve the accounting identity: Gary King () Ecological Inference 17 / 38

70 The Data 1.75 T i X i A Scattercross Graph of Voter Turnout by Fraction Hispanic Solve the accounting identity: T i = β w i + (β b i β w i )X i Gary King () Ecological Inference 17 / 38

71 The Data 1.75 T i X i A Scattercross Graph of Voter Turnout by Fraction Hispanic Solve the accounting identity: for the unknowns: T i = β w i + (β b i β w i )X i Gary King () Ecological Inference 17 / 38

72 The Data 1.75 T i X i A Scattercross Graph of Voter Turnout by Fraction Hispanic Solve the accounting identity: for the unknowns: β w i = T i = β w i + (β b i β w i )X i Ti 1 X i ««Xi βi b 1 X i Gary King () Ecological Inference 17 / 38

73 The Data: Continued Precinct 52: T 52 =.19, X 52 =.88 Gary King () Ecological Inference 18 / 38

74 The Data: Continued Precinct 52: T 52 =.19, X 52 =.88 β w 52 = T 52 1 X 52 X 52 1 X 52 β b 52 Gary King () Ecological Inference 18 / 38

75 The Data: Continued Precinct 52: T 52 =.19, X 52 =.88 β w 52 = T 52 1 X 52 X 52 1 X 52 β b 52 = βb 52 Gary King () Ecological Inference 18 / 38

76 The Data: Continued Precinct 52: T 52 =.19, X 52 =.88 β52 w = T 52 X 52 β52 b 1 X 52 1 X 52 = βb 52 = β52 b Gary King () Ecological Inference 18 / 38

77 The Data: Continued Precinct 52: T 52 =.19, X 52 =.88 β52 w = T 52 X 52 β52 b 1 X 52 1 X 52 = βb 52 = β52 b 1.75 β w i β b i Gary King () Ecological Inference 18 / 38

78 The Model for Data Without Aggregation Bias, But Robust in its Presence Gary King () Ecological Inference 19 / 38

79 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Gary King () Ecological Inference 19 / 38

80 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Begin with the basic accounting identity (not an assumption of linearity): Gary King () Ecological Inference 19 / 38

81 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Begin with the basic accounting identity (not an assumption of linearity): T i = β b i X i + β w i (1 X i ) Gary King () Ecological Inference 19 / 38

82 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Begin with the basic accounting identity (not an assumption of linearity): T i = β b i X i + β w i (1 X i ) add three assumptions (in the basic version of the model): Gary King () Ecological Inference 19 / 38

83 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Begin with the basic accounting identity (not an assumption of linearity): T i = β b i X i + β w i (1 X i ) add three assumptions (in the basic version of the model): 1. β b i and β w i are truncated bivariate normal: β w i β b i β w i β b i β w i β b i (a) (b) (c) Gary King () Ecological Inference 19 / 38

84 The Model for Data Without Aggregation Bias, But Robust in its Presence The Goal: Knowledge of β b i and β w i in each precinct. Begin with the basic accounting identity (not an assumption of linearity): T i = β b i X i + β w i (1 X i ) add three assumptions (in the basic version of the model): 1. β b i and β w i are truncated bivariate normal: β w i β b i β w i β b i β w i β b i (a) (b) (c) (The 5 parameters of this density need to be estimated by forming the likelihood.) Gary King () Ecological Inference 19 / 38

85 The Model for Data Without Aggregation Bias, But Robust in its Presence Gary King () Ecological Inference 20 / 38

86 The Model for Data Without Aggregation Bias, But Robust in its Presence 2. No aggregation bias (a priori): β b i and β w i mean independent of X i. Allows a posteriori aggregation bias (i.e., after conditioning on T i ) Gary King () Ecological Inference 20 / 38

87 The Model for Data Without Aggregation Bias, But Robust in its Presence 2. No aggregation bias (a priori): β b i and β w i mean independent of X i. Allows a posteriori aggregation bias (i.e., after conditioning on T i ) 3. No spatial autocorrelation: T i X i are independent over observations. Gary King () Ecological Inference 20 / 38

88 Deriving the Likelihood Function Gary King () Ecological Inference 21 / 38

89 Deriving the Likelihood Function 1. The story of the model is that we learn things in order Gary King () Ecological Inference 21 / 38

90 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. Gary King () Ecological Inference 21 / 38

91 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. Gary King () Ecological Inference 21 / 38

92 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). Gary King () Ecological Inference 21 / 38

93 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal Gary King () Ecological Inference 21 / 38

94 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal 3. The five parameters of the truncated bivariate normal need to be estimated: ψ = { B b, B w, σ b, σ w, ρ} = { B, Σ} Gary King () Ecological Inference 21 / 38

95 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal 3. The five parameters of the truncated bivariate normal need to be estimated: ψ = { B b, B w, σ b, σ w, ρ} = { B, Σ} These are on the untruncated scale (and not quantities of interest) since: Gary King () Ecological Inference 21 / 38

96 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal 3. The five parameters of the truncated bivariate normal need to be estimated: ψ = { B b, B w, σ b, σ w, ρ} = { B, Σ} These are on the untruncated scale (and not quantities of interest) since: TN(β b i, β w i B, Σ) = N(β b i, β w i B, Σ) 1(βb i, βw i ) R( B, Σ) Gary King () Ecological Inference 21 / 38

97 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal 3. The five parameters of the truncated bivariate normal need to be estimated: ψ = { B b, B w, σ b, σ w, ρ} = { B, Σ} These are on the untruncated scale (and not quantities of interest) since: where TN(β b i, β w i B, Σ) = N(β b i, β w i B, Σ) 1(βb i, βw i ) R( B, Σ) Gary King () Ecological Inference 21 / 38

98 Deriving the Likelihood Function 1. The story of the model is that we learn things in order (a) (As in regression), everything is conditional on X i, which means we learn it first. (b) Then the world draws β b i and β w i from a truncated normal, but we don t get to see them. (c) Finally, we learn T i, which is computed via the accounting identity deterministically: T i = β b i X i + β w i (1 X i ). 2. The random variable is then T (given X ), which is truncated bivarate normal 3. The five parameters of the truncated bivariate normal need to be estimated: ψ = { B b, B w, σ b, σ w, ρ} = { B, Σ} These are on the untruncated scale (and not quantities of interest) since: where R( B, Σ) = TN(β b i, β w i B, Σ) = N(β b i, β w i B, Σ) 1(βb i, βw i ) R( B, Σ) Z 1 Z 1 N(β b, β w B, Σ)dβ b dβ w (volume above unit square) 0 0 Gary King () Ecological Inference 21 / 38

99 Deriving the Likelihood Function Gary King () Ecological Inference 22 / 38

100 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) Gary King () Ecological Inference 22 / 38

101 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: Gary King () Ecological Inference 22 / 38

102 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: L( ψ T ) X i (0,1) P(T i ψ) Gary King () Ecological Inference 22 / 38

103 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: L( ψ T ) X i (0,1) = X i (0,1) P(T i ψ) ( What we observe ) What we could have observed Gary King () Ecological Inference 22 / 38

104 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: L( ψ T ) X i (0,1) = X i (0,1) = X i (0,1) P(T i ψ) ( What we observe What we could have observed ) ( ) Area above line segment Volume above square Gary King () Ecological Inference 22 / 38

105 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: L( ψ T ) X i (0,1) = X i (0,1) = X i (0,1) = X i (0,1) P(T i ψ) ( What we observe What we could have observed ) ( ) Area above line segment Volume above square ) ( ) Area above line segment ( Area above line Volume above plane Area above line ( Volume above square Volume above plane ) Gary King () Ecological Inference 22 / 38

106 Deriving the Likelihood Function 4. (From simulations of these parameters, we will compute quantities of interest: β b i, βw i. Details shortly.) 5. The likelihood: L( ψ T ) X i (0,1) = X i (0,1) = X i (0,1) = X i (0,1) = X i (0,1) P(T i ψ) ( What we observe What we could have observed ) ( ) Area above line segment Volume above square ) ( ) Area above line segment ( Area above line Volume above plane N(T i µ i, σ 2 i ) S( B, Σ) R( B, Σ) Area above line ( Volume above square Volume above plane ) Gary King () Ecological Inference 22 / 38

107 Deriving the Likelihood Function Gary King () Ecological Inference 23 / 38

108 Deriving the Likelihood Function where Gary King () Ecological Inference 23 / 38

109 Deriving the Likelihood Function where E(T i X i ) µ i = B b X i + B w (1 X i ), Gary King () Ecological Inference 23 / 38

110 Deriving the Likelihood Function where E(T i X i ) µ i = B b X i + B w (1 X i ), V (T i X i ) σ 2 i = ( σ 2 w ) + (2 σ bw 2 σ 2 w )X i + ( σ 2 b + σ 2 w 2 σ bw )X 2 i, Gary King () Ecological Inference 23 / 38

111 Deriving the Likelihood Function where E(T i X i ) µ i = B b X i + B w (1 X i ), V (T i X i ) σ 2 i = ( σ 2 w ) + (2 σ bw 2 σ 2 w )X i + ( σ 2 b + σ 2 w 2 σ bw )X 2 i, min 1, T i X i S( B, Σ) = max 0, T (1 X i ) X i ( N β b B b + ω ) i ɛ i, σ b 2 ω2 i σ i σi 2 dβ b Gary King () Ecological Inference 23 / 38

112 Deriving the Likelihood Function 6. A visual version of the likelihood: 1.75 β w i β b i Gary King () Ecological Inference 24 / 38

113 The Truncated Bivariate Normal Distribution s Five Parameters Can be Estimated From Aggregate Data: Intuition (a) X i T i (b) X i T i (c) X i T i (d) X i T i (e) X i T i (f) X i T i Data were randomly generated from the model with parameter values B b, B w, σ b, σ w, and ρ, at the top of each graph. The solid line is the expected value and dashed lines are at plus and minus one standard deviation. Gary King () Ecological Inference 25 / 38

114 Another view of how the data change with the model 1 (a) (d) β w i.5 β w i β b i β b i 1.75 (b) (e) β w i.5 β w i β b i β b i 1 (c) (f) β w i.5 β w i β b i β b i Observable Implications for Sample Parameter Values. The numbers at the top of each tomography plot are the parameter values for the distribution from which data were randomly generated: B b, B w, σ b, σ w, and ρ. Gary King () Ecological Inference 26 / 38

115 Calculating Quantities of Interest: A story of X-Rays and tomography machines; then how to do it Rearranging the basic accounting identity gives βi w βi b: as a linear function of Gary King () Ecological Inference 27 / 38

116 Calculating Quantities of Interest: A story of X-Rays and tomography machines; then how to do it Rearranging the basic accounting identity gives βi w as a linear function of βi b: ( ) ( ) βi w Ti Xi = βi b 1 X i 1 X i Gary King () Ecological Inference 27 / 38

117 Calculating Quantities of Interest: A story of X-Rays and tomography machines; then how to do it Rearranging the basic accounting identity gives βi w as a linear function of βi b: ( ) ( ) βi w Ti Xi = βi b 1 X i 1 X i Thus, knowing T i and X i in one precinct narrows the possible values of βi b, βw i to one line cut across this figure: Gary King () Ecological Inference 27 / 38

118 Calculating Quantities of Interest: A story of X-Rays and tomography machines; then how to do it Rearranging the basic accounting identity gives βi w as a linear function of βi b: ( ) ( ) βi w Ti Xi = βi b 1 X i 1 X i Thus, knowing T i and X i in one precinct narrows the possible values of βi b, βw i to one line cut across this figure: 1.75 β w i.5.25 A Tomography Plot β b i Gary King () Ecological Inference 27 / 38

119 Calculating Quantities of Interest: A story of X-Rays and tomography machines; then how to do it P P P P β b i Gary King () Ecological Inference 28 / 38

120 How to Calculate Quantities of Interest Gary King () Ecological Inference 29 / 38

121 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities Gary King () Ecological Inference 29 / 38

122 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: Gary King () Ecological Inference 29 / 38

123 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: i. Draw ψ from its posterior or sampling density: an asymptotic normal with mean equal to point estimates and variance the inverse of the -Hessian at the maximum. Gary King () Ecological Inference 29 / 38

124 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: i. Draw ψ from its posterior or sampling density: an asymptotic normal with mean equal to point estimates and variance the inverse of the -Hessian at the maximum. ii. Draw β b i and β w i from TN(β b i, β w i B, Σ), given the simulated parameters, ψ = { B, Σ}. Gary King () Ecological Inference 29 / 38

125 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: i. Draw ψ from its posterior or sampling density: an asymptotic normal with mean equal to point estimates and variance the inverse of the -Hessian at the maximum. ii. Draw β b i and β w i from TN(β b i, β w i B, Σ), given the simulated parameters, ψ = { B, Σ}. iii. Compute the weighted average of the simulated coefficients (weights based on precinct population): Gary King () Ecological Inference 29 / 38

126 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: i. Draw ψ from its posterior or sampling density: an asymptotic normal with mean equal to point estimates and variance the inverse of the -Hessian at the maximum. ii. Draw βi b and βi w from TN(βi b, βi w B, Σ), given the simulated parameters, ψ = { B, Σ}. iii. Compute the weighted average of the simulated coefficients (weights based on precinct population): px B b N b+ i β i b = N b+ + i=1 Gary King () Ecological Inference 29 / 38

127 How to Calculate Quantities of Interest 1. Option 1. Simulate only (district level) aggregate quantities (a) Algorithm to take one draw of the district-level fraction of blacks who vote: i. Draw ψ from its posterior or sampling density: an asymptotic normal with mean equal to point estimates and variance the inverse of the -Hessian at the maximum. ii. Draw βi b and βi w from TN(βi b, βi w B, Σ), given the simulated parameters, ψ = { B, Σ}. iii. Compute the weighted average of the simulated coefficients (weights based on precinct population): px B b N b+ i β i b = N b+ + (b) Problem: We only get knowledge of the district-wide aggregate & its not robust. i=1 Gary King () Ecological Inference 29 / 38

128 How to Calculate Quantities of Interest Gary King () Ecological Inference 30 / 38

129 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: Gary King () Ecological Inference 30 / 38

130 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). Gary King () Ecological Inference 30 / 38

131 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. Gary King () Ecological Inference 30 / 38

132 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. (c) We could apply the Option 1 algorithm and use rejection sampling (discard simulations of βi b, βw i that are not on the tomography line), but this would take forever. Gary King () Ecological Inference 30 / 38

133 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. (c) We could apply the Option 1 algorithm and use rejection sampling (discard simulations of βi b, βw i that are not on the tomography line), but this would take forever. (d) Alternative algorithm for drawing simulations of βi b and βi w. Gary King () Ecological Inference 30 / 38

134 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. (c) We could apply the Option 1 algorithm and use rejection sampling (discard simulations of βi b, βw i that are not on the tomography line), but this would take forever. (d) Alternative algorithm for drawing simulations of βi b and βi w. i. Find the expression for P(β b i T i, ψ) analytically, which is a particular truncated univariate normal (see King, 1997: Appendix C). Gary King () Ecological Inference 30 / 38

135 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. (c) We could apply the Option 1 algorithm and use rejection sampling (discard simulations of βi b, βw i that are not on the tomography line), but this would take forever. (d) Alternative algorithm for drawing simulations of βi b and βi w. i. Find the expression for P(β b i T i, ψ) analytically, which is a particular truncated univariate normal (see King, 1997: Appendix C). ii. Draw ψ from its posterior or sampling density (the same multivariate normal as always). Gary King () Ecological Inference 30 / 38

136 How to Calculate Quantities of Interest 2. Option 2. use the knowledge that simulations for observation i must come from its tomography line: (a) By the story of the model, if we know T i, we learn the entire tomography line (since X i is known ex ante). (b) So we will condition on T i to make a prediction from the tomography line. (c) We could apply the Option 1 algorithm and use rejection sampling (discard simulations of βi b, βw i that are not on the tomography line), but this would take forever. (d) Alternative algorithm for drawing simulations of βi b and βi w. i. Find the expression for P(β b i T i, ψ) analytically, which is a particular truncated univariate normal (see King, 1997: Appendix C). ii. Draw ψ from its posterior or sampling density (the same multivariate normal as always). iii. Insert the simulation into P(β b i T i, ψ) and draw out one simulated β b i. Gary King () Ecological Inference 30 / 38

Ecological Inference

Ecological Inference Ecological Inference Simone Zhang March 2017 With thanks to Gary King for slides on EI. Simone Zhang Ecological Inference March 2017 1 / 28 What is ecological inference? Definition: Ecological inference

More information

A Consensus on Second-Stage Analyses in Ecological Inference Models

A Consensus on Second-Stage Analyses in Ecological Inference Models Political Analysis, 11:1 A Consensus on Second-Stage Analyses in Ecological Inference Models Christopher Adolph and Gary King Department of Government, Harvard University, Cambridge, MA 02138 e-mail: cadolph@fas.harvard.edu

More information

4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo

4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo PART TWO 4 Extending King s Ecological Inference Model to Multiple Elections Using Markov Chain Monte Carlo Jeffrey B. Lewis ABSTRACT King s EI estimator has become a widely used procedure for tackling

More information

Ecological inference with distribution regression

Ecological inference with distribution regression Ecological inference with distribution regression Seth Flaxman 10 May 2017 Department of Politics and International Relations Ecological inference I How to draw conclusions about individuals from aggregate-level

More information

CHAPTER 1: Preliminary Description of Errors Experiment Methodology and Errors To introduce the concept of error analysis, let s take a real world

CHAPTER 1: Preliminary Description of Errors Experiment Methodology and Errors To introduce the concept of error analysis, let s take a real world CHAPTER 1: Preliminary Description of Errors Experiment Methodology and Errors To introduce the concept of error analysis, let s take a real world experiment. Suppose you wanted to forecast the results

More information

AP Statistics Review Ch. 7

AP Statistics Review Ch. 7 AP Statistics Review Ch. 7 Name 1. Which of the following best describes what is meant by the term sampling variability? A. There are many different methods for selecting a sample. B. Two different samples

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit Amherst College Department of Economics Economics 360 Fall 2014 Exam 1: Solutions 1. (10 points) The following table in reports the summary statistics for high and low temperatures in Key West, FL from

More information

Online Appendix to The Political Economy of the U.S. Mortgage Default Crisis Not For Publication

Online Appendix to The Political Economy of the U.S. Mortgage Default Crisis Not For Publication Online Appendix to The Political Economy of the U.S. Mortgage Default Crisis Not For Publication 1 Robustness of Constituent Interest Result Table OA1 shows that the e ect of mortgage default rates on

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name.

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name. Supplementary Appendix: Imai, Kosuke and Kabir Kahnna. (2016). Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records. Political Analysis doi: 10.1093/pan/mpw001

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Lecture Notes Part 7: Systems of Equations

Lecture Notes Part 7: Systems of Equations 17.874 Lecture Notes Part 7: Systems of Equations 7. Systems of Equations Many important social science problems are more structured than a single relationship or function. Markets, game theoretic models,

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

HWA CHONG INSTITUTION 2016 JC2 PRELIMINARY EXAMINATION. Tuesday 20 September hours. List of Formula (MF15)

HWA CHONG INSTITUTION 2016 JC2 PRELIMINARY EXAMINATION. Tuesday 20 September hours. List of Formula (MF15) HWA CHONG INSTITUTION 06 JC PRELIMINARY EXAMINATION MATHEMATICS Higher 9740/0 Paper Tuesday 0 September 06 3 hours Additional materials: Answer paper List of Formula (MF5) READ THESE INSTRUCTIONS FIRST

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Statistical Inference for Means

Statistical Inference for Means Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Predicting the Treatment Status

Predicting the Treatment Status Predicting the Treatment Status Nikolay Doudchenko 1 Introduction Many studies in social sciences deal with treatment effect models. 1 Usually there is a treatment variable which determines whether a particular

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

The Importance of the Median Voter

The Importance of the Median Voter The Importance of the Median Voter According to Duncan Black and Anthony Downs V53.0500 NYU 1 Committee Decisions utility 0 100 x 1 x 2 x 3 x 4 x 5 V53.0500 NYU 2 Single-Peakedness Condition The preferences

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. Tutorials and exercises

SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. Tutorials and exercises SALES AND MARKETING Department MATHEMATICS 2nd Semester Bivariate statistics Tutorials and exercises Online document: http://jff-dut-tc.weebly.com section DUT Maths S2. IUT de Saint-Etienne Département

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Regression Discontinuity Designs

Regression Discontinuity Designs Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

1 Introduction Overview of the Book How to Use this Book Introduction to R 10

1 Introduction Overview of the Book How to Use this Book Introduction to R 10 List of Tables List of Figures Preface xiii xv xvii 1 Introduction 1 1.1 Overview of the Book 3 1.2 How to Use this Book 7 1.3 Introduction to R 10 1.3.1 Arithmetic Operations 10 1.3.2 Objects 12 1.3.3

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Week 2: Review of probability and statistics

Week 2: Review of probability and statistics Week 2: Review of probability and statistics Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED

More information

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics

Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics Notes 6: Multivariate regression ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano 1 Notation and language Recall the notation that we discussed in the previous classes. We call the outcome

More information

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics The Nature of Geographic Data Types of spatial data Continuous spatial data: geostatistics Samples may be taken at intervals, but the spatial process is continuous e.g. soil quality Discrete data Irregular:

More information

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies Kosuke Imai Princeton University Joint work with Keele (Ohio State), Tingley (Harvard), Yamamoto (Princeton)

More information

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis

More information

Statistical Analysis of Causal Mechanisms

Statistical Analysis of Causal Mechanisms Statistical Analysis of Causal Mechanisms Kosuke Imai Princeton University November 17, 2008 Joint work with Luke Keele (Ohio State) and Teppei Yamamoto (Princeton) Kosuke Imai (Princeton) Causal Mechanisms

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Gov 2002: 3. Randomization Inference

Gov 2002: 3. Randomization Inference Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via

More information

Truncation and Censoring

Truncation and Censoring Truncation and Censoring Laura Magazzini laura.magazzini@univr.it Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35 Truncation and censoring Truncation: sample data are drawn from a subset of

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

BIG IDEAS. Area of Learning: SOCIAL STUDIES Urban Studies Grade 12. Learning Standards. Curricular Competencies

BIG IDEAS. Area of Learning: SOCIAL STUDIES Urban Studies Grade 12. Learning Standards. Curricular Competencies Area of Learning: SOCIAL STUDIES Urban Studies Grade 12 BIG IDEAS Urbanization is a critical force that shapes both human life and the planet. The historical development of cities has been shaped by geographic,

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

Confidence Intervals for the Mean of Non-normal Data Class 23, Jeremy Orloff and Jonathan Bloom

Confidence Intervals for the Mean of Non-normal Data Class 23, Jeremy Orloff and Jonathan Bloom Confidence Intervals for the Mean of Non-normal Data Class 23, 8.05 Jeremy Orloff and Jonathan Bloom Learning Goals. Be able to derive the formula for conservative normal confidence intervals for the proportion

More information

CSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement

CSSS/STAT/SOC 321 Case-Based Social Statistics I. Levels of Measurement CSSS/STAT/SOC 321 Case-Based Social Statistics I Levels of Measurement Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University of Washington, Seattle

More information

Forecasting the 2012 Presidential Election from History and the Polls

Forecasting the 2012 Presidential Election from History and the Polls Forecasting the 2012 Presidential Election from History and the Polls Drew Linzer Assistant Professor Emory University Department of Political Science Visiting Assistant Professor, 2012-13 Stanford University

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences Random Variables Christopher Adolph Department of Political Science and Center for Statistics and the Social Sciences University

More information

Supplemental Material for Policy Deliberation and Voter Persuasion: Experimental Evidence from an Election in the Philippines

Supplemental Material for Policy Deliberation and Voter Persuasion: Experimental Evidence from an Election in the Philippines Supplemental Material for Policy Deliberation and Voter Persuasion: Experimental Evidence from an Election in the Philippines March 17, 2017 1 Accounting for Deviations in the Randomization Protocol In

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Introduction to Statistical Inference Kosuke Imai Princeton University January 31, 2010 Kosuke Imai (Princeton) Introduction to Statistical Inference January 31, 2010 1 / 21 What is Statistics? Statistics

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Statistical Analysis of Causal Mechanisms

Statistical Analysis of Causal Mechanisms Statistical Analysis of Causal Mechanisms Kosuke Imai Princeton University April 13, 2009 Kosuke Imai (Princeton) Causal Mechanisms April 13, 2009 1 / 26 Papers and Software Collaborators: Luke Keele,

More information

Ecological Regression with Partial Identification

Ecological Regression with Partial Identification Ecological Regression with Partial Identification Wenxin Jiang Gary King Allen Schmaltz Martin A. Tanner January 21, 2019 Abstract Ecological inference (EI) is the process of learning about individual

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 21 Oct 2011 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Generative Learning algorithms

Generative Learning algorithms CS9 Lecture notes Andrew Ng Part IV Generative Learning algorithms So far, we ve mainly been talking about learning algorithms that model p(y x; θ), the conditional distribution of y given x. For instance,

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

CSC 411: Lecture 09: Naive Bayes

CSC 411: Lecture 09: Naive Bayes CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1

More information

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points] Econometrics (60 points) Question 7: Short Answers (30 points) Answer parts 1-6 with a brief explanation. 1. Suppose the model of interest is Y i = 0 + 1 X 1i + 2 X 2i + u i, where E(u X)=0 and E(u 2 X)=

More information

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 Page 1 of 4 QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 ECONOMICS 250 Introduction to Statistics Instructor: Gregor Smith Instructions: The exam

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Mid-term exam Practice problems

Mid-term exam Practice problems Mid-term exam Practice problems Most problems are short answer problems. You receive points for the answer and the explanation. Full points require both, unless otherwise specified. Explaining your answer

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

Weakly informative priors

Weakly informative priors Department of Statistics and Department of Political Science Columbia University 23 Apr 2014 Collaborators (in order of appearance): Gary King, Frederic Bois, Aleks Jakulin, Vince Dorie, Sophia Rabe-Hesketh,

More information

Selection on Observables

Selection on Observables Selection on Observables Hasin Yousaf (UC3M) 9th November Hasin Yousaf (UC3M) Selection on Observables 9th November 1 / 22 Summary Altonji, Elder and Taber, JPE, 2005 Bellows and Miguel, JPubE, 2009 Oster,

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter

More information

CSC321 Lecture 5 Learning in a Single Neuron

CSC321 Lecture 5 Learning in a Single Neuron CSC321 Lecture 5 Learning in a Single Neuron Roger Grosse and Nitish Srivastava January 21, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 5 Learning in a Single Neuron January 21, 2015 1 / 14

More information

Preliminary Results on Social Learning with Partial Observations

Preliminary Results on Social Learning with Partial Observations Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from

More information

GEOGRAPHIC INFORMATION SYSTEMS

GEOGRAPHIC INFORMATION SYSTEMS GEOGRAPHIC INFORMATION SYSTEMS 4-H Round-Up Community Transitions Workshop Daniel Hanselka June 14, 2011 Goals of the Workshop Answer the question: What is GIS? Uses of GIS. Some of the Common Terminology

More information

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned

More information

Sociology Exam 2 Answer Key March 30, 2012

Sociology Exam 2 Answer Key March 30, 2012 Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales

More information

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016

1 Overview. 2 Learning from Experts. 2.1 Defining a meaningful benchmark. AM 221: Advanced Optimization Spring 2016 AM 1: Advanced Optimization Spring 016 Prof. Yaron Singer Lecture 11 March 3rd 1 Overview In this lecture we will introduce the notion of online convex optimization. This is an extremely useful framework

More information

MA Advanced Econometrics: Applying Least Squares to Time Series

MA Advanced Econometrics: Applying Least Squares to Time Series MA Advanced Econometrics: Applying Least Squares to Time Series Karl Whelan School of Economics, UCD February 15, 2011 Karl Whelan (UCD) Time Series February 15, 2011 1 / 24 Part I Time Series: Standard

More information

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania Frank Schorfheide and Ken Wolpin University of Pennsylvania April 4, 2011 Introduction Randomized controlled trials (RCTs) to evaluate policies, e.g., cash transfers for school attendance, have become

More information

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 2013 MODULE 5 : Further probability and inference Time allowed: One and a half hours Candidates should answer THREE questions.

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

1 Review of the dot product

1 Review of the dot product Any typographical or other corrections about these notes are welcome. Review of the dot product The dot product on R n is an operation that takes two vectors and returns a number. It is defined by n u

More information

HUDM4122 Probability and Statistical Inference. February 2, 2015

HUDM4122 Probability and Statistical Inference. February 2, 2015 HUDM4122 Probability and Statistical Inference February 2, 2015 Special Session on SPSS Thursday, April 23 4pm-6pm As of when I closed the poll, every student except one could make it to this I am happy

More information

Learning Objectives. Zeroes. The Real Zeros of a Polynomial Function

Learning Objectives. Zeroes. The Real Zeros of a Polynomial Function The Real Zeros of a Polynomial Function 1 Learning Objectives 1. Use the Remainder and Factor Theorems 2. Use the Rational Zeros Theorem to list the potential rational zeros of a polynomial function 3.

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Last few slides from last time

Last few slides from last time Last few slides from last time Example 3: What is the probability that p will fall in a certain range, given p? Flip a coin 50 times. If the coin is fair (p=0.5), what is the probability of getting an

More information

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21 CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Department of Economics University of Wisconsin-Madison September 27, 2016 Treatment Effects Throughout the course we will focus on the Treatment Effect Model For now take that to

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

SOLUTIONS Problem Set 2: Static Entry Games

SOLUTIONS Problem Set 2: Static Entry Games SOLUTIONS Problem Set 2: Static Entry Games Matt Grennan January 29, 2008 These are my attempt at the second problem set for the second year Ph.D. IO course at NYU with Heski Bar-Isaac and Allan Collard-Wexler

More information

SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. SOLUTIONS of tutorials and exercises

SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. SOLUTIONS of tutorials and exercises SALES AND MARKETING Department MATHEMATICS 2nd Semester Bivariate statistics SOLUTIONS of tutorials and exercises Online document: http://jff-dut-tc.weebly.com section DUT Maths S2. IUT de Saint-Etienne

More information