Statistical Analysis of List Experiments

Similar documents
Statistical Analysis of the Item Count Technique

Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments

Eliciting Truthful Answers to Sensitive Survey Questions: New Statistical Methods for List and Endorsement Experiments

Identification and Inference in Causal Mediation Analysis

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

High Dimensional Propensity Score Estimation via Covariate Balancing

The Logit Model: Estimation, Testing and Interpretation

On the Use of Linear Fixed Effects Regression Models for Causal Inference

Chapter 11. Regression with a Binary Dependent Variable

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Stat 5101 Lecture Notes

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Covariate Balancing Propensity Score for General Treatment Regimes

Exam Applied Statistical Regression. Good Luck!

Statistical Analysis of Causal Mechanisms

Statistical Analysis of Causal Mechanisms

Binary Logistic Regression

Statistical Analysis of Causal Mechanisms

Item Response Theory for Conjoint Survey Experiments

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

High-Throughput Sequencing Course

Generalized Linear Models for Non-Normal Data

Chapter 1 Statistical Inference

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Models for Heterogeneous Choices

Chapter 4: Factor Analysis

STA 216, GLM, Lecture 16. October 29, 2007

WISE International Masters

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Statistical Analysis of Causal Mechanisms for Randomized Experiments

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

Lecture 10: Introduction to Logistic Regression

Dynamics in Social Networks and Causality

Truncation and Censoring

Classification. Chapter Introduction. 6.2 The Bayes classifier

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Binary Dependent Variable. Regression with a

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Instrumental Variables

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Review of Statistics 101

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Probability and Statistics Notes

Discussion of Papers on the Extensions of Propensity Score

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

Statistical Distribution Assumptions of General Linear Models

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

STA6938-Logistic Regression Model

CHAPTER 1: BINARY LOGIT MODEL

Hierarchical Linear Modeling. Lesson Two

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Psychology 282 Lecture #4 Outline Inferences in SLR

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Investigating Models with Two or Three Categories

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Modeling the Mean: Response Profiles v. Parametric Curves

A Mixture Modeling Approach for Empirical Testing of Competing Theories

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

Econometrics Problem Set 10

ECON 4160, Autumn term Lecture 1

Potential Outcomes Model (POM)

Loglikelihood and Confidence Intervals

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Semiparametric Generalized Linear Models

Linear Regression With Special Variables

Math 494: Mathematical Statistics

Semiparametric Regression

9 Generalized Linear Models

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

ECONOMETRICS HONOR S EXAM REVIEW SESSION

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Final Exam. Name: Solution:

1. Capitalize all surnames and attempt to match with Census list. 3. Split double-barreled names apart, and attempt to match first half of name.

Transcription:

Statistical Analysis of List Experiments Graeme Blair Kosuke Imai Princeton University December 17, 2010 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 1 / 32

Motivation Surveys are used widely in social sciences Validity of surveys depends on the accuracy of self-reports Sensitive questions = social desirability, privacy concerns e.g., racial prejudice, corruptions, fraud, support for militant groups Lies and non-responses How can we elicit truthful answers to sensitive questions? Survey methodology: protect privacy through indirect questioning Statistical methodology: efficiently recover underlying responses Blair and Imai (Princeton) List Experiments Political Methodology Seminar 2 / 32

Project Overview List Experiments Also known as total block response, item count technique, and unmatched count technique Use aggregation to protect privacy An alternative to randomized response technique Little methodological work Goals of this project: 1 Formalize the key identification assumptions 2 Develop methods for multivariate regression analysis 3 Develop a statistical test to detect failures of list experiments 4 Develop methods to adjust for deviations from the assumptions 5 Develop software to implement all suggestions 6 Extend the methods to measure spacial variation of citizens support for militant groups and foreign forces Blair and Imai (Princeton) List Experiments Political Methodology Seminar 3 / 32

Project Reference Papers: 1 Imai. Statistical Inference for the Item Count Technique. 2 Blair and Imai. Statistical Analysis of List Experiments. Software: Blair and Imai. list: Multivariate Statistical Analysis for the Item Count Technique. R package Applications (in Progress): 1 Measuring support for foreign forces and Taliban in Afghanistan (joint with Jason Lyall) 2 Measuring support for insurgent groups in the Niger Delta Project Website: http://imai.princeton.edu/projects/sensitive.html Blair and Imai (Princeton) List Experiments Political Methodology Seminar 4 / 32

The 1991 National Race and Politics Survey Randomize the sample into the treatment and control groups The script for the control group Now I m going to read you three things that sometimes make people angry or upset. After I read all three, just tell me HOW MANY of them upset you. (I don t want to know which ones, just how many.) (1) the federal government increasing the tax on gasoline; (2) professional athletes getting million-dollar-plus salaries; (3) large corporations polluting the environment. How many, if any, of these things upset you? Blair and Imai (Princeton) List Experiments Political Methodology Seminar 5 / 32

The 1991 National Race and Politics Survey Randomize the sample into the treatment and control groups The script for the treatment group Now I m going to read you four things that sometimes make people angry or upset. After I read all four, just tell me HOW MANY of them upset you. (I don t want to know which ones, just how many.) (1) the federal government increasing the tax on gasoline; (2) professional athletes getting million-dollar-plus salaries; (3) large corporations polluting the environment; (4) a black family moving next door to you. How many, if any, of these things upset you? Blair and Imai (Princeton) List Experiments Political Methodology Seminar 6 / 32

Notation and Setup J: number of non-sensitive items N: number of respondents T i : binary treatment indicator (1 = treatment, 0 = control) Potential outcomes notation Z ij (t): potential response to the jth non-sensitive item under treatment status T i = t for j = 1,..., J and t = 0, 1 Z i,j+1 (t): potential response to the sensitive item under treatment status T i = t where Z i,j+1 (0) represents truthful answer Y i (0) = J j=1 Z ij(0): potential response under control condition Y i (1) = J+1 j=1 Z ij(1): potential response under treatment condition Y i = Y i (T i ): observed response Blair and Imai (Princeton) List Experiments Political Methodology Seminar 7 / 32

Identification Assumptions 1 No Design Effect: The inclusion of the sensitive item does not affect answers to non-sensitive items J Z ij (0) = j=1 J Z ij (1) j=1 2 No Liars: Answers about the sensitive item are truthful Z i,j+1 (0) = Z i,j+1 (1) Blair and Imai (Princeton) List Experiments Political Methodology Seminar 8 / 32

Limitations to the Standard Techniques Difference-in-means estimator: ˆτ = average of the treated average of the control Straightforward and unbiased under the above assumptions But, potentially inefficient Difficult to explore multivariate relationship No existing method allows for multivariate regression analysis Blair and Imai (Princeton) List Experiments Political Methodology Seminar 9 / 32

Nonlinear Least Squares (NLS) Estimator Generalize the difference-in-means estimator to a multivariate regression estimator The Model: Y i = f (X i, γ) }{{} + T i g(x i, δ) }{{} sensitive + ɛ i non sensitive X i : covariates f (x, γ): model for non-sensitive items, e.g., J logit 1 (x γ) g(x, δ): model for sensitive item, e.g., logit 1 (x δ) Two-step estimation procedure: 1 Fit f (x, γ) to the control group via NLS and obtain ˆγ 2 Fit g(x, δ) to the treatment group via NLS after subtracting f (X i, ˆγ) from Y i and obtain ˆδ Standard errors via the generalized method of moments With no covariates, it reduces to the difference-in-means estimator Blair and Imai (Princeton) List Experiments Political Methodology Seminar 10 / 32

Extracting More Information from the Data Define a type of each respondent by (Y i (0), Z i,j+1 (0)) Y i (0): total number of yes for non-sensitive items {0, 1,..., J} Z i,j+1 (0): truthful answer to the sensitive item {0, 1} A total of 2 (J + 1) types Example: two non-sensitive items (J = 3) Y i Treatment group Control group 4 (3,1) 3 (2,1) (3,0) (3,1) (3,0) 2 (1,1) (2,0) (2,1) (2,0) 1 (0,1) (1,0) (1,1) (1,0) 0 (0,0) (0,1) (0,0) Joint distribution is identified Blair and Imai (Princeton) List Experiments Political Methodology Seminar 11 / 32

Extracting More Information from the Data Define a type of each respondent by (Y i (0), Z i,j+1 (0)) Y i (0): total number of yes for non-sensitive items {0, 1,..., J} Z i,j+1 (0): truthful answer to the sensitive item {0, 1} A total of 2 (J + 1) types Example: two non-sensitive items (J = 3) Y i Treatment group Control group 4 (3,1) 3 (2,1) (3,0) (3,1) (3,0) 2 (1,1) (2,0) (2,1) (2,0) 1 (0,1) (1,0) (1,1) (1,0) 0 (0,0) (0,1) (0,0) Joint distribution is identified: Pr(type = (y, 1)) = Pr(Y i y T i = 0) Pr(Y i y T i = 1) Blair and Imai (Princeton) List Experiments Political Methodology Seminar 12 / 32

Extracting More Information from the Data Define a type of each respondent by (Y i (0), Z i,j+1 (0)) Y i (0): total number of yes for non-sensitive items {0, 1,..., J} Z i,j+1 (0): truthful answer to the sensitive item {0, 1} A total of 2 (J + 1) types Example: two non-sensitive items (J = 3) Y i Treatment group Control group 4 (3,1) 3 (2,1) (3,0) (3,1) (3,0) 2 (1,1) (2,0) (2,1) (2,0) 1 (0,1) (1,0) (1,1) (1,0) 0 (0,0) (0,1) (0,0) Joint distribution is identified: Pr(type = (y, 1)) = Pr(Y i y T i = 0) Pr(Y i y T i = 1) Pr(type = (y, 0)) = Pr(Y i y T i = 1) Pr(Y i < y T i = 0) Blair and Imai (Princeton) List Experiments Political Methodology Seminar 13 / 32

The Maximum Likelihood (ML) Estimator Model for sensitive item as before: e.g., logistic regression Pr(Z i,j+1 (0) = 1 X i = x) = logit 1 (x δ) Model for non-sensitive item given the response to sensitive item: e.g., binomial or beta-binomial logistic regression Pr(Y i (0) = y X i = x, Z i,j+1 (0) = z) = J logit 1 (x ψ z ) Difficult to maximize the resulting likelihood function Develop the EM algorithm for reliable estimation Blair and Imai (Princeton) List Experiments Political Methodology Seminar 14 / 32

The Likelihood Function g(x, δ) = Pr(Z i,j+1 (0) = 1 X i = x) h z (y; x, ψ z ) = Pr(Y i (0) = y X i = x, Z i,j+1 (0) = z) The likelihood function consists of mixtures: (1 g(x i, δ))h 0 (0; X i, ψ 0 ) g(x i, δ)h 1 (J; X i, ψ 1 ) i J (1,0) i J (1,J+1) J {g(x i, δ)h 1 (y 1; X i, ψ 1 ) + (1 g(x i, δ))h 0 (y; X i, ψ 0 )} y=1 i J (1,y) J {g(x i, δ)h 1 (y; X i, ψ 1 ) + (1 g(x i, δ))h 0 (y; X i, ψ 0 )} y=0 i J (0,y) where J (t, y) represents a set of respondents with (T i, Y i ) = (t, y) Maximizing this function is difficult Blair and Imai (Princeton) List Experiments Political Methodology Seminar 15 / 32

Missing Data Framework and the EM Algorithm Consider Z i,j+1 (0) as missing data For some respondents, Z i,j+1 (0) is observed The complete-data likelihood has a simple form: N i=1 { } g(x i, δ)h 1 (Y i 1; X i, ψ 1 ) T i h 1 (Y i ; X i, ψ 1 ) 1 T Zi,J+1 (0) i {(1 g(x i, δ))h 0 (Y i ; X i, ψ 0 )} 1 Z i,j+1(0) The EM algorithm: only separate optimization of g(x, δ) and h z (y; x, ψ z ) is required weighted logistic regression weighted binomial logistic regression Both can be implemented in standard statistical software Blair and Imai (Princeton) List Experiments Political Methodology Seminar 16 / 32

Empirical Application: Racial Prejudice in the US Kuklinski, Cobb, and Gilens (1997) analyze the 1991 National Race and Politics survey with the difference-in-means estimator Finding: Southern whites are more prejudiced against blacks than non-southern whites no evidence for the New South The limitation acknowledged by the authors: So far our discussion has implicitly assumed that the higher level of prejudice among white southerners results from something uniquely southern, what many would call southern culture. This assumption could be wrong. If white southerners were older, less educated, and the like characteristics normally associated with greater prejudice then demographics would explain the regional difference in racial attitudes, leaving culture as little more than a small and relatively insignificant residual. Need for a multivariate regression analysis Blair and Imai (Princeton) List Experiments Political Methodology Seminar 17 / 32

Results of the Multivariate Analysis Logistic regression model for sensitive item Binomial regression model for non-sensitive item (not shown) Little over-dispersion Likelihood ratio test supports the constrained model Nonlinear Least Maximum Likelihood Squares Constrained Unconstrained Variables est. s.e. est. s.e. est. s.e. Intercept 7.084 3.669 5.508 1.021 6.226 1.045 South 2.490 1.268 1.675 0.559 1.379 0.820 Age 0.026 0.031 0.064 0.016 0.065 0.021 Male 3.096 2.828 0.846 0.494 1.366 0.612 College 0.612 1.029 0.315 0.474 0.182 0.569 The original conclusion is supported Standard errors are smaller for ML estimator Blair and Imai (Princeton) List Experiments Political Methodology Seminar 18 / 32

Estimated Proportion of Prejudiced Whites 0.5 South South South 0.4 Estimated Proportion 0.3 0.2 0.1 Non South Non South Non South 0.0 0.1 Difference in Means Nonlinear Least Squares Maximum Likelihood Regression adjustments and MLE yield more efficient estimates Blair and Imai (Princeton) List Experiments Political Methodology Seminar 19 / 32

Simulation Evidence Bias Root Mean Squared Error Coverage of 90% Confidence Intervals South 0.1 0.0 0.1 0.2 0.3 ML NLS 0.0 0.5 1.0 1.5 2.0 NLS ML 0.80 0.85 0.90 0.95 1.00 ML NLS 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 Age 0.1 0.0 0.1 0.2 0.3 0.0 0.5 1.0 1.5 2.0 0.80 0.85 0.90 0.95 1.00 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 20 / 32 0

The Design with Multiple Sensitive Items The 1991 National Race and Politics Survey includes another treatment group with the following sensitive item (4) "black leaders asking the government for affirmative action" Use of the same non-sensitive items permits joint-modeling Same assumptions: No Design Effect and No Liars Extension to the design with K sensitive items: h(y; x, ψ) = Pr(Y i (0) = y X i = x) g t (x, y, δ ty ) = Pr(Z i,j+t (0) = 1 Y i (0) = y, X i = x) for each t = 1,..., K The EM algorithm for the ML estimation Blair and Imai (Princeton) List Experiments Political Methodology Seminar 21 / 32

Results of Multivariate Analysis with Joint Modeling Multivariate analysis results: Sensitive Items Non-Sensitive Items Black Family Affirmative Action Variables est. s.e. est. s.e. est. s.e. Intercept 6.886 1.357 4.203 1.137 1.296 0.127 Male 0.782 0.524 0.083 0.407 0.262 0.073 College 0.356 0.494 0.445 0.386 0.543 0.074 Age 0.669 0.169 0.431 0.129 0.025 0.024 South 1.663 0.580 1.332 0.542 0.228 0.088 Y i (0) 0.424 0.309 0.890 0.248 Results are consistent with the No New-South finding Blair and Imai (Princeton) List Experiments Political Methodology Seminar 22 / 32

Estimated Proportion of Prejudiced Whites 0.9 0.8 South Estimated Proportion / Difference in Proportions 0.7 0.6 0.5 0.4 0.3 0.2 0.1 South Non South Difference Non South Difference 0.0 0.1 Black Family Affirmative Action Blair and Imai (Princeton) List Experiments Political Methodology Seminar 23 / 32

The Design with Non-Sensitive Items Asked Directly Motivation: Indirect questioning = loss of information Recoup lost efficiency by asking non-sensitive items directly (for the control group) Corstange (2009) proposes a multivariate regression model based on separate logistic regression for each item (LISTIT) Concern of design effect raised by Flavin and Keane (2010) Likelihood must be based on Poisson-Binomial instead of Binomial Develop the EM algorithm for reliable ML estimation Also extend the NLS estimator to this design Blair and Imai (Princeton) List Experiments Political Methodology Seminar 24 / 32

Simulation Evidence 3 non sensitive items equal probabilities unequal probabilities 4 non sensitive items symmetric probabilities skewed probabilities Bias 0.00 0.10 0.20 LISTIT NLS MLE 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 500 1000 1500 500 1000 1500 500 1000 1500 500 1000 1500 Root Mean Squared Error 0 2 4 6 LISTIT NLS MLE 0 2 4 6 0 2 4 6 0 2 4 6 500 1000 1500 500 1000 1500 500 1000 1500 500 1000 1500 Coverage of 90% Confidence Intervals 0.90 0.95 1.00 LISTIT NLS MLE 0.90 0.95 1.00 0.90 0.95 1.00 0.90 0.95 1.00 500 1000 1500 500 1000 1500 500 1000 1500 500 1000 1500 Sample Size Sample Size Sample Size Sample Size Blair and Imai (Princeton) List Experiments Political Methodology Seminar 25 / 32

When Can List Experiments Fail? Recall the two assumptions: 1 No Design Effect: The inclusion of the sensitive item does not affect answers to non-sensitive items 2 No Liars: Answers about the sensitive item are truthful Design Effect: Respondents evaluate non-sensitive items relative to sensitive item Lies: Ceiling effect: too many yeses for non-sensitive items Floor effect: too many noes for non-sensitive items Both types of failures are difficult to detect Importance of choosing non-sensitive items Question: Can these failures be addressed statistically? Blair and Imai (Princeton) List Experiments Political Methodology Seminar 26 / 32

Hypothesis Test for Detecting List Experiment Failures Under the null hypothesis of no design effect and no liars, we π 1 = Pr(type = (y, 1)) = Pr(Y i y T i = 0) Pr(Y i y T i = 1) 0 π 0 = Pr(type = (y, 0)) = Pr(Y i y T i = 1) Pr(Y i < y T i = 0) 0 Alternative hypothesis: At least one is negative A multivariate one-sided LR test for each t = 0, 1 ˆλ t = min(ˆπ t π t ) Σ 1 π t (ˆπ t π t ), subject to π t 0, t where ˆλ t follows a mixture of χ 2 Difficult to characterize least favorable values under the joint null Bonferroni correction: Reject the joint null if min(ˆp 0, ˆp 1 ) α/2 Failure to reject the null may arise from the lack of power We use the Generalized Moment Selection to improve power Blair and Imai (Princeton) List Experiments Political Methodology Seminar 27 / 32

Statistical Power of the Proposed Test Expected Number of Yeses to 3 Non sensitive Items E(Yi(0)) 0.75 1.5 2.25 0.1 Statistical Power 0.0 0.25 0.5 0.75 1.0 n=500 n=1000 n=2000 0.0 0.25 0.5 0.75 1.0 n=500 n=1000 n=2000 0.0 0.25 0.5 0.75 1.0 n=500 n=1000 n=2000 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 Probability of Answering Yes to the Sensitive Item Pr(Zi, J+1(0)) 0.75 0.5 0.25 Statistical Power Statistical Power Statistical Power 0.25 0.5 0.75 0.25 0.5 0.75 0.25 0.5 0.75 0.0 1.0 0.0 1.0 0.0 1.0 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 0.0 0.25 0.5 0.75 1.0 0.0 0.25 0.5 0.75 1.0 0.0 0.25 0.5 0.75 1.0 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 0.0 0.25 0.5 0.75 1.0 0.0 0.25 0.5 0.75 1.0 0.0 0.25 0.5 0.75 1.0 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 1.0 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 0.6 0.3 0.0 0.3 0.6 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 28 / 32 1.0 1.0

The Racial Prejudice Data Revisited Let s look at the black family item Did the negative proportion arise by chance? Observed Data Estimated Proportion of Control Treatment Respondent Types Response counts prop. counts prop. ˆπ y0 s.e. ˆπ y1 s.e. 0 8 1.4% 19 3.0% 3.0% 0.7 1.7% 0.8 1 132 22.4 123 19.7 21.4 1.7 1.0 2.4 2 222 37.7 229 36.7 35.7 2.6 2.0 2.8 3 227 38.5 219 35.1 33.1 2.2 5.4 0.9 4 34 5.4 Total 589 624 93.2 6.8 Minimum p-value: 0.022 Fail to reject the null with α = 0.05 For the affirmative action item, p-value is 0.394 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 29 / 32

Modeling Ceiling and Floor Effects Potential liars: Y i Treatment group Control group 4 (3,1) 3 (2,1) (3,0) (3,1) (3,1) (3,0) 2 (1,1) (2,0) (2,1) (2,0) 1 (0,1) (1,0) (1,1) (1,0) 0 (0,0) (0,1) (0,1) (0,0) The above test does not detect these liars so long as there is no design effect Proposed strategy: model ceiling and/or floor effects under an additional assumption Identification assumption: conditional independence between items given covariates ML estimation can be extended to this situation Blair and Imai (Princeton) List Experiments Political Methodology Seminar 30 / 32

Ceiling and Floor Effects for the Affirmative Action Item Both Ceiling Ceiling Effects Alone Floor Effects Alone and Floor Effects Variables est. s.e. est. s.e. est. s.e. Intercept 1.291 0.558 1.251 0.501 1.245 0.502 Age 0.294 0.101 0.314 0.092 0.313 0.092 College 0.345 0.336 0.605 0.298 0.606 0.298 Male 0.038 0.346 0.088 0.300 0.088 0.300 South 1.175 0.480 0.682 0.335 0.681 0.335 Prop. of liars Ceiling 0.0002 0.0017 0.0002 0.0016 Floor 0.0115 0.0000 0.0115 0.0000 Essentially no ceiling and floor effects Main conclusion for the affirmative action item seems robust Blair and Imai (Princeton) List Experiments Political Methodology Seminar 31 / 32

Concluding Remarks List experiments as alternative to randomized response method Advantages: 1 some empirical evidence that list experiments work better 2 easy to use, easy to understand Disadvantages: 1 loss of information = inefficiency 2 difficult to explore multivariate relationship 3 identification assumptions may be violated Our proposed methods partially overcome the difficulties: multivariate regression analysis statistical test for detecting list experiment failures adjusting for ceiling and floor effects The importance of design: choice of non-sensitive items Future work: Applications in Afghanistan and Nigeria Extension to hierarchical models for identifying spatial variation Blair and Imai (Princeton) List Experiments Political Methodology Seminar 32 / 32