A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters. By Tiemen Woutersen. Draft, September

Similar documents
A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters. By Tiemen Woutersen

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

Econ 273B Advanced Econometrics Spring

Working Paper No Maximum score type estimators

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

11. Bootstrap Methods

Dynamic time series binary choice

Dynamic time series binary choice

Comparison of inferential methods in partially identified models in terms of error in coverage probability

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

Specification Test for Instrumental Variables Regression with Many Instruments

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

NBER WORKING PAPER SERIES A NOTE ON ADAPTING PROPENSITY SCORE MATCHING AND SELECTION MODELS TO CHOICE BASED SAMPLES. James J. Heckman Petra E.

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Partial Identification and Confidence Intervals

Inference for Identifiable Parameters in Partially Identified Econometric Models

A Local Generalized Method of Moments Estimator

INVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES. Donald W. K. Andrews and Sukjin Han

Multiscale Adaptive Inference on Conditional Moment Inequalities

More on confidence intervals for partially identified parameters

Estimating the Derivative Function and Counterfactuals in Duration Models with Heterogeneity

Discussion Paper Series

Economics 583: Econometric Theory I A Primer on Asymptotics

Inference for Subsets of Parameters in Partially Identified Models

APEC 8212: Econometric Analysis II

A Note on Adapting Propensity Score Matching and Selection Models to Choice Based Samples

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

1. GENERAL DESCRIPTION

Inference for identifiable parameters in partially identified econometric models

Closest Moment Estimation under General Conditions

Review. December 4 th, Review

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Semi and Nonparametric Models in Econometrics

Partial Identification and Inference in Binary Choice and Duration Panel Data Models

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Of Nickell Bias, Cross-Sectional Dependence, and Their Cures: Reply

Generated Covariates in Nonparametric Estimation: A Short Review.

Estimation of the Conditional Variance in Paired Experiments

Informational Content in Static and Dynamic Discrete Response Panel Data Models

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

The properties of L p -GMM estimators

Semiparametric Efficiency in Irregularly Identified Models

Increasing the Power of Specification Tests. November 18, 2018

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator

CAUSALITY AND EXOGENEITY IN ECONOMETRICS

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

More on Confidence Intervals for Partially Identified Parameters

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Weizhen Wang & Zhongzhan Zhang

The Numerical Delta Method and Bootstrap

ECO 2403 TOPICS IN ECONOMETRICS

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

Non-linear panel data modeling

A NOTE ON MINIMAX TESTING AND CONFIDENCE INTERVALS IN MOMENT INEQUALITY MODELS. Timothy B. Armstrong. December 2014

What s New in Econometrics? Lecture 14 Quantile Methods

Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments

12E016. Econometric Methods II 6 ECTS. Overview and Objectives

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Course Description. Course Requirements

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Generalized Method of Moments Estimation

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

LECTURE 5 HYPOTHESIS TESTING

A better way to bootstrap pairs

Identification in Nonparametric Limited Dependent Variable Models with Simultaneity and Unobserved Heterogeneity

Bootstrapping Sensitivity Analysis

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

Identification and Inference on Regressions with Missing Covariate Data

Linear Models and Estimation by Least Squares

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press.

Simultaneous Prediction Intervals for the (Log)- Location-Scale Family of Distributions

Fall, 2007 Nonlinear Econometrics. Theory: Consistency for Extremum Estimators. Modeling: Probit, Logit, and Other Links.

Partial Identification of Nonseparable Models using Binary Instruments

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Asymptotic inference for a nonstationary double ar(1) model

Lecture 2: Consistency of M-estimators

Improving linear quantile regression for

Estimation and Inference with Weak Identi cation

Taking a New Contour: A Novel View on Unit Root Test 1

IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY

New Developments in Econometrics Lecture 16: Quantile Estimation

Generalized instrumental variable models, methods, and applications

Darmstadt Discussion Papers in Economics

Two examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.

Bootstrap Confidence Intervals

Dummy Endogenous Variables in Weakly Separable Models

MCMC CONFIDENCE SETS FOR IDENTIFIED SETS. Xiaohong Chen, Timothy M. Christensen, and Elie Tamer. May 2016 COWLES FOUNDATION DISCUSSION PAPER NO.

Sensitivity to Missing Data Assumptions: Theory and An Evaluation of the U.S. Wage Structure. September 21, 2012

Inference on Breakdown Frontiers

CALIFORNIA INSTITUTE OF TECHNOLOGY

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Simple Estimators for Monotone Index Models

Transcription:

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters By Tiemen Woutersen Draft, September 006 Abstract. This note proposes a new way to calculate confidence intervals of partially identified parameters. Keywords: Bounds, Partial Identification, Confidence Intervals 1. Introduction Confidence intervals for point identified parameters have been well studied in econometrics, see for example ewey and McFadden (1994) for an overview. More recently, confidence intervals have been developed for partially identified parameters. Horowitz and Manski (1998) develop interval estimates that asymptotically cover the entire identified region with fixed probability. Chernozhukov, Hong and Tamer (003) extend this approach through formulating the problem of covering the entire identified region as a minimization problem. Imbens and Manski (004) derive the confidence interval that covers each element in the identified region with fixed probability. This confidence interval is in general shorter than the earlier derived confidence intervals. In this note, we develop an easier and perhaps more intuitive way to calculate confidence the interval that covers each element in the identified region with fixed probability. We also extend earlier work by allowing that different estimators (e.g. a least squares and an instrumental variable estimator) or different datasets can be used to estimate the upper and the lower bound. An easy way to calculate the confidence interval of a point identified parameter is by estimating the (asymptotic) distribution function of the parameter and than calculating Comments are welcome, woutersen@jhu.edu, or Johns Hopkins University, Department of Economics, 3400 orth Charles street, Baltimore, MD 118. 1

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters its τ th and the (1-τ) th percentile. Similarly, one may be to approximate the (asymptotic) distribution of both the lower and upper bound of a scalar parameter. In general, the confidence interval of the lower bound contains smaller values of the parameter than the confidence interval of the upper bound. In this note, we propose to average the confidence intervals by averaging the distribution function of the lower and the upper bound. After averaging the distribution functions, the confidence interval is constructed as if we construct the confidence interval of a point, i.e. by calculating the τ th and the (100-τ) th percentile. The resulting confidence interval has the same asymptotic coverage properties as the confidence interval of Imbens and Manski (004) but is much easier to calculate. In particular, this simple procedure can be implemented by applying the bootstrap method to the lower and upper bound while no correction term is needed to ensure uniform convergence. Imbens and Manski (004) need a correction term to ensure uniform convergence in case the parameter in point identified and also assume that the estimator of the lower bound coincides with the estimator of the upper bound in case of point identification of the parameter. The simple procedure of averaging the distribution functions does not need a correction term if the estimators of the lower and upper bound coincide: Averaging the identical distribution function still yields the desired distribution function and, in this point identified case, the usual confidence intervals for point identification. Another advantage of the simple procedure is that estimators of the lower and upper bound can be different (e.g. the ordinary least squares estimator for the upper bound and an instrumental variable estimator for the lower bound, each estimator usingadifferent dataset). Moreover, parameter by parameter confidence intervals can be constructed for vector valued parameters.. Confidence interval For now, let the parameter of interest θ and its upper and lower bound, θ l and θ u, be scalars. The following assumption is identical to Imbens and Manski (004, assumption 1(i): Assumption 1: There are estimators for the lower and the upper bound, ˆθ l and ˆθ u, that

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters 3 satisfy: µ ˆθl θ l ˆθu θ u d µµ 0 0 µ σ, l ρσ l ρσ l σ u, uniformly in P P and there are estimators σ c l, σ c u, and ˆρ for σ l,, and ρ that converge to their population values uniformly in P P ( ρ may be equal to one in absolute value, as in the case where the width of the identification region is known. Imbens and Manski (004) need their assumptions 1 (ii)-(iii) so that the estimate of the difference between the lower and upper bound is well behaved. Here, we do not estimate that difference and do not need to make those assumptions. Moreover, we also do not need the assumption that θ l = f(p, λ l ) where λ l is a quantity which is known only to belong to a specialized set. For example, the estimators of the lower and upper bound could be differentorcouldbebasedondifferent datasets without reference to some λ l. Let CI θ α denote the symmetric confidence interval with coverage probability α. Let Φ(.) denote the distribution function of the standard normal distribution. Let θ and θ be chosen such that Φ( θ ˆθ l )+Φ( θ ˆθ u )=1 α (1) bσ l c Φ( θ ˆθ l )+Φ( θ ˆθ u )=1+α () bσ l c q q where bσ l = cσ l and c = cσ u. The confidence interval can then be constructed as CI θ α =[θ, θ ]. (3) The following theorem gives the uniform coverage result. Theorem 1 Let assumption 1 hold and let α 1.Then Proof : See appendix. inf Pr(θ P P CIθ α) α. The following lemma is central to this note. Lemma 1

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters 4 Let Z (0, 1), let α, c R. and let 0 α 1. Let Φ( ) denote the cumulative distribution function of Z. Then P (1 α Φ(Z)+Φ(Z + c) 1+α) =α. It is easy to see that the lemma is true for c =0and c ±. Consider c =0, P (1 α Φ(Z) 1+α) =P (1 α U 1+α) where U is uniformly distributed on the interval [0, 1] so that For c, we have P (1 α Φ(Z) 1+α) = 1+α 1 α = α. P (1 α c Φ(Z)+Φ(Z + c) 1+α) = P (1 α Φ(Z)+1 1+α) = P (0 Φ(Z) α) =α. For a fixed data generating process, one can often use the bootstrap to calculate CI θ α. For example, the bootstrap can be used if the estimators of the lower and upper bound can be written as the sum of an influence function in the terminology of ewey and McFadden (1994) and a term that converges in probability to zero, i.e. ˆθ l = θ l + 1 Pi g l(x i )+o p (1), ˆθ u = θ u + 1 Pi g u(x i )+o p (1) for some g l (.) and g u (.) and X being independent identically distributed (Horowitz, 001, theorem.). The bootstrap estimates the tail probabilities consistently in this case and can be used to approximate the tail probabilities of ˆθ l and ˆθ u as well as the average of the two. If, in addition, the assumptions of theorem 3.1 of Horowitz (001) hold, then the bootstrap yields an asymptotic refinement over the regular first order asymptotics. Imbens and Manski (004) assume that θ l = f(p, λ l ) and θ l = f(p, λ l ) where are the (smallest and largest) elements of a set. For example, their section 3 discusses an example in which the value of the dependent variable is observed with probability p while the mean value of the unobserved observations cannot be estimated and is assumed to lie between

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters 5 zero and one, λ [0, 1]. If p =1, i.e. we observe all outcomes, then the lower and upper bound coincide with the sample average. Assumption : There are estimators for the lower and the upper bound, ˆθ l and ˆθ u, that satisfy: µ ˆθl θ l ˆθu θ u d µµ 0 0 µ σ, l ρσ l ρσ l σ u, and there are estimators c σ l, c σ u, and ˆρ for σ l,, and ρ that converge in probability to their population values ( ρ may be equal to one in absolute value, as in the case where the width of the identification region is known). Theorem Let assumption 1 hold and let α 1.Then Proof : See appendix. Pr(θ CIθ α) α. Theorem allows us to use the bootstrap. In particular, the bootstrap can be used under the same conditions as before and parameter by parameter confidence intervals can be constructed for vector valued parameters. Lemma 1 3. Appendix Let Z (0, 1), let α, c R. and let 0 α 1. Let Φ( ) denote the cumulative distribution function of Z. Then P (1 α Φ(Z)+Φ(Z + c) 1+α) =α. Theorem 1 Let assumption 1 hold and let α 1.Then inf Pr(θ P P CIθ α) α. Proof : ote that, by construction, θ θ. We first consider θ 0 = θ. For simplicity, we first prove the theorem under the stronger assumption that µ µµ µ ˆθl θ l 0 σ, l ρσ l ˆθ u θ u 0 ρσ l σ u

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters 6 for known σ l, and ρ =1. ote that θ would not be an element of the confidence interval if H(θ) < 1 α or H(θ) > 1+α. In particular, Pr(θ CI θ α)=p (1 α H(θ) 1+α) = P (1 α Φ( θ ˆθ l bσ l )+Φ( θ ˆθ u ) 1+α) c = P (1 α Φ( θ ˆθ l )+Φ( θ θ σ l + θ ˆθ u ) 1+α) For any, θ ˆθ l σ l = θ ˆθ u has a standard normal distribution. Therefore, Lemma1appliessothat Pr(θ CI θ α)=p (1 α Z + Φ( θ θ Pr(θ CI θ α)=α. + Z) 1+α). Similarly, Pr( θ CI θ α)=α. Since Pr(θ 0 CI θ α) is minimized at θ 0 = θ or θ 0 = θ, we have Pr(θ CI θ α) α. Let assumption 1 hold and let α 1. Let We first consider θ 0 = θ. Then inf P P Pr(θ CIθ α)= inf.p (1 α H(θ) 1+α) α. P P As above, Pr(θ 0 CI θ α) is minimized at θ 0 = θ or θ 0 = θ, so that inf Pr(θ P P CIθ α) α. References [1] Chernozhukov, V., H. Hong and E. Tamer (003): Parameter Set Inference in a Class of Econometric Models, unpublished manuscript, Department of Economics, Princeton University. [] Horowitz, J. L. (001): The Bootstrap in Handbook of Econometrics, Vol. 5, ed. by J. J. Heckman and E. Leamer. Amsterdam: orth-holland.

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters 7 [3] Horowitz, J. L., and C. Manski (1998): Censoring of Outcomes and Regressors due to Survey onresponse: Identification and Estimatoin Using Weights and Imputations, Journal of Econometrics, 84, 37-58. [4] Imbens, G. W. and C. F. Manski (004): Confidence Intervals for Partially Identified Parameters, Econometrica, 7, 1845-1857. [5] Manski, C. (1990): onparametric Bounds on Treatment Effects, American Economic Review Papers and Proceedings, 80, 319-33. [6] Manski, C. (003): Partial Identification of Probability Distributions, ew York: Springer-Verlag. [7] ewey, W. K. (1991): Uniform Convergence in Probability and Stochastic Equicontinuity, Econometrica, 59, 1161-1167. [8] ewey, W. K., and D. McFadden (1994): Large Sample Estimation and Hypothesis Testing, in Handbook of Econometrics, Vol. 4, ed. by R. F. Engle and D. MacFadden. Amsterdam: orth-holland.