Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability

Size: px
Start display at page:

Download "Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability"

Transcription

1 Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Toru Kitagawa Department of Economics, University College London 28, July, 2012 Abstract In inference for set-identi ed parameters, Bayesian probability statements about unknown parameters do not coincide, even asymptotically, with frequentist s con dence statements. This paper aims to smooth out this disagreement from a robust Bayes perspective. I show that a class of prior distributions exists, with which the posterior inference statements drawn via the lower envelope (lower probability) of the class of posterior distributions asymptotically agrees with frequentist con dence statements for the identi ed set. With this class of priors, the statistical decision problems, including the point and set estimation of the set-identi ed parameters, are analyzed under the posterior gamma-minimax criterion. Keywords: Partial Identi cation, Bayesian Robustness, Belief Function, Imprecise Probability, Gamma-minimax, Random Set. t.kitagawa@ucl.ac.uk. I thank Gary Chamberlain, Andrew Chesher, Siddhartha Chib, Larry Epstein, Jean-Pierre Florens, Guido Imbens, Hiro Kaido, Charles Manski, Ulrich Müller, Andriy Norets, Adam Rosen, Kevin Song, and Elie Tamer for their valuable discussions and comments. I also thank the seminar participants at Academia Sinica, Brown, Cornell, Cowles Conference 2011, EC , Harvard/MIT, Midwest Econometrics 2010, Northwestern, NYU, RES Conference 2011, Seoul National University, Simon Fraser University, and UBC for their helpful comments. All remaining errors are mine. Financial support from the ESRC through the ESRC Centre for Microdata Methods and Practice (CeMMAP) (grant number RES ) is gratefully acknowledged. 1

2 1 Introduction In inferring identi ed parameters in a parametric setup, the Bayesian probability statements about unknown parameters are found to be similar, at least asymptotically, to the frequentist con dence statements about the true value of the parameters. In partial identi cation analyses initiated by Manski (1989, 1990, 2003, 2007), such asymptotic harmony between the two inference paradigms breaks down (Moon and Schorfheide (2011)). The Bayesian interval estimates for the set-identi ed parameter are shorter, even asymptotically, than the frequentist ones, and they asymptotically lie inside the frequentist con dence intervals. Frequentists might interpret this phenomenon, Bayesian over-con dence in their inferential statements, as being ctitious. Bayesians, on the other hand, might consider that the frequentist con dence statements, which apparently lack posterior probability interpretation, raise some interpretative di culty once data are observed. The primary aim of this paper is to smooth out the disagreement between the two schools of statistical inference by applying the perspective of a robust Bayes inference, where one can incorporate partial prior knowledge into posterior inference. While there is a variety of robust Bayes approaches, this paper focuses on a multiple prior Bayes analysis, where the partial prior knowledge, or the robustness concern against prior misspeci cation, is modeled with a class of priors (ambiguous belief). The Bayes rule is applied to each prior to form a class of posteriors. The posterior inference procedures considered in this paper operate on the class of posteriors by focusing on their lower and upper envelopes, the so-called posterior lower and upper probabilities. When the parameters are not identi ed, the prior distribution of the model parameters can be decomposed into two components: one that can be updated by data (revisable prior knowledge) and one that can never be updated by data (unrevisable prior knowledge). Given that the ultimate goal of the partially identi cation analysis is to establish a "domain of consensus" (Manski (2007)) among the set of assumptions that data are silent about, a natural way to incorporate this agenda into the robust Bayes framework is to design a prior class in such a way that it shares a single prior distribution for the revisable prior knowledge, but allows for arbitrary prior distributions for the unrevisable prior knowledge. Using this prior class as a prior input, this paper derives the posterior lower probability and investigates 2

3 its analytical property. For an interval-identi ed parameter case, I also examine whether the inferential statements drawn via the posterior lower probability can asymptotically have any type of valid frequentist coverage probability in the partially identi ed setting. Another question this paper examines is, with such class of priors, how to formulate and solve statistical decision problems including point estimation of the set-identi ed parameters. I approach this question by adapting the posterior gamma-minimax analysis, which can be seen as a minimax analysis with the multiple posteriors, and demonstrate that the proposed prior class leads to an analytically tractable and numerically solvable formulation of the posterior gamma-minimax decision problem, provided that the identi ed set for the parameter of interest can be computed for each possible distribution of data. 1.1 Related Literature Estimation and inference in partially identi ed models are a growing research area in the eld of econometrics. From the frequentist perspective, Horowitz and Manski (2000) construct con dence intervals for an interval identi ed set. Imbens and Manski (2004) propose uniformly asymptotically valid con dence sets for an interval-identi ed parameter, which are further extended by Stoye (2009). Chernozhukov, Hong, and Tamer (2007) develop a way to construct asymptotically valid con dence sets for an identi ed set based on the criterion function approach, which can be applied to a wide range of partially identi ed models including moment inequality models. In relation to the criterion function approach, the literature on the construction of con dence sets by inverting test statistics includes, but is not limited to, Andrews and Guggenberger (2009), Andrews and Soares (2010), and Romano and Shaikh (2010). From the Bayesian perspective, Neath and Samaniego (1997), Poirier (1998), and Gustafson (2009, 2010) analyze how Bayesian updating performs when a model lacks identi cation. Liao and Jiang (2010) conduct a Bayesian inference for moment inequality models, based on the pseudo-likelihood. Moon and Schorfheide (2011) compare the asymptotic properties of frequentist and Bayesian inferences for set-identi ed models. My robust Bayes analysis is motivated by Moon and Schorfheide s important ndings on the asymptotic disagreement between the frequentist and Bayesian inferences. Epstein and Seo (2012) focus on a set- 3

4 identi ed model of entry games with multiple equilibria, and provide an axiomatic argument that justi es a single-prior Bayesian inference for a set-identi ed parameter. The current paper does not intend to provide any normative argument as to whether one should proceed with a single prior or multiple priors in inferring non-identi ed parameters. The analysis of lower and upper probabilities originates with Dempster (1966, 1967a, 1967b, 1968), in his ducial argument of drawing posterior inferences without specifying a prior distribution. The in uence of Dempster s appears in the belief function analysis of Shafer (1976, 1982) and the imprecise probability analysis of Walley (1991). In the context of robust Bayes analysis, the lower and upper probabilities have been playing important roles in measuring the global sensitivity of the posterior (Berger (1984), Berger and Berliner (1986)) and also in characterizing a class of priors/posteriors (DeRobertis and Hartigan (1981), Wasserman (1989, 1990), and Wasserman and Kadane (1990)). In econometrics, pioneering work using multiple priors was carried out by Chamberlain and Leamer (1976), and Leamer (1982), who obtained the bounds for the posterior mean of the regression coe cients when a prior varies over a certain class. All of these previous studies did not explicitly consider non-identi ed models. This paper, in contrast, focuses on non-identi ed models, and aims to clarify a link between the early idea of the lower and upper probabilities and a recent issue on inferences in set-identi ed models. The posterior lower probability to be obtained in this paper is an in nite-order monotone capacity, or equivalently, a containment functional in the random set theory. Beresteanu and Molinari (2008) and Beresteanu, Molchanov, and Molinari (2012) show the usefulness and wide applicability of the random set theory to a class of partially identi ed models by viewing observations as random sets, and the estimand (identi ed set) as its Aumann expectation. They propose an asymptotically valid frequentist inference procedure for the identi ed set by employing the central limit theorem applicable to the properly de ned sum of random sets. Galichon and Henry (2006, 2009) and Beresteanu, Molchanov, and Molinari (2011) propose a use of in nite-order capacity in de ning and inferring the identi ed set in the structural econometric model with multiple equilibria. The robust Bayes analysis of this paper closely relates to the literature of non-additive measures and random sets, but the way that these theories enter to the analysis di ers from these previous works in the following ways. First, the class of models to be considered is assumed to have well- 4

5 de ned likelihood functions, and the lack of identi cation is modeled in terms of the "dataindependent at regions" of the likelihood. Ambiguity is not explicitly modeled at the level of observations, but instead ambiguity for the parameters is introduced through the absence of prior knowledge on each at region of the likelihood. Second, I obtain the identi ed set as random sets, whose probability law is represented by the posterior lower probability. Here, the source of probability that induces the random identi ed set is the posterior uncertainty for the identi able parameters, not the sampling probability of the observations. Third, the inferential statements to be proposed in the paper are made conditional on data, and they do not invoke any large-sample approximations. The decision theoretic analysis in this paper employs the posterior gamma-minimax criterion, which leads to a decision that minimizes the worst case posterior risk over the class of posteriors. The gamma-minimax decision analysis often becomes challenging, both analytically and numerically, and the existing analyses are limited to rather simple parametric models with a certain choice of prior class (Betro and Ruggeri (1992), Chamberlain (2000), and Vidakovic (2000)). The speci ed prior class, in contrast, o ers a general and feasible way to solve the posterior gamma-minimax decision problem, provided that the identi ed set for the parameter of interest can be computed for each of the identi ed parameter values. In a recent study by Song (2012), point estimation for an interval-identi ed parameter from the local asymptotic minimax approach is considered. 1.2 Plan of the Paper The rest of the paper is organized as follows. In Section 2, the main results of this paper are presented using a simple example of missing data. Section 3 introduces the general framework, where I construct a class of prior distributions that can contain arbitrary unrevisable prior knowledge. I then derive the posterior lower and upper probabilities. Statistical decision analyses with multiple priors are examined in Section 4. In Section 5, how to construct the posterior credible regions based on the posterior lower probability is discussed and their large-sample behaviors are examined in an interval-identi ed parameter case. Proofs and lemmas are provided in Appendix A. 5

6 2 An Illustration: A Missing Data Example This section illustrates the main results of the paper using an example of missing data (Manski (1989)). Let Y 2 f1; 0g be the binary outcome of interest (e.g., a worker is employed or not). Let W 2 f1; 0g be an indicator of whether Y is observed (W = 1) or not (W = 0), (i.e., the subject responded or not). Data are given by a random sample of size n, x = f(y i W i ; W i ) : i = 1; : : : ; ng. The starting point of the analysis is to specify a parameter vector 2 that pins down the distribution of the data and the parameter of interest. Here, can be speci ed by a vector of four probability masses: ( yd ), yd Pr (Y = y; D = d), y = 1; 0, and d = 1; 0. The observed data likelihood for is written as p(xj) = n n [ ] n mis ; where n 11 = P n i=1 Y iw i ; n 01 = P n i=1 (1 Y i)w i ; n mis = P n i=1 (1 W i). This likelihood function depends on only through the three probability masses, = ( 11 ; 01 ; mis ) ( 11 ; 01 ; ); 2, no matter what the observations are, so that the likelihood for has "data-independent" at regions, which are expressed as a set-valued map of, () f 2 : 11 = 11 ; 01 = 01 ; = mis g : The parameter of interest is the mean of Y; which is written as a function of, Pr(Y = 1) = The identi ed set of, H (), as the set-valued map of is de ned by the range of when varies over (), H() = [ 11 ; 11 + mis ]; which are the Manski (1989) s bounds of Pr(Y = 1). The standard Bayes inference for would proceed as follows; specify a prior of, update it using the Bayes rule, and marginalize the posterior of to. If the likelihood for has dataindependent at regions as represented by f () : 2 g, then the prior for conditional on f 2 ()g (i.e., belief on how the proportion of missing observations, mis, is divided into 10 and 00 ) will never be updated by data. Consequently, the posterior of and possibly that of become sensitive to the speci cation of such conditional priors of given. The 6

7 robust Bayes procedure considered in this paper aims to make the posterior inference free from such sensitivity concerns by introducing multiple priors for. The way to construct a prior class is as follows. I rst specify a single prior for the identi ed parameters. In view of, prior speci es how much prior belief should be assigned to each at region of the s likelihood (), whereas, depending on ways to allocate the assigned belief over 2 () (for each ), the implied prior for may di er. Therefore, by collecting all the possible ways of allocate the assigned belief over f 2 ()g for each, I can construct the following class of prior distributions of, M = : ( (B)) = (B) for all B, where denotes a prior distribution for. By applying the Bayes rule to each 2 M and marginalizing each posterior of for, I obtain the class of posteriors of, F jx : 2 M. I now summarize the class of posteriors of by its lower envelope (lower probability), F jx (D) = inf 2M( ) F jx (D), which maps subset D in the parameter space of to [0; 1]. In words, the posterior lower probability evaluated at D says that the posterior belief allocated for f 2 Dg is at least F jx (D), no matter which 2 M is used. The main theorem of this paper shows that the posterior lower probability satis es F jx (D) = F jx (f : H () Dg), where F jx denotes the posterior distribution of implied from the prior. The key insight of this equality is that, with prior class M, drawing inference for based on its posterior lower probability is done by analyzing the probability law of random sets H (), F jx. Leaving their formal analysis to the later sections of this paper, I now outline the implementation of the posterior lower probability inference for proposed in this paper. 1. Specify a prior for and update it by the Bayes rule. When a credible prior for is not available, a reasonably "non-informative" prior may be used as far as the posterior of is proper Let f s : s = 1; : : : ; Sg be random draws of from the posterior of. The mean and median of the posterior lower probability of can be de ned via the gamma-minimax 1 See Kass and Wasserman (1996) for a survey of reasonably non-informative priors. 7

8 decision criterion, and they can be approximated by 1 arg min a S respectively. SX sup (a ) 2 1 and arg min 2H( s ) a S s=1 SX sup ja j ; 2H( s ) 3. The posterior lower credible region of at credibility level 1, which can be interpreted as a (1 )- level set of the posterior lower probability of, is de ned by the smallest interval that contains H () with posterior probability 1 (Proposition 5.1 in this paper proposes an algorithm to compute the posterior lower credible region for interval-identi ed cases). s=1 Under certain regularity conditions that are satis ed in the current missing data example, the posterior lower credible region of asymptotically attains the frequentist coverage probability 1 for the true identi ed set H ( 0 ). where 0 is the value of corresponding to the sampling distribution of data. 3 Multiple-prior Analysis and the Lower and Upper Probabilities 3.1 Likelihood and Set Identi cation: The General Framework Let (X; X ) and (; A) be measurable spaces of a sample X 2 X and a parameter vector 2, respectively. The analytical framework of this paper covers both a parametric model = R d, d < 1, and a non-parametric model where is a separable Banach space. The sample size is implicit in the notation. Let be a marginal probability distribution on the parameter space (; A), referred to as a prior distribution for. Assume that the conditional distribution of X given exists and has the probability density p(xj) at every 2 with respect to a - nite measure on (X; X ). The parameter vector may consist of parameters that determine the behaviors of the economic agents, as well as those that characterize the distribution of the unobserved heterogeneities in the population. In the context of the missing data or counterfactual causal models, indexes the distribution of the underlying population outcomes or the potential outcomes. In all of these cases, the parameter should be distinguished from the parameters 8

9 that are solely used to index the sampling distribution of observations. The identi cation problem of typically arises in this context. If multiple values of generate the same distribution of data, then these s are observationally equivalent and the identi cation of fails. In terms of the likelihood function p(xj), the observational equivalence of and 0 6= means that the values of the likelihood at and 0 are equal for every possible sample, i.e., p(xj) = p(xj 0 ) for every x 2 X (Rothenberg (1971), Drèze (1974), and Kadane (1974)). I represent the observational equivalence relation of s by a many-to-one function g : (; A)! (; B): g() = g( 0 ) if and only if p(xj) = p(xj 0 ) for all x 2 X: The equivalence relationship partitions the parameter space into equivalent classes, in each of which the likelihood of is at, irrespective of observations, and = g() maps each of these equivalent classes to a point in another parameter space. In the language of structural models in econometrics (Hurwicz (1950), and Koopman and Reiersol (1950)), = g() is interpreted as the reduced-form parameter that carries all the information for the structural parameters through the value of the likelihood function. In the literature of Bayesian statistics, = g() is referred to as the minimally su cient parameter (su cient parameter for short), and the range space of g(), (; B), is called the su cient parameter space (Barankin (1960), Dawid (1979), Florens and Mouchart (1977), Picci (1977), and Florens, Mouchart, and Rolin (1990)). 2 In the presence of su cient parameters, the likelihood depends on, only through the function g(), i.e., there exists a B-measurable function ^p(xj) such that p(xj) = ^p(xjg()) 8x 2 X and 2 (3.1) holds (Lemma of Lehmann and Romano (2005)). Denote the inverse image of g() by : () = f 2 : g() = g, 2 Florens and Simoni (2011) provide comprehensive discussions on the relationship between frequentist and Bayesian identi cation. 9

10 where () and ( 0 ) for 6= 0 are disjoint, and f () ; 2 g constitutes a partition of. I assume g() = ; so () is non-empty for every 2. 3 In the set-identi ed model, the parameter of interest 2 H is a subvector or a transformation of denoted by = h(), h : (; A)! (H; D). The formal de nition of the identi ed set of is given as follows. De nition 3.1 (Identi ed Set of ) (i) The identi ed set of is a set-valued map H : H de ned by the projection of onto H through h(), H() fh() : 2 ()g : () (ii) The parameter = h() is point-identi ed at if H() is a singleton, and is setidenti ed at if H () is not a singleton. Note that the identi cation of is de ned in the pre-posterior sense because it is based on the likelihood evaluated at every possible realization of a sample, not only for the observed one. 3.2 Examples I now provide some examples, in addition to the illustrating example of Section 2, both to illustrate the above concepts and notations, and to provide a concrete focus for the later development. Example 3.1 (Bounding ATE by Linear Programming) Consider the treatment effect model with incompliance and a binary instrument 2 f1; 0g, as considered in Imbens and Angrist (1994), and Angrist, Imbens, and Rubin (1996). Assume that the treatment status and the outcome of interest are both binary. Let (W 1 ; W 0 ) 2 f1; 0g 2 be the potential treatment status in response to the instrument, and W = W 1 + (1 )W 0 be the observed treatment status. (Y 1 ; Y 0 ) 2 f1; 0g 2 is a pair of treated and control outcomes and 3 In an observationally restrictive model, in the sense of Koopman and Reiersol (1950), ^p(xj) likelihood function for the su cient parameters, is well de ned for a domain larger than g() (see Example 3.1 in Section 3.2). In this case, the model possesses the falisi ability property, and () can be empty for some 2. 10

11 Y = W Y 1 + (1 W )Y 0 is the observed outcome. Data is a random sample of (Y i ; W i ; i ). Following Imbens and Angrist (1994), consider partitioning the population into four subpopulations de ned in terms of the potential treatment-selection responses: 8 c if W 1i = 1 and W 0i = 0 : complier, >< at if W 1i = W 0i = 1 : always-taker, T i = nt if W 1i = W 0i = 0 : never-taker, >: d if W 1i = 0 and W 0i = 1 : de er, where T i is the indicator for the types of selection responses. Assume a randomized instrument,? (Y 1 ; Y 0 ; W 1 ; W 0 ). Then, the distribution of observables and the distribution of potential outcomes satisfy the following equalities for y 2 f1; 0g: Pr(Y = y; W = 1j = 1) = Pr(Y 1 = y; T = c) + Pr(Y 1 = y; T = at); (3.2) Pr(Y = y; W = 1j = 0) = Pr(Y 1 = y; T = d) + Pr(Y 1 = y; T = at); Pr(Y = y; W = 0j = 1) = Pr(Y 0 = y; T = d) + Pr(Y 1 = y; T = nt); Pr(Y = y; W = 0j = 0) = Pr(Y 0 = y; T = c) + Pr(Y 1 = y; T = nt): Ignoring the marginal distribution of, a full parameter vector of the model can be speci ed by a joint distribution of (Y 1 ; Y 0 ; T ): = (Pr(Y 1 = y; Y 0 = y 0 ; T = t) : y = 1; 0; y 0 = 1; 0; t = c; nt; at; d) 2 ; where is the 16-dimensional probability simplex. Let ATE be the parameter of interest. E(Y 1 Y 0 ) = X [Pr(Y 1 = 1; T = t) Pr(Y 0 = 1; T = t)] t=c;nt;at;d X X = [Pr(Y 1 = 1; Y 0 = y; T = t) t=c;nt;at;d y=1;0 Pr(Y 1 = y; Y 0 = 1; T = t)] h(): The likelihood conditional on depends on only through the distribution of (Y; W ) given, so the su cient parameter vector consists of eight probability masses: = (Pr(Y = y; W = wj = z) : y = 1; 0; d = 1; 0; z = 1; 0) : 11

12 The set of equations (3.2) de nes () the set of observationally equivalent distributions of (Y 1 ; Y 0 ; T ) ; when the data distribution is given at. Balke and Pearl (1997) derive the identi ed set of ATE, H() = h( ()); by maximizing or minimizing h(), subject to 2 and constraints (3.2). obtained as a convex interval. This optimization can be solved by linear programming and H() is Note that, in this model, special attention is needed for the su cient parameter space to ensure that () is non-empty. Pearl (1995) shows that the distribution of data is compatible with the instrument exogeneity condition,? (Y 1 ; Y 0 ; W 1 ; W 0 ) ; if and only if max w X y max fpr(y = y; W = w)j = zg 1: (3.3) z This implies that in order to guarantee one for that ful ll (3.3). () 6= ;, a prior distribution for puts probability Example 3.2 (Linear Moment Inequality Model) Consider the model where the parameter of interest 2 H is characterized by linear moment inequalities, E(m(X) A) 0; where the parameter space H is a subset of R L, m(x) is a J-dimensional vector of known functions of an observation, and A is a J L known constant matrix. By augmenting the J-dimensional parameter 2 [0; 1) J, these moment inequalities can be written as the J-moment equalities, 4 E(m(X) A ) = 0: To obtain a likelihood function for the current moment equality model, specify the full parameter vector to be = (; ) 2 H [0; 1) J, and consider the exponentially tilted empirical likelihood for as considered in Schennach (2005). Let x = (x 1 ; : : : ; x n ) be a size n random sample of observations, and de ne g() = A +. If the convex hull of [ i fm(x i ) g()g contains the origin, then the exponentially tilted empirical likelihood is written as p(xj) = Y w i (); 4 I owe the Bayesian formulation of the moment inequality model shown here to Tony Lancaster (personal communication 2006). 12

13 where w i () = exp f(g()) 0 (m(x i ) g())g P n i=1 exp f(g())0 (m(x i ) g())g ; (g()) = arg min 2R J + ( nx exp f 0 (m(x i ) i=1 g())g Thus, the parameter = (; ) enters the likelihood only through g() = A +. Consequently, I take = g() to be the su cient parameters. by () = (; ) 2 H [0; 1) L : A + = : The coordinate projection of ) : The identi ed set for is given () onto H yields H(), the identi ed set for (Bertsimas and Tsitsiklis (1997, Chap.2) for an algorithm for projecting a polyhedron). 3.3 Unrevisable Prior Knowledge and a Class of Priors Let be a prior of and be the marginal probability measure on the su cient parameter space (; B) induced by and g(): (B) = ( (B)) for all B 2 B. Let x 2 X be sampled data. The posterior distribution of, denoted by F jx (), is obtained as F jx (A) = j (Aj)dF jx (); A 2 A, (3.4) where j (Aj) denotes the conditional distribution of given, and F jx () is the posterior distribution of. The posterior distribution of given in (3.4) shows that the prior distribution for marginalized to can be updated by data, while the conditional prior of given is never be updated by the data because the likelihood is at on () for any realizations of the sample. In this sense, the prior information marginalized to the su cient parameter can be interpreted as the revisable prior knowledge, and the conditional priors of given, j (j) : 2 can be interpreted as the unrevisable prior knowledge. If one wants to 13

14 summarize the posterior uncertainty of in the form of a probability distribution on (; A), as recommended in the Bayesian paradigm, he needs to have a single prior distribution of, which necessarily induces unique unrevisable prior knowledge j. If he could justify his choice of j by any credible prior information, the standard Bayesian updating (3.4) would yield a valid posterior distribution of : A challenging situation would arise if one is short of a credible prior distribution of. In this case, the researcher, who is aware that j will never be updated by data, might feel anxious in implementing the Bayesian inference procedure, because an uncon dently speci ed j can have a signi cant in uence to the subsequent posterior inference. The robust Bayes analysis in this paper speci cally focuses on such a situation, and introduce ambiguity for the conditional prior j (j) : 2 in the form of multiple priors. Speci cally, given a prior on (; B) speci ed by the researcher, consider the class of prior distributions of de ned by: M( ) = : ( (B)) = (B) for every B 2 B : M( ) consists of prior distributions of whose marginal distribution for the su cient parameters coincides with the prespeci ed. 5 This paper proposes to use M( ) as a prior input for the posterior analysis, meaning that, with accepting to specify a single prior distribution for the su cient parameters, I leave the conditional priors j unspeci ed and allow for arbitrary ones as long as () = R j(j)d yields a probability measure on (; A). 6 given. In the subsequent analysis, I shall not discuss how to select, and shall treat as The in uence of on the posterior of will diminish as the sample size increases, so the sensitivity issue of the posterior of is expected to be less severe when the sample size is moderate or large. 5 I thank Jean-Pierre Florens for suggesting this representation of the prior class. 6 Su cient parameters are de ned by examining the entire model fp (xj) : x 2 X ; 2 g, so that the prior class M( ) is, by construction, model dependent. This distinguishes the current approach from the standard robust Bayes analysis where a prior class represents the researcher s subjective assessment of his imprecise prior knowledge (Berger (1985)). 14

15 3.4 Posterior Lower and Upper Probabilities The Bayes rule is applied to each prior in M( ) to generate the class of posterior distributions of. Consider summarizing the posterior class by the posterior lower probability F jx () : A! [0; 1] and the posterior upper probability FjX () : A! [0; 1], de ned as F jx (A) inf F jx (A); 2M( ) FjX(A) sup F jx (A): 2M( ) Note that the posterior lower probability and the upper probability have a conjugate property, F jx (A) = 1 form. F jx (Ac ), so it su ces to focus on one of them in deriving their analytical In order to obtain F jx (), the following regularity conditions are assumed. Condition 3.1 (i) A prior for,, is proper and absolutely continuous with respect to a - nite measure on (; B). (ii) g : (; A)! (; B) is measurable and its inverse image () is a closed set in, -almost every 2. (iii) h : (; A)! (H; D) is measurable and H () = h ( ()) is a closed set in H, -almost every 2 : These conditions are imposed for () and H () to be interpreted as random closed sets induced by a probability measure on (; B). 7 The closedness of () and H () are implied, for instance, by continuity of g () and h (). Theorem 3.1 Assume Condition The inference procedure proposed in this paper can be implemented as long as the posterior of is proper. However, how to accommodate an improper prior for in the development of the analytical results is beyond the scope of this paper. 15

16 (i) For each A 2 A, F jx (A) = F jx (f : () Ag); (3.5) F jx(a) = F jx (f : () \ A 6= ;g) ; (3.6) where F jx (B); B 2 B, is the posterior probability measure of. (ii) De ne the posterior lower and upper probabilities of = h () by It holds F jx (D) inf F jx (h 1 (D)); 2M( ) FjX(D) sup F jx (h 1 (D)); for D 2 D. 2M( ) F jx (D) = F jx (f : H() Dg); F jx(d) = F jx (f : H() \ D 6= ;g): Proof. For a proof of (i), see Appendix A. For a proof of (ii), see equation (3.7). The expression for F jx (A) implies that the posterior lower probability on A calculates the probability that the set () is contained in subset A in terms of the posterior probability law of. On the other hand, the upper probability is interpreted as the posterior probability that the set () hits subset A. The second statement of the theorem provides a procedure for marginalizing the lower and upper probabilities of into those of the parameter of interest. The expressions of F jx (D) and FjX (D) are simple and easy to interpret: the lower and upper probabilities of = h() are the containment and hitting probabilities of the random sets obtained by projecting () through h(). This marginalization rule of the lower probability follows from F jx (D) = F jx (h 1 (D)) = F jx ( : () h 1 (D) ) = F jx (f : H() Dg). (3.7) 16

17 Note that, In the standard Bayesian inference, marginalization of the posterior of to is conducted by integrating the posterior probability measure of for, while in the lower probability inference, marginalization for corresponds to projecting random sets () via = h (). This stark contrast between the standard Bayes and the multiple prior robust Bayes inference highlights how the introduction of ambiguity changes the way of eliminating the nuisance parameters in the posterior inference. As is known in the literature (e.g., Huber (1973)), the lower probability of a set of probability measures is a monotone nonadditive measure (capacity). Furthermore, in the current speci cation of the prior class, the representation of the lower probability obtained in Theorem 3.1 implies that the resulting posterior lower and upper probabilities are supermodular and submodular, respectively. Corollary 3.1 Assume Condition 3.1. The posterior lower and upper probabilities of are supermodular and submodular, respectively. For A 1, A 2 2 A subsets in, F jx (A 1 [ A 2 ) + F jx (A 1 \ A 2 ) F jx (A 1 ) + F jx (A 2 ); FjX(A 1 [ A 2 ) + FjX(A 1 \ A 2 ) FjX(A 1 ) + FjX(A 2 ): Also, the posterior lower and upper probabilities of are supermodular and submodular, respectively. For D 1, D 2 2 D subsets in H, F jx (D 1 [ D 2 ) + F jx (D 1 \ D 2 ) F jx (D 1 ) + F jx (D 2 ); FjX(D 1 [ D 2 ) + FjX(D 1 \ D 2 ) FjX(D 1 ) + FjX(D 2 ): The results of Theorem 3.1 (i) can be seen as a special case of Wasserman s (1990) general construction of the posterior lower and upper probabilities. Whereas, one notable di erence from Wasserman s analysis is that, with prior class M, the lower probability of the posterior class becomes an 1-order monotone capacity (a containment functional of random 17

18 sets). 8 This plays a crucial role in simplifying the gamma minimax analysis considered in the next section. 4 Posterior Gamma-minimax Analysis for = h() In the standard Bayesian posterior analysis, a statistical decision problem involving (e.g., point estimation for ) is straightforward, minimizing the posterior risk. When the posterior information of is summarized by the class of posteriors, how should the optimal statistical action be solved? This section studies this problem by adapting the posterior gammaminimax analysis. Let a 2 H a be an action, where H a is an action space. In the case of the point estimation problem for, an action is interpreted as reporting a non-randomized point estimate for, where action space H a is a subset of H. Given an action a to be taken, and being the true state of nature, a loss function L(; a) : H H a! R + yields the cost to the decision maker of taking action a. I assume that the loss function L(; a) is non-negative. If a single prior for,, were given, the posterior risk would be de ned by ( ; a) L(; a)df jx (); (4.1) H where the rst argument in the posterior risk represents the dependence of the posterior of on the speci cation of the prior for. Our analysis involves multiple priors, so the class of posterior risks ( ; a) : 2 M( ) is considered. The posterior gamma-minimax criterion 9 ranks actions in terms of the worst case posterior risk (upper posterior risk): ( ; a) sup ( ; a), 2M( ) 8 Wasserman (1990, p.463) posed an open question asking which class of priors can assure the posterior lower probability to be a containment functional of random sets. Theorem 3.1 provides an answer to his open question in the situation where the model lacks identi ability. 9 In the robust Bayes literature, the class of prior distributions is often denoted by. This is why it is called the gamma-minimax criterion. upper probabilities, Unfortunately, in the literature of belief functions and lower and often denotes a set-valued mapping that generates the lower and upper probabilities. In this paper, we adopt the latter notational convention, but still refer to the decision criterion as the gamma-minimax criterion. 18

19 where the rst argument in the upper posterior risk represents the dependence of the prior class on a prior for. De nition 4.1 A posterior gamma-minimax action a x with respect to prior class M is an action that minimizes the upper posterior risk, i.e., ( ; a x) = inf a2h a ( ; a) = inf sup a2h a 2M( ) ( ; a). The gamma-minimax decision approach involves a favor for a conservative action that guards against the least favorable prior within the class, and it can be seen as a compromise of the Bayesian decision principle and the minimax decision principle. The next proposition shows that the upper posterior risk ( ; a) equals the Choquet expected loss with respect to the posterior upper probability. Proposition 4.1 Under Condition 3.1, the upper posterior risk satis es ( ; a) = L(; a)dfjx() = sup L(; a)df jx (), (4.2) 2H() whenever R L(; a)dfjx () < 1, where R L(; a)dfjx () is the Choquet integral. Proof. See Appendix A. The third expression in (4.2) shows that the posterior gamma-minimax criterion is written as the expectation of the worst-case loss function, sup 2H() L(; a), with respect to the posterior of. The supremum part stems from the ambiguity of : given, what the researcher knows about is only that it lies within the identi ed set H (), and, following the minimax principle, he forms the loss by supposing that the nature chooses the worst case in response to his/her action a. On the other hand, the expectation in represents the posterior uncertainty of the identi ed set H (): with the nite number of observations, the identi ed set of is known with some uncertainty as summarized by the posterior of. The posterior gamma-minimax criterion combines such ambiguity of with the posterior uncertainty of the identi ed set H () to yield a single objective function to be minimized The posterior gamma minimax action a x can be interpreted as a Bayes action for some posterior distributions in the class. For instance, in case of the quadratic loss, the saddle-point argument implies that the gamma-minimax action a x corresponds to the mean of a posterior distribution (Bayes action) that has maximal posterior variance in the class. 19

20 Although a closed-form expression of a x is not, in general, available, this proposition suggests a simple numerical algorithm for approximating a x using a random sample of from its posterior F jx. Let f s g S s=1 be S random draws of from posterior F jx. Then, a x can be approximated by ^a 1 x arg min a2h a S SX sup L(; a). 2H( s ) s=1 The gamma minimax decisions are usually dynamically inconsistent; a posteriori optimal gamma-minimax action does not coincide with an unconditional optimal gamma-minimax decision. This is also the case with out prior class, and this will imply that a x fails to be a Bayes decision with respect to any single prior in the class M. See Appendix B for an example and further discussion. As an alternative to the posterior gamma-minimax action, the gamma-minimax regret criterion may be considered (Berger (1985, p. 218), and Rios Insua, Ruggeri, and Vidakovic (1995)). Appendix B provides some analytical results of the posterior gamma-minimax regret analysis where the parameter of interest is a scalar and the loss function is quadratic, L(; a) = ( a) 2. There, it is shown that the posterior gamma-minimax regret decision can di er from the posterior gamma-minimax decision derived above, but that they converge to the same limit asymptotically. 5 Set Estimation of In the standard Bayesian inference, set estimation is often conducted by reporting the contour sets of the posterior probability density of (highest posterior density region). If the posterior information for is summarized by the lower and upper probabilities, how should we conduct set estimation of? 5.1 Posterior Lower Credible Region For 2 (0; 1), consider a subset C 1 H such that the posterior lower probability F jx (C 1 ) is greater than or equal to 1 : F jx (C 1 ) = F jx (H() C 1 )) 1. (5.1) 20

21 C 1 is interpreted as a set on which the posterior credibility of is at least 1, no matter which posterior is chosen within the class. If I drop the italicized part from this statement, I obtain the usual interpretation of the posterior credible region, so C 1 de ned in this way seems to be a natural extension of the Bayesian posterior credible regions to those of the posterior lower probability. multiple C 1 Analogous to the Bayesian posterior credible region, there are s that satisfy (5.1). For instance, given a posterior credibility region of with credibility 1, B 1, C 1 = [ 2B1 H () considered in the 2009 working paper version of Moon and Schorfheide (2011) satis es (5.1). In proposing set inference for, I resolve the multiplicity issue of set estimates by focusing on the smallest one, C 1 arg min Leb(C) (5.2) C2C s.t. F jx (H() C)) 1 ; where Leb(C) is the volume of subset C in terms of the Lebesgue measure and C is a family of subsets in H over which the volume-minimizing lower credible region is searched. I thereafter refer to C 1 de ned in this way as a posterior lower credible region with credibility 1. Note that focusing on the smallest set estimate has a decision theoretic justi cation; C 1 can be supported as a posterior gamma minimax action: 2 3 C 1 = arg min 4 sup L (; C) df jx 5 C2C 2M( ) with a loss function that penalizes the volume and non-coverage, L (; C) = Leb (C) + b () [1 1 C ()] ; where b () is a positive constant that depends on credibility level 1, and 1 C () is the indicator function for subset C. Here, the loss function is written in terms of parameter, so the object of interest is, rather than identi ed set H (). Finding C 1 is challenging if is multi-dimensional and no restriction is placed on class of subsets C. I therefore restrict our analysis to scalar and constrain C to the class of closed connected intervals. The next proposition shows how to obtain C 1. 21

22 Proposition 5.1 Let d : H D! R + measures distance from c 2 H to set H () in terms of d ( c ; H()) sup fk c kg. 2H() For each c 2 H, let r 1 ( c ) be the (1 )-th quantile of the distribution of d ( c ; H()) induced by the posterior distribution of, i.e., r 1 ( c ) inf r : F jx : d(c ; H()) r 1. Then, C 1 is a closed interval centered at c = arg min c 2H r 1 ( c ) with radius r 1 = r 1 ( c). Proof. See Appendix A. 5.2 Asymptotic Properties of the Posterior Lower Credible Region In this section, I examine the large-sample behavior of the posterior lower probability in an intervally identi ed case, where H () R is a closed bounded interval for almost all 2. From now on, I make the sample size explicit in the notation: a size n sample X n is generated from its sampling distribution P X n j 0, where 0 denotes the value of the su cient parameters that corresponds to the true data-generating process. The maximum likelihood estimator for is denoted by ^. Provided that the posterior of is consistent 11 to 0 and the set-valued map H ( 0 ) is continuous at = 0 in terms of the Hausdor metric d H, it can be shown that random sets H (), represented by the posterior lower probability F jx n (), converges to true identi ed set H ( 0 ) in the sense of lim n!1 F jx n (f : d H (H () ; H ( 0 )) > g) = 0 for almost 11 Posterior consistency of means that lim n!1 F jx n (G) = 1 for every G open neighborhood of 0 for almost every sampling sequence. For nite dimensional, this posterior consistency for is implied by a set of higher-level conditions for the likelihood of. We do not list up all those conditions here for the sake of brevity. See Section 7.4 of Schervish (1995) for details. 22

23 every sampling sequences of fx n g. Given such posterior consistency of, we analyze the asymptotic coverage property of the lower credible region C 1. The following set of regularity conditions are imposed to show the asymptotic correct coverage property of C 1. Condition 5.1 (i) The parameter of interest is a scalar and the identi ed set H() is -almost surely a non-empty and connected interval, H () = [ l (); u ()], the true identi ed set H ( 0 ) = [ l ( 0 ); u ( 0 )] is a bounded interval. 1 l () u () 1, and (ii) For sequence a n! 1, random variables ^L = a n l ( 0 ) l (^) and ^U = a n u ( 0 ) u (^) converges in distribution to bivariate random variables (L; U), whose cumulative distribution function J () on R 2 is Lipschitz continuous and monotonically increasing in the sense of J (c l ; c u ) < J (c l + ; c u + ) for any > 0. (iii) De ne random variables L n () = a n l () l (^) and U n () = a n u () whose distribution is induced by the posterior distribution of given sample X n. The cumulative distribution function of (L n () ; U n ()) given X n denoted by J n () is continuous almost surely under ^p (x n j 0 ) for all n: (iv) At each c (c l ; c u ) 2 R 2, the cumulative distribution function of (L n () ; U n ()) given X n, J n (c), converges in probability under P X n j 0 to J (c). Conditions 5.1 (ii) and (iv) imply that the estimators for the lower and upper bounds of H () attain the Bernstein von Mises property: the sampling distribution of the bound estimators and the posterior distribution of the bounds coincide asymptotically in the sense of Theorem in Schervish (1995). and (iv), with a n In case of nite dimensional, Condition 5.1 (ii) = p n and bivariate normal (L; U), are implied from the following set of assumptions: (a) the regularity of the likelihood of and the asymptotic normality of p n ^, (b) puts a positive probability on every open neighborhood of 0 and s density is smooth at 0, and (c) the applicability of the delta method to l () and u () at = 0 with non-zero rst derivatives See Schervish (1995, Section 7.4) for further detail on these assumptions. u (^), 23

24 The next proposition establishes the large-sample coverage property of the posterior lower credible region C 1. Theorem 5.1 (Asymptotic Coverage Property) Assume Conditions 3.1 and 5.1. C 1 can be interpreted as frequentist con dence intervals for the true identi ed set H ( 0 ) with a pointwise asymptotic coverage probability (1 lim P X n!1 n j 0 (H ( 0 ) C 1 ) = 1 : Proof. See Appendix A. C 1 ), This result shows that, for the interval-identi ed, the posterior lower credible region achieves the exact desired frequentist coverage for the identi ed set asymptotically (Horowitz and Manski (2000), Chernozhukov, Hong, and Tamer (2007), and Romano and Shaikh (2010)). It is worth noting that the posterior lower credible region C 1 di ers from the con dence intervals for the parameter of interest, as considered in Imbens and Manski (2004) and Stoye (2009); in case H ( 0 ) is an interval, C 1 will be asymptotically wider than the frequentist con dence interval for. This implies that the set of priors M is too large to interpret C 1 as the frequentist s con dence interval for. It is also worth noting that the asymptotic coverage probability presented in Theorem 5.1 is in the sense of pointwise asymptotic coverage rather than an asymptotic uniform coverage over 0. The frequentist literature has stressed the importance of the uniform coverage property of interval estimates in order to ensure that the intervals estimates can have an accurate coverage probability in a nite sample situation (Imbens and Manski (2004), Andrew and Guggenberger (2009), Stoye (2009), Romano and Shaikh (2010), Andrew and Soares (2010), among many others). Examining whether or not the posterior lower credible region constructed above can attain a uniformly valid coverage probability for the identi ed set is beyond the scope of this paper and is left for future research. We note that Condition 5.1 (iv) is a quite delicate condition as illistrated by the following counterexample. Example 5.1 Let the identi ed set be given by H () = [max f 1 ; 2 g ; min f 3 ; 4 g]. type of bound commonly appears in the intersection bound analysis (Manski (1990)), and has This 24

25 attracted considerable attention in the literature (Hirano and Porter (2011), Chernozhukov, Lee, and Rosen (2011)). Condition 5.1 (iv) does not hold in this class of models when the true values of the arguments in the minimum or maximum happen to be equal. Let us focus on the lower bound l () = max f 1 ; 2 g. Assume that the maximum likelihood estimators ^ 1 and ^ 2 are independent and ^ 1 N 10 ; n 1 and ^2 N 20 ; 1 n with 10 = 20. As for the posterior of 1 and 2, assume that 1 jx n N ^1 ; 1 and n 2 jx n N ^2 ; 1. In this case, the sampling distribution of L = p n n l ( 0 ) l (^) and the posterior distribution of L n () = p n l () l (^) are obtained as 8 < L max : = 8 < ; ; Ln () jx n min : 1 + ^ 2 + ^ where ( 1 ; 2 ) are independent standard normal variables, ^ = p n ^1 ^2, jaj = min f0; ag, and jaj + = max f0; ag. The posterior distribution of L n () fails to converge to the sampling distribution of p n l (^) l ( 0 ) due to the non-vanishing ^. Note that, even when ^ happens to be zero, L n () s posterior distribution di ers from the sampling distribution of L. C = ; ; will estimate H ( 0 ) with inward bias, and the coverage probability for H ( 0 ) will be lower than the nominal coverage. A lesson from this example is that, despite the explicit introduction of ambiguity in the form of prior class M and the decision theoretic justi cation behind the construction of C 1 procedure based on C 1, the robust Bayes posterior inference does not correct the frequentist bias issue in the intersection bounds analysis. In contrast, the correct coverage will be attained if C 1 = [ 2B1 H () is used as a robusti ed posterior credible region. 6 Concluding Remarks This paper proposes a framework of a robust Bayes analysis for set-identi ed models in econometrics. I demonstrate that the posterior lower probability obtained from the prior class M can be interpreted as the posterior probability law of the identi ed set (Theorem 3.1). This robust Bayesian way of generating and interpreting the identi ed set as an a posteriori random object has not been investigated in the literature, This highlights the 25

26 seamless links among partial identi cation analysis, robust Bayes inference, and random set theory, and o ers a uni ed framework of statistical decision and inference for set-identi ed parameters from the conditional perspective. I employ the posterior gamma-minimax criterion to formulate and solve for a statistical decision with multiple posteriors. The objective function of the gamma-minimax criterion integrates the ambiguity associated with the set identi cation and posterior uncertainty of the identi ed set into a single objective function. It leads to a numerically solvable posterior gamma-minimax action, as long as the identi ed sets H () can be simulated from the posterior of. The posterior lower probability is a non-additive measure, so one complication of the lower probability inference is that we cannot plot it as we would do for the posterior probability densities. To visualize it and conduct a set estimation in a decision-theoretically justi able way, I propose the posterior lower credible region. For an interval-identi ed parameter, I derive the conditions that the posterior lower credible region with credibility (1 ) can be interpreted as an asymptotically valid frequentist con dence interval for the identi ed set with coverage (1 ). This claim can be seen as an extension of the celebrated Bernsteinvon Mises theorem to the multiple prior Bayesian inference via the lower probability, and exempli es a situation where the robust Bayesians can accomplish a compromise of the Bayesian and frequentist inference. Appendix A Lemmas and Proofs In this appendix, I rst demonstrate that the set-valued mappings () and H () de ned in the main text are closed random sets (measurable and closed set-valued mappings) induced by a probability measure on (; B). Lemma A.1 Assume (; A) and (; B) are complete separable metric spaces. Under Condition 3.1, () and H () are random closed sets induced by a probability measure on (; B), 26

Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability

Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Toru Kitagawa CeMMAP and Department of Economics, UCL First Draft: September 2010 This Draft: March, 2012 Abstract

More information

Inference and decision for set identified parameters using posterior lower and upper probabilities

Inference and decision for set identified parameters using posterior lower and upper probabilities Inference and decision for set identified parameters using posterior lower and upper probabilities Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP16/11

More information

Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities

Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities Toru Kitagawa CeMMAP and Department of Economics, UCL First Draft, July, 2010 This Draft, August, 2010

More information

Intersection Bounds, Robust Bayes, and Updating Ambiguous Beliefs.

Intersection Bounds, Robust Bayes, and Updating Ambiguous Beliefs. Intersection Bounds, Robust Bayes, and Updating mbiguous Beliefs. Toru Kitagawa CeMMP and Department of Economics, UCL Preliminary Draft November, 2011 bstract This paper develops multiple-prior Bayesian

More information

Robust Bayesian inference for set-identified models

Robust Bayesian inference for set-identified models Robust Bayesian inference for set-identified models Raffaella Giacomini Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP61/18 Robust Bayesian Inference

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BRENDAN KLINE AND ELIE TAMER UNIVERSITY OF TEXAS AT AUSTIN AND HARVARD UNIVERSITY Abstract. This paper develops a Bayesian approach to inference

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

Estimation under Ambiguity (Very preliminary)

Estimation under Ambiguity (Very preliminary) Estimation under Ambiguity (Very preliminary) Ra aella Giacomini, Toru Kitagawa, y and Harald Uhlig z Abstract To perform a Bayesian analysis for a set-identi ed model, two distinct approaches exist; the

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Natalia Lazzati y November 09, 2013 Abstract We study collective choice models from a revealed preference approach given limited

More information

Columbia University. Department of Economics Discussion Paper Series. The Knob of the Discord. Massimiliano Amarante Fabio Maccheroni

Columbia University. Department of Economics Discussion Paper Series. The Knob of the Discord. Massimiliano Amarante Fabio Maccheroni Columbia University Department of Economics Discussion Paper Series The Knob of the Discord Massimiliano Amarante Fabio Maccheroni Discussion Paper No.: 0405-14 Department of Economics Columbia University

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

The Identi cation Power of Equilibrium in Games: The. Supermodular Case

The Identi cation Power of Equilibrium in Games: The. Supermodular Case The Identi cation Power of Equilibrium in Games: The Supermodular Case Francesca Molinari y Cornell University Adam M. Rosen z UCL, CEMMAP, and IFS September 2007 Abstract This paper discusses how the

More information

MCMC CONFIDENCE SETS FOR IDENTIFIED SETS. Xiaohong Chen, Timothy M. Christensen, and Elie Tamer. May 2016 COWLES FOUNDATION DISCUSSION PAPER NO.

MCMC CONFIDENCE SETS FOR IDENTIFIED SETS. Xiaohong Chen, Timothy M. Christensen, and Elie Tamer. May 2016 COWLES FOUNDATION DISCUSSION PAPER NO. MCMC CONFIDENCE SETS FOR IDENTIFIED SETS By Xiaohong Chen, Timothy M. Christensen, and Elie Tamer May 2016 COWLES FOUNDATION DISCUSSION PAPER NO. 2037 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

More information

Robust Bayes Inference for Non-identified SVARs

Robust Bayes Inference for Non-identified SVARs 1/29 Robust Bayes Inference for Non-identified SVARs Raffaella Giacomini (UCL) & Toru Kitagawa (UCL) 27, Dec, 2013 work in progress 2/29 Motivating Example Structural VAR (SVAR) is a useful tool to infer

More information

How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1

How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1 How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1 Karl H. Schlag 2 March 12, 2007 ( rst version May 12, 2006) 1 The author would like to thank Dean

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Approximately Most Powerful Tests for Moment Inequalities

Approximately Most Powerful Tests for Moment Inequalities Approximately Most Powerful Tests for Moment Inequalities Richard C. Chiburis Department of Economics, Princeton University September 26, 2008 Abstract The existing literature on testing moment inequalities

More information

Inference about Non- Identified SVARs

Inference about Non- Identified SVARs Inference about Non- Identified SVARs Raffaella Giacomini Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP45/14 Inference about Non-Identified SVARs

More information

Robust Bayes Inference for Non-Identified SVARs

Robust Bayes Inference for Non-Identified SVARs Robust Bayes Inference for Non-Identified SVARs Raffaella Giacomini and Toru Kitagawa This draft: March, 2014 Abstract This paper considers a robust Bayes inference for structural vector autoregressions,

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

arxiv: v3 [stat.me] 26 Sep 2017

arxiv: v3 [stat.me] 26 Sep 2017 Monte Carlo Confidence Sets for Identified Sets Xiaohong Chen Timothy M. Christensen Elie Tamer arxiv:165.499v3 [stat.me] 26 Sep 217 First draft: August 215; Revised September 217 Abstract In complicated/nonlinear

More information

Semi and Nonparametric Models in Econometrics

Semi and Nonparametric Models in Econometrics Semi and Nonparametric Models in Econometrics Part 4: partial identification Xavier d Haultfoeuille CREST-INSEE Outline Introduction First examples: missing data Second example: incomplete models Inference

More information

Limited Information Econometrics

Limited Information Econometrics Limited Information Econometrics Walras-Bowley Lecture NASM 2013 at USC Andrew Chesher CeMMAP & UCL June 14th 2013 AC (CeMMAP & UCL) LIE 6/14/13 1 / 32 Limited information econometrics Limited information

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Semi-parametric Bayesian Partially Identified Models based on Support Function

Semi-parametric Bayesian Partially Identified Models based on Support Function Semi-parametric Bayesian Partially Identified Models based on Support Function Yuan Liao University of Maryland Anna Simoni CNRS and THEMA November 2013 Abstract We provide a comprehensive semi-parametric

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data Outline A Course in Applied Econometrics Lecture 10 1. Introduction 2. Example I: Missing Data Partial Identification 3. Example II: Returns to Schooling 4. Example III: Initial Conditions Problems in

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Estimation under Ambiguity

Estimation under Ambiguity Estimation under Ambiguity R. Giacomini (UCL), T. Kitagawa (UCL), H. Uhlig (Chicago) Giacomini, Kitagawa, Uhlig Ambiguity 1 / 33 Introduction Questions: How to perform posterior analysis (inference/decision)

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Bayesian consistent prior selection

Bayesian consistent prior selection Bayesian consistent prior selection Christopher P. Chambers and Takashi Hayashi yzx August 2005 Abstract A subjective expected utility agent is given information about the state of the world in the form

More information

Characterizations of identified sets delivered by structural econometric models

Characterizations of identified sets delivered by structural econometric models Characterizations of identified sets delivered by structural econometric models Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP44/16

More information

Instrumental variable models for discrete outcomes

Instrumental variable models for discrete outcomes Instrumental variable models for discrete outcomes Andrew Chesher The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP30/08 Instrumental Variable Models for Discrete Outcomes

More information

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Graham Elliott UCSD Ulrich K. Müller Princeton University Mark W. Watson Princeton University and NBER January 22 (Revised

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008 Instrumental Variable Models for Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice and UCL Revised November 13th 2008 Abstract. Single equation instrumental variable models for

More information

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern

More information

MCMC Confidence Sets for Identified Sets

MCMC Confidence Sets for Identified Sets MCMC Confidence Sets for Identified Sets Xiaohong Chen Timothy M. Christensen Keith O'Hara Elie Tamer The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP28/16 MCMC Confidence

More information

Partial Identification and Confidence Intervals

Partial Identification and Confidence Intervals Partial Identification and Confidence Intervals Jinyong Hahn Department of Economics, UCLA Geert Ridder Department of Economics, USC September 17, 009 Abstract We consider statistical inference on a single

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Generalized instrumental variable models, methods, and applications

Generalized instrumental variable models, methods, and applications Generalized instrumental variable models, methods, and applications Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP43/18 Generalized

More information

Estimation under Ambiguity

Estimation under Ambiguity Estimation under Ambiguity Raffaella Giacomini, Toru Kitagawa, and Harald Uhlig This draft: March 2019 Abstract To perform a Bayesian analysis for a set-identified model, two distinct approaches exist;

More information

Control Functions in Nonseparable Simultaneous Equations Models 1

Control Functions in Nonseparable Simultaneous Equations Models 1 Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell 2 UCL & IFS and Rosa L. Matzkin 3 UCLA June 2013 Abstract The control function approach (Heckman and Robb (1985)) in a

More information

An Instrumental Variable Model of Multiple Discrete Choice

An Instrumental Variable Model of Multiple Discrete Choice An Instrumental Variable Model of Multiple Discrete Choice Andrew Chesher y UCL and CeMMAP Adam M. Rosen z UCL and CeMMAP February, 20 Konrad Smolinski x UCL and CeMMAP Abstract This paper studies identi

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Counterfactual worlds

Counterfactual worlds Counterfactual worlds Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP22/15 Counterfactual Worlds Andrew Chesher and Adam M. Rosen CeMMAP

More information

Non-parametric Identi cation and Testable Implications of the Roy Model

Non-parametric Identi cation and Testable Implications of the Roy Model Non-parametric Identi cation and Testable Implications of the Roy Model Francisco J. Buera Northwestern University January 26 Abstract This paper studies non-parametric identi cation and the testable implications

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Comparison of inferential methods in partially identified models in terms of error in coverage probability Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,

More information

Sharp identified sets for discrete variable IV models

Sharp identified sets for discrete variable IV models Sharp identified sets for discrete variable IV models Andrew Chesher Konrad Smolinski The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP11/10 Sharp identi ed sets for

More information

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z Boston College First

More information

Revisiting independence and stochastic dominance for compound lotteries

Revisiting independence and stochastic dominance for compound lotteries Revisiting independence and stochastic dominance for compound lotteries Alexander Zimper Working Paper Number 97 Department of Economics and Econometrics, University of Johannesburg Revisiting independence

More information

Sharp identification regions in models with convex moment predictions

Sharp identification regions in models with convex moment predictions Sharp identification regions in models with convex moment predictions Arie Beresteanu Ilya Molchanov Francesca Molinari The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Solving Extensive Form Games

Solving Extensive Form Games Chapter 8 Solving Extensive Form Games 8.1 The Extensive Form of a Game The extensive form of a game contains the following information: (1) the set of players (2) the order of moves (that is, who moves

More information

Monte Carlo Confidence Sets for Identified Sets

Monte Carlo Confidence Sets for Identified Sets Monte Carlo Confidence Sets for Identified Sets Xiaohong Chen Timothy M. Christensen Elie Tamer First Draft: August 25; st Revision: September 27; 3rd Revision: June 28 Abstract It is generally difficult

More information

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587 Cycles of length two in monotonic models José Alvaro Rodrigues-Neto Research School of Economics, Australian National University ANU Working Papers in Economics and Econometrics # 587 October 20122 JEL:

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Learning and Risk Aversion

Learning and Risk Aversion Learning and Risk Aversion Carlos Oyarzun Texas A&M University Rajiv Sarin Texas A&M University January 2007 Abstract Learning describes how behavior changes in response to experience. We consider how

More information

Applications of Subsampling, Hybrid, and Size-Correction Methods

Applications of Subsampling, Hybrid, and Size-Correction Methods Applications of Subsampling, Hybrid, and Size-Correction Methods Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department of Economics UCLA November

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Random Utility Models, Attention Sets and Status Quo Bias

Random Utility Models, Attention Sets and Status Quo Bias Random Utility Models, Attention Sets and Status Quo Bias Arie Beresteanu and Roee Teper y February, 2012 Abstract We develop a set of practical methods to understand the behavior of individuals when attention

More information

Inference Based on Conditional Moment Inequalities

Inference Based on Conditional Moment Inequalities Inference Based on Conditional Moment Inequalities Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Xiaoxia Shi Department of Economics University of Wisconsin, Madison

More information

What do instrumental variable models deliver with discrete dependent variables?

What do instrumental variable models deliver with discrete dependent variables? What do instrumental variable models deliver with discrete dependent variables? Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP10/13 What

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2010 Paper 117 Estimating Causal Effects in Trials Involving Multi-treatment Arms Subject to Non-compliance: A Bayesian Frame-work

More information

Dynamics of Inductive Inference in a Uni ed Model

Dynamics of Inductive Inference in a Uni ed Model Dynamics of Inductive Inference in a Uni ed Model Itzhak Gilboa, Larry Samuelson, and David Schmeidler September 13, 2011 Gilboa, Samuelson, and Schmeidler () Dynamics of Inductive Inference in a Uni ed

More information

Positive Political Theory II David Austen-Smith & Je rey S. Banks

Positive Political Theory II David Austen-Smith & Je rey S. Banks Positive Political Theory II David Austen-Smith & Je rey S. Banks Egregious Errata Positive Political Theory II (University of Michigan Press, 2005) regrettably contains a variety of obscurities and errors,

More information

Identifying Structural E ects in Nonseparable Systems Using Covariates

Identifying Structural E ects in Nonseparable Systems Using Covariates Identifying Structural E ects in Nonseparable Systems Using Covariates Halbert White UC San Diego Karim Chalak Boston College October 16, 2008 Abstract This paper demonstrates the extensive scope of an

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

The Kuhn-Tucker Problem

The Kuhn-Tucker Problem Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

More information

Robust inference about partially identified SVARs

Robust inference about partially identified SVARs Robust inference about partially identified SVARs Raffaella Giacomini and Toru Kitagawa This draft: June 2015 Abstract Most empirical applications using partially identified Structural Vector Autoregressions

More information

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS ECONOMETRICS II (ECO 2401) Victor Aguirregabiria Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS 1. Introduction and Notation 2. Randomized treatment 3. Conditional independence

More information

Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data.

Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data. Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data. Thierry Magnac Eric Maurin y First version: February 004 This revision: December 006 Abstract We investigate identi

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS. Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger.

GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS. Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger. GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS By Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1813 COWLES

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Week 2 Spring Lecture 3. The Canonical normal means estimation problem (cont.).! (X) = X+ 1 X X, + (X) = X+ 1

Week 2 Spring Lecture 3. The Canonical normal means estimation problem (cont.).! (X) = X+ 1 X X, + (X) = X+ 1 Week 2 Spring 2009 Lecture 3. The Canonical normal means estimation problem (cont.). Shrink toward a common mean. Theorem. Let X N ; 2 I n. Let 0 < C 2 (n 3) (hence n 4). De ne!! (X) = X+ 1 Then C 2 X

More information

Inference for Parameters Defined by Moment Inequalities: A Recommended Moment Selection Procedure. Donald W.K. Andrews and Panle Jia

Inference for Parameters Defined by Moment Inequalities: A Recommended Moment Selection Procedure. Donald W.K. Andrews and Panle Jia Inference for Parameters Defined by Moment Inequalities: A Recommended Moment Selection Procedure By Donald W.K. Andrews and Panle Jia September 2008 Revised August 2011 COWLES FOUNDATION DISCUSSION PAPER

More information

Mean-Variance Utility

Mean-Variance Utility Mean-Variance Utility Yutaka Nakamura University of Tsukuba Graduate School of Systems and Information Engineering Division of Social Systems and Management -- Tennnoudai, Tsukuba, Ibaraki 305-8573, Japan

More information

Lecture 1- The constrained optimization problem

Lecture 1- The constrained optimization problem Lecture 1- The constrained optimization problem The role of optimization in economic theory is important because we assume that individuals are rational. Why constrained optimization? the problem of scarcity.

More information

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS FUNCTIONS Michael Baron Received: Abstract We introduce a wide class of asymmetric loss functions and show how to obtain asymmetric-type optimal decision

More information

University of Toronto

University of Toronto A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL July 5th 2007 Accompanies the presentation Identi cation and Discrete Measurement CeMMAP Launch Conference,

More information

IDENTIFICATION-ROBUST SUBVECTOR INFERENCE. Donald W. K. Andrews. September 2017 Updated September 2017 COWLES FOUNDATION DISCUSSION PAPER NO.

IDENTIFICATION-ROBUST SUBVECTOR INFERENCE. Donald W. K. Andrews. September 2017 Updated September 2017 COWLES FOUNDATION DISCUSSION PAPER NO. IDENTIFICATION-ROBUST SUBVECTOR INFERENCE By Donald W. K. Andrews September 2017 Updated September 2017 COWLES FOUNDATION DISCUSSION PAPER NO. 3005 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy Department of Economics, Harvard University 1 / 40 Agenda instrumental variables part I Origins of instrumental

More information

Unconditional Quantile Regressions

Unconditional Quantile Regressions Unconditional Regressions Sergio Firpo, Pontifícia Universidade Católica -Rio Nicole M. Fortin, and Thomas Lemieux University of British Columbia October 2006 Comments welcome Abstract We propose a new

More information

Dominance and Admissibility without Priors

Dominance and Admissibility without Priors Dominance and Admissibility without Priors Jörg Stoye Cornell University September 14, 2011 Abstract This note axiomatizes the incomplete preference ordering that reflects statewise dominance with respect

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008 Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS Revised April 2nd 2008 Abstract. This paper studies models for discrete outcomes which permit explanatory

More information

Chapter 2. GMM: Estimating Rational Expectations Models

Chapter 2. GMM: Estimating Rational Expectations Models Chapter 2. GMM: Estimating Rational Expectations Models Contents 1 Introduction 1 2 Step 1: Solve the model and obtain Euler equations 2 3 Step 2: Formulate moment restrictions 3 4 Step 3: Estimation and

More information