Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability

Size: px
Start display at page:

Download "Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability"

Transcription

1 Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Toru Kitagawa CeMMAP and Department of Economics, UCL First Draft: September 2010 This Draft: March, 2012 Abstract In inference about set-identi ed parameters, it is known that the Bayesian probability statements about the unknown parameters do not coincide, even asymptotically, with the frequentist s con dence statements. This paper aims to smooth out this disagreement from a multiple prior robust Bayes perspective. We show that there exist a class of prior distributions (ambiguous belief), with which the posterior probability statements drawn via the lower envelope (lower probability) of the posterior class asymptotically agree with the frequentist con dence statements for the identi ed set. With such class of priors, we analyze statistical decision problems including point estimation of the set-identi ed parameter by applying the gammaminimax criterion. From a subjective robust Bayes point of view, the class of priors can serve as benchmark ambiguous belief, by introspectively assessing which one can judge whether the frequentist inferential output for the set-identi ed model is an appealing approximation of his posterior ambiguous belief. Keywords: Partial Identi cation, Bayesian Robustness, Belief Function, Imprecise Probability, Gamma-minimax, Random Set. JEL Classi cation: C12, C15, C21. t.kitagawa@ucl.ac.uk. I thank Andrew Chesher, Siddhartha Chib, Jean-Pierre Florens, Charles Manski, Ulrich Müller, Andriy Norets, Adam Rosen, Kevin Song, and Elie Tamer for valuable discussions and comments. I also thank the seminar participants at Academia Sinica Taiwan, Cowles Summer Conference 2011, EC 2 Conference 2010, MEG 2010, RES Annual Conference 2011, Simon Fraser University, and the University of British Columbia for helpful comments. All remaining errors are mine. Financial support from the ESRC through the ESRC Centre for Microdata Methods and Practice (CEMMAP) (grant number RES ) is gratefully acknowledged. 1

2 1 Introduction In the usual situations of inferring identi ed parameters, the Bayesian probability statements about the unknown parameters are similar, at least asymptotically, to the frequentist con dence statements for the true value of the parameters. In partial identi cation analysis initiated by Manski (1989, 1990, 2003, 2008), it is known that such asymptotic harmony between the two inference paradigms breaks down (Moon and Schorfheide (2011)). The Bayesian interval estimates for the set-identi ed parameter are shorter, even asymptotically, than the frequentist ones, and they asymptotically lie strictly inside of the frequentist s con dence intervals. Frequentists might interpret this phenomenon as that the Bayesian s over-con dence in their inferential statement is ctitious. Bayesians, on the other hand, might reply that the frequentist s con dence statements su er from an extreme conservatism that lacks any posterior probability justi cation. The main aim of this paper is to smooth out such disagreement between the two schools of statistical inference from a perspective of robust Bayes inference: a third paradigm of statistical inference, in which we can incorporate researcher s partial prior knowledge into posterior inference. Among various robust Bayes approaches, this paper in particular focuses on a multiple prior Bayes analysis, where the partial prior knowledge or the robustness concern against prior misspeci cation is modeled by a class of priors (ambiguous belief). The Bayes rule is applied to each prior to form the class of posteriors. Posterior inference procedures considered in this paper operate on the thus-constructed class of posteriors by focusing on its lower and upper envelopes, the so-called posterior lower and upper probabilities. Along this robust Bayes approach, this paper considers the following questions. First, are there any classes of priors, with which the probabilistic statements made via the posterior lower probability can approximate the frequentist s con dence statements for set-identi ed parameters? The answer turns out to be yes. When the parameters are not identi ed, a prior distribution of the model parameters is decomposed into two components: one that can be updated by data (revisable prior knowledge) and one that can never be updated by data (unrevisable prior knowledge). We show that, if a prior class is designed in such way that it shares a single prior distribution for revisable prior knowledge, but allows for arbitrary prior distributions for the unrevisable prior knowledge, then, under certain regularity conditions, the posterior interval estimates constructed based upon the posterior lower probability asymptotically attains the correct frequentist coverage probability for the identi ed set. 2

3 The second question we examine is, with such class of priors, is it possible to formulate and solve statistical decision problems including point estimation of the set-identi ed parameter? We approach this question by adapting the gamma-minimax decision analysis, which can be seen as a minimax analysis in the multiple prior setup. We demonstrate that the proposed prior class leads us to an analytically tractable and numerically solvable formulation of the gamma-minimax decision problem, provided that the identi ed set for the parameter of interest can be computed for each value of the identi ed parameters We believe our analyses are insightful not only from the "objective" Bayesian point of view but also from the subjective robust Bayesian point of view, because the classes of priors that can asymptotically match with the frequentist s inference can serve as benchmark ambiguous belief. That is, by subjectively assessing if the ambiguous belief represented by the benchmark prior class appears reasonable or not, one can judge if the frequentist s inferential output is too conservative or not in view of his posterior ambiguous belief. On this issue, we argue through an example that some priors in the benchmark prior class appear less appealing than some others, so that subjective robust Bayesians would consider that the frequentist inference statements for identi ed set are too uninformative to interpret it as an approximation of their posterior ambiguous belief. 1.1 Literature Review Estimation and inference in partial identi ed models have been a growing area of research in econometrics. From the frequentist perspective, Horowitz and Manski (2000) construct con dence sets for the interval identi ed set. Imbens and Manski (2004) propose uniformly asymptotically valid con dence sets for an intervally-identi ed parameter. Stoye (2009) extends their analysis to a di erent set of conditions. Chernozhukov, Hong, and Tamer (2007) develop a way to construct asymptotically valid con dence sets for an identi ed set based on the criterion function approach, which can be applied to the wide range of partially identi ed models including moment inequality models. Related to the criterion function approach, the literatures on construction of con dence sets by inverting test statistics include, Andrews and Guggenberger (2009), Andrews and Soares (2010), and Romano and Shaikh (2010), to list a few. From the Bayesian perspective, Neath and Samaniego (1997), Poirier (1998), and Gustafson (2009, 2010) analyze how the Bayesian updating operates when a model lacks identi ability. Liao and Jiang (2010) conduct Bayesian inference for moment inequality models based on the pseudo- 3

4 likelihood. Moon and Schorfheide (2011) compares asymptotic properties of frequentist and Bayesian inference for set-identi ed models, and our robust Bayes analysis is motivated by their important nding on the asymptotic disagreement between them. The lower and upper probabilities originate from Dempster (1966, 1967a, 1967b, 1968) in his ducial argument of drawing posterior inferences without specifying a prior distribution. Demspter s analyses have given a large impact to the eld of statistics, arti cial intelligence, and decision theory, such as the belief function analysis of Shafer (1976, 1982), the imprecise probability analysis of Walley (1991). The lower and upper probabilities have been playing the major role also in the robust Bayes analysis for measuring the global sensitivity of the posterior (Berger (1984), Berger and Berliner (1986)) and for characterizing a class of posteriors (DeRobertis and Hartigan (1981), Wasserman (1989, 1990), and Wasserman and Kadane (1990)). In econometrics, the pioneering work using multiple priors was carried out by Chamberlain and Leamer (1976), and Leamer (1982), who obtained the bounds for the posterior mean of the regression coe cients when a prior varies over a certain class. These previous studies do not explicitly considers non-identi ed models, while this paper focuses on non-identi ed models, and aims to clarify a link between an early idea of the lower and upper probabilities and a recent issue on inference in set-identi ed models. The posterior lower probability that we shall obtain with our speci cation of prior class turns out to be an in nite-order capacity, or equivalently, a containment functional in the random set theory (see, e.g., Molchanov (2005)). Beresteanu and Molinari (2008) and Bereseanu, Molchanov, and Molinari (2012) show the usefulness and wide applicability of random set theory to partially identi ed models by viewing an observation as a random set and the estimand as its Aumann expectation. Galichon and Henry (2006, 2009) proposed a use of in nite-order capacity in de ning and inferring the identi ed set in the framework of incomplete econometric models. Our analysis albeit the di erence in the source of probability is seen as a robust Bayes analogue to these frequentist inference procedures based of the random set theory and the capacity. Our decision theoretic analysis employs the gamma-minimax criterion (see Berger (1985)); this leads us to a decision that minimizes the risk formed under the most pessimistic prior within the class. The gamma-minimax decision problem often becomes challenging both analytically and numerically, and its analysis is limited to rather simple parametric models with a certain choice of prior class (see Betro and Ruggeri (1992), and Vidakovic (2000), for example). Our speci cation of prior class, in contrast, o ers a simple solution to the gamma-minimax decision problem, provided 4

5 that the identi ed set for the parameter of interest can be computed for each value of the identi ed parameters. 1.2 Plan of the Paper The rest of the paper is organized as follows. In Section 2, we illustrate our main results using a simple example of missing data. In Section 3, a general framework is introduced. We construct a class of prior distributions that can contain arbitrary unrevisable prior knowledge, and derive the posterior lower and upper probabilities. Statistical decision analyses with multiple priors are examined in Section 4. In Section 5, we investigate how to construct the posterior credible region based on the posterior lower probability, and its large-sample behavior is examined in an interval-identi ed parameter case. Proofs and lemmas are provided in Appendix A. 2 An Illustrating Example: Missing Data In this section we illustrate the main results of the paper using an example of missing data (Manski (1989)). Let Y 2 f1; 0g be a binary outcome of interest (e.g., a worker is employed or not) and let D 2 f1; 0g be an indicator of whether Y is being observed (D = 1) or not (D = 0), i.e., the subject responded or not. Data are given by a random sample of size n, x = f(y i D i ; D i ) : i = 1; : : : ; ng. The starting point of our analysis is to specify a parameter vector 2 that pins down the distribution of data as well as the parameter of interest. four probability masses, ( yd ), yd Pr (Y = y; D = d), y = 1; 0, d = 1; 0. likelihood for is written as p(xj) = n n [ ] n mis ; where n 11 Here, can be speci ed by a vector of The observed data = P n i=1 Y id i ; n 01 = P n i=1 (1 Y i)d i ; n mis = P n i=1 (1 D i). Clearly, this likelihood function depends on only through the three probability masses, = ( 11 ; 01 ; mis ) ( 11 ; 01 ; ); 2, no matter what the observations are, so that the likelihood for has "data-independent" at regions expressed by a set-valued map of, () f 2 : 11 = 11 ; 01 = 01 ; = mis g : The parameter of interest is the mean of Y; which is written as a function of, Pr(Y = 1) = The identi ed set of, H (), as the set-valued map of is de ned by the range of 5

6 when varies over (), H() = [ 11 ; 11 + mis ]; which are the Manski (1989) s bounds of Pr(Y = 1). In a standard Bayes inference for, we specify a prior of, update it by the Bayes rule, and marginalize the posterior of to. If the likelihood for has data-independent at regions, then the prior for conditional on f 2 ()g (i.e., belief on how the proportion of missing observations mis is divided into 10 and 00 ) will never be updated by data, so that the posterior of as well as become sensitive to how to specify it. The robust Bayes procedure considered in this paper aims to make posterior inference free from such sensitivity concern by introducing multiple priors for. The rst step for constructing our prior class is to specify a single prior for identi ed parameters. In view of, prior speci es how much prior belief should be assigned to each at region of the s likelihood (), whereas, depending on ways to allocate the assigned belief over 2 () (for each ), the implied prior for may di er. Therefore, by collecting all the possible ways to allocate the assigned belief over f 2 () ; 2 g, we can obtain the following class of prior distributions of, M = : ( (B)) = (B) for all B. By applying the Bayes rule to each 2 M and marginalizing the posterior of for, we obtain the class of posteriors of, F jx : 2 M. The class of posteriors is summarized by its lower envelope (lower probability), F jx () = inf 2M( ) F jx (), which is interpreted as that the posterior belief for f 2 g is at least F jx () no matter what 2 M is used, (i.e. no matter how prior beliefs are formed for 10 and 00 given the proportion of missing observations mis ). In the main theorem of this paper, we show F jx () = F jx (f : H () g), where F jx denotes the posterior distribution of implied from the prior. An implication of this result is that the posterior probability law of identi ed sets, H () : F jx, can be used to summarize the posterior ambiguous belief (multiple porsteriors) for implied by prior class M, and this fact plays a key role in our development of statistical decision and set inference for based on the posterior lower probability. With leaving their formal analysis in later sections, let us state brie y how to implement the posterior lower probability inference for. 1. Specify a prior for and update it by the Bayes rule. When a credible prior for is not 6

7 available, a reasonably "non-informative" prior may be used Let f s : s = 1; : : : ; Sg be random draws of from the posterior of. The mean and median of the posterior lower probability of can be de ned via the gamma-minimax decision criterion, and they can be approximated by 1 arg min a S respectively. SX sup s=1 2H( s ) (a ) 2 1 and arg min a S SX sup s=1 2H( s ) ja j ; 3. The posterior lower credible region of at credibility level 1, which can be interpreted as a contour set of the posterior lower probability of, is de ned by the smallest interval that contains H () with posterior probability 1 algorithm to compute it for interval-identi ed cases). (Proposition 5.1 below proposes an Under certain regularity conditions, which are satis ed in the current missing data example, the posterior lower credible region of constructed in this way asymptotically attains the frequentist coverage probability 1 for the true identi ed set. 3 Multiple-prior Analysis and the Lower and Upper Probabilities 3.1 Likelihood and Set Identi cation: The General Framework Let (X; X ) and (; A) be measurable spaces of a sample X 2 X and a parameter vector 2, respectively. Our analytical framework covers both a parametric model = R d, d < 1, and a non-parametric model where is a separable Banach space. We make the sample size implicit in our notation until the large sample analysis in Section 5. Let be a marginal probability distribution on the parameter space (; A), referred to as a prior distribution for. We assume that the conditional distribution of X, given, has the probability density p(xj) at every 2 with respect to a - nite measure on (X; X ). We call p(xj) the likelihood for. The parameter vector may consist of parameters that determine the behaviors of economic agents as well as those that characterize the distribution of unobserved heterogeneities in the population. In the context of the missing data or counterfactual causal models, indexes the distribution of the underlying population outcomes or the potential outcomes (see Example 3.1 below). In all of these cases, the parameter should be distinguished from the parameters that are 1 See Kass and Wasserman (1996) for a survey of reasonably non-informative priors. 7

8 used solely to index the sampling distribution of observations. A problem with the identi cation of typically arises in this context: if multiple values of can generate the same distribution of data, then we claim that these s are observationally equivalent and the identi cation of fails. In terms of the likelihood function, the observational equivalence of and 0 6= means that the values of the likelihood at and 0 are equal for every possible sample, i.e., p(xj) = p(xj 0 ) for every x 2 X (Rothenberg (1971), Drèze (1974), and Kadane (1974)). We represent the observational equivalence relation of s by a many-to-one function g : (; A)! (; B): g() = g( 0 ) if and only if p(xj) = p(xj 0 ) for all x 2 X: The equivalence relationship partitions the parameter space into equivalent classes, in each of which the likelihood of is at, irrespective of observations, and = g() maps each of these equivalent classes to a point in another parameter space. In the language of structural models in econometrics (Hurwicz (1950), and Koopman and Reiersol (1950)), = g() is interpreted as the reduced-form parameter that carries all the information for the structural parameters through the value of the likelihood function. In the literature of Bayesian statistics, = g() is referred to as the minimally su cient parameter (abbreviated to su cient parameter), and the range space of g(), (; B), is called the su cient parameter space (Barankin (1960), Dawid (1979), Florens and Mouchart (1977), Picci (1977), and Florens, Mouchart, and Rolin (1990)). 2 See Florens and Simoni (2011) for comprehensive discussions on the relationship between frequentist and Bayesian identi cation. In the presence of su cient parameters, the likelihood depends on only through the function g(), i.e., there exists a B-measurable function ^p(xj) such that p(xj) = ^p(xjg()) 8x 2 X and 2 (3.1) holds (Lemma of Lehmann and Romano (2005)). Denote the inverse image of g() by : () = f 2 : g() = g, where () and ( 0 ) for 6= 0 are disjoint, and f () ; 2 g constitutes a partition of. We assume that g() is onto, g() = ; so that () is assumed to be non-empty for every The su cient parameter space is unique up to one-to-one transformations (Picci (1977)). 3 In an observationally restrictive model in the sense of Koopman and Reiersol (1950), ^p(xj), i.e., the likelihood 8

9 In the set-identi ed model, the parameter of interest is typically a subvector or a transformation of. We denote parameters of interest by a measurable transformation of, = h(), h : (; A)! (H; D). We de ne the identi ed set of by the projection of () onto H through h(). De nition 3.1 (Identi ed Set of ) (i) The identi ed set of is a set-valued map H : H de ned by H() fh() : 2 ()g : (ii) The parameter = h() is point-identi ed at if H() is a singleton, and is set-identi ed at if H () is not a singleton. Note that identi cation of is de ned in the pre-posterior sense because it is based on the likelihood evaluated at every possible realization of a sample, not only on the observed one. The task of constructing the sharp bounds of in the partially identi ed model is equivalent to nding the expression of H(). 3.2 Examples We now give some examples, both to illustrate the above concepts and notations, and to provide a concrete focus for later development. Example 3.1 (Bounding ATE by Linear Programming) Consider the treatment e ect model with incompliance and a binary instrument 2 f1; 0g, as considered in Imbens and Angrist (1994), and Angrist, Imbens, and Rubin (1996). Assume that the treatment status and the outcome of interest are both binary. Let Y 2 Y be the observed outcome and (Y 1 ; Y 0 ) be the potential outcomes, as de ned in Example 2.1. We denote by (D 1 ; D 0 ) 2 f1; 0g 2 the potential selection responses to the instrument with the observed treatment status D = D 1 + (1 )D 0. The data consist of a random sample of (Y i ; D i ; i ). Following Imbens and Angrist (1994), we partition the population into four subpopulations de ned in terms of potential treatment-selection responses: 8 c if D 1i = 1 and D 0i = 0 : complier, >< at if D 1i = D 0i = 1 : always-taker, T i = nt if D 1i = D 0i = 0 : never-taker, >: d if D 1i = 0 and D 0i = 1 : de er, function for the su cient parameters, is well de ned for a domain larger than g(); for example, see Example 2.3 in Section 2.2. In this case, the model possesses the refutability property, and () can be empty for some 2. We do not consider such refutable models in this paper. 9

10 where T i is the indicator for the types of selection responses that are latent since we cannot observe both (D 1i ; D 0i ). We assume a randomized instrument,? (Y 1 ; Y 0 ; D 1 ; D 0 ). Then, the distribution of observables and the distribution of potential outcomes satisfy the following equalities for y 2 f1; 0g: Pr(Y = y; D = 1j = 1) = Pr(Y 1 = y; T = c) + Pr(Y 1 = y; T = at); (3.2) Pr(Y = y; D = 1j = 0) = Pr(Y 1 = y; T = d) + Pr(Y 1 = y; T = at); Pr(Y = y; D = 0j = 1) = Pr(Y 0 = y; T = d) + Pr(Y 1 = y; T = nt); Pr(Y = y; D = 0j = 0) = Pr(Y 0 = y; T = c) + Pr(Y 1 = y; T = nt): Ignoring the marginal distribution of, a full parameter vector of the model can be speci ed by a joint distribution of (Y 1 ; Y 0 ; T ) that consists of 16 probability masses: = Pr(Y 1 = y; Y 0 = y 0 ; T = t) : y = 1; 0; y 0 = 1; 0; t = c; nt; at; d : Let ATE be the parameter of interest. E(Y 1 Y 0 ) = X [Pr(Y 1 = 1; T = t) t=c;nt;at;d y=1;0 t=c;nt;at;d X X = [Pr(Y 1 = 1; Y 0 = y; T = t) h(): Pr(Y 0 = 1; T = t)] Pr(Y 1 = y; Y 0 = 1; T = t)] The likelihood conditional on depends on only through the distribution of (Y; D), so the su cient parameter vector consists of eight probability masses: = (Pr(Y = y; D = dj = z) : y = 1; 0; d = 1; 0; z = 1; 0) : The set of equations given in (3.2) characterizes the observationally equivalent set of distributions of (Y 1 ; Y 0 ; T ) ; () when the data distribution is given at. Balke and Pearl (1997) derive the identi ed set of ATE, H() = h( ()); by maximizing or minimizing h(), subject to being in the probability simplex and satisfying constraints (3.2). Since the objective function and the constraints are all linear, this optimization can be solved by linear programming and, consequently, H() is obtained as the convex intervals whose lower and upper bounds are the achieved minimum and maximum of the objective function in the linear optimization (see Balke and Pearl (1997) for a closed-form expression of the bounds and Kitagawa (2009) for an extension to general support of Y ). 10

11 Note that, in this model, special attention is needed for the su cient parameter space in order to ensure that the identi ed set of, (), is non-empty. Pearl (1995) shows that the distribution of data is compatible with the instrument exogeneity condition,? (Y 1 ; Y 0 ; D 1 ; D 0 ) ; if and only if max d X y max fpr(y = y; D = d)j = zg 1: (3.3) z This implies that in order to guarantee on the distributions of data that ful ll (3.3). () 6= ;, a prior distribution for must be supported only Example 3.2 (Linear Moment Inequality Model) Consider the model where the parameter of interest 2 H is characterized by linear moment inequalities, E(m(X) A) 0; where the parameter space H is a subset of R L, m(x) is a J-dimensional vector of known functions of an observation, and A is a J L known constant matrix. By augmenting the J-dimesional parameter 2 [0; 1) J, these moment inequalities can be written as the J-moment equalities 4 E(m(X) A ) = 0: We let the full parameter vector = (; ) 2 H [0; 1) J. To obtain a likelihood function for the moment equality model, we employ the exponentially tilted empirical likelihood for that is considered in the Bayesian context in Schennach (2005). Let x n be a size n random sample of observations, and let g() = A +. If the convex hull of [ i fm(x i ) g()g contains the origin, then the exponentially tilted empirical likelihood is written as p(x n j) = Y w i (); where w i () = exp f(g()) 0 (m(x i ) g())g P n i=1 exp f(g())0 (m(x i ) g())g ; (g()) = arg min 2R J + ( nx exp 0 (m(x i ) i=1 g()) ) : 4 The Bayesian formulation of the moment inequality model shown here is owed to Tony Lancaster, who suggested this to us in

12 Thus, the parameter = (; ) enters the likelihood only through g() = A+, so we take = g() as the su cient parameters. The identi ed set for is given by () = (; ) 2 H [0; 1) L : A + = : The coordinate projection of () onto H gives H(), the identi ed set for (see, e.g., Bertsimas and Tsitsiklis (1997, Chap.2) for an algorithm for projecting a polyhedron). 3.3 A Class of Unrevisable Prior Knowledge Let be a prior of and be the marginal probability measure on the su cient parameter space (; B) implied by and g(), i.e., (B) = ( (B)) for all B 2 B. Let x 2 X be sampled data, and assume that has a dominating measure. The posterior distribution of, denoted by F jx (), is then obtained as F jx (A) = j (Aj)dF jx (); A 2 A, (3.4) where j (Aj) denotes the conditional distribution of given, and F jx () is the posterior distribution of (Barankin (1960), Florens and Mouchart (1974), Picci (1977), Dawid (1979), Poirier (1998)). The posterior distribution of given in (3.4) shows that we are incapable of updating the conditional prior information of given because of the at likelihood on (). Accordingly, we can consider the prior information marginalized to the su cient parameter as the revisable prior knowledge, which we can potentially update by data, and the conditional priors of given, n o j (j) : 2, as the unrevisable prior knowledge, which we are never be able to update by data. If we want to obtain the posterior uncertainty of in the form of a probability distribution on (; A), as desired in the Bayesian paradigm, we need to have a single prior distribution of, which necessarily induces unique unrevisable prior knowledge j. If the researcher could justify his choice of j by any credible prior information, the standard Bayesian updating (3.4) would yield a valid posterior distribution of : A challenging situation would arise if he is short of a credible prior opinion on. In this case, the researcher, who is aware that j will never be updated by data, might feel anxious in implementing the Bayesian inference procedure that requires him to 12

13 specify unique j, because an uncon dently speci ed j can have a signi cant in uence to the subsequent posterior inference. The robust Bayes analysis in this paper particularly focuses on such situation. Let M be the set of probability measures on (; A), and be a prior on (; B) speci ed by the researcher. Consider the class of prior distributions of de ned by M( ) = 2 M : ( (B)) = (B) for every B 2 B : M( ) consists of prior distributions of whose marginal for the su cient parameters coincides with the prespeci ed. 5 That is, we accept specifying a single prior distribution for the su cient parameters, while we leave conditional priors j unspeci ed and allow for arbitrary ones as long as () = R j(j)d is a probability measure on (; A). 6 This prior class may be understood more intuitively if we consider a discrete su cient parameter space = f 1 ; : : : ; K g. Constraint ( ( k )) = ( k ) in the de nition of M( ) says that prior probability for su cient parameter; ( k ) for each k = 1; : : : ; K, speci es one s belief for f 2 ( k )g. On the other hand, conditional prior j (j k ) is interpreted as his prior belief for conditional on knowing that the lies in ( k ). Arbitrary probability distributions for conditional prior j (j k ) means any probability distributions of j (j k ) including a probability mass are allowed as long as it is supported only on ( k ), and no constraints imposed across any j (j k ) and j (j k 0), k 6= k 0. We can therefore express M( ) as ( K ) X M( ) = j (j k ) ( k ) : j (j k ) are prob. measures for supported only on ( k ). k=1 Prior class M( ) induces the prior class for = h(), which is represented by ( K ) X M ( ) = j (j k ) ( k ) : j (j k ) are prob. measures for supported only on H ( k ) : k=1 This representation of M ( ) o ers heuristic interpretation on the ambiguous belief of implied by M( ); the researcher has a single prior belief on which identi ed set H ( k ) is true in the population, while, he is lack of belief on where true likely to be conditional on knowing that lies 5 We thank Jean-Pierre Florens for suggesting this way of representing the prior class. 6 Su cient parameters are de ned by examining the entire model fp (xj) : x 2 X ; 2 g, so that prior class M( ) is by construction model dependent. This distinguishes our approach from the standard robust Bayes analysis where a prior class is intended to represent the researcher s subjective assessment of his imprecise prior knowledge (Berger (1985)). 13

14 in H ( k ) for every k = 1; : : : ; K. That is, the ambiguous prior knowedge for is interpreted as the n o K state of knowledge that one cannot assess which combination of conditional priors j (j k ) is more credible than other combinations, and he considers every possible combination of conditional n o K priors j (j k ) as plausible. The measurable selection argument in random set theory (see, k=1 e.g., Molchanov (2005, Chap.1, Sec2.2)), which is used also in the proof of Theorem 3.1 below, allows us to extend the interpretation of ambiguity stated here to a general space of. k=1 Some readers might wonder if it is sensible to apply some of the existing selection rules for non-informative priors for instead of introducing the class of priors. First of all, Je reys s general rule (Je reys (1961)), which takes the prior density to be proportional to the square root of the determinant of the Fisher information matrix, is not well de ned if the information matrix for is singular at almost every 2, which is the case in many partially identi ed model, e.g., the missing data example of Section 2. The empirical Bayes-type approach of choosing a prior for (Robbins (1964), Good (1965), Morris (1983), and Berger and Berliner (1986)) also breaks down if the model lacks identi cation because the marginal likelihood of data depends only on, p(xj)d = ^p(xj)d m(xj ): (3.5) It is also worth noting that we cannot obtain the reference prior of Bernardo (1979) because the conditional Kullback Leibler distance between the posterior density f jx () and a prior density d (): fjx () log d () ^p(xj) df jx () = log m(xj ) again depends only on for every realization of X. ^p(xj) m(xj ) d ; These prior selection rules are therefore useful for choosing a prior for the su cient parameters, and are of no use for selecting out of M( ). In the analysis given below, we shall not discuss how to select, and shall treat as given. The in uence of on the posterior of will diminish as the sample size increases, so the sensitivity issue of the posterior of is expected to be less severe when the sample size is moderate or large. 3.4 Posterior Lower and Upper Probabilities The prior class M( ) yields the class of posterior distributions of. We rst consider summarizing the posterior class by the posterior lower probability F jx () and the posterior upper probability 14

15 FjX (), which are de ned, for A 2 A, as F jx (A) inf 2M( ) F jx(a); FjX (A) sup F jx (A): 2M( ) Note that the posterior lower probability and the upper probability have a conjugate property, F jx (A) = 1 F jx (Ac ), so it su ces to focus on one of them in deriving their analytical form. For the lower and upper probabilities to be well de ned, we assume the following regularity conditions. Condition 3.1 (i) A prior for,, is a proper probability measure on (; B), and it is absolutely continuous with respect to a - nite measure on (; B) : (ii) g : (; A)! (; B) is measurable and its inverse image every 2. () is a closed set in, -almost (iii) h : (; A)! (H; D) is measurable and H () = h ( ()) is a closed set in H, -almost every 2 : These conditions are imposed in order for the identi ed sets () and H () to be interpreted as random closed sets in and H induced by a probability law on (; B). 7 The closedness of () and H () are implied, for instance, by continuity of g () and h (). Under these conditions, we can guarantee that for each A 2 A, f : () \ A = ;g and f : () Ag are supported by a probability measure on (; B) : Theorem 3.1 Assume Condition 3.1. (i) For each A 2 A, F jx (A) = F jx (f : () Ag); (3.6) FjX (A) = F jx (f : () \ A 6= ;g) ; (3.7) 7 The inference procedure proposed in this paper can be implemented as long as the posterior of is proper, while we do not know how to accommodate an improper prior for in our development of the analytical results. 15

16 where F jx (B) is the posterior probability measure of, F jx (B) = R B f jx()d for B 2 B. (ii) De ne the posterior lower and upper probabilities of = h () by, for each D 2 D, F jx (D) inf 2M( ) F jx(h 1 (D)); FjX (D) sup F jx (h 1 (D)): 2M( ) These are given by F jx (D) = F jx (f : H() Dg); F jx (D) = F jx(f : H() \ D 6= ;g): Proof. For a proof of (i), see Appendix A. For a proof of (ii), see equation (3.8) below. The expression for F jx (A) given above implies that the posterior lower probability on A calculates the probability that the set () is contained in subset A in terms of the posterior probability law of. On the other hand, the upper probability is interpreted as the posterior probability that the set () hits subset A. Since the probability law of random sets is uniquely characterized by the containment probability, as in (3.6), or the hitting probability, as in (3.7) (the Choquet Theorem), these results show that the posterior lower and upper probabilities for and with our speci cation of the prior class represents a unique probability law of the identi ed sets of and. The second statement of the theorem provides a procedure for transforming or marginalizing the lower and upper probabilities of into those of the parameter of interest. The expressions of F jx (D) and FjX (D) are simple and easy to interpret: the lower and upper probabilities of = h() are the containment and hitting probabilities of the random sets obtained by projecting () through h(). This marginalization rule of the lower probability follows from F jx (D) = F jx (h 1 (D)) = F jx ( : () h 1 (D) ) = F jx (f : H() Dg), (3.8) and the marginalization rule for the upper probability is obtained similarly. It is worth noting that the operation of marginalizing the posterior information of to is very di erent between the lower probability inference and the standard Bayesian inference. In 16

17 the standard Bayesian inference with a single posterior, marginalization of the posterior of to is done by integrating the posterior probability measure of for, while in the lower probability inference, marginalization for corresponds to obtaining the probability law of H () the projected random set of () by = h (). As is well known in the literature, the lower and upper probabilities of a set of probability measures are in general a monotone and nonadditive measure (capacity). In our speci cation of prior class, the closed form expressions of Theorem 3.1 assures that the resulting posterior lower and upper probabilities are supermodular and submodular, respectively. Corollary 3.1 Assume Condition 3.1. The posterior lower and upper probabilities of are supermodular and submodular, respectively. For A 1, A 2 2 A subsets in, F jx (A 1 [ A 2 ) + F jx (A 1 \ A 2 ) F jx (A 1 ) + F jx (A 2 ); FjX (A 1 [ A 2 ) + FjX (A 1 \ A 2 ) FjX (A 1) + FjX (A 2): Also, the posterior lower and upper probabilities of are supermodular and submodular, respectively. For D 1, D 2 2 D subsets in H, F jx (D 1 [ D 2 ) + F jx (D 1 \ D 2 ) F jx (D 1 ) + F jx (D 2 ); FjX (D 1 [ D 2 ) + FjX (D 1 \ D 2 ) FjX (D 1) + FjX (D 2): The result of Theorem 3.1 (i) can be seen as a special case of Wasserman s (1990) general construction of the posterior lower and upper probabilities. An important di erence from Wasserman (1990) is that with our way of specifying prior class M, we can guarantee that the lower probability of the posterior class is an 1-order monotone capacity (a containment functional of random sets). 8 Submodularity property for the posterior upper probability implied from this result is crucial for simplifying the gamma minimax analysis of Section Wasserman (1990, p.463) posed an open question asking which class of priors can assure the posterior lower probability to be a Shafer s belief function (equivalently, a containment functional of random sets). Theorem 3.1 gives an answer to his open question in the situation where the model lacks identi ability. 9 To the best of my knoweldge, it is not yet known if submodularity of the posterior upper probability generally holds or not in the general setup of Wasserman (1990). prior class, we obtain a submodular posterior upper probability. Corollary 3.1 demonstrates that, with our speci cation of 17

18 4 Point Decision of = h() with Multiple Priors In the standard Bayes posterior analysis, the mean or median of a posterior distribution of is often reported as a point estimate of. If we summarize the posterior information of in terms of its posterior lower and upper probabilities instead of a posterior distribution, how should we construct a reasonable point estimator for? We consider a setting where a statistical decision is made after observing data. be an action, where H a is an action space. Let a 2 H a We interpret an action as reporting a non-randomized point estimate for, so that we assume action space H a is a subset of H. Given an action a to be taken, and 0 being the true state of nature, a loss function L( 0 ; a) : H H a! R + yields the cost to the decision maker of taking such an action. function L(; a) is non-negative and D-measurable for each a 2 H a. We assume loss Given a single prior for,, the posterior risk is de ned by ( ; a) L(; a)df jx (); (4.1) H where the rst argument in the posterior risk represents the dependence of the posterior of on the speci cation of the prior for. Our analysis involves multiple priors, so the class of posterior risks ( ; a) : 2 M( ) is considered. The posterior gamma-minimax criterion 10 ranks actions in terms of the worst case posterior risk (upper posterior risk): ( ; a) sup ( ; a), 2M( ) where the rst argument in the upper posterior risk represents the dependence of the prior class on a prior for. De nition 4.1 A posterior gamma-minimax action a x with respect to prior class M is an action that minimizes the upper posterior risk, i.e., ( ; a x) = inf a2h a ( ; a) = inf sup a2h a 2M( ) ( ; a). 10 In the robust Bayes literature, the class of prior distributions is often denoted by, and this is why it is called the (conditional) gamma-minimax criterion. Unfortunately, in the literature of belief functions and lower and upper probabilities, often denotes a set-valued mapping that generates the lower and upper probabilities. In this paper, we adopt the latter notational convention, but still refer to the decision criterion as the gamma-minimax criterion. 18

19 The gamma-minimax decision approach involves a favor for a conservative action that guards against the least favorable prior within the class, and it can be seen as a compromise of the Bayesian decision principle and the minimax decision principle. The next proposition shows that minimizing the upper posterior risk ( ; a) is equivalent to minimizing the Choquet expected loss with respect to the posterior upper probability. Proposition 4.1 Under Condition 3.1, the upper posterior risk satis es ( ; a) = L(; a)dfjx () = sup L(; a)f jx (jx)d, (4.2) 2H() whenever R L(; a)dfjx () < 1. Accordingly, a gamma-minimax decision a x with respect to prior class M exists if and only if EjX sup 2H() L(; a) has a minimizer in a 2 H a. Proof. See Appendix A. The third expression in (4.2) shows that the posterior gamma-minimax criterion is written as the expectation of the worst-case loss function, sup 2H() L(; a), with respect to the posterior of. The supremum part stems from the ambiguity of : given, what the researcher knows about is only that it lies within the identi ed set H () ; so the researcher forms the loss by supposing that the nature chooses the worst case in response to his action a. On the other hand, the expectation in represents the posterior uncertainty of the identi ed set H (): with the nite number of observations, the identi ed set of is a random set induced by the posterior of. The gammaminimax criterion with the class of priors M combines such ambiguity of with the posterior uncertainty of the identi ed set H () to yield a single objective function to be minimized. Although a closed-form expression of a x is not in general available, this proposition suggests a simple numerical algorithm for approximating a x using a random sample of from its posterior f jx (jx). Let f s g S s=1 be S random draws of from posterior f jx(). We shall approximate a x by ^a 1 x arg min a2h a S SX sup s=1 2H( s ) L(; a). The reader may wonder if the posterior gamma minimax action a x can be interpreted as a Bayes action for some posterior distributions in the class. In case of the quadratic loss, the saddle-point 19

20 argument as in Grünwald and Dawid (2003) shows that the gamma-minimax action a x corresponds to the mean of a posterior distribution (Bayes action) that has maximal variance in the class. Such maximal variance posterior for may be obtained by allocating the posterior probability at each to some boundary points of H (). 11 In the presence of multiple priors, it is known that a posteriori optimal gamma-minimax action as considered here can di er from an a priori optimal gamma-minimax decision (dynamic inconsistency problem, Vidakovic (2000)). Interestingly, with our speci cation of prior class, such dynamic inconsistency problem does not arise, so the gamma-minimax action obtained above is supported as an optimal gamma-minimax decision in the unconditional sense. For details and a proof of this claim, see Appendix B. As an alternative to the (posterior) gamma-minimax criterion considered above, it is possible to consider the gamma-minimax regret criterion (Berger (1985, p. 218), and Rios Insua, Ruggeri, and Vidakovic (1995)), which is seen as an extension of the minimax regret criterion of Savage (1951) to the multiple-prior Bayes context. In Appendix B, we provide some analytical results of the gamma-minimax regret analysis where the parameter of interest is a scalar and the loss function is quadratic, L(; a) = ( a) 2. We demonstrate that the gamma-minimax regret decision can di er from the gamma-minimax decision derived above, but that they coincide asymptotically. 5 Set Estimation of This section discusses how to use the posterior lower probability of to conduct set estimation of. In the standard Bayesian inference, set estimation is often done by reporting contour sets of the posterior probability density of (highest posterior density region). If the posterior information for is summarized by the lower and upper probabilities, how should we conduct set estimation of? 11 A common criticism to the gamma-minimax analysis is that a posterior that yields the gamma minimax action as a Bayes action does not appear appealing. If this criticism applies to our context, it would make sense to constrain the prior class, or to seek for a more appealing decision criterion. Further discussion related to this point is provided in Section

21 5.1 Posterior Lower Credible Region For 2 (0; 1), consider a subset C 1 H such that the posterior lower probability F jx (C 1 ) is greater than or equal to 1 : F jx (C 1 ) = F jx (H() C 1 )) 1. (5.1) In words, C 1 is interpreted as a set on which the posterior credibility of is at least 1, no matter which posterior is chosen within the class. If we drop the italicized part from this statement, we obtain the usual interpretation of the posterior credible region, so C 1 de ned in this way seems to be a natural way to extend the Bayesian posterior credible regions to those of the posterior lower probability. Analogous to the Bayesian posterior credible region, there are also multiple C 1 s that satisfy (5.1). For instance, given a posterior credibility region of with credibility 1, B 1, C 1 = [ 2B1 H () suggested in an earlier version of Moon and Schorfheide (2011) satis es (5.1). In our proposal of set inference for, we restrict multiplicity of set estimates as well as penalize their volume by focusing on a smallest one among C 1 s satisfying (5.1), C 1 arg min Leb(C) (5.2) C2C s.t. F jx (H() C)) 1 ; where Leb(C) is the volume of subset C in terms of the Lebesgue measure and C is a family of subsets in H over which the volume-minimizing lower credible region is searched. Analogous to the highest posterior density region in the standard Bayesian inference, focusing on the smallest set estimate has a decision theoretic justi cation in terms of gamma-minimax criterion. 12 C 1 as a posterior lower credible region with credibility 1. We refer to 12 Focusing on smallest C 1 has a decision theoretic justi cation. We can show that under mild conditions C 1 can be supported as a gamma minimax action (if it exists), 2 3 C 1 = arg min 4 sup L (; C) df jx 5 C2C 2M( ) with loss function, L (; C) = Leb (C) + b () [1 1 C ()] ; where b () is a positive constant that depends on credibility level 1 terms of parameter ; so the object of interest is rather than identi ed set H ().. Note that the loss function is written in 21

22 Finding C 1 is challenging if is multi-dimensional and no restriction is placed on class of subsets C. In what follows, we restrict our analysis to scalar and constrain C to the class of closed connected intervals. Let C r ( c ) be a closed interval in H, centered at c 2 H, with width 2 r: Accordingly, the constrained minimization problem of (5.2) is reduced to min r r; c s.t. F jx (H() C r ( c )) 1 : (5.3) The next proposition provides an alternative way to formulate this optimization problem, which will be useful for obtaining C 1 numerically. Proposition 5.1 Let d : H D! R + measures distance from c 2 H to set H () in terms of d ( c ; H()) sup fk c kg. 2H() For each c 2 H, let r 1 ( c ) be the (1 )-th quantile of the distribution of d ( c ; H()) induced by the posterior distribution of, i.e., r 1 ( c ) inf r : F jx (f : d( c ; H()) rg) 1. Then, the solution of the constrained minimization problem (5.3) is given by (r 1 ; c), where c = arg min c 2H r 1 ( c ) and r 1 = r 1 ( c). Proof. See Appendix A. Let f s : s = 1; : : : ; Sg be random draws of from its posterior. At each c 2 H, we rst calculate ^r 1 ( c ), the empirical (1 )-th quantile of d( c ; H()), based on the simulated d( c ; H( s )), s = 1; : : : ; S. Provided that r 1 ( c ) is continuous in c, the obtained empirical (1 )-th quantile, ^r 1 ( c ), is a valid approximation of the underlying (1 )-th quantile r 1 ( c ), so that we can approximate (r1 ; c) by searching a minimizer and the minimized value of ^r 1 ( c ). 5.2 Asymptotic Property of the Posterior Lower Credible Region This section examines the large-sample behavior of the posterior lower probability in an intervally identi ed case, i.e., H () R is a closed bounded interval for almost all 2. From now on, we 22

23 make the sample size explicit in our notation: a size n sample X n is generated from its sampling distribution P X n j 0, where 0 denotes the value of the su cient parameters that corresponds to the true data-generating process. We denote the maximum likelihood estimator for by ^. Provided that the posterior of is consistent 13 to 0 and set-valued map H ( 0 ) is continuous at = 0 in terms of the Hausdor metric d, it is straightforward to show that random sets H () represented by the posterior lower probability F jx n () is consistent to true identi ed set H ( 0 ) in the sense of lim n!1 F jx n (f : d (H () ; H ( 0 )) > g) = 0 for almost every sampling sequence of fx n g. Our main focus in what follows is therefore on the asymptotic coverage property of the lower credible region C 1 constructed in the previous section. Our main analytical result rely on the following set of regularity conditions. Condition 5.1 (i) The parameter of interest is a scalar and the identi ed set H() is -almost surely a nonempty and connected interval, i.e., H () = [ l (); u ()], 1 l () u () 1. The true identi ed set H ( 0 ) = [ l ( 0 ); u ( 0 )] is a bounded interval. (ii) For sequence a n! 1, de ne random variables L n () = a n l () l (^) and U n () = a n u () u (^), whose distribution is induced by the posterior distribution of given sample X n. The cumulative distribution function of (L n () ; U n ()) given X n denoted by J n () is continuous almost surely under ^p (x n j 0 ) for all n. (iii) There exists bivariate random variables (L; U), whose cumulative distribution function J () on R 2 is continuous, and at each c (c l ; c u ) 2 R 2, the cumulative distribution function of (L n () ; U n ()) given X n, J n (c) ; converges in probability under P X n j 0 to J (c). (iv) The sampling distribution of converges in distribution to (L; U) : a n l ( 0 ) l (^) ; a n u ( 0 ) u (^) Conditions 5.1 (iii) and (iv) state that the estimator for the lower and upper bounds of H () satisfy the Bernstein von Mises asymptotic equivalence between the Bayesian estimation and the 13 What we mean for posterior consistency of is that lim n!1 F jx n (G) = 1 for every G open neighborhood of 0 for almost every sampling sequence. In case of nite dimensional, this posterior consistency for is implied by a set of higher-level conditions for the likelihood of. We do not list up all those conditions here for the sake of brevity. See Section 7.4 of Schervish (1995) for details. 23

Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability

Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Estimation and Inference for Set-identi ed Parameters Using Posterior Lower Probability Toru Kitagawa Department of Economics, University College London 28, July, 2012 Abstract In inference for set-identi

More information

Inference and decision for set identified parameters using posterior lower and upper probabilities

Inference and decision for set identified parameters using posterior lower and upper probabilities Inference and decision for set identified parameters using posterior lower and upper probabilities Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP16/11

More information

Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities

Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities Inference and Decision for Set Identi ed Parameters Using the Posterior Lower and Upper Probabilities Toru Kitagawa CeMMAP and Department of Economics, UCL First Draft, July, 2010 This Draft, August, 2010

More information

Intersection Bounds, Robust Bayes, and Updating Ambiguous Beliefs.

Intersection Bounds, Robust Bayes, and Updating Ambiguous Beliefs. Intersection Bounds, Robust Bayes, and Updating mbiguous Beliefs. Toru Kitagawa CeMMP and Department of Economics, UCL Preliminary Draft November, 2011 bstract This paper develops multiple-prior Bayesian

More information

Robust Bayesian inference for set-identified models

Robust Bayesian inference for set-identified models Robust Bayesian inference for set-identified models Raffaella Giacomini Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP61/18 Robust Bayesian Inference

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Estimation under Ambiguity (Very preliminary)

Estimation under Ambiguity (Very preliminary) Estimation under Ambiguity (Very preliminary) Ra aella Giacomini, Toru Kitagawa, y and Harald Uhlig z Abstract To perform a Bayesian analysis for a set-identi ed model, two distinct approaches exist; the

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS

BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BAYESIAN INFERENCE IN A CLASS OF PARTIALLY IDENTIFIED MODELS BRENDAN KLINE AND ELIE TAMER UNIVERSITY OF TEXAS AT AUSTIN AND HARVARD UNIVERSITY Abstract. This paper develops a Bayesian approach to inference

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

Robust Bayes Inference for Non-identified SVARs

Robust Bayes Inference for Non-identified SVARs 1/29 Robust Bayes Inference for Non-identified SVARs Raffaella Giacomini (UCL) & Toru Kitagawa (UCL) 27, Dec, 2013 work in progress 2/29 Motivating Example Structural VAR (SVAR) is a useful tool to infer

More information

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Natalia Lazzati y November 09, 2013 Abstract We study collective choice models from a revealed preference approach given limited

More information

How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1

How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1 How to Attain Minimax Risk with Applications to Distribution-Free Nonparametric Estimation and Testing 1 Karl H. Schlag 2 March 12, 2007 ( rst version May 12, 2006) 1 The author would like to thank Dean

More information

Limited Information Econometrics

Limited Information Econometrics Limited Information Econometrics Walras-Bowley Lecture NASM 2013 at USC Andrew Chesher CeMMAP & UCL June 14th 2013 AC (CeMMAP & UCL) LIE 6/14/13 1 / 32 Limited information econometrics Limited information

More information

Estimation under Ambiguity

Estimation under Ambiguity Estimation under Ambiguity R. Giacomini (UCL), T. Kitagawa (UCL), H. Uhlig (Chicago) Giacomini, Kitagawa, Uhlig Ambiguity 1 / 33 Introduction Questions: How to perform posterior analysis (inference/decision)

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Inference about Non- Identified SVARs

Inference about Non- Identified SVARs Inference about Non- Identified SVARs Raffaella Giacomini Toru Kitagawa The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP45/14 Inference about Non-Identified SVARs

More information

Robust Bayes Inference for Non-Identified SVARs

Robust Bayes Inference for Non-Identified SVARs Robust Bayes Inference for Non-Identified SVARs Raffaella Giacomini and Toru Kitagawa This draft: March, 2014 Abstract This paper considers a robust Bayes inference for structural vector autoregressions,

More information

Approximately Most Powerful Tests for Moment Inequalities

Approximately Most Powerful Tests for Moment Inequalities Approximately Most Powerful Tests for Moment Inequalities Richard C. Chiburis Department of Economics, Princeton University September 26, 2008 Abstract The existing literature on testing moment inequalities

More information

The Identi cation Power of Equilibrium in Games: The. Supermodular Case

The Identi cation Power of Equilibrium in Games: The. Supermodular Case The Identi cation Power of Equilibrium in Games: The Supermodular Case Francesca Molinari y Cornell University Adam M. Rosen z UCL, CEMMAP, and IFS September 2007 Abstract This paper discusses how the

More information

Columbia University. Department of Economics Discussion Paper Series. The Knob of the Discord. Massimiliano Amarante Fabio Maccheroni

Columbia University. Department of Economics Discussion Paper Series. The Knob of the Discord. Massimiliano Amarante Fabio Maccheroni Columbia University Department of Economics Discussion Paper Series The Knob of the Discord Massimiliano Amarante Fabio Maccheroni Discussion Paper No.: 0405-14 Department of Economics Columbia University

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587 Cycles of length two in monotonic models José Alvaro Rodrigues-Neto Research School of Economics, Australian National University ANU Working Papers in Economics and Econometrics # 587 October 20122 JEL:

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

MCMC CONFIDENCE SETS FOR IDENTIFIED SETS. Xiaohong Chen, Timothy M. Christensen, and Elie Tamer. May 2016 COWLES FOUNDATION DISCUSSION PAPER NO.

MCMC CONFIDENCE SETS FOR IDENTIFIED SETS. Xiaohong Chen, Timothy M. Christensen, and Elie Tamer. May 2016 COWLES FOUNDATION DISCUSSION PAPER NO. MCMC CONFIDENCE SETS FOR IDENTIFIED SETS By Xiaohong Chen, Timothy M. Christensen, and Elie Tamer May 2016 COWLES FOUNDATION DISCUSSION PAPER NO. 2037 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

More information

Instrumental variable models for discrete outcomes

Instrumental variable models for discrete outcomes Instrumental variable models for discrete outcomes Andrew Chesher The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP30/08 Instrumental Variable Models for Discrete Outcomes

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Counterfactual worlds

Counterfactual worlds Counterfactual worlds Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP22/15 Counterfactual Worlds Andrew Chesher and Adam M. Rosen CeMMAP

More information

Non-parametric Identi cation and Testable Implications of the Roy Model

Non-parametric Identi cation and Testable Implications of the Roy Model Non-parametric Identi cation and Testable Implications of the Roy Model Francisco J. Buera Northwestern University January 26 Abstract This paper studies non-parametric identi cation and the testable implications

More information

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis

Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Nearly Optimal Tests when a Nuisance Parameter is Present Under the Null Hypothesis Graham Elliott UCSD Ulrich K. Müller Princeton University Mark W. Watson Princeton University and NBER January 22 (Revised

More information

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics CS 540: Machine Learning Lecture 2: Review of Probability & Statistics AD January 2008 AD () January 2008 1 / 35 Outline Probability theory (PRML, Section 1.2) Statistics (PRML, Sections 2.1-2.4) AD ()

More information

Control Functions in Nonseparable Simultaneous Equations Models 1

Control Functions in Nonseparable Simultaneous Equations Models 1 Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell 2 UCL & IFS and Rosa L. Matzkin 3 UCLA June 2013 Abstract The control function approach (Heckman and Robb (1985)) in a

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

An Instrumental Variable Model of Multiple Discrete Choice

An Instrumental Variable Model of Multiple Discrete Choice An Instrumental Variable Model of Multiple Discrete Choice Andrew Chesher y UCL and CeMMAP Adam M. Rosen z UCL and CeMMAP February, 20 Konrad Smolinski x UCL and CeMMAP Abstract This paper studies identi

More information

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008

Instrumental Variable Models for Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice and UCL. Revised November 13th 2008 Instrumental Variable Models for Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice and UCL Revised November 13th 2008 Abstract. Single equation instrumental variable models for

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

Characterizations of identified sets delivered by structural econometric models

Characterizations of identified sets delivered by structural econometric models Characterizations of identified sets delivered by structural econometric models Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP44/16

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Mean-Variance Utility

Mean-Variance Utility Mean-Variance Utility Yutaka Nakamura University of Tsukuba Graduate School of Systems and Information Engineering Division of Social Systems and Management -- Tennnoudai, Tsukuba, Ibaraki 305-8573, Japan

More information

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z Boston College First

More information

Estimation under Ambiguity

Estimation under Ambiguity Estimation under Ambiguity Raffaella Giacomini, Toru Kitagawa, and Harald Uhlig This draft: March 2019 Abstract To perform a Bayesian analysis for a set-identified model, two distinct approaches exist;

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Semi and Nonparametric Models in Econometrics

Semi and Nonparametric Models in Econometrics Semi and Nonparametric Models in Econometrics Part 4: partial identification Xavier d Haultfoeuille CREST-INSEE Outline Introduction First examples: missing data Second example: incomplete models Inference

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Generalized instrumental variable models, methods, and applications

Generalized instrumental variable models, methods, and applications Generalized instrumental variable models, methods, and applications Andrew Chesher Adam M. Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP43/18 Generalized

More information

arxiv: v3 [stat.me] 26 Sep 2017

arxiv: v3 [stat.me] 26 Sep 2017 Monte Carlo Confidence Sets for Identified Sets Xiaohong Chen Timothy M. Christensen Elie Tamer arxiv:165.499v3 [stat.me] 26 Sep 217 First draft: August 215; Revised September 217 Abstract In complicated/nonlinear

More information

Inferential models: A framework for prior-free posterior probabilistic inference

Inferential models: A framework for prior-free posterior probabilistic inference Inferential models: A framework for prior-free posterior probabilistic inference Ryan Martin Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago rgmartin@uic.edu

More information

Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data.

Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data. Partial Identi cation in Monotone Binary Models: Discrete Regressors and Interval Data. Thierry Magnac Eric Maurin y First version: February 004 This revision: December 006 Abstract We investigate identi

More information

Solving Extensive Form Games

Solving Extensive Form Games Chapter 8 Solving Extensive Form Games 8.1 The Extensive Form of a Game The extensive form of a game contains the following information: (1) the set of players (2) the order of moves (that is, who moves

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Sharp identified sets for discrete variable IV models

Sharp identified sets for discrete variable IV models Sharp identified sets for discrete variable IV models Andrew Chesher Konrad Smolinski The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP11/10 Sharp identi ed sets for

More information

Gaussian Estimation under Attack Uncertainty

Gaussian Estimation under Attack Uncertainty Gaussian Estimation under Attack Uncertainty Tara Javidi Yonatan Kaspi Himanshu Tyagi Abstract We consider the estimation of a standard Gaussian random variable under an observation attack where an adversary

More information

What do instrumental variable models deliver with discrete dependent variables?

What do instrumental variable models deliver with discrete dependent variables? What do instrumental variable models deliver with discrete dependent variables? Andrew Chesher Adam Rosen The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP10/13 What

More information

Applications of Subsampling, Hybrid, and Size-Correction Methods

Applications of Subsampling, Hybrid, and Size-Correction Methods Applications of Subsampling, Hybrid, and Size-Correction Methods Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department of Economics UCLA November

More information

Learning and Risk Aversion

Learning and Risk Aversion Learning and Risk Aversion Carlos Oyarzun Texas A&M University Rajiv Sarin Texas A&M University January 2007 Abstract Learning describes how behavior changes in response to experience. We consider how

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL July 5th 2007 Accompanies the presentation Identi cation and Discrete Measurement CeMMAP Launch Conference,

More information

Random Utility Models, Attention Sets and Status Quo Bias

Random Utility Models, Attention Sets and Status Quo Bias Random Utility Models, Attention Sets and Status Quo Bias Arie Beresteanu and Roee Teper y February, 2012 Abstract We develop a set of practical methods to understand the behavior of individuals when attention

More information

Semi-parametric Bayesian Partially Identified Models based on Support Function

Semi-parametric Bayesian Partially Identified Models based on Support Function Semi-parametric Bayesian Partially Identified Models based on Support Function Yuan Liao University of Maryland Anna Simoni CNRS and THEMA November 2013 Abstract We provide a comprehensive semi-parametric

More information

Desire-as-belief revisited

Desire-as-belief revisited Desire-as-belief revisited Richard Bradley and Christian List June 30, 2008 1 Introduction On Hume s account of motivation, beliefs and desires are very di erent kinds of propositional attitudes. Beliefs

More information

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS ECONOMETRICS II (ECO 2401) Victor Aguirregabiria Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS 1. Introduction and Notation 2. Randomized treatment 3. Conditional independence

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Bayesian consistent prior selection

Bayesian consistent prior selection Bayesian consistent prior selection Christopher P. Chambers and Takashi Hayashi yzx August 2005 Abstract A subjective expected utility agent is given information about the state of the world in the form

More information

Robust inference about partially identified SVARs

Robust inference about partially identified SVARs Robust inference about partially identified SVARs Raffaella Giacomini and Toru Kitagawa This draft: June 2015 Abstract Most empirical applications using partially identified Structural Vector Autoregressions

More information

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern

More information

Revisiting independence and stochastic dominance for compound lotteries

Revisiting independence and stochastic dominance for compound lotteries Revisiting independence and stochastic dominance for compound lotteries Alexander Zimper Working Paper Number 97 Department of Economics and Econometrics, University of Johannesburg Revisiting independence

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS. Revised April 2nd 2008 Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL & IFS Revised April 2nd 2008 Abstract. This paper studies models for discrete outcomes which permit explanatory

More information

Stochastic dominance with imprecise information

Stochastic dominance with imprecise information Stochastic dominance with imprecise information Ignacio Montes, Enrique Miranda, Susana Montes University of Oviedo, Dep. of Statistics and Operations Research. Abstract Stochastic dominance, which is

More information

Identifying Structural E ects in Nonseparable Systems Using Covariates

Identifying Structural E ects in Nonseparable Systems Using Covariates Identifying Structural E ects in Nonseparable Systems Using Covariates Halbert White UC San Diego Karim Chalak Boston College October 16, 2008 Abstract This paper demonstrates the extensive scope of an

More information

The Kuhn-Tucker Problem

The Kuhn-Tucker Problem Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2010 Paper 117 Estimating Causal Effects in Trials Involving Multi-treatment Arms Subject to Non-compliance: A Bayesian Frame-work

More information

Epistemic independence for imprecise probabilities

Epistemic independence for imprecise probabilities International Journal of Approximate Reasoning 24 (2000) 235±250 www.elsevier.com/locate/ijar Epistemic independence for imprecise probabilities Paolo Vicig Dipartimento di Matematica Applicata ``B. de

More information

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Comparison of inferential methods in partially identified models in terms of error in coverage probability Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,

More information

Lecture 8: Basic convex analysis

Lecture 8: Basic convex analysis Lecture 8: Basic convex analysis 1 Convex sets Both convex sets and functions have general importance in economic theory, not only in optimization. Given two points x; y 2 R n and 2 [0; 1]; the weighted

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

University of Toronto

University of Toronto A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,

More information

Chapter 2. GMM: Estimating Rational Expectations Models

Chapter 2. GMM: Estimating Rational Expectations Models Chapter 2. GMM: Estimating Rational Expectations Models Contents 1 Introduction 1 2 Step 1: Solve the model and obtain Euler equations 2 3 Step 2: Formulate moment restrictions 3 4 Step 3: Estimation and

More information

Sklar s theorem in an imprecise setting

Sklar s theorem in an imprecise setting Sklar s theorem in an imprecise setting Ignacio Montes a,, Enrique Miranda a, Renato Pelessoni b, Paolo Vicig b a University of Oviedo (Spain), Dept. of Statistics and O.R. b University of Trieste (Italy),

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Ordered Value Restriction

Ordered Value Restriction Ordered Value Restriction Salvador Barberà Bernardo Moreno Univ. Autònoma de Barcelona and Univ. de Málaga and centra March 1, 2006 Abstract In this paper, we restrict the pro les of preferences to be

More information

Trimming for Bounds on Treatment Effects with Missing Outcomes *

Trimming for Bounds on Treatment Effects with Missing Outcomes * CENTER FOR LABOR ECONOMICS UNIVERSITY OF CALIFORNIA, BERKELEY WORKING PAPER NO. 5 Trimming for Bounds on Treatment Effects with Missing Outcomes * David S. Lee UC Berkeley and NBER March 2002 Abstract

More information

Recursive Ambiguity and Machina s Examples

Recursive Ambiguity and Machina s Examples Recursive Ambiguity and Machina s Examples David Dillenberger Uzi Segal May 0, 0 Abstract Machina (009, 0) lists a number of situations where standard models of ambiguity aversion are unable to capture

More information

MCMC Confidence Sets for Identified Sets

MCMC Confidence Sets for Identified Sets MCMC Confidence Sets for Identified Sets Xiaohong Chen Timothy M. Christensen Keith O'Hara Elie Tamer The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP28/16 MCMC Confidence

More information

arxiv: v1 [math.pr] 9 Jan 2016

arxiv: v1 [math.pr] 9 Jan 2016 SKLAR S THEOREM IN AN IMPRECISE SETTING IGNACIO MONTES, ENRIQUE MIRANDA, RENATO PELESSONI, AND PAOLO VICIG arxiv:1601.02121v1 [math.pr] 9 Jan 2016 Abstract. Sklar s theorem is an important tool that connects

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Local disaggregation of demand and excess demand functions: a new question

Local disaggregation of demand and excess demand functions: a new question Local disaggregation of demand and excess demand functions: a new question Pierre-Andre Chiappori Ivar Ekeland y Martin Browning z January 1999 Abstract The literature on the characterization of aggregate

More information

Bayesian vs frequentist techniques for the analysis of binary outcome data

Bayesian vs frequentist techniques for the analysis of binary outcome data 1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly

More information

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data Outline A Course in Applied Econometrics Lecture 10 1. Introduction 2. Example I: Missing Data Partial Identification 3. Example II: Returns to Schooling 4. Example III: Initial Conditions Problems in

More information

PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION

PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION AEA SESSION: REVEALED PREFERENCE THEORY AND APPLICATIONS: RECENT DEVELOPMENTS Abigail Adams (IFS & Oxford) Martin Browning (Oxford & IFS)

More information

A gentle introduction to imprecise probability models

A gentle introduction to imprecise probability models A gentle introduction to imprecise probability models and their behavioural interpretation Gert de Cooman gert.decooman@ugent.be SYSTeMS research group, Ghent University A gentle introduction to imprecise

More information