SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS

Size: px
Start display at page:

Download "SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS"

Transcription

1 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS JONG-MYUN MOON Abstract. This paper studies transformation models T (Y ) = X + " with an unknown monotone transformation T. Our focus is on the identi cation and estimation of, leaving the speci cation of T and the distribution of " nonparametric. We identify under a new set of conditions; speci cally, we demonstrate that identi cation may be achieved even when the regressor X has bounded support and contains discrete random variables. Our identi cation is constructive and leads to sieve extremum estimator. The empirical criterion of our estimator has a U-process structure, and therefore does not conform to existing results in the sieve estimation literature. We derive the convergence rate of the estimator and demonstrate its asymptotic normality. For inference, the weighted bootstrap is proved to be consistent. The estimator is simple to implement with standard optimization algorithms. A simulation study provides insight on its nite-sample performance. Date: October 27, 214 A liation and Contact Information: UCL and CeMMAP, jong-myun.moon@ucl.ac.uk. 1

2 2 JONG-MYUN MOON 1. Introduction Data transformation is often used in econometric analysis. For example, dependent variables are routinely log-transformed in linear regressions in order to mitigate nonlinearity and heteroskedasticity. This e ective but arbitrary technique can be justi ed if transforming functions are included as model parameters and then estimated using data. The most prominent example of this approach is the in uential Box-Cox transformation model (Box and Cox, 1964). Those authors suggested a parametric family of power functions, including a log-transformation, as candidate functions for data transformation. There are several variations of this approach, which involve di erent sets of transformation functions. However, if complex patterns are possible, then a nonparametric approach provides a useful alternative. This paper concerns identi cation and estimation of regression models with a nonparametric transformation. Regression models with a transformed dependent variable are called transformation models. Speci cally, transformation models are represented by the equation (1) T (Y ) = X + "; where Y 2 R and X 2 R dx are observed random variables, and " 2 R is an unobserved error term. In the model (1), there are three parameters: (i) the regressor coe cient, (ii) the transformation T and (iii) the error distribution. We consider the case when both T and the error distribution are nonparametric. Horowitz (1996) and Chen (22) review the literature regarding identi cation and estimation of model (1). For related models in econometrics, see Matzkin (27). Following the literature, we assume (i) " is independent of X and (ii) T is strictly monotone. There are several applications of the transformation model (1). An important class of transformation models are duration models. In labor economics, the study of employment and unemployment durations is an important area of research, and the duration model has been the main vehicle of empirical studies (Kiefer, 1988, Farber, 1999). More recently, the unemployment duration is often studied through the labor-market search model (Mortensen and Pissarides, 1999, Rogerson, Shimer, and Wright, 25), which imposes testable implications on the duration models (Eckstein and Van den Berg, 27). See Meyer (1996), van den Berg and Ridder (1998) and van den Berg (21) for related works. Also, hedonic models with additive marginal utility and additive marginal production technology, studied by Ekeland, Heckman, and Nesheim (22, 24), are closely related to the transformation model (1). Chiappori, Komunjer, and Kristensen (213) provides an extensive list of applications in di erent areas.

3 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 3 We contribute to the literature by producing new conditions for identi cation and proposing a new estimator for. The identi cation exploits two model features: the monotonicity of T and the additive separability of X and ". First, we notice that an ordering is preserved by any monotone transformation. Therefore, if we use only the ordering induced by Y for the identi cation, then the speci c form of the transformation T is entirely irrelevant. This is not to say T is not identi ed; indeed, if is identi ed, then the identi cation of T can be established following Chen (22). However, in order to identify, it is enough to consider the ordering induced by Y, as will be demonstrated. Further, in general, the ordering is completely characterized by a binary relation. Therefore, if we are to use the information on ordering only, it is enough to see the binary comparison of any pair of data. The second observation leading to the identi cation is that the ordering of Y is determined by a linear function X +". Suppose we have two observations (Y 1 ; X1 ) and (Y 2; X2 ). Then we see that Y 1 < Y 2 if and only if X1 + " 1 < X2 + " 2. This observation may be summarized to the equality (2) IfY 2 Y 1 > g = If (X 2 X 1 ) < " 2 " 1 g; for an indicator function Ifg. The relation (2) is similar to the binary choice model. The di erence of errors " 2 " 1 has the role of a random threshold, and the binary outcome of whether the inequality Y 2 Y 1 > holds or not is determined by whether the threshold is crossed by the di erence of two single indices (X 2 X 1 ). These two observations help us formulate a minimization problem that identi es the model parameter as a unique solution. Our identi cation result is similar in spirit to the identi - cation of the maximum rank correlation (MRC) estimator of Han (1987). A distinctive feature of our approach is that the cumulative distribution function (cdf) of " 2 " 1, denoted by F, will be identi ed along with. Our identi cation result is new, and provides new identifying conditions. Speci cally, we allow the regressor vector X to contain discrete random variables. Further, all continuous regressors may have bounded support. Our key identifying condition is intuitive; we require that discrete regressors do not dominate the continuous regressors in terms of the relative contribution to the single index X. However, regardless of the condition s being met, the subvector of for continuous regressors is identi ed. The identi cation is constructive in the sense that it suggests a natural estimator. Our estimator is de ned as a minimizing solution to an empirical criterion, and the empirical criterion is acquired as a sample analogue of the identifying criterion. We propose to use the method of sieves. Sieves refer to a collection of subsets of parameter space which approximates the original parameter space increasingly well. Conceptually, a denser sieve is employed as more data are collected. See Chen (27) for a survey of the literature on sieve estimation.

4 4 JONG-MYUN MOON Our estimation procedure involves minimizing an empirical criterion, which is a function of and F, over a sieve space. As implied by the equation (2), the criterion function will involve pairwise combinations of observations in its formulation. As such, our empirical criterion has a U-process structure; in other words, it appears as a double-summation over every pair of observations. Extremum estimation involving U-processes has been studied by Sherman (1993, 1994) for parametric problems, and the theory is applied to the MRC estimation. The MRC criterion function has a U-process structure, and it is a step function of a Euclidean parameter. On the other hand, our empirical criterion function is a smooth function of parameters, which is one advantage of our approach. However, we need to extend the existing literature to deal with a seminonparametric problem to account for the in nite-dimensional parameter F. To do so, we adopt and modify the existing results on sieve M-estimation by Shen and Wong (1994) and Shen (1997). The main contribution here is to show that the estimator minimizing the U- process can be represented as an approximate M-estimator. We achieve this by approximating the U-process using a more familiar empirical process. The theoretical device used for this task is the U-process maximal inequality; in Appendix B, we present its working form. We show that the estimator of F converges faster than the n 1=4 -rate in terms of L 2 -norm. The estimator of converges at the n 1=2 -rate to the normal distribution. Regarding the inference on, because we provide an explicit form of the asymptotic variance, an inference can be conducted relying on the asymptotic approximation. A downside of this approach is that the asymptotic covariance matrix has quite a complex form, and that it requires estimation of even more nonparametric objects, such as a conditional expectation. Therefore, we prefer simulation-based methods and suggest a weighted-bootstrap scheme to approximate the nite-sample distribution of the estimator. The consistency of weighted bootstrap has been recently shown by Ma and Kosorok (25) and Chen and Pouzo (29) for the sieve M-estimation and the conditional moment model, respectively. We extend these earlier works to the case when the empirical criterion has a U-process structure. There are several literatures related to this paper. First, several papers have proposed to estimate T nonparametrically, when p n-consistent estimator of is available. As such, these papers and our work are complementary. See Horowitz (1996), Ye and Duan (1997), Klein and Sherman (22), and Chen (22). If T is parametrized, then all the model parameters can be estimated jointly, including and T. Relevant works in this approach include Linton, Sperlich, and Van Keilegom (28) and Santos (211) among others. Second, there are rank-based estimators initiated by Han (1987). Other relevant works in this strand include Cavanagh and Sherman (1998), Abrevaya (23), Khan and Tamer (27) and Khan, Shin, and Tamer (211) among others. A common aspect shared by these methods is that

5 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 5 is identi ed and estimated without knowledge of T and the error distribution. Third, the methods for the single-index model are applicable to the transformation model. Single-index models have been extensively studied in econometrics and statistics since Ichimura (1993); see Horowitz (1998) and Ichimura and Todd (27) for surveys. In addition, although our estimator is designed speci cally for the transformation model, its technical aspect is akin to that of the single-index regression model. This is because the Euclidean parameter enters the in nite-dimensional parameter F as its argument. Rather unexpectedly, however, few works relate the sieve estimation to single-index models; see Ding and Nan (211) and references therein. These results are not applicable to our problem. 1 Therefore, we develop a suitable asymptotic theory that applies to the single-index problem in the context of sieve estimation. The remainder of this paper is organized as follows. Section 2 de nes the model and establishes the identi cation. Section 3 de nes our estimator and shows its consistency. Section 4 derives the rate-of-convergence. Section 5 shows the asymptotic normality of the estimator. It also includes the consistency of the weighted bootstrap procedure. Section 6 contains a simulation study. Section 7 discusses possible extensions. Proofs are gathered in the Appendix. Most notations will be de ned in Section 2 and in Appendix A.1, but inevitably more notations will be added throughout the paper. 2. Identification We de ne the criterion function that identi es and F as its minimizing solution. To this end, we need to introduce scale and location normalizations. As a scale normalization, the rst component of is normalized to 1, and thus is written as ( ;1 ; ) for a scalar ;1 such that j ;1 j = 1 and some (d x 1)-dimensional vector. To see why it is necessary, consider T = ct and " i = c" i for some positive constant c >. Because T is strictly increasing and " i is not observed, an alternative model T (Yi ) = Xi (c ) + " i is observationally equivalent to the original model (1). Therefore, for the point identi cation of, we need to restrict the parameter space for so that no two possible points 1 and 2 can be related as a constant multiple of the other. There are other ways to achieve the scale normalization. For instance, we could set j j = 1, so that the parameter space for is a unit sphere in R dx. The location normalization is achieved by not allowing a constant term in X. Suppose we have a constant term c, and write the model as T (Y i ) = c+x i +" i. This can be equivalently 1 The recent work by Ding and Nan (211) assumes that the empirical criterion is twice Fréchet di erentiable with respect to a certain pseudo-metric. See pp of Ding and Nan (211). Our empirical criterion is not Fréchet di erentiable.

6 6 JONG-MYUN MOON written as T (Y i ) = c + Xi + " i for " i = " i + c c for any constant c. As these two models are observationally equivalent, the constant term c or c is not identi ed. Notice that we do not impose a location normalization to " i ; its mean or median is not restricted. As mentioned in the introduction, our criterion function is motivated by the relation (2). We develop further from (2) to induce the identifying criterion. By taking a conditional expectation in both sides of (2), we have P (Y > jx 1 ; X 2 ) = P (" > X jx 1 ; X 2 ) = 1 F ( X ); where F is the cdf of " 2 " 1 and the notation denotes a di erence of two consecutive observations; that is, () = () 2 () 1. Recall " 1 and " 2 are from i.i.d. samples, and hence the distribution of " 1 " 2 is equal to the distribution " 2 " 1. This implies that 1 F ( z) = F (z) for any z 2 R. Then we have an equation (3) P (Y > jx 1 ; X 2 ) = F (X ): This relation leads us to de ne a new criterion. To state it, let us relabel the parameters and F. Because the rst component of is normalized to 1, we denote it separately by b 2 f 1; 1g. Then is a (d x 1)-by-1 vector such that = (b; ). Combining the parameters of interest and F, we write = (; F ). Then, we de ne a nonlinear least squares criterion implied by the relation (3) as follows; for V i = (Y i ; X i ), (4) h(b; ; V 1 ; V 2 ) = fify > g F (X )g 2 ; Q(b; ) = E[h(b; ; V 1 ; V 2 )]: We call Q(b; ) the population criterion. A corresponding empirical criterion will be de ned in Section 3. Theorem 2.1 below shows that and F are uniquely identi ed as a minimizer of the population criterion function Q, and the in nite-dimensional parameter F is uniquely identi ed on the support of X. The following notations are needed. Because we have di erent conditions for continuous and discrete regressors (see Assumption 2.3), let us divide X i to a continuous random vector X i;c 2 R dc and a discrete random vector X i;c 2 R dx dc so that Xi = (X i;c ; X i;d ). Divide to c and d accordingly. Similarly, we write X = (Xc; Xd ). The support of a random vector X is denoted by supp X. 2 Lastly, we denote N j = supp " \ fx ;c + j 1 ;d : x 2 supp X i;c g \ fx ;c + j ;d : x 2 supp X i;c g; for some constants f j g dx dc j= and j 2 f1; ; d x d c g. The last notation N j is used only for the identi cation purpose (see Assumption 2.4). Assumption 2.1. fy i ; X i ; " i g n i=1 is independent and identically distributed (i.i.d.) and conforms to the equation (1). " i is continuous and independent of X i. 2 For a random variable X, its support is de ned as the smallest closed set B such that P [X 2 B c ] =.

7 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 7 Assumption 2.2. (i) = ( ;1 ; ) for ;1 2 f 1; 1g and 2 R dx 1. (ii) F is a collection of continuous monotone functions on R. F 2 F. Assumption 2.3. (i) For X i = (X i;c ; X i;d ), X i;c 2 R dc is jointly continuous, and X i;d 2 R dx dc is discrete. There is no constant in X i. (ii) supp X i = supp X i;c supp X i;d. (iii) supp X d is not contained in a proper linear subspace of R dx dc. Assumption 2.4. There exist a set of points f = ; 1 ; ; dx d c g such that j 2 supp X d for j = 1; ; d x d c and f 1 ; ; dx d c g are linearly independent. In addition, the set N j has a non-empty interior for every j = 1; ; d x d c. Assumption 2.1 is standard in the literature although the independence of " i and X i can be weakened to the conditional median independence; see Khan, Shin and Tamer (212). We do not consider this possibility. Assumption 2.2 regards the parameter spaces for and F. Assumption 2.2 (i) restates our scale normalization and restricts to be compact. Assumption 2.2 (ii) de nes F, and restricts the parameter space F for F. Regarding the identi cation, F needs not to be smooth. However, it is essential that any F 2 F is continuous and monotone. The monotonicity requirement is not required if the support for X is R. Heuristically speaking, this assumption regulates the possible value of the parameter when the support of X i is not connected due to the existence of discrete regressors. Assumption 2.3 (i) allows that both continuous and discrete regressors in X i. The requirement that X i;c is jointly continuous implies that supp X c has a non-empty interior, or, equivalently, that supp X c is not included in a proper subspace of R dx;c. If this assumption is violated, then supp X c will exhibit the multi-collinearity. As explained above, a constant term is not allowed. Assumption 2.3 (ii) means that the support of X i;c does not depend on the realization of X i;d. This assumption can be weakened, and we may allow the support of X i;c depends on the value of the discrete regressors X i;d, as long as the support of X i;c conditional on X i;d has non-empty interior in R dc. What is necessary to this generalization is only to modify Assumption 2.4 accordingly. The proof of identi cation will remain essentially same. For simplicity, however, we do not attempt this generalization. Assumption 2.3 (iii) is a requirement for the discrete regressor X i;d, parallel to the requirement that X i;c is jointly continuous. Assumption 2.4 requires that (i) the contribution of discrete variables to the single index X i is not too large relative to that of continuous variables, and (ii) the variation of the error term is not too small. This assumption regards the identi cation of ;d, that is, the regression parameter for the discrete regressors. If there is no discrete regressor, therefore, Assumptions 2.4 is not needed. Even with discrete regressors, if the support of any regressor is R, then we can omit it.

8 8 JONG-MYUN MOON Theorem 2.1. Suppose Assumptions hold and de ne A = F. let Q(b 1 ; 1 ) = min Q(b; ) b2f 1;1g;2A for some b 1 2 f 1; 1g and 1 = ( 1 ; F 1 ) 2 A. Then b 1 = ;1, 1 = and F 1 (z) = F (z) for any z 2 supp X. The proof of Theorem 2.1 is in the Appendix A. Theorem 2.1 establishes the identi cation of and F. We stress that F is identi ed only on the support of X. This fact adds some complication when we study the estimation of and F. 3. Consistency 3.1. Extremum Estimation and Method of Sieves. The identi cation result of Theorem 2.1 is constructive in the sense that it suggests an extremum estimator. This section de nes our estimator and proves its consistency. As there is an in nite-dimensional parameter, the consistency will be stated in terms of a particular norm that we de ne soon. Before proceeding, let us add one simpli cation. Henceforth, we assume ;1 is known and its value is 1; this can be accepted without loss of generality, because our estimator of ;1 exactly equals to the true value, with probability approaching 1. Therefore we let = (1; ), and further, simplify notations to Q() = Q(1; ) and h(; ; ) = h(1; ; ; ). The sample analogue to the population criterion Q de ned in (4) is 1 X (5) Q n () = h(; V i ; V j ): n(n 1) Let us call Q n the empirical criterion. It is immediate from the de nition that E[Q n ()] = Q(). Also, Q n () is a U-statistic for Q(). If viewed as a stochastic process, then Q n () induces a U-process, a generalization of U-statistic; it is a U-process after centering by Q() and scaling by p n. Much of our asymptotic theory will rely on the U-process theory 3. We minimize Q n not over A but over a subset of A, called a sieve. Let us denote a collection of sieves by fa k g. It is required that the sieve A k approximates the entire parameter space A increasingly accurately as the index k grows. For a nite sample size, we get to pick one sieve A k to use. However, conceptually, a di erent sieve A kn is used as the sample size n changes. The sieve index k n depends on n, and grows to the in nity along with the sample size n. Our discussion below will rely on abstract assumptions on the sieve spaces fa k g and the speed of divergence of the sieve index k n. Because is nite-dimensional, we may de ne the sieve A k as a product of and F k ; that is, only the in nite-dimensional F is sieved. 3 U-Process theory is similar to the empirical process theory. For more about U-process theory, see Arcones and Giné (1993), Sherman (1994) and de la Peña and Giné (1999) among others.

9 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 9 Using the sieve A kn = F kn, we de ne the estimator ^ n as follows; (6) ^ n 2 argmin Q n (): 2A kn We write ^ n = (^ n ; ^F n ) for ^ n 2 and ^F n 2 F kn. If there are multiple minimizers in (6), any point among them can be chosen as the estimator Consistency. In semi-nonparametric problems, there are several candidates for a norm attached to the parameter space. This is due to the in nite-dimensional nature of the problem. One of the main task in studying a semi-nonparametric problem is to nd out a proper norm to the context. In contrast, in parametric problems, the Euclidean norm is a natural choice to measure a distance. We start by de ning a suitable norm to state the consistency of the estimator ^ n. 4 When de ning norms on F, an important fact is that F is only identi ed on the support of X. Therefore, we rst de ne a norm on F as ( ) kf k F;c = max sup jf (z)j; sup jf (z)j z2supp X supp X Then, de ne the consistency norm k k c on A as kk c = jj + kf k F;c. Also, denote the usual sup-norm by k k 1. We are ready to state assumptions for the consistency. We assume X i has at least fourth moment (EjX i j 4 < 1). Also fv i = (Y i ; X i ) g is always a random sample. These two premises are maintained throughout the paper. We list other more substantial assumptions. Assumption 3.1. (i) The parameter is uniquely identi ed in the sense of Theorem 2.1. (ii) is a compact subset of R d with a non-empty interior. is an interior point of. Assumption 3.2. (i) For some integer 3, max i2f;1; ;g sup z2r j di F dz i (z)j < 1. (ii) For some constant! >, collect every monotone function F on R such that max sup d i2f;1; ;g ff (z) dzi F (z)g (1 + z2 )!=2 B; z2r for some positive constant B >. The set F is the closure of this function class in the norm kf k 1;1 = kf k 1 _ kf k 1. Assumption 3.3. There exists a sequence f k F g such that k F 2 F k and max i2f;1g sup z2r d dz i f k F (z) F (z)g! as k! 1: Assumptions 3.1 is standard. The true parameter needs not to be an interior point for consistency, but it is included for later results. Assumption 3.2 (i) states that F is at least 4 Later we add more norms when needed. See the de nition (7) and Appendix A.1. In fact, all those norms are only semi-norms. We do not stress this fact. :

10 1 JONG-MYUN MOON -times di erentiable and its derivatives are uniformly bounded. Assumption 3.2 (ii) de nes the set F. There are several implications. First, by de nition, F is an interior point 5 of F. Second, the weighting function (1 + () 2 )!=2 is included to address the case when X i have an unbounded support. The particular form of the weighting function and its technical usage come from Gallant and Nychka (1987). Third, F 2 F needs not to be a cdf. Recall that F 2 F being continuous monotone is enough for the identi cation (Assumption 2.2 (ii)). However, it is possible to make F include only cdfs. Similarly, knowing that F is symmetric (that is, F (z) = 1 F ( z) for any z 2 R), we may restrict that every F 2 F is symmetric. The asymptotic distribution of ^ n is not a ected by the choice of F. Assumption 3.3 speci es the approximation property of the sieves. For consistency, it is enough that the true parameter F is well approximated. We de ne k = ( ; k F ). Notice that k k k c!. Theorem 3.1. Suppose Assumptions hold. Then k^ n k c p!. The proof of Theorem 3.1 is in the appendix. Notice that the derivative F, as well as F, is consistently estimated, uniformly on the support of X. This result is used to establish the convergence rate of ^ n in a weaker norm. 4. Rate of Convergence This section derives the convergence rate of the estimator ^ n. The rst step is to de ne an appropriate norm on A. To this end, we need to show that the population criterion Q induces a norm on the parameter space local to. We provide heuristic explanations. Given the consistency result, we can focus on a subset of parameter space A near to. Consider a local neighborhood of in the normed space (A; k k c ). By the equality (3), it is easy to show that Q() Q( ) = E[F (X ) F (X )] 2 : Recall that we set ;1 = 1 and as such X = X 1 + X for X j = X 2;j X 1;j and X = [X 2 ; ; X dx ]. Applying the Taylor expansion to F (X ) F (X ), we obtain the following approximate equality: Q() Q( ) ' E[fF (X ) X ( ) + F (X ) F (X )g 2 ]; If k k c is small, then we may replace F (X ) by F (X ) in the last expression. This is the reason why the consistency norm kk c is chosen to involve the rst-order derivative of F. 5 Here, we regard F as a normed space attached with k k1;1; that is, F is a whole set. Note that F is not an interior point of a larger normed space ff : kf k 1;1 < 1g with the same norm.

11 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 11 This heuristic observation motivates us to de ne the following norm as a measure of rate of convergence for ^ n ; de ne the rate norm k k q as (7) kk q = fe[ff (X ) X + F (X )g 2 ]g 1=2 : The subscript q is chosen to indicate that the norm is derived from the population criterion Q. Lemma A.15 proves that Q() Q( ) is locally similar to kk 2 q on the open neighborhood of in the normed space (A; k k c ). In standard parametric problems, a similar relation holds with the Euclidean norm jj. The rate norm kk q is not necessarily an object of interest; however, it turns out that the rate norm kk q is equivalent 6 to the norm jj+kf (X )k L2 (P ) for the usual L 2 -norm kk L2 (P ) with respect to the probability measure P. Then, for instance, the upper bound for the rate of j^ n j is given by the k k q -norm rate. The following three assumptions, in addition to the assumptions for the consistency, will be used to derive the rate of convergence. Assumption ! >! + : Assumption 4.2. There exists a sequence f k = ( ; F ;k ) : k 2 N; F k 2 F k g such that r n k kn k q = o(1). Assumption 4.3. Denote, for X = [X 2 ; ; X dx ], = E[F (X ) 2 f X E[ XjX ]gf X E[ XjX ]g ]: The matrix is non-singular. Assumption 4.1 limits possible values for the constants and!. Recall that these two constants are used to de ne the parameter space F in Assumptions Note that the convergence rate r n will be determined by these constants. Assumption 4.2 states that the sieve approximation error k kn k q vanishes faster than the convergence rate r n. This requirement is intuitive because the rate of k kn k q is an upper bound for the rate of k^ n k q. Assumption 4.3 is a key condition to the entire rate calculation. It has a similar role to the nonsingularity of Hessian matrix in usual parametric problems. The particular form of the matrix will be suggested in the proof of Lemma A.14, which proves the norm equivalence of kk q and jj + kf (X )k L2 (P ). These three assumptions with the consistency of ^ n in the norm k k c are su cient to have the following result. Recall that the constants and! are de ned in Assumption 3.2. Theorem 4.1 (Rate of Convergence). Suppose Assumptions and hold. Then r n k^ n k q = O p (1); 6 Two norms are equivalent if their ratio remains within a xed range [a; b] for < a < b < 1, for any point. This equivalence result is proved in Lemma A.14.

12 12 JONG-MYUN MOON for the rate-of-convergence factor r n = n!=(2!++!). The convergence rate for sieve M-estimator is proved by Shen and Wong (1994). A similar result can be found in van der Vaart and Wellner (1996), Theorem We use the proof method similar to van der Vaart and Wellner (1996). When doing so, it needs to be considered that the empirical criterion Q n has a U-process structure. Sherman (1993, 1994) study a similar problem in parametric problems. Our result extends Sherman (1993, 1994) to in nite-dimensional problems with sieve spaces. To facilitate asymptotic analysis, we need to decompose the criterion function. De ne for v; v 1 ; v 2 2 R 2+dx, m(; v) = E[h(; V 1 ; V 2 )jv 1 = v] + E[h(; V 1 ; V 2 )jv 2 = v] Q(); g(; v 1 ; v 2 ) = h(; v 1 ; v 2 ) E[h(; V 1 ; V 2 )jv 1 = v 1 ] + E[h(; V 1 ; V 2 )jv 2 = v 2 ] + Q(): Note that E[m(; V 2 )] = Q() and E[g(; V 1 ; V 2 )] =. Moreover, it can be checked that (8) Q n () = 1 nx 1 X m(; V i ) + g(; V i ; V j ): n n(n 1) i=1 The expression (8) is called the Höe ding decomposition; this is a fundamental result to the U-statistic theory. Because E[g(; V 1 ; V 2 )jv 1 ] = E[g(; V 1 ; V 2 )jv 2 ] = for any 2 A, the second term in the right of (8) is called a degenerate U-process. From the last expression, it is clear that the U-process criterion is the sum of a sample-mean process and the degenerate U-process. As such, our proof of Theorem 4.1 can be divided to two parts. First, we show that the degenerate U-process in (8) is asymptotically negligible. This is proved in Lemma A.13 in the appendix. Then, we can treat ^ n as a M-estimator minimizing the sample mean of m(; V i ) with some error; the error comes from the degenerate U-process. Second, we prove the rate-of-convergence using the empirical process theory, similar to van der Vaart and Wellner (1996) Theorem Asymptotic Normality This section focuses on the asymptotic distribution of ^ n ; recall that = (1; ). The in nitedimensional parameter F is treated as a nuisance parameter. The rst step is to express as a function of. We are to express such a functional as an inner product of and a special point v. The inner product is induced by the norm k k q. To de ne it, let V be a product space of R d and ff : kf (X )k L2 (P ) < 1g. For arbitrary two points v; w in V, we de ne (9) hv; wi = E[fF (X ) X v + F v (X )gff (X ) X w + F w (X )g];

13 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 13 for v = ( v ; F v ) and w = ( w ; F w ). It can be easily veri ed that the bilinear map h; i is indeed an inner product. Then, the special point v is de ned as follows. Let v = ( ; F ) for = 1 and F (z) = F (z)e[ X jx = z] 1 : Assume that v is in V, or, equivalent, assume that kf (X )k L1(P ) is nite. By easy calculation 7, one can show that (1) = h; v i: Therefore, we know the exact expression for the special point v. Even when its expression is unknown, however, the existence of v is guaranteed by the Riesz representation theorem if V is a Hilbert space and the map 7! is bounded linear. For this reason, v is often called the Riesz representer. The representation of as an inner product (1) is instrumental, since it is possible to approximate the inner product by the population criterion. Note that the inner product (9) is equivalently de ned by the polarization identify: (11) 4 hv; wi = kv + wk 2 q kv wk 2 q: Therefore, if the two squared norms in (11) are well approximated, so is the inner product. A relevant fact is that the rate-norm k k q is chosen to approximate the population criterion Q locally to ; see (7) in the previous section. Therefore it is foreseeable that can be expressed using Q. There are technical subtlety in doing so, and more details can be found in the proof of Theorem 5.1. To obtain the asymptotic normality, the following assumptions are used. Assumption 5.1.! > +!. Assumption 5.2. For kn = ( ; F ;kn ) de ned in Assumption 4.2, kf ;kn (X ) F (X )k L2 (P ) = o(n 2=3 ), kf ;k n (X ) F (X )k L4 (P ) = o(n 1=3 ): Assumption 5.3. (i) F k spanfp 1 ; ; p k g for all k; (ii) fkp j k 1 g 1 j=1 is uniformly bounded. Assumption 5.4. Let j (k) = max 1ik k dj p dz j i k 1. Then the followings hold: (i) 1 (k n ) _ 2 (k n ). p k n ^ rn 2, (ii) k n rn 3 = o(n 1 ) and (iii) knr 2 n 1 = o(1). Assumption 5.5. Let p k (z) = (p 1 (z); ; p k (z)). The smallest eigenvalue of E[p k (X )p k (X ) ] is bounded away from zero uniformly in k 2 N. Assumption 5.6. For any 2 R d, there exists a sequence f kn v : k n v = ( ; F ;k n ); 2 ; F ;k n 2 spanfp 1 ; ; p kn gg; 7 A similar calculation appears in the proof of Lemma A.14.

14 14 JONG-MYUN MOON such that (i) p nr n k kn v v k q! as n! 1 and that (ii) sup n2r kf ;k n k 1 is bounded. Assumption 5.1 is stronger than Assumption 4.1. For the asymptotic normality of ^ n, we need that k k 2 q is well approximated by Q() Q( ) for close to ^ n. Therefore, if ^ n converges faster, then the approximation shows less error. By imposing Assumption 5.1, we achieve the faster convergence rate and hence control the approximation error. Assumption 5.2 demands the sieve approximation error vanishes not only for F but also for its derivative F at a certain rate. Assumption 5.3 limits the sieve space that we consider. As mentioned already, we choose F k to be nite-dimensional and linear. The functions fp 1 ; p 2 ; g are called basis functions. Assumption 5.4 regards the smoothness property of the basis functions. Note that j (k) can be regarded as a smoothness measure for a basis functions fp 1 ; ; p k g. The role of Assumption 5.4 is to control the convergence of derivatives of ^Fn. Recall that the convergence rate is stated in terms of the rate norm k k q, and that the convergence of k^ n k q does not imply that the derivatives ^F n and ^F n converge in some norm. However, by imposing Assumption 5.4, we can control the convergence rate of k ^F n F k 1 and k ^F n F k 1 with regard to k ^F n F k 1. Assumption 5.5 is used to establish the norm equivalence between kf (X )k L2 (P ) and kf (X )k L1(P ) for F in F k. This is possible because F k is a nitedimensional sieve; recall that in the Euclidian space, L p -norms are equivalent for 1 p 1. Assumption 5.6 states that the Riesz representer v can be approximated by a sequence in the sieves to a certain precision. Before stating the main result, we add one more notation. De ne the linear directional derivative of h(; ; ) to the direction v = ( v ; F v ) 2 V as (12) h (; ; )[v] = d dt h( + tv; ; ) t= : Now we state the main result of this paper. Theorem 5.1. Suppose Assumptions , 4.3, hold. Then p n(^n )! d N(; ); where the matrix is such that, for any 2 R d, = E h ( ; V 1 ; V 2 )[v h ( ; V 1 ; V 3 )[v ]]: The proof of Theorem 5.1 can be found in Appendix A, and the functional form of h ( ; V 1 ; V 2 )[] is derived by Lemma A.19. Because we have an explicit expression for v, it is possible to estimate v and then the matrix. If is consistently estimated, the inference on can be conducted relying on the asymptotic normality result of the above theorem. A downside to this approach is that it involves several nonparametric estimations. For instance,

15 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 15 to estimate v, the conditional expectation E[ X jx ] needs to be estimated. Therefore, a simulation-based method is preferred. Below, we prove the consistency of weighted bootstrap Weighted Bootstrap. Consider a randomly generated sequence of weights fb i g n i=1. We assume E[B i ] = 1 and V ar(b i ) = 1. If these conditions are met, it may have any distribution. Possible distributions are the discrete uniform distribution on f; 2g or the normal distribution N(1; 1). De ne the weighted empirical criterion Q n() = 1 n(n 1) X B i B j h(; V i ; V j ): Next, de ne ^ n to be a point such that ^ n 2 A kn and ^ n 2 argmin 2Akn Q n(): Also, write ^ n = (^ n; ^F n). The following theorem proves that the asymptotic distribution of p ^ n(^ n n ) conditional on the sample fv 1 ; ; V n g is same with the unconditional asymptotic distribution of p n(^ n ). Theorem 5.2. Suppose all the conditions of Theorem 5.1 hold. If fb i g n i=1 is an i.i.d. sequence such that E[B i ] = 1 and V ar(b i ) = 1, then for any c 2 R d and any n 2 N, P [ p n(^ n ^n ) cjv 1 ; ; V n ] = P [ p n(^ n ) c] + o p (1): The bootstrap inference is easy to implement. Fix the distribution for B i, and draw the random weights fb i g n i=1. Then, estimate ^ n by minimizing the weighted empirical criterion Q n. The sieve-size index k n remains same with the original problem. By repeating this procedure, we obtain the empirical distribution of p n(^ n ^n ) conditional on fv i g n i=1. Then, the quantiles of the empirical distribution can be used as critical values for the inference on p n(^n ). 6. Simulation Study Many duration models are examples of the transformation model. Proportional hazard models and mixed proportional hazard models are all nested in transformation models. 8 We use those two models to conduct the following simulation study. 8 Proportional hazard models assume the error distribution is xed to be a negative extreme-value distribution, whereas the transformation function (or baseline hazard) remains nonparametric. Mixed proportional hazard models are more general, but still restrictive; for instance, the normal distribution is not allowed for the error distribution (Ridder, 199).

16 16 JONG-MYUN MOON Design 1 Design 2 Design Figure 1. CDF of We consider three designs. The transformation function T (y) = log y is chosen for data generation. However, note that all three estimators are numerically invariant even if Y is transformed by any other monotone function. The data are generated from the following equation log Y = X 1 1 X 2 2 X 3 ", for ( 1 ; 2 ) = (1; 1): This speci cation is shared by all three designs. Further, we x the distribution of (X 1 ; X 2 ; X 3 ); X 1 and X 2 are standard normal random variables and X 3 is a binary random variable with equal probabilities of being or 1. (X 1 ; X 2 ; X 3 ) are mutually independent. Across three designs, we vary only the distribution of ". This is summarized below: Design 1: " EV (; 1); Design 2: " d = log v + u; for v (1; 1) and u EV (; 1); Design 3: " d = log v + u; for v (3; 3) and u EV (; 1); where EV (; 1) means the standard extreme-value distribution with cdf F (z) = exp( exp( z)), and (; ) denotes the gamma distribution with mean and variance 2. Design 1 conforms to the proportional hazard model. Design 2 and 3 belong to the mixed proportional hazard model or frailty model. As the additional random error v follows the gamma distribution, they are also called a gamma frailty model. Finite-sample distributions of several estimators are compared. Let us call the sieve extremum estimator developed by this paper, the sieve estimator. We compare the sieve estimator with two others estimators: Cox estimator for the proportional hazard model and the MRC

17 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 17 estimator of Han (1987). Note that the Cox estimator is mis-speci ed for Design 2 and Design 3. We still report the result because Cox model is widely used in empirical researches. For each design, we generate samples of size 1 and 3. Then the parameter = ( 1 ; 2 ) is estimated for (i) Cox estimator, (ii) sieve estimator, and (iii) MRC estimator. Regarding the sieve estimator, we vary the dimension of sieve space to k = 3; 5; 7. The estimation procedure is repeated 5 times, and we report the sample bias and the sample mean squared error (MSE) from 5 estimates of ve estimators. To implement the sieve estimator, sieve F k needs to be speci ed. We choose I-spline as basis functions. Ramsay (1988) explains the construction of I-spline. What is useful with I-spline is that each basis function is a cdf of some continuous random variable. Therefore, it is easy to tune F k to our purpose of estimating a symmetric cdf. We construct F k to contain only symmetric cdfs from I-spline bases. The dimension of F k equals to the index k. The simulation results are summarized in Figure 2-7. In each gure, the left panel shows the bias and the right panel shows MSE. Bias1 indicates the bias of estimating 1. Bias2 is for 2. MSE1 and MSE2 also correspond to 1 and 2 respectively. Design 1 provides a good benchmark to our estimator. It is because the Cox estimator is correctly speci ed and has one less in nite-dimensional parameter. Not surprisingly, the Cox estimator shows the least MSE. Our estimator behaves comparably well. The e ciency loss of our estimator relative to the Cox estimator seems bearable when considering Design 2 and Design 3. The Cox estimator shows a large bias in these mis-speci ed designs. On the contrary, the sieve estimator performs well across all three designs. Compared to MRC estimator, the sieve estimator shows less MSE, especially for a smaller sample size of n = 1. We also notice that the sieve estimator is not sensitive to di erent sieve-size indexes k 2 f3; 5; 7g. In summary, we nd that the sieve estimator behaves well, even for a small sample size. 7. Conclusion The intuition that a binary comparison speci es the ordering is used to identify the transformation model. A new estimator is constructed from the identi cation result. Its asymptotic distribution is derived, and the bootstrap inference is justi ed. As technical by-products, we contribute to the literature on the sieve estimation by studying a U-process problem and showing how to handle the single-index structure in the semi-nonparametric problem. Several important extensions are possible. Regarding its application to duration models, we may extend the current method to account for censoring and time-varying regressors. Another direction is to consider competing risks models. We hope to study these extensions in future researches.

18 18 JONG-MYUN MOON Cox Sieve3 Sieve5 Sieve7 MRC Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 2. Simulation result for Design 1 when n = Cox Sieve3 Sieve5 Sieve7 MRC Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 3. Simulation result for Design 1 when n = Cox Sieve3 Sieve5 Sieve7 MRC.1.5 Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 4. Simulation result for Design 2 when n = 1

19 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS Cox Sieve3 Sieve5 Sieve7 MRC Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 5. Simulation result for Design 2 when n = Cox Sieve3 Sieve5 Sieve7 MRC Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 6. Simulation result for Design 3 when n = Cox Sieve3 Sieve5 Sieve7 MRC Cox Sieve3 Sieve5 Sieve7 MRC Bias1 Bias2 MSE1 MSE2 Figure 7. Simulation result for Design 3 when n = 3

20 2 JONG-MYUN MOON Appendix A. Proofs A.1. Notations. We de ne and use several norms throughout the appendix. k k ;1;! The norm kf k ;1;! = max i sup z2r j di F (z)j(1 + z 2 )!=2 dz i k k ;1 The norm kf k ;1 = max i sup z2r j di F (z)j dz i k k 1 The norm kf k 1 = sup z2r jf (z)j k k L1(P ) The norm kxk L1(P ) is the essential supremum of the random variable X k k Lp(P ) The norm kxk Lp(P ) = fejxj P g 1=p for any integer p 1 k k F;c The norm kf k F;c = kf (Z )k L1(P ) + kf (Z )k L1(P ) for Z = X k k e;;1 The norm kk e;;1 = jj + kf k ;1 k k e;1 The norm kk e;1 = jj + kf k 1 k k c The norm kk c = jj + kf k F;c k k q The norm kk q is de ned in (7) k k e;lp The norm kk e;lp = jj + kf (Z )k Lp(P ) Other notations used in the appendix are gathered in the table below. Z d A scalar random variable such that Z = X a. b a Kb for a universal constant K not depending on a or b a b a. b and a & b N("; F; k k) The covering number 9 of size " for a set F under the norm k k N [] ("; F; k k) The bracketing number of size " for a function class F under the norm k k C 1 ; C 2 ; Generic positive constants which do no depend on the context of the proof The degree of smoothness of F; see Assumption 3.2! The constant for the weighting function (1 + z 2 )!=2 ; see Assumption 3.2 j See Assumption 5.4 n See Remark A.17 A.2. Proof for Section 2. Lemma A.1. Suppose Assumptions hold. Suppose 2 f 1; 1g and F 2 F. If F (x ) = F (x ) for any x 2 supp X, then = and F (z) = F (z) for any z 2 supp X : Proof. Note 2 supp X d. Hence, if x = (x c ; ), by Assumption 2.3 (ii), we have (13) F (x c c ) = F (x c ;c ) for any x c 2 supp X c : As " is a di erence of two i.i.d. continuous RVs, is an interior point of supp ". Regarding supp X c, the same holds by Assumption 2.3 (i). We know c 6= and ;c 6= since 9 See p.83 of van der Vaart and Wellner (1996) for the precise de nition.

21 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 21 j ;1 j = 1. Observe that 2 R is an interior point to both supp Xc c and supp Xc ;c. Therefore we can nd an open neighborhood of denoted by N c R dc such that (14) N c supp " \ supp X c c \ supp X c ;c : We rst show F is strictly increasing on N c. Suppose not. Then nd two points x 1 ; x 2 2 N c with the following three properties; (i) x 1 and x 2 are di erent only in the rst coordinates, say x 1;1 6= x 2;1 ; (ii) x 1;1 < x 2;1 ; (iii) F (x 1 c) F (x 2 c). Because F is strictly increasing on N c, F (x 1 ;c) < F (x 2 ;c). Then either F (x 1 c) 6= F (x 1 ;c) or F (x 2 c) 6= F (x 2 ;c). This contradicts the condition of the lemma. As such, F is strictly increasing on N c. Next, we prove c = ;c. Suppose not. Find two points x 1 ; x 2 2 N c such that x 1 c > x 2 c and x 1 ;c < x 2 ;c. By strong monotonicity of F and F on N c, F (x 1 c) > F (x 2 c) and F (x 1 ;c) < F (x 2 ;c). We reach a contradiction and conclude c = ;c. Then by (13), we can infer that F (z) = F (z) for any z 2 supp X c ;c. So far, ;c is identi ed and F is identi ed only on supp X c ;c. We move on to the identi cation of ;d. To this end, we nd the values of f j ;d g dx dc j=1 ; for the de nition of j, see Assumption 2.4. Start by j = 1. By Assumption 2.4, there are two points x 1 ; x 2 2 supp X c such that x 1 ;c = x 2 ;c + 1 ;d 2 N 1. Because F is strictly increasing on N 1 and F = F on N 1, then it follows that F (x 1 ;c)? F (z) if x 1 ;c? z. In other words, (15) x 1 ;c = z if F (x 1 ;c ) = F (z): By the condition of the lemma, (16) F (x 1 ;c ) = F (x 1 ;c ) = F (x 2 ;c + 1 ;d ) = F (x 2 ;c + 1 d ): From (15) and (16), we see that 1 d = x 1 ;c x 2 ;c = 1 ;d and F (z) = F (z) on z 2 supp X c ;c [ fx ;c + 1 ;d : x 2 supp X c g: Repeat the same argument for each j to identify other j ;d s. Then we identify f j ;d g dx dc j=1. As the last step, we note that, since f 1 ; ; dx d c g is linearly independent, ;d 2 R dx dc is identi ed. Conclude that = and that F (z) = F (z) for any z 2 supp X. Proof of Theorem 2.1. We know P (Y > jx 1 ; X 2 ) = F (X ). iterated expectation, By this fact and the Q(b; ) = E[E[IfY gf1 2F (X )gjx 1; X 2 ]f1 2F (X )g + F (X ) 2 ] = E[ F (X )(1 2F (X )) + F (X ) 2 ]:

22 22 JONG-MYUN MOON The last expectation can be simpli ed to the sum of E[fF (X ) F (X )g 2 ] and some constant not depending on parameters. From this observation, it is obvious that Q() is minimized only if F (X ) = F (X ) almost surely. Lemma A.1 proves that if F (X ) = F (X ), then it follows that = and F (z) = F (z) for any z 2 supp X. Hence we conclude. A.3. Proof for Section 3. Remark A.2 (The constant B). Note that kf k ;1 for any F 2 F is uniformly bounded. By Hölder inequality and Assumptions 3.1, kf k ;1 kf F k ;1 + kf k ;1 B + kf k ;1 : The second inequality holds because the weighting function is strictly larger than 1. As kf k ;1 is bounded by Assumption 3.1, kf k ;1 is bounded by a universal constant B + kf k ;1. We denote B = B + kf k ;1. Lemma A.3. Under Assumptions3.1(ii) and 3.2(ii), for any 1 ; 2 2 A and v i = (d i ; y i ; x i ) 2 supp V i, (17) jh( 1 ; V 1 ; V 2 ) h( 2 ; V 1 ; V 2 )j. (jx 1 j + jx 2 j + 1)k 1 2 k e;1 : Proof. We use notations x; x below; they are de ned similarly to X; X. Observe jh( 1 ; V 1 ; V 2 ) h( 2 ; V 1 ; V 2 )j = d 1 d 2 j2ify g F 1 (x 1 ) F 2 (X 2 )j jf 1 (x 1 ) F 2 (x 2 )j (18). jf 1 (x 1 ) F 2 (x 2 )j; where the inequality holds by Remark A.2 and the fact that d 1 is a binary variable. Taylor expansion after obvious expansion, jf 1 (x 1 ) F 2 (x 2 )j is equal to By jf 1(z ) X ( 1 2 ) + F 1 (x 2 ) F 2 (x 2 )j: for some z 2 [ X 1 ; X 2 ]. Since kf 1 k 1 < B by Remark A.2, using Hölder inequality, we have jh( 1 ; V 1 ; V 2 ) h( 2 ; V 1 ; V 2 )j. Bjxj j 1 2 j + jf 1 (x 2 ) F 2 (x 2 )j (19). (jx 1 j + jx 2 j + 1) j 1 2 j + jf 1 (x 2 ) F 2 (x 2 )j ; where the second inequality holds by that Bjxj + 1. jx 1 j + jx 2 j + 1. The result (17) follows (19). Lemma A.4. Under Assumptions3.1(ii) and 3.2(ii), jq( 1 ) Q( 2 )j. k 1 2 k e;1 ;

23 SIEVE EXTREMUM ESTIMATION OF TRANSFORMATION MODELS 23 for any 1 ; 2 2 A. Proof. By Jensen s inequality, jq( 1 ) Q( 2 )j E jh( 1 ; V 1 ; V 2 ) h( 2 ; V 1 ; V 2 )j. The claim follows by Lemma A.3. Lemma A.5. Under Assumptions , F is compact in k k 1;1 -norm and A is compact in k k e;1;1 -norm. Proof. We recall Lemma A.4 of Gallant and Nychka (1987); let us call it GN. Let = ; =!; m = 1; m = 1; k = 1 in the cited lemma. Although one of the conditions is that < <, it can be learnt from the proof that can be zero (and indeed can be negative). The set F de ned in Assumption 3.2 is smaller than a corresponding set in the cited lemma; note we de ne F as k k ;1;! -ball of radius B=2 whereas GN sets up F as a ball in the L 2 - type norm similarly de ned to k k ;1;!. All other conditions of GN are included verbatim in Assumptions Therefore, we know F is relatively compact in k k 1;1 -norm. By Assumption 3.2, F is compact in k k 1;1 -norm. The second claim follows immediately. Lemma A.6. Suppose Assumptions hold. Let " > be small enough. Then log N("; A; kk e;1 ). 1 log " + " ; = +!! : Proof. The inequality (2) is immediate from the de nitions of the covering number and the norm kk e;1 : (2) N("; A; kk e;1 ) N("=2; ; jj) N("=2; F; kk 1 ): Because is compact, "=2-covering number of is proportional to " d. As such, ignoring constant terms, log N("=2; ; jj). 1 log ": Denote C ;! B=2 = ff : kf k ;1;! B=2g. By Lemma A.3 of Santos (212), for some " >, if " < ", log N("; C ;! B=2 ; kk 1 ). ". Since, by Assumption 3.2, ff F : F 2 Fg C ;! B=2 ; it follows that N("; F; kk 1 ) N("; C ;! B=2 ; kk 1 ). Hence the claim is shown. Remark A.7. When we use Lemma A.6, we ignore that it holds for small ". This is harmless simpli cation. Lemma A.8. Under Assumptions , sup 2A jq n () Q()j p! as n! 1. Proof. Let H = fh (; ; ) : 2 Ag. By Lemma A.3, Ejh(; V 1 ; V 2 ) h(; V 1 ; V 2 )j. Kk 1 2 k e;1 ;

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

Rank Estimation of Partially Linear Index Models

Rank Estimation of Partially Linear Index Models Rank Estimation of Partially Linear Index Models Jason Abrevaya University of Texas at Austin Youngki Shin University of Western Ontario October 2008 Preliminary Do not distribute Abstract We consider

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Nonparametric Identification and Estimation of a Transformation Model

Nonparametric Identification and Estimation of a Transformation Model Nonparametric and of a Transformation Model Hidehiko Ichimura and Sokbae Lee University of Tokyo and Seoul National University 15 February, 2012 Outline 1. The Model and Motivation 2. 3. Consistency 4.

More information

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Econ 273B Advanced Econometrics Spring

Econ 273B Advanced Econometrics Spring Econ 273B Advanced Econometrics Spring 2005-6 Aprajit Mahajan email: amahajan@stanford.edu Landau 233 OH: Th 3-5 or by appt. This is a graduate level course in econometrics. The rst part of the course

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Non-parametric Identi cation and Testable Implications of the Roy Model

Non-parametric Identi cation and Testable Implications of the Roy Model Non-parametric Identi cation and Testable Implications of the Roy Model Francisco J. Buera Northwestern University January 26 Abstract This paper studies non-parametric identi cation and the testable implications

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Control Functions in Nonseparable Simultaneous Equations Models 1

Control Functions in Nonseparable Simultaneous Equations Models 1 Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell 2 UCL & IFS and Rosa L. Matzkin 3 UCLA June 2013 Abstract The control function approach (Heckman and Robb (1985)) in a

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors The Asymptotic Variance of Semi-parametric stimators with Generated Regressors Jinyong Hahn Department of conomics, UCLA Geert Ridder Department of conomics, USC October 7, 00 Abstract We study the asymptotic

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

The Kuhn-Tucker Problem

The Kuhn-Tucker Problem Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

More information

Stochastic Demand and Revealed Preference

Stochastic Demand and Revealed Preference Stochastic Demand and Revealed Preference Richard Blundell Dennis Kristensen Rosa Matzkin UCL & IFS, Columbia and UCLA November 2010 Blundell, Kristensen and Matzkin ( ) Stochastic Demand November 2010

More information

Local Rank Estimation of Transformation Models with Functional Coe cients

Local Rank Estimation of Transformation Models with Functional Coe cients Local Rank Estimation of Transformation Models with Functional Coe cients Youngki Shin Department of Economics University of Rochester Email: yshn@troi.cc.rochester.edu January 13, 007 (Job Market Paper)

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

Nonlinear Programming (NLP)

Nonlinear Programming (NLP) Natalia Lazzati Mathematics for Economics (Part I) Note 6: Nonlinear Programming - Unconstrained Optimization Note 6 is based on de la Fuente (2000, Ch. 7), Madden (1986, Ch. 3 and 5) and Simon and Blume

More information

Estimating Semi-parametric Panel Multinomial Choice Models

Estimating Semi-parametric Panel Multinomial Choice Models Estimating Semi-parametric Panel Multinomial Choice Models Xiaoxia Shi, Matthew Shum, Wei Song UW-Madison, Caltech, UW-Madison September 15, 2016 1 / 31 Introduction We consider the panel multinomial choice

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Mean-Variance Utility

Mean-Variance Utility Mean-Variance Utility Yutaka Nakamura University of Tsukuba Graduate School of Systems and Information Engineering Division of Social Systems and Management -- Tennnoudai, Tsukuba, Ibaraki 305-8573, Japan

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

Estimation and Inference with Weak Identi cation

Estimation and Inference with Weak Identi cation Estimation and Inference with Weak Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania First Draft: August, 2007 Revised: March

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

ECON2285: Mathematical Economics

ECON2285: Mathematical Economics ECON2285: Mathematical Economics Yulei Luo Economics, HKU September 17, 2018 Luo, Y. (Economics, HKU) ME September 17, 2018 1 / 46 Static Optimization and Extreme Values In this topic, we will study goal

More information

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

Economics 620, Lecture 18: Nonlinear Models

Economics 620, Lecture 18: Nonlinear Models Economics 620, Lecture 18: Nonlinear Models Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 1 / 18 The basic point is that smooth nonlinear

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0.

Time is discrete and indexed by t =0; 1;:::;T,whereT<1. An individual is interested in maximizing an objective function given by. tu(x t ;a t ); (0. Chapter 0 Discrete Time Dynamic Programming 0.1 The Finite Horizon Case Time is discrete and indexed by t =0; 1;:::;T,whereT

More information

MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE. Donald W. K. Andrews and Xu Cheng.

MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE. Donald W. K. Andrews and Xu Cheng. MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE By Donald W. K. Andrews and Xu Cheng October COWLES FOUNDATION DISCUSSION PAPER NO. 8 COWLES FOUNDATION FOR RESEARCH

More information

The Influence Function of Semiparametric Estimators

The Influence Function of Semiparametric Estimators The Influence Function of Semiparametric Estimators Hidehiko Ichimura University of Tokyo Whitney K. Newey MIT July 2015 Revised January 2017 Abstract There are many economic parameters that depend on

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

Approximately Most Powerful Tests for Moment Inequalities

Approximately Most Powerful Tests for Moment Inequalities Approximately Most Powerful Tests for Moment Inequalities Richard C. Chiburis Department of Economics, Princeton University September 26, 2008 Abstract The existing literature on testing moment inequalities

More information

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor

A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor A Note on the Closed-form Identi cation of Regression Models with a Mismeasured Binary Regressor Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z Boston College First

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Risk)

Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Risk) Supplemental Appendix to: Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Ris) Andrew J. Patton Johanna F. Ziegel Rui Chen Due University University of Bern Due University September

More information

Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Risk)

Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Risk) Supplemental Appendix to: Dynamic Semiparametric Models for Expected Shortfall (and Value-at-Ris) Andrew J. Patton Johanna F. Ziegel Rui Chen Due University University of Bern Due University 30 April 08

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data

Robust Solutions to Multi-Objective Linear Programs with Uncertain Data Robust Solutions to Multi-Objective Linear Programs with Uncertain Data M.A. Goberna yz V. Jeyakumar x G. Li x J. Vicente-Pérez x Revised Version: October 1, 2014 Abstract In this paper we examine multi-objective

More information

Microeconomics, Block I Part 1

Microeconomics, Block I Part 1 Microeconomics, Block I Part 1 Piero Gottardi EUI Sept. 26, 2016 Piero Gottardi (EUI) Microeconomics, Block I Part 1 Sept. 26, 2016 1 / 53 Choice Theory Set of alternatives: X, with generic elements x,

More information

University of Toronto

University of Toronto A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence) Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence) David Glickenstein December 7, 2015 1 Inner product spaces In this chapter, we will only consider the elds R and C. De nition 1 Let V be a vector

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL

Endogeneity and Discrete Outcomes. Andrew Chesher Centre for Microdata Methods and Practice, UCL Endogeneity and Discrete Outcomes Andrew Chesher Centre for Microdata Methods and Practice, UCL July 5th 2007 Accompanies the presentation Identi cation and Discrete Measurement CeMMAP Launch Conference,

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

ECON0702: Mathematical Methods in Economics

ECON0702: Mathematical Methods in Economics ECON0702: Mathematical Methods in Economics Yulei Luo SEF of HKU January 14, 2009 Luo, Y. (SEF of HKU) MME January 14, 2009 1 / 44 Comparative Statics and The Concept of Derivative Comparative Statics

More information

Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function

Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function Problem 3. Give an example of a sequence of continuous functions on a compact domain converging pointwise but not uniformly to a continuous function Solution. If we does not need the pointwise limit of

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

01. Review of metric spaces and point-set topology. 1. Euclidean spaces

01. Review of metric spaces and point-set topology. 1. Euclidean spaces (October 3, 017) 01. Review of metric spaces and point-set topology Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 017-18/01

More information

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587

Alvaro Rodrigues-Neto Research School of Economics, Australian National University. ANU Working Papers in Economics and Econometrics # 587 Cycles of length two in monotonic models José Alvaro Rodrigues-Neto Research School of Economics, Australian National University ANU Working Papers in Economics and Econometrics # 587 October 20122 JEL:

More information

GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS. Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger.

GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS. Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger. GENERIC RESULTS FOR ESTABLISHING THE ASYMPTOTIC SIZE OF CONFIDENCE SETS AND TESTS By Donald W.K. Andrews, Xu Cheng and Patrik Guggenberger August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1813 COWLES

More information

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Applications of Subsampling, Hybrid, and Size-Correction Methods

Applications of Subsampling, Hybrid, and Size-Correction Methods Applications of Subsampling, Hybrid, and Size-Correction Methods Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department of Economics UCLA November

More information

The Impact of a Hausman Pretest on the Size of a Hypothesis Test: the Panel Data Case

The Impact of a Hausman Pretest on the Size of a Hypothesis Test: the Panel Data Case The Impact of a Hausman retest on the Size of a Hypothesis Test: the anel Data Case atrik Guggenberger Department of Economics UCLA September 22, 2008 Abstract: The size properties of a two stage test

More information

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 15, 2013 Christophe Hurlin (University of Orléans)

More information

Nonparametric Welfare Analysis for Discrete Choice

Nonparametric Welfare Analysis for Discrete Choice Nonparametric Welfare Analysis for Discrete Choice Debopam Bhattacharya University of Oxford September 26, 2014. Abstract We consider empirical measurement of exact equivalent/compensating variation resulting

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University February 7, 2007 2 Contents 1 Metric Spaces 1 1.1 Basic definitions...........................

More information

Nonparametric Estimation of Wages and Labor Force Participation

Nonparametric Estimation of Wages and Labor Force Participation Nonparametric Estimation of Wages Labor Force Participation John Pepper University of Virginia Steven Stern University of Virginia May 5, 000 Preliminary Draft - Comments Welcome Abstract. Model Let y

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

Semiparametric Estimation of Invertible Models

Semiparametric Estimation of Invertible Models Semiparametric Estimation of Invertible Models Andres Santos Department of Economics University of California, San Diego e-mail: a2santos@ucsd.edu July, 2011 Abstract This paper proposes a simple estimator

More information

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 19: Nonparametric Analysis

More information

Stochastic Demand and Revealed Preference

Stochastic Demand and Revealed Preference Stochastic Demand and Revealed Preference Richard Blundell y Dennis Kristensen z Rosa Matzkin x This version: May 2010 Preliminary Draft Abstract This paper develops new techniques for the estimation and

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

1 The Well Ordering Principle, Induction, and Equivalence Relations

1 The Well Ordering Principle, Induction, and Equivalence Relations 1 The Well Ordering Principle, Induction, and Equivalence Relations The set of natural numbers is the set N = f1; 2; 3; : : :g. (Some authors also include the number 0 in the natural numbers, but number

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

Stochastic integral. Introduction. Ito integral. References. Appendices Stochastic Calculus I. Geneviève Gauthier.

Stochastic integral. Introduction. Ito integral. References. Appendices Stochastic Calculus I. Geneviève Gauthier. Ito 8-646-8 Calculus I Geneviève Gauthier HEC Montréal Riemann Ito The Ito The theories of stochastic and stochastic di erential equations have initially been developed by Kiyosi Ito around 194 (one of

More information

Stein s method and weak convergence on Wiener space

Stein s method and weak convergence on Wiener space Stein s method and weak convergence on Wiener space Giovanni PECCATI (LSTA Paris VI) January 14, 2008 Main subject: two joint papers with I. Nourdin (Paris VI) Stein s method on Wiener chaos (ArXiv, December

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

l(y j ) = 0 for all y j (1)

l(y j ) = 0 for all y j (1) Problem 1. The closed linear span of a subset {y j } of a normed vector space is defined as the intersection of all closed subspaces containing all y j and thus the smallest such subspace. 1 Show that

More information

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods MIT 14.385, Fall 2007 Due: Wednesday, 07 November 2007, 5:00 PM 1 Applied Problems Instructions: The page indications given below give you

More information

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J. CAE Working Paper #05-08 A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests by Nicholas M. Kiefer and Timothy J. Vogelsang January 2005. A New Asymptotic Theory for Heteroskedasticity-Autocorrelation

More information

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

4.3 - Linear Combinations and Independence of Vectors

4.3 - Linear Combinations and Independence of Vectors - Linear Combinations and Independence of Vectors De nitions, Theorems, and Examples De nition 1 A vector v in a vector space V is called a linear combination of the vectors u 1, u,,u k in V if v can be

More information

Near convexity, metric convexity, and convexity

Near convexity, metric convexity, and convexity Near convexity, metric convexity, and convexity Fred Richman Florida Atlantic University Boca Raton, FL 33431 28 February 2005 Abstract It is shown that a subset of a uniformly convex normed space is nearly

More information