Partial Rank Estimation of Transformation Models with General forms of Censoring

Size: px
Start display at page:

Download "Partial Rank Estimation of Transformation Models with General forms of Censoring"

Transcription

1 Partial Rank Estimation of Transformation Models with General forms of Censoring Shakeeb Khan a and Elie Tamer b a Department of Economics, University of Rochester, Rochester NY 14627, USA b Department of Economics, Northwestern University, 2001 Sheridan Rd, Evanston IL 60208, USA. Submitted Version: February 2004 Current Version: March 2005 Abstract In this paper we propose estimators for the regression coefficients in censored duration models which are distribution free, impose no parametric specification of the baseline hazard function, and can accommodate general forms of censoring. The estimators are shown to have desirable asymptotic properties and Monte Carlo simulations demonstrate good finite sample performance. Among the data features the new estimators can accommodate are covariate dependent censoring, double censoring, heteroskedasticity, and fixed (individual or group specific) effects. JEL Classification: C13; C31 Keywords: duration analysis, rank correlation, median regression, panel data. 1 Introduction This paper considers estimation of regression coefficients in censored duration models. Duration models have seen widespread use in empirical work in various areas of economics. This is because many time-to-event variables are of interest to researchers conducting empirical studies in labor economics, development economics, public finance and finance. For example, the time-to-event of interest may be the length of an unemployment spell, the time between purchases of a particular good, time intervals between child births, and insurance claim durations, to name a few. (Van den Berg(2001) surveys the many applications of duration models.) We thank the co-editor Takeshi Amemiya and three referees for extremely useful suggestions (the report of one very diligent referee was particularly helpful in the revision). We also thank J. Abrevaya, S. Chen, H. Hong, B. Honoré, J. Powell as well as seminar participants at various institutions and the European and North American Summer Meetings of the Econometric Society for many helpful comments. Both authors also gratefully acknowledge the National Science Foundation for financial support. Corresponding Author: Department of Economics, Northwestern University, 2001 Sheridan Rd, Evanston IL 60208; tamer@northwestern.edu.

2 Since the seminal work in Cox(1972,1975), the most widely used models in duration analysis are the proportional hazards model, and its extension, the mixed proportional hazards model, introduced in Lancaster(1979). These models can be represented as monotonic transformation models, where an unknown, monotonic transformation of the dependent variable is a linear function of observed covariates plus an unobserved error term, subject to restrictions that maintain the (mixed) proportional hazards assumption. Relaxing these restrictions gives the Generalized Accelerated Failure Time (GAFT) model introduced in Ridder(1990). The GAFT model in its most basic form is usually expressed as T (y i ) = x iβ 0 + ɛ i i = 1, 2,...n (1.1) where (y i, x i ) is a (k + 1) dimensional observed random vector, with y i denoting the dependent variable, usually a time to event, and x i denoting a vector of observed covariates. The random variable ɛ i is unobserved and independent of x i with an unknown distribution. The function T ( ) is assumed to be strictly monotonic, but otherwise unspecified. The k-dimensional vector β 0 is unknown, and is often the object of interest to be estimated from a random sample of n observations. Duration data is often subject to right censoring for a variety of reasons that are usually a consequence of the empirical researcher s observation or data collection plan. For example, unemployment spell-length may be censored because the agent is lost from the sample, or to control data collection costs, unemployed agents are only followed for short period of time. If they are still unemployed at the end of this period, their spell length is censored. This paper considers right censoring as the base case and shows how our approach can be extended to cover double censoring (and naturally left censoring). When the data is subject to censoring the variable y i is no longer always observed. Instead one observes the pair (v i, d i ) where v i is a scalar random variable, and d i is a binary random variable. We express the right censored transformation model as 1 : T (v i ) = min(x iβ 0 + ɛ i, c i ) (1.2) d i = I[x iβ 0 + ɛ i c i ] (1.3) where I[ ] denotes the indicator function, and c i denotes the random censoring variable. So, here v i = y i for uncensored observations, and v i = c i otherwise. We note the censoring variable need not always be observed, as would occur in a competing risks type setting (see, e.g. Heckman and Honoré(1990)). The primary aim of this paper is to provide an estimator of β 0 in the above model with few restrictions on c i. Specifically, we wish to allow for the presence of covariate dependent censoring, 1 We can also express the censored model in the latent dependent variable framework. Letting yi = T 1 (x iβ 0 + ɛ i ) and c i = T 1 (c i), one observes the covariates and the pair (y i, d i) where y i = min(yi, c i) and d i can now be expressed as I[yi c i]. 2

3 i.e., in the case where c i can be arbitrarily correlated with x i. This would be in line with the form of censoring allowed for in the Partial Maximum Likelihood Estimator (PMLE) introduced in Cox(1972,1975), and several other estimators (to be mentioned below) in the duration literature. Outside the proportional hazards framework, covariate dependent censoring also arises in the biostatistics literature on competing risks, and survival analysis (even for randomized clinical trials - see Chen, Jin and Ying(2002)). We motivate the construction of such an estimator by two ways- first by illustrating the relevance of the censored transformation model in empirical settings, and second, by showing that the problem of (distribution free) estimation of β 0 in a censored transformation model has not been completely solved. Turning first to relevance of the model in empirical work we note the censored transformation model has become increasingly popular in the applied econometrics literature. This is because economic theory rarely provides guidelines on how to specify functional form relationships among variables while (1.1) can accommodate many functional relationships used in practice such as linear, log-linear, or the parametric transformation in Box-Cox models, without suffering from the dimensionality problems encountered when adopting a fully nonparametric approach. Next, we explain why the problem of estimating β 0 has not been completely solved despite the extensive literature (both in econometrics and in biostatistics) and much recent research progress. We note that with T ( ) known there exists many distribution free estimators for β 0 - examples include Buckley James(1979), Koul, Susarla and Van Rysin(1981), Tsiatis(1990), Ying, Jung and Wei(1995), Yang(1999), Honoré, Khan and Powell(2002) and Portnoy(2003), some of which allow for the distribution of the censoring variable to depend on the covariates. Bijwaard (2001) imposes parametric restrictions on T ( ). For T ( ) unknown except for a strict monotonicity assumption, we can divide the existing literature into two groups. One group allows for covariate dependent censoring but require a known distribution of ɛ i, and can be inconsistent if this distribution is misspecified. See for example Cox(1975) s partial maximum likelihood estimator (PMLE), Cuzick(1988), and more recently Chen, Jin and Ying (2002). Cheng, Wei and Ying(1995), Fine, Ying and Wei(1998), Cai, Wei and Wilcox(2000) are more restrictive in the sense that in addition to parametrically specifying the distribution of ɛ i, they do not allow the censoring variable to depend on x i. The other group, of which examples include the important single-index estimators in Han(1987) and Cavanagh and Sherman(1998), do not impose distributional assumptions on ɛ i, but for consistency requires that the censoring variable be independent of the covariates. However, as mentioned above, this assumption is often too restrictive. Attempting to remedy this problem using conditional Kaplan-Meier methods would require smoothing parameters, trimming procedures, and tail behavior restrictions. In summary, the literature lacks an estimator for β 0 that is distribution free and permits covariate dependent censoring 2. The estimator we provide in this 2 If the censoring distribution depends on the regressors through the index x iβ 0, we note some single index estimators may be applied, though we consider this too restrictive of a condition. We also note that general covariate censoring is permitted in Gorgens and Horowitz(1998) in their estimator of the link function T ( ). They do not provide an estimator of β 0 assuming it is known, and only suggest estimation by an existing single index estimator. 3

4 paper is a partial rank estimator that allows for general forms of censoring, including left, right, and double sided censoring, where the censoring can possibly depend on the regressors. This leads to new identification strategies that are emphasized in the text (and the proofs). The rest of the paper is organized as follows. In the next section, we introduce an estimator for β 0 in transformation models with covariate dependent censoring that aims to fill the gap in the literature, and provide conditions for parameter identification. We then show that the estimator is consistent and derive its asymptotic distribution. Sections 3-5 extend our estimator to cases of doubly censored, heteroskedastic, and fixed effect panel data. Section 6 explores the finite sample properties of the new estimators by means of a small scale simulation study. Section 7 concludes by summarizing results and discussing areas for future research. The proofs of the main results are collected in an appendix. 2 Estimation Procedure Our estimator for β 0 in the censored GAFT model is motivated by existing rank estimators for β 0 in uncensored transformation models, specifically Han (1987) s maximum rank correlation (MRC) estimator 3. This estimator, like other estimators in the single index literature (e.g. Powell, Stock and Stoker (1989), Ichimura(1993), Cavanagh and Sherman(1997)), is inconsistent when the censoring variable depends on the covariates in an arbitrary way. Before introducing our distribution free estimator that accommodates covariate dependent censoring, we define the vector y i = (v i, d i ). To construct a rank regression estimator analogous to Han (1987) s, we wish to construct a function: f ij f(y i, y j ) which satisfies the property E[I[f ij 0] x i, x j ] E[I[f ji 0] x i, x j ] iff x iβ 0 x jβ 0 (2.1) For the uncensored transformation model, Han(1987) sets f ij = y i y j. For the problem at hand with covariate dependent censoring, we propose an alternative form for f ij that satisfies (2.1). First define the random variables y 0i = v i (2.2) y 1i = d i v i + (1 d i ) (+ ) where by definition we have y 0i y i y 1i (2.3) As mentioned, this will not yield consistent estimates if the censoring depends arbitrarily on x i. 3 A similar rank estimator was introduced in Cavanagh and Sherman(1998). Their Monotone Rank Estimator (MRE) is computationally simpler than the MRC, but also does not allow for covariate dependent censoring. 4

5 where y i T 1 (x i β 0 + ɛ i ). We can then define f ij and consequently I[f ij 0] as f ij = y 1i y 0j (2.4) I[f ij 0] = I[y 1i y 0j 0] = (1 d i ) + d i I[v i v j ] (2.5) Our choice of f ij is motivated by the following inequalities which follow from (2.3) T (y 0i ) x iβ 0 + ɛ i T (y 1i ) Heuristically, by monotonicity of T (.), we should have x iβ x jβ P (y 1i y 0j ) 1 2 (2.6) We first show that (2.1) holds for the censored transformation model. Our result is based on the following assumptions. I1 Letting S X denote the support of x i, and let X uc denote the set X uc = {x S X : P (d i = 1 x i = x) > 0} Then X uc has positive measure. I2 The random variable ɛ i is distributed independently of the random vector (c i, x i ). I3 The first component of x i has everywhere positive Lebesgue density, conditional on the other components. Condition I1 requires that the probability of censoring is not equal to one for all x. The independence assumption I2 requires that ɛ be independent of both x and c. This assumption is a natural starting point for examining the identification of β in this class of models. The independence between ɛ and x is similar to independence assumption in the MRC estimator. In section 4, we relax this assumption where we allow for (conditional) heteroskedasticity. This will come at the expense of stronger (sufficient) point identification condition 4. Finally, the last assumption I3 provides sufficient condition for point identification. This support condition is a widely used identification condition in semiparametric econometric models such as the MRC and the maximum score estimator in Manski(1975,1985). The main result of this section is stated in the next lemma whose proof is in the appendix. Lemma 2.1 Under Assumptions I1-I3, (2.1) holds. 4 On the other hand, it is not generally possible to relax the conditional independence of c and ɛ without additional assumption like exclusion restrictions (or functional form assumptions). This independence assumption is natural and has been widely adotpted in the duration literature. 5

6 It is this result which motivates our estimator. Before describing it in detail, we note that the object of interest β 0 is only identified up to scale as the function T ( ) is unknown. Following convention, we set the first component of the vector β 0 to 1, express β 0 = (1, θ 0 ) and consider estimation of θ 0. Adopting standard notation, for any θ Θ, we let β denote (1, θ ). Our censoring robust rank estimator, which we refer to hereafter as the partial rank estimator (PRE), is of the form: ˆθ = arg max θ Θ = arg max θ Θ 1 n(n 1) 1 n(n 1) I[f ij 0]I[x iβ x jβ] (2.7) i j (d i I[v i v j ] + (1 d i ))I[x iβ x jβ] (2.8) i j where Θ denotes the parameter space. Remark 2.1 We note the following particular features of the estimation procedure: The above estimator is numerically equivalent to maximizing the objective function: 1 n(n 1) I[y 0i y 1j ]I[x iβ x jβ] = i j 1 n(n 1) d i I[v i v j ]I[x iβ x jβ] (2.9) Expressed this way, a loose analogy can be drawn to the partial maximum likelihood estimator (PMLE) introduced in Cox(1972,1975). i j In the PMLE only uncensored observations enter the likelihood function, and for a given such observation, all the observations in its risk set (i.e. observations whose spell length is known not to be less that i s spell length) are used in evaluating the likelihood. Analogously, for the PRE only uncensored observations enter the rank correlation function, and for a given uncensored observation, all the observations in its risk set are used to determine its rank. The PRE is numerically equivalent to the MRC in the absence of censoring, as the censoring indicators are always 1. PRE is also numerically equivalent to the MRC, and hence is consistent in the case of fixed censoring, (e.g. c i 0), which arises often in economic models. The estimator can therefore be applied without change to fixed and randomly censored data. This is in contrast to procedures dividing by Kaplan-Meier estimators of the censoring variable s survivor function (see, e.g. Koul, Susarla and Van Rysin(1980)), which cannot in general be applied in the fixed censoring case. Though the PRE was defined in terms of the right censored model, it can be easily modified for the left censored model. The form of this modification is illustrated in the next section where we explore the doubly censored model. We first establish consistency of the PRE. For this we impose the additional conditions 6

7 I4 The vector z i = (d i, v i, x i ), i = 1, 2,..n are i.i.d. I5 Θ is a compact subset of R k 1. I6 S X is not contained in any proper linear subspace of R k. PRE. The following theorem, whose proof is left to the appendix, establishes the consistency of the Theorem 2.1 Under Assumptions I1-I6, ˆθ p θ 0 We now establish the limiting distribution theory of the PRE. The arguments are analogous to those used in Sherman(1993) for establishing the asymptotic distribution of the MRC. Our results are based on a set of similar assumptions and we deliberately choose notation to match his as closely as possible. Recalling that z i denotes the vector (d i, v i, x i ), we define τ(z i, θ) = E[(d i I[v i v] + (1 d i ))I[x β x iβ]] + E[(dI[v v i ] + (1 d))]i[x iβ x β]] Finally, we let N denote a neighborhood of θ 0. A1 θ 0 lies in the interior of Θ, a compact subset of R k 1. A2 For each z, the function τ(z, ) is twice differentiable in a neighborhood of θ 0. Furthermore, the vector of second derivatives of τ(z, ) satisfies the following Lipschitz condition: 2 τ(z, θ) 2 τ(z, θ 0 ) M(z) θ θ 0 where 2 denotes the second derivative operator and M( ) denotes an integrable function of z. A3 E[ 1 τ(z i, θ 0 ) 2 ] and E[ 2 τ(z i, θ 0 ) ] are finite, where 1 denotes the first derivative operator. A4 E[ 2 τ(z i, θ 0 )] is non-singular. We now state the main theorem, characterizing the asymptotic distribution of the PRE; its proof is left to the appendix. Theorem 2.2 Under Assumptions I1-I5, A1-A4, n(ˆθ θ0 ) N(0, V 1 V 1 ) (2.10) where V = E[ 2 τ(z i, θ 0 )]/2 and = E[ 1 τ(z i, θ 0 ) 1 τ(z i, θ 0 ) ]. 7

8 We conclude this section with a brief discussion on conducting inference with the PRE. The asymptotic variance matrix can be estimated in a similar fashion to the estimator in Sherman(1993). As is the case with that estimator, the selection of smoothing parameters will be required. Unfortunately, it has not been formally established that the bootstrap is asymptotically valid in this setting, or else inference could be conducted without the selection of smoothing parameters. However, other sampling procedures are possible. For one, a recent wild sampling procedure introduced in Jin, Ying, and Wei (2001) appears likely to be applicable (after appropriate modifications) to the problem at hand. Separately, the PRE can be used to construct model specification tests by comparing its value to those of existing estimators. For example, the PRE may be compared to the MRC to test for the presence of covariate dependent censoring. We can compare the PRE to the relative coefficients obtained from Cox s partial likelihood estimator (PMLE) or those found using the estimator in Ying, Jung and Wei (2002) to test for the presence of unobserved heterogeneity, or more generally, to test for particular distributions of ɛ i. Also, we can compare the PRE to relative coefficients obtained from the Tsiatsis(1990) and/or Ying, Jung and Wei (1995) estimators, to test for particular functional forms of the transformation. 3 Extension I: Doubly Censored Data Many data sets in both biostatistics and economics are subject to double (i.e. left and right) random censoring. Examples are when the dependent variable is the age of the individual at which a particular event(e.g. cancerous tumor, change in employment status) occurs, and individuals are regularly and frequently surveyed or tested for an interval of time. If the occurrence of the event is detected on the first survey/test, the dependent variable (age) is left censored, as the recorded value is greater than the actual (latent) value. If no such events have occurred by the last survey/test, the dependent variable is right censored, as the recorded value is exceeded by the actual value. In the monotonic transformation framework, the doubly censored regression model can be expressed as follows. (1.1) still holds, but the econometrician does not always observe the dependent variable y i T 1 (x i β 0 + ɛ i ). Instead one observes the doubly censored sample, which we can express as the pair (v i, d i ) where d i = I[c 1i < x iβ 0 + ɛ i c 2i ] + 2 I[x iβ 0 + ɛ i c 1i ] + 3 I[c 2i < x iβ 0 + ɛ i ] v i = I[d i = 1] (x iβ 0 + ɛ i ) + I[d i = 2]c 1i + I[d i = 3]c 2i where I[ ] denotes the usual indicator function, c 1i, c 2i denote left and right censoring variables, whose distributions may depend on the covariates x i and who satisfy P (c 1i < c 2i ) = 1. For a regression model with double censoring, estimators have been proposed by Zhang and Li(1996), Ren and Gu(1997) to name a few. Both of these require a linear regression specification and the censoring variables to be independent of the covariates. With T ( ) unknown, one can again 8

9 perform MRC using v i as the dependent variable if x i is independent of (c 1i, c 2i ). However in the doubly censored case the efficiency loss can be very severe due to ignoring the value of d i. To estimate β 0 in the general model with T ( ) and the distribution of ɛ i unknown, as well as covariate dependent censoring, we first define y 1i, y 0i as y 1i = I[d i < 3]v i + I[d i = 3] + (3.1) y 0i = I[d i 2]v i + I[d i = 2] (3.2) and accordingly we may define f ij, I[f ij 0] as: f ij = y 1i y 0j I[f ij 0] = I[d i = 3] + I[d j = 2] (I[d i = 3] I[d j = 2]) + (I[d i = 1] + I[d i = 2]) (I[d j = 1] + I[d j = 3])I[v i v j ] Letting d 1i, d 2i, d 3i denote I[d i = 1], I[d i = 2], I[d i = 3], respectively, we can express the PRE for doubly censored data as: ˆθ = arg max θ Θ = arg max θ Θ 1 n(n 1) 1 n(n 1) I[y 1i y 0j ]I[x iβ x jβ] (3.3) i j ((d 1i + d 2i ) (d 1j + d 3j )I[v i v j ] i j + (d 3i + d 2j d 3i d 2j ))I[x iβ x jβ] (3.4) which, as before, is numerically equivalent to ˆθ = arg max θ Θ = arg max θ Θ 1 n(n 1) 1 n(n 1) I[y 0i y 1j ]I[x iβ x jβ] (3.5) i j (1 I[d i = 2])(1 I[d j = 3])I[v i v j ]I[x iβ x jβ] (3.6) i j We first establish the analogous identification result for the PRE in the doubly censored case. The identification result is again based on Assumptions I1-I3, where it is now understood that the event d i = 1 is defined for doubly censored data. The proof is left to the appendix. Lemma 3.1 Under assumptions analogous to I1-I3, if either the covariates are independent of the regressors or we have the condition: P (c 1i > c 2j x i, x j ) = P (c 1j > c 2i x i, x j ) = 0 x i, x j a.s. (3.7) then we have P (y 0i y 1j x i, x j ) P (y 0j y 1i x i, x j ) iff x iβ 0 x jβ 0 (3.8) 9

10 In the above lemma, we impose the additional condition (3.7) which is a sufficient condition for point identification. Consistency of the estimator follows by including assumptions analogous to I4-I6. In the appendix, we also provide the asymptotic distribution of (3.6) above. 4 Extension II: Heteroskedastic Data One of the assumptions that we used above the independence between the disturbance term ɛ i and the covariates x i. This assumption may be overly restrictive; for example, it rules out any form of conditional heteroskedasticity which is important in some data sets. In this section we relax the independence assumption by assuming only one of the quantiles of ɛ i, say the median, is independent of the covariates. Khan(2000) proposed a two step rank estimator for a heteroskedastic transformation model, but did not allow for random censoring. In contrast, Honoré, Khan and Powell(2002) and Portnoy(2004) allow for unknown heteroskedasticity and random censoring, but require the transformation function to be known. For point identification in models with random covariate dependent censoring, heteroskedasticity and an unknown transformation function, we assume that the random variables c i, ɛ i are statistically independent 5 given x i. Next, we provide identification conditions for the univariate censoring case. Similar arguments can be used to attain point identification results for the double censoring case. The results in the next lemma, whose proof is in the appendix, provide sufficient conditions for regular point identification. Lemma 4.1 Define the set X such that X = {x : Pr(c xβ 0 x) = 1} Assume further that Pr x (X ) > 0. Moreover, the random variable c is such that ɛ c x. Finally, define the random variables y 0i = v i and y 1i = d i v i + (1 d i ) +. Then we have that Med(T (y 0 ) x) = Med(T (y) x) = Med(T (y 1 ) x) = xβ if and only if x X. The above identification result 6, along with the invariance of medians, suggests an (infeasible) rank estimator based on the conditional medians of y 0i and y 1i. Letting m 0 (x i ), m 1 (x i ) denote these conditional median functions, we would estimate β 0 by maximizing the function Q(β) = 1 n(n 1) I[m 1 (x i ) m 0 (x j )]I[x iβ x jβ] (4.1) i j 5 Without this assumption, the identified features of the model is a set of parameters (as opposed to a unique parameter.). 6 The lemma above provides sufficient conditions for regular point identification or β 0, i.e., conditions that allow for consistent estimation of β 0 at the parametric rate. Notice that, the sufficient condition rules out the case where x and c are jointly normal. One can easily show that this is a case of identification at infinity where the parameter β 0 is point identified but can only be estimated at rates slower than n. 10

11 To construct a feasible estimation procedure, we replace the unknown median functions in the above estimator with their nonparametric estimators. To construct these estimators, we adopt the local polynomial approach introduced in Chaudhuri(1991). For a detailed description of the estimator, see Chaudhuri(1991). Here, we simply let ˆm δ n,p 0 (x i ), ˆm δ n,p 1 (x i ) denote the local polynomial estimators where the superscripts denote the bandwidth sequence (δ n ), and order of polynomial (p) used. Conditions on δ n and p are stated in the theorem below characterizing the limiting distribution of our estimator of β 0. To avoid the technical difficulty of dealing with a smoothing parameter inside an indicator function, we define our heteroskedasticity robust estimator of β 0, denoted here as ˆβ ht as follows: ˆβ ht = arg max β B 1 n(n 1) i j K hn ( ˆm δn,p 1 (x i ) ˆm δn,p 0 (x j ))I[x iβ x jβ] (4.2) where K hn ( ) K( /h n )/h n, with K( ) denoting a smooth approximating function to an indicator function (i.e. a cumulative distribution function), and h n denotes a sequence of positive constants, converging to 0, such that in the limit we have an indicator function. This smoothing technique was introduced in the seminal work of Horowitz(1992). In the appendix, we provide the asymptotic distribution of our estimator in (4.2) and state the required regularity conditions necessary for its well behavior. 5 Extension III: Panel Data As is the case with duration data 7, panel data has received an increasing amount of attention in the econometric literature - see Arellano and Honoré(2001) for a recent survey. In the duration context, a panel data set usually refers to a cross section for which we observe multiple time-to-events, or spells. This may refer to multiple spells by the same individual, or spells for different individuals in a group or family. Empirical examples in the first case include include unemployment spells (Heckman and Borjas(1980)), time intervals between child births (Newman and McCullogh (1984)) and car insurance claim durations (Abbring, Chiappori and Pinquet(2003)). Examples in the second case would include survival times of children in a family in a developing country (Sastry (1997), Ridder and Tunah(1999)), lifetimes of machines grouped by a firm, or unemployment spells grouped by a family or region (e.g. Fitzgerald(1992)). In this section we consider estimation of a right censored duration model with fixed effects. As in the previous sections, we allow for general forms of censoring. Of particular interest in the panel data setting is to permit the distribution of the censoring variable to be spell-specific and individual/group specific. The vast existing literature does not address this type of problem. Honoré at al.(2002) allows for random censoring, but requires a linear transformation, and the censoring variables to be distributed independently of the covariates with the same distribution across spells. Exten- 7 We thank Bo Honoré for suggesting this extension to us. 11

12 sions of the linear specification can be found in Abrevaya(1999,2000), which allow for a generalized transformation function, but rule out fixed and/or general random censoring. Other work in the panel duration literature parametrically specifies the distribution of the error terms. Examples include Chamberlain(1985), Honoré(1993), Ridder and Tunah(1999), Lancaster (2000), Horowitz and Lee(2003) and Lee(2003). Some of these also rule out censoring distributions that vary across spells and/or are independent of covariates 8. In the context of multiple spell data, we wish to allow for distribution of the censoring variable to vary across spells, for one of two reasons: for one, the censoring distribution may depend on time-varying covariates. Also, even if the censoring distribution does not depend on the covariates, and is purely a result of the observation plan, the observation plans may vary across spells. To be precise, we will focus on the following model: T i (v it ) = min(α i + x itβ 0 + ɛ it, c it ) d it = I[α i + x itβ 0 + ɛ it c it ] i = 1, 2,...n t = 1, 2,...τ where here the subscript i denotes an economic agent in a cross section of n observations. the duration model framework studied in this section, the subscript t does not denote the time period, but one of τ spells. T i ( ) is an unknown, strictly monotonic function that varies across individuals, x it is a k dimensional vector of covariates, c it is a censoring variable which is permitted to be random, and whose distribution is permitted to depend on x it. The disturbance term ɛ it is unobserved, and will be assumed to satisfy conditions which will be discussed shortly. The individual specific effect α i is unobserved, and following the standard fixed effects approach, is permitted to depend on the covariates x it in an arbitrary way. Finally, the k dimensional vector β 0 is the parameter of interest which we wish to estimate. Following convention in the fixed-effects literature, we regard n to be large and τ small, as many time-to-event panel data sets encountered in practice are characterized by a large cross section but few spells. Without loss of generality, we set τ = 2, as this facilitates description of the new estimation procedure, and allow n. Consequently, we wish to estimate β 0 from a random sample of pairs of the (4 + 2k) 1 vector (d i1, d i2, v i1, v i2, x i1, x i2) In As in the previous sections, we let θ 0 denote the remaining components of β 0 after imposing the same scale normalization, and propose an estimator for θ 0. The estimator, denoted here as ˆθ p, 8 Honoré(1993), Horowitz and Lee (2003) and Lee(2003) do allow for the censoring variable s distribution to depend on the error term, which is not considered here. As mentioned in these papers, dependent censoring can easily occur in multiple data. Therefore the independence assumption considered here is better suited for analyzing data with group specific effects. 12

13 and referred to hereafter as the censored duration panel (CDP) estimator, is of the form 9 : ˆθ p = arg max θ Θ 1 n n d i1 I[v i1 < v i2 ]I[x i1β(θ) < x i2β(θ)] (5.1) i=1 where Θ denotes the parameter space and for each θ Θ, β(θ) (1, θ ). We establish the consistency of the estimator; this result is based on conditions which are analogous to those imposed in previous sections. x i2 x i1. To simplify notation, we will let x i denote P1 Letting S Xt denote the support of x it, and let X uct denote the set X uct = {x S X : P (d it = 1 x it = x) > 0} Then X uct has positive measure. P2 The random variables ɛ i1, ɛ i2 are identically distributed conditional on the vector (c i1, x i1, c i2, x i2 ), with common support equal to the real line. P3 The first component of x i has everywhere positive Lebesgue density, conditional on the other components. P4 The vector (d it, v it, x it ), i = 1, 2,..., n are i.i.d. P5 Θ is a compact subset of R k 1. P6 S X, the support of x i, is not contained in any proper linear subspace of R k. We can now state the theorem establishing consistency in the panel data setting. The proof is left to the appendix. Theorem 5.1 Under Assumptions P1-P6, the CDP estimator is consistent: ˆθ p p θ0 (5.2) Inference on parameters requires limiting distribution theory. We note since the CDP estimator has the same form of a maximum score estimator, the rate of convergence will be slower than the parametric rate, with a non-gaussian limiting distribution (Kim and Pollard(1990)), making inference difficult to conduct. We also note that one can adopt a smoothed maximum score approach 9 This estimator can be related to, but is distinct from, existing panel data estimators. The indicator functions comparing values of the observed dependent variables and index values across time originates in Manski(1987), and was also used in Abrevaya(1999). Comparing these values across time intuitively in a maximum score (Manski(1975,1985)) type setting leads to consistent estimation of θ 0 for binary choice (Manski(1987)) models and transformation models with fixed censoring (Abrevaya(1999)). However this estimation approach by itself will not consistently estimate θ 0 in the presence of random covariate dependent and/or spell specific censoring, as considered here. 13

14 (Horowitz(1992)) and attain a faster rate by imposing stronger smoothness conditions. Since this is now a standard exercise and has been applied in many settings, we omit the details here. Finally, we conjecture that the slow rate of convergence is a consequence of the generality of the model and not a deficiency of the proposed estimator. This is because there is no common element in our model which permits averaging over individuals in the cross section. Such an averaging would permit attaining the parametric (root-n) rate of convergence for a regression coefficient estimator. 6 Monte Carlo Results In this section we explore the finite sample properties of the new estimators introduced in this paper by reporting results obtained from a small scale simulation study. Our base design involves two regressors and an additive error term which we express in the absence of censoring as: T (y i ) = α 0 + x 1i β 0 + x 2i + ɛ i where x 1i, x 2i are distributed as a chi-squared with one degree of freedom, and standard normal, respectively; α 0, β 0 were set to 1 and -1 respectively. We considered 2 functional forms for T ( ) and the error distribution as follows: 1. T 1 (ν) = ν; ɛ i mixture of two normals, centered around -1 and 2 respectively. 2. T 1 (ν) = exp(ν); ɛ i chi squared, 1 degree of freedom. We simulated four types of censoring: 1. Covariate dependent right censoring: For the exponential design, the censoring variable was distributed as 2.05 exp(x 2 1i x 2i +x 2i ) and for the linear design it was distributed as x 2 1i x 2i. 2. Covariate independent right censoring: Here for both functional forms the censoring variable was distributed as a chi-squared random variable, with one degree of freedom. 3. Double covariate independent censoring (linear transformation only) : the left censoring variable was distributed as the right censoring variable - 2 times a chi-squared with one degree of freedom - 2, and the right censoring variable was distributed as in Double covariate dependent censoring (linear transformation only): The right censoring variable was the same as in 1. and the relationship between the two censoring variables was the same as in 3. In tables I through VI we report results for 4 estimators: 1)PRE 2) the MRC 3) the monotone rank estimator (MRE) introduced in Cavanagh and Sherman(1998) 4) the PMLE in Cox(1972,1975). For each estimator and each design the summary statistics mean bias, median bias, root mean squared error (RMSE) and median absolute deviation (MAD) are reported for 100,200, and 400 observations, with 401 replications. As there is only one parameter to compute, each rank estimator 14

15 was evaluated by means of a grid search of 500 evenly spaced points over the interval [-5,5]. For the PMLE, the intercept and both slope coefficients were evaluated, and the tables report the ratio of the two slope coefficients. Computation of these three values was performed using QNewton in GAUSS, with 10 starting values, which included the true values, least squares estimates, and randomly generated values. In general, the simulation results are in accordance with the theory. For covariate independent right censoring, all rank estimator perform well in the linear design, and the PRE has the smallest RMSE at all sample sizes. The PMLE performs well, even though the error distribution is misspecified, though its bias values and RMSE do not decline with the sample size. In the exponential design, the only estimators that perform adequately are the PRE and the PMLE, with the PMLE again suffering from RMSE values not shrinking with the sample size. The MRC and MRE only perform adequately at a sample size of 400, which is surprising since they are both theoretically consistent for this design. For covariate dependent right censoring, the results clearly establish the benefits of the PRE over MRC and MRE. It performs quite well with bias and RMSE values shrinking at the parametric rate. In complete contrast, the MRC and MRE perform very poorly for both functional forms, with RMSE values in most cases not reducing, and sometimes even increasing with the sample size. The PMLE s inconsistency (due to the error distribution misspecification) is also apparent, tough not as pronounced in the linear design at 400 observations. Its RMSE values are much larger than the PRE s for the smaller sample sizes. For the exponential design, the PMLE performs very poorly at all sample sizes. For double covariate independent censoring, all rank estimators have RMSE s shrinking at the parametric rate, but the efficiency gains of the PRE are very apparent for both functional form error distribution pairs. This is due to the fact that the PRE uses more information on the censoring structure than the other two estimators. The PMLE performs poorly at all sample sizes. For covariate dependent double censoring, the results are similar to the one sided covariate dependent censoring case, i.e., only the PRE exhibits root-n consistency and the others are clearly inconsistent. In summary, the results from our simulation indicate that the PRE estimators introduced in this paper perform adequately well in finite samples, so it can be applied in empirical settings, which we turn to in the following section. The results also show how sensitive the other estimators are to model misspecification. 7 Conclusions and further extensions In this paper, we introduced new estimators for duration models with general forms of covariate dependent censoring. The new estimators have the attractive properties of being distribution free, require no smoothing parameters, and are robust to censoring that depends on the regressors. The estimator is shown to converge at the parametric rate with asymptotically normal distribution. 15

16 Extensions were provided for doubly censored, heteroskedastic, and panel data. A simulation study indicated the estimator(s) performed well in finite samples, and also illustrated how erroneous existing estimators can be if the censoring variable depends on covariates or the error distribution is misspecified. The work in this paper suggest areas for future research. We provide two such examples. For one, it would be useful to construct an estimator for the function T (.) based on y 0i, y 1i that modifies the rank estimator of T ( ) in Chen(2002) to allow for covariate dependent censoring. Finally, another important area for future work would be to formally confirm the conjecture that the proposed panel data estimator attains the fastest rate of convergence possible under the assumptions of the model. 16

17 References [1] Abbring, J.H., P.A. Chiappori, and J. Pinquet (2003), Moral Hazard and Dynamic Insurance Data, Journal of the European Economic Association, forthcoming. [2] Abrevaya, J. (1999), Leapfrog Estimation of a Fixed-effects Model with Unknown Transformation of the Dependent Variable. Journal of Econometrics, 93, [3] Abrevaya, J. (2000), Rank estimation of a generalized fixed-effects regression model,journal of Econometrics, 95, [4] Bijwaard, G.E. (2001), Rank Estimation of Duration Models, Ph D. Disseration, Tinbergen Institute. [5] Buckley, J. and I. James (1979), Linear Regression with Censored Data, Biometrika, 66, [6] Cai, T., Wei, L.J. and M. Wilcox(2000), Semiparametric Regression Analysis for Clustered Failure Time Data. Biometrika, 87, [7] Cavanagh, C. and R.P. Sherman (1998), Rank Estimators for Monotonic Index Models, Journal of Econometrics, 84, [8] Chamberlain, G. (1985), Heterogeneity, Omitted Variable Bias, and Duration Dependence, in in Heckman, J.J. and B. Singer, eds., Longitudinal Analysis of Labor Market Data, Cambridge: Cambridge University Press. [9] Chen, S. (2002), Rank Estimation of Transformation Models, Econometrica, 70, [10] Chen, J., Jin, Z., and Z. Ying (2002), Semiparametric Analyisis of Transformation Models with Censored Data, Biometrika, 89, [11] Cheng, S.C., Wei, L.J., and Z. Ying (1995), Analysis of Transformation Models with Censored Data, Biometrika, 82, [12] Cox D.R. (1972), Regression Models and Life Tables, Journal of the Royal Statistical Society Series B, 34, [13] Cox D.R. (1975), Partial Likelihood Biometrika, 62, [14] Cuzick, J. (1988), Rank Regression, Annals of Statistics, 16, [15] Fine, J.P., Ying, Z. and L.J. Wei (1998), On the Linear Transformation Model with Censored Data, Biometrika, 85, [16] Flinn, C. and J. Heckman (1983), Are Unemployement and out of the Labor Force Behaviorally Distinct Labor Force States?, Journal of Labor Economics, 1, [17] Gorgens, Tue and J. Horowitz (1998), Semiparametric estimation of a censored regression model with an unknown transformation of the dependent variable, Journal of Econometrics, Volume 90, Issue 2, Pages [18] Han, A. (1987) Non Parametric Analysis of a Generalized Regression Model, Journal of Econometrics, 35, [19] Heckman, J.J. and G.J. Borjas (1980) Does Unemployment Cause Future Unemployment? Definitions, Questions and Answers for a Continuous Time Model of Heterogeneity and State Dependence, Economica, 47, [20] Honoré, B.E.(1993) Identification Results for Duration Models with Multiple Spells, Review of Economic Studies, 60, [21] Honoré, B.E., Khan, S. and J.L. Powell (2002) Quantile Regression under Random Censoring, Journal of Econometrics,

18 [22] Horowitz, J.L. and S. Lee (2003), Semiparametric Estimation of a Panel Data Proportional Hazards Model with Fixed Effects, Journal of Econometrics, forthcoming. [23] Jin Z., Ying Z., and Wei L.J. (2002) A Simple Resampling Method by Perturbing the Minimand, Biometrika, 88, [24] Kalbfleisch, J.D. and R.L. Prentice (1980), The Statistical Analysis of Failure Time Data. New York: Wiley. [25] Kaplan, E.L. and P. Meier (1958), Nonparametric Estimation from Incomplete Data, Journal of the American Statistical Association, 53, [26] Koul, H., V. Susarla, and J. Van Ryzin (1981), Regression Analysis with Randomly Right Censored Data, Annals of Statistics, 9, [27] Lancaster, T. (2000), The Incidental Parameter Problem since 1948, Journal of Econometrics, 95, [28] Lee, S. (2003), Estimating Panel Data Duration Models with Censored Data, manuscript, University College London. [29] Manski, C.F. (1975), Maximum Score Estimation of the Stochastic Utility Model of Choice, Journal of Econometrics, 3, [30] Manski, C.F. (1985), Semiparametric Analysis of Discrete Response: Asymptotic Properties of Maximum Score Estimation, Journal of Econometrics, 27, [31] Newey, W.K. and D. McFadden (1994) Estimation and Hypothesis Testing in Large Samples, in Engle, R.F. and D. McFadden (eds.), Handbook of Econometrics, Vol. 4, Amsterdam: North-Holland. [32] Newman, J.L. and C.E. McCullogh (1984), A hazard Rate Approach to the Timing of Births, Econometrica, 52, [33] Pakes, A. and D. Pollard (1989), Simulation and the Asymptotics of Optimization Estimators, Econometrica, 57, [34] Portnoy, S. (2003), Censored Regression Quantiles, Journal of the American Statistical Association, 98, [35] Ridder, G. (1990) The Non-parametric Identification of Generalized Accelerated Failure-time Models, Review of Economic Studies, 57, [36] Ridder, G. and I. Tunah(1999) Stratified Partial Likeliood Estimation, Journal of Econometrics, 92, [37] Sherman, R.P. (1993), The Limiting Distribution of the Maximum Rank Correlation Estimator, Econometrica, 61, [38] Sherman, R.P. (1994a), U-Processes in the Analysis of a Generalized Semiparametric Regression Estimator, Econometric Theory, 10, [39] Sherman, R.P. (1994b), Maximal Inequalities for Degenerate U-Processes with Applications to Optimization Estimators, Annals of Statistics, 22, [40] Tsiatis, A.A. (1990), Estimating Regression Parameters Using Linear Rank Tests for Censored Data, Annals of Statistics, 18, [41] Van den Berg, G.J. (2001), Duration Models: Specification, Identification and Multiple Durations, in Heckman, J.J. and E. Leamer, eds., Handbook of Econometrics, Vol. 5, Amsterdam: North-Holland. 18

19 [42] Wei, L.J., Z. Ying, and D.Y. Lin (1990), Linear Regression Analysis of Censored Survival Data Based on Rank Tests. Biometrika, 19, [43] Wang, J.-G. (1987), A Note on the Uniform Consistency of the Kaplan-Meier Estimator Annals of Statistics, 15, [44] Yang, S. (1999), Censored Median Regression Using Weighted Empirical Survival and Hazard Functions, Journal of the American Statistical Association, 94, [45] Ying, Z., S.H. Jung, and L.J. Wei (1995), Survival Analysis with Median Regression Models, Journal of the American Statistical Association, 90,

20 A Appendix A.1 Proof of lemma 2.1 We need to show that Pr[y 1i y 0j x i, x j ] Pr[y 1j y 0i x i, x j ] iff x iβ 0 x jβ 0 (A.1) which is equivalent to showing that P (y 0i y 1j x i, x j ) P (y 0j y 1i x i, x j ) iff x iβ 0 x jβ 0 (A.2) For notational convenience, we let z i, z j denote x i β 0, x j β 0 respectively. We first evaluate P (y 0i y 1j ) (A.3) where we condition on x i, x j. This probability can be decomposed into the mutually exclusive cases c i > c j and c i c j. We first focus on the case where c i > c j, and evaluate the probability conditional on the censoring values c i, c j. Note the probability of (A.3) is zero whenever d j = 0, so we can decompose (A.3) as P (y 0i y 1j ) = P (y 0i y 1j, d i = 1, d j = 1) + P (y 0i y 1j, d i = 0, d j = 1) (A.4) We derive an expression for the first term, which we write here as: P (ɛ i ɛ j z, ɛ i c i z i, ɛ j c j z j ) where here, z z i z j. Recall that we are assuming for now that c i > c j, so by the independence assumption, we express the above probability as: cj z j ci z i ɛ j z df (ɛ i )df (ɛ j ) (A.5) where F ( ) denotes the c.d.f. of ɛ i and ɛ j. So (A.5) is cj z j F (c i z i )F (c j z j ) F (ɛ j z)df (ɛ j ) (A.6) Now, turning attention to the second term in (A.4), we express it as: P (ɛ j c j z j, ɛ i c i z i, ɛ j c i z j ) = P (ɛ i c i z i, ɛ j c j z j ) (A.7) where the equality follows from c i > c j. This is equal to (1 F (c i z i ))(F (c j z j )) (A.8) Thus we have that conditioning on x i, x j, and c i > c j, (A.4) can be expressed as: cj z j S(ɛ j z)df (ɛ j ) (A.9) where here S( ) = 1 F ( ). We next evaluate P (y 0j y 1i ), again conditioning on x i, x j, c i > c j. A similar decomposition yields: P (y 0j y 1i, d i = 1, d j = 1) + P (y 0j y 1i, d i = 1, d j = 0) (A.10) 20

21 The first term is: P (ɛ i ɛ j z, ɛ i c i z i, ɛ j c j z j ) (A.11) which we can decompose into the sum of P (ɛ i ɛ j z, ɛ i c j z i, ɛ j c j z j ) + P (ɛ i ɛ j z, c j z i ɛ i c i z i, ɛ j c j z j ) (A.12) Note the second term is 0 (since ɛ i ɛ j z and ɛ j c j z j contradicts the middle event), and the first term is: cj z i cj z j df (ɛ j )df (ɛ i ) (A.13) which is equal to ɛ i + z cj z i F (c j z j )F (c j z i ) F (ɛ i + z)df (ɛ i ) The second term in (A.10) is P (c j ɛ i + z i, ɛ j c j z j, ɛ i c i z i ) = P (ɛ j c j z j, ɛ i c j z i ) (A.14) where the equality follows from c i > c j. This can be expressed as: (1 F (c j z j ))F (c j z i ) (A.15) Therefore, (A.10) is c j z i S(ɛ i + z)df (ɛ i ), and the difference between (A.4) and (A.10) is cj z j cj z i S(ɛ z)df (ɛ) S(ɛ + z)df (ɛ) (A.16) We note that the above difference has the same sign as z i z j, as the integrand is strictly larger for the first term if and only if z i > z j, as is the range of integration. This establishes the identification result for the case where c i > c j. For c i c j, using analogous arguments, we find that P (y 0i y 1j ) = ci z j S(ɛ i z)df (ɛ i ) (A.17) and P (y 0j y 1i ) = ci z i S(ɛ i + z)df (ɛ i ) (A.18) and the difference in the two integrals has the same sign as z i z j here as well. Thus after integrating over c i, c j we can conclude that P (y 0i y 1j x i, x j ) P (y 0i y 1j x i, x j ) has the same sign as x i β 0 x j β 0. A.2 Proof of Theorem 2.1 To show consistency it suffices to show 4 conditions (see e.g. Newey and McFadden(1994), Theorem 2.1.): compactness, uniform convergence, continuity, identification. 21

22 We first turn attention to identification, whose result will be shown to follow Lemma 2.1. The sample objective function (2.7) can be written as 1 n(n 1) I[y 1i y 0j ]I[x iβ x jβ] = i j 1 n(n 1) ( I[y1i y 0j ]I[x iβ x jβ] + I[y 1j y 0i ]I[x jβ x iβ] ) i<j Then, let Q(β) denote the limiting objective function: Q(β) = E X [P (y 1i y 0j x i, x j )I[x iβ x jβ] + P (y 1j y 0i x i, x j )I[x jβ x iβ]] where E X [ ] denotes the expectation over x i, x j. We need to show that this is uniquely maximized at β 0. We have (suppressing the conditioning on x i and x j ): Q(β 0 ) Q(β) = E X [(P (y 1i y 0j ) P (y 1j y 0i )) ( I[x iβ 0 x jβ 0 ] I[x iβ x jβ] ) ] (A.19) = E X [(P (y 1i y 0j ) P (y 1j y 0i )) (I[ x β 0 0 > x β] I[ x β 0 > x β 0 ])] By the previous lemma, the above expectation is non-negative, and trivially equal to 0 when β = β 0 by Assumption I3. We show for β β 0, the above expectation is strictly positive. Note that β β 0 corresponds to θ θ 0 since θ and θ 0 have the same first component of 1. It follows from Assumption I6 that with positive probability, x ( 1) θ x ( 1) θ 0, where here x ( 1) denotes the difference in the k 1 dimensional vector corresponding to the last k 1 components of the regressor vector. By Assumption I3, we can find a subset of X uc X uc where I[x i β 0 > x j β 0, x i β < x j β]] = 1 or I[x i β 0 < x j β 0, x i β > x jβ]] = 1, and this subset has positive probability. But from Lemma 2.1, on this subset P (y 1i y 0j x i, x j ) > P (y 1j y 0i x i, x j ), so (A.19) is strictly positive, establishing that the limiting objective function is uniquely maximized at β 0 and proving identification. Turning attention to the other three items, we note that compactness holds by Assumption I5. Regarding uniform convergence, we need to show sup Q n (β) p Q(β) θ Θ (A.20) where Q n (β) is the sample objective function defined in (2.9). (A.20) follows from uniform laws of large numbers for U-statistics with bounded kernel functions satisfying a Euclidean property. This property (with the constant envelope 1) is shown below, so we can apply Corollary 7 in Sherman(1994) to establish (A.20). The continuity condition that Q(β) is continuous at β = β 0 follows from the smoothness of the density of x i β 0 which follows from I3. This establishes consistency. A.3 Proof of Theorem 2.2 We note that virtually identical arguments as in Sherman(1993) can be used, as the objective functions of the MRC and the PRE are very similar. The only component of the proof there that does not immediately carry over to the problem at hand is establishing the Euclidean property of the class of functions in the objective function. For the problem at hand, we consider the class of functions: F = {f(,, θ) : θ Θ} (A.21) where for each (z 1, z 2 ) S S, θ Θ, we can define f(z 1, z 2, θ) = I[y 01 y 12 ]I[x 1β x 2β] (A.22) = (d 2 I[v 1 v 2 ])I[x 1β x 2β] (A.23) 22

ESTIMATING PANEL DATA DURATION MODELS WITH CENSORED DATA

ESTIMATING PANEL DATA DURATION MODELS WITH CENSORED DATA ESTIMATING PANEL DATA DURATION MODELS WITH CENSORED DATA Sokbae Lee THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working paper CWP13/03 Estimating Panel Data Duration Models with

More information

Informational Content in Static and Dynamic Discrete Response Panel Data Models

Informational Content in Static and Dynamic Discrete Response Panel Data Models Informational Content in Static and Dynamic Discrete Response Panel Data Models S. Chen HKUST S. Khan Duke University X. Tang Rice University May 13, 2016 Preliminary and Incomplete Version Abstract We

More information

Partial rank estimation of duration models with general forms of censoring

Partial rank estimation of duration models with general forms of censoring / /j Partial rank estimation of duration models with general forms of censoring Shakeeb Khan a, Elie Tamer b, a Department of Economics, University of Rochester, Rochester NY 14627, USA b Department of

More information

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION Arthur Lewbel Boston College December 19, 2000 Abstract MisclassiÞcation in binary choice (binomial response) models occurs when the dependent

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and

More information

Working Paper No Maximum score type estimators

Working Paper No Maximum score type estimators Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,

More information

Econ 273B Advanced Econometrics Spring

Econ 273B Advanced Econometrics Spring Econ 273B Advanced Econometrics Spring 2005-6 Aprajit Mahajan email: amahajan@stanford.edu Landau 233 OH: Th 3-5 or by appt. This is a graduate level course in econometrics. The rst part of the course

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 995-002 Research India Publications http://www.ripublication.com Truncated Regression Model and Nonparametric Estimation

More information

Inference on Endogenously Censored Regression Models Using Conditional Moment Inequalities

Inference on Endogenously Censored Regression Models Using Conditional Moment Inequalities Inference on Endogenously Censored Regression Models Using Conditional Moment Inequalities Shakeeb Khan a and Elie Tamer b a Department of Economics, Duke University, Durham, NC 27708. b Department of

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Tobit and Interval Censored Regression Model

Tobit and Interval Censored Regression Model Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 98-994 Research India Publications http://www.ripublication.com Tobit and Interval Censored Regression Model Raidani

More information

Estimating the Derivative Function and Counterfactuals in Duration Models with Heterogeneity

Estimating the Derivative Function and Counterfactuals in Duration Models with Heterogeneity Estimating the Derivative Function and Counterfactuals in Duration Models with Heterogeneity Jerry Hausman and Tiemen Woutersen MIT and University of Arizona February 2012 Abstract. This paper presents

More information

Rank Estimation of Partially Linear Index Models

Rank Estimation of Partially Linear Index Models Rank Estimation of Partially Linear Index Models Jason Abrevaya University of Texas at Austin Youngki Shin University of Western Ontario October 2008 Preliminary Do not distribute Abstract We consider

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools Syllabus By Joan Llull Microeconometrics. IDEA PhD Program. Fall 2017 Chapter 1: Introduction and a Brief Review of Relevant Tools I. Overview II. Maximum Likelihood A. The Likelihood Principle B. The

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Discrete Time Duration Models with Group level Heterogeneity

Discrete Time Duration Models with Group level Heterogeneity This work is distributed as a Discussion Paper by the STANFORD INSTITUTE FOR ECONOMIC POLICY RESEARCH SIEPR Discussion Paper No. 05-08 Discrete Time Duration Models with Group level Heterogeneity By Anders

More information

Exclusion Restrictions in Dynamic Binary Choice Panel Data Models

Exclusion Restrictions in Dynamic Binary Choice Panel Data Models Exclusion Restrictions in Dynamic Binary Choice Panel Data Models Songnian Chen HKUST Shakeeb Khan Boston College Xun Tang Rice University February 2, 208 Abstract In this note we revisit the use of exclusion

More information

Estimating Semi-parametric Panel Multinomial Choice Models

Estimating Semi-parametric Panel Multinomial Choice Models Estimating Semi-parametric Panel Multinomial Choice Models Xiaoxia Shi, Matthew Shum, Wei Song UW-Madison, Caltech, UW-Madison September 15, 2016 1 / 31 Introduction We consider the panel multinomial choice

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Spring 2007 Econ 721: Advanced Microeconometrics Petra Todd Course Description Lecture: 9:00-10:20 Tuesdays and Thursdays Office Hours: 10am-12 Fridays or by appointment. To

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

Partial Identification and Inference in Binary Choice and Duration Panel Data Models

Partial Identification and Inference in Binary Choice and Duration Panel Data Models Partial Identification and Inference in Binary Choice and Duration Panel Data Models JASON R. BLEVINS The Ohio State University July 20, 2010 Abstract. Many semiparametric fixed effects panel data models,

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators

A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators ANNALS OF ECONOMICS AND FINANCE 4, 125 136 (2003) A Monte Carlo Comparison of Various Semiparametric Type-3 Tobit Estimators Insik Min Department of Economics, Texas A&M University E-mail: i0m5376@neo.tamu.edu

More information

Parametric Identification of Multiplicative Exponential Heteroskedasticity

Parametric Identification of Multiplicative Exponential Heteroskedasticity Parametric Identification of Multiplicative Exponential Heteroskedasticity Alyssa Carlson Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States Dated: October 5,

More information

Semiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models for Unemployment Spells

Semiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models for Unemployment Spells Semiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models for Unemployment Spells Jason Abrevaya University of Chicago Graduate School of Business Chicago, IL 60637,

More information

Semiparametric Efficiency in Irregularly Identified Models

Semiparametric Efficiency in Irregularly Identified Models Semiparametric Efficiency in Irregularly Identified Models Shakeeb Khan and Denis Nekipelov July 008 ABSTRACT. This paper considers efficient estimation of structural parameters in a class of semiparametric

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator Francesco Bartolucci and Valentina Nigro Abstract A model for binary panel data is introduced

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Lagged Duration Dependence in Mixed Proportional Hazard Models

Lagged Duration Dependence in Mixed Proportional Hazard Models FACULTEIT ECONOMIE EN BEDRIJFSKUNDE TWEEKERKENSTRAAT 2 B-9000 GENT Tel. : 32 - (0)9 264.34.61 Fax. : 32 - (0)9 264.35.92 WORKING PAPER Lagged Duration Dependence in Mixed Proportional Hazard Models Matteo

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

Identification of the timing-of-events model with multiple competing exit risks from single-spell data

Identification of the timing-of-events model with multiple competing exit risks from single-spell data COHERE - Centre of Health Economics Research Identification of the timing-of-events model with multiple competing exit risks from single-spell data By: Bettina Drepper, Department of Econometrics and OR,

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Fall, 2015 Econ 721: Econometrics III Advanced Petra Todd Course Description Lecture: 10:30-11:50 Mondays and Wednesdays Office Hours: 10-11am Tuesdays or by appointment. To

More information

The relationship between treatment parameters within a latent variable framework

The relationship between treatment parameters within a latent variable framework Economics Letters 66 (2000) 33 39 www.elsevier.com/ locate/ econbase The relationship between treatment parameters within a latent variable framework James J. Heckman *,1, Edward J. Vytlacil 2 Department

More information

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Michael P. Babington and Javier Cano-Urbina August 31, 2018 Abstract Duration data obtained from a given stock of individuals

More information

Econometric Analysis of Games 1

Econometric Analysis of Games 1 Econometric Analysis of Games 1 HT 2017 Recap Aim: provide an introduction to incomplete models and partial identification in the context of discrete games 1. Coherence & Completeness 2. Basic Framework

More information

Research Statement. Zhongwen Liang

Research Statement. Zhongwen Liang Research Statement Zhongwen Liang My research is concentrated on theoretical and empirical econometrics, with the focus of developing statistical methods and tools to do the quantitative analysis of empirical

More information

Simulation-based robust IV inference for lifetime data

Simulation-based robust IV inference for lifetime data Simulation-based robust IV inference for lifetime data Anand Acharya 1 Lynda Khalaf 1 Marcel Voia 1 Myra Yazbeck 2 David Wensley 3 1 Department of Economics Carleton University 2 Department of Economics

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods MIT 14.385, Fall 2007 Due: Wednesday, 07 November 2007, 5:00 PM 1 Applied Problems Instructions: The page indications given below give you

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Fall, 2016 Econ 721: Advanced Micro Econometrics Petra Todd Course Description Lecture: 10:30-11:50 Mondays and Wednesdays Office Hours: 10-11am Fridays or by appointment. To

More information

Identification in Nonparametric Limited Dependent Variable Models with Simultaneity and Unobserved Heterogeneity

Identification in Nonparametric Limited Dependent Variable Models with Simultaneity and Unobserved Heterogeneity Identification in Nonparametric Limited Dependent Variable Models with Simultaneity and Unobserved Heterogeneity Rosa L. Matzkin 1 Department of Economics University of California, Los Angeles First version:

More information

Semiparametric Estimation of a Panel Data Proportional Hazards Model with Fixed Effects

Semiparametric Estimation of a Panel Data Proportional Hazards Model with Fixed Effects Semiparametric Estimation of a Panel Data Proportional Hazards Model with Fixed Effects Joel L. Horowitz Department of Economics Northwestern University Evanston, IL 60208 and Sokbae Lee Department of

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

On Uniform Inference in Nonlinear Models with Endogeneity

On Uniform Inference in Nonlinear Models with Endogeneity On Uniform Inference in Nonlinear Models with Endogeneity Shakeeb Khan Duke University Denis Nekipelov University of Virginia September 2, 214 Abstract This paper explores the uniformity of inference for

More information

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation Maria Ponomareva University of Western Ontario May 8, 2011 Abstract This paper proposes a moments-based

More information

A Simple GMM Estimator for the Semiparametric Mixed Proportional Hazard Model. February 2013

A Simple GMM Estimator for the Semiparametric Mixed Proportional Hazard Model. February 2013 A Simple GMM Estimator for the Semiparametric Mixed Proportional Hazard Model G E. B, G R T W NIDI, U S C, U A February 213 A. Ridder and Woutersen 23) have shown that under a weak condition on the baseline

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Panel Threshold Regression Models with Endogenous Threshold Variables

Panel Threshold Regression Models with Endogenous Threshold Variables Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

Quantile regression and heteroskedasticity

Quantile regression and heteroskedasticity Quantile regression and heteroskedasticity José A. F. Machado J.M.C. Santos Silva June 18, 2013 Abstract This note introduces a wrapper for qreg which reports standard errors and t statistics that are

More information

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters. By Tiemen Woutersen. Draft, September

A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters. By Tiemen Woutersen. Draft, September A Simple Way to Calculate Confidence Intervals for Partially Identified Parameters By Tiemen Woutersen Draft, September 006 Abstract. This note proposes a new way to calculate confidence intervals of partially

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information

Nonparametric Identification and Estimation of a Transformation Model

Nonparametric Identification and Estimation of a Transformation Model Nonparametric and of a Transformation Model Hidehiko Ichimura and Sokbae Lee University of Tokyo and Seoul National University 15 February, 2012 Outline 1. The Model and Motivation 2. 3. Consistency 4.

More information

Generalized Autoregressive Score Models

Generalized Autoregressive Score Models Generalized Autoregressive Score Models by: Drew Creal, Siem Jan Koopman, André Lucas To capture the dynamic behavior of univariate and multivariate time series processes, we can allow parameters to be

More information

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

Local Rank Estimation of Transformation Models with Functional Coe cients

Local Rank Estimation of Transformation Models with Functional Coe cients Local Rank Estimation of Transformation Models with Functional Coe cients Youngki Shin Department of Economics University of Rochester Email: yshn@troi.cc.rochester.edu January 13, 007 (Job Market Paper)

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

A better way to bootstrap pairs

A better way to bootstrap pairs A better way to bootstrap pairs Emmanuel Flachaire GREQAM - Université de la Méditerranée CORE - Université Catholique de Louvain April 999 Abstract In this paper we are interested in heteroskedastic regression

More information

Estimation of Treatment Effects under Essential Heterogeneity

Estimation of Treatment Effects under Essential Heterogeneity Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

THE SINGULARITY OF THE INFORMATION MATRIX OF THE MIXED PROPORTIONAL HAZARD MODEL

THE SINGULARITY OF THE INFORMATION MATRIX OF THE MIXED PROPORTIONAL HAZARD MODEL Econometrica, Vol. 71, No. 5 (September, 2003), 1579 1589 THE SINGULARITY OF THE INFORMATION MATRIX OF THE MIXED PROPORTIONAL HAZARD MODEL BY GEERT RIDDER AND TIEMEN M. WOUTERSEN 1 This paper presents

More information

A Comparison of Robust Estimators Based on Two Types of Trimming

A Comparison of Robust Estimators Based on Two Types of Trimming Submitted to the Bernoulli A Comparison of Robust Estimators Based on Two Types of Trimming SUBHRA SANKAR DHAR 1, and PROBAL CHAUDHURI 1, 1 Theoretical Statistics and Mathematics Unit, Indian Statistical

More information

Semiparametric Identification in Panel Data Discrete Response Models

Semiparametric Identification in Panel Data Discrete Response Models Semiparametric Identification in Panel Data Discrete Response Models Eleni Aristodemou UCL March 8, 2016 Please click here for the latest version. Abstract This paper studies partial identification in

More information

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press.

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press. Cheng Hsiao Microeconometrics Required Text: C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press. A.C. Cameron and P.K. Trivedi (2005), Microeconometrics, Cambridge University

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -

More information

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Victor Chernozhukov Iván Fernández-Val Jinyong Hahn Whitney Newey MIT BU UCLA MIT February 4, 2009 1 First version of May 2007.

More information

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. * Least Absolute Deviations Estimation for the Accelerated Failure Time Model Jian Huang 1,2, Shuangge Ma 3, and Huiliang Xie 1 1 Department of Statistics and Actuarial Science, and 2 Program in Public Health

More information

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Statistica Sinica 22 (2012), 295-316 doi:http://dx.doi.org/10.5705/ss.2010.190 EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Mai Zhou 1, Mi-Ok Kim 2, and Arne C.

More information

A Local Generalized Method of Moments Estimator

A Local Generalized Method of Moments Estimator A Local Generalized Method of Moments Estimator Arthur Lewbel Boston College June 2006 Abstract A local Generalized Method of Moments Estimator is proposed for nonparametrically estimating unknown functions

More information

Endogenous binary choice models with median restrictions: A comment

Endogenous binary choice models with median restrictions: A comment Available online at www.sciencedirect.com Economics Letters 98 (2008) 23 28 www.elsevier.com/locate/econbase Endogenous binary choice models with median restrictions: A comment Azeem M. Shaikh a, Edward

More information