Nonparametric estimation for current status data with competing risks
|
|
- Dennis Barton
- 6 years ago
- Views:
Transcription
1 Nonparametric estimation for current status data with competing risks Marloes Henriëtte Maathuis A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2006 Program Authorized to Offer Degree: Statistics
2
3 University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Marloes Henriëtte Maathuis and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Co-Chairs of the Supervisory Committee: Piet Groeneboom Jon A. Wellner Reading Committee: Piet Groeneboom Michael G. Hudgens Jon A. Wellner Date:
4
5 In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with fair use as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to Proquest Information and Learning, 300 North Zeeb Road, Ann Arbor, MI , , to whom the author has granted the right to reproduce and sell (a) copies of the manuscript in microform and/or (b) printed copies of the manuscript made from microform. Signature Date
6
7 University of Washington Abstract Nonparametric estimation for current status data with competing risks Marloes Henriëtte Maathuis Co-Chairs of the Supervisory Committee: Professor Piet Groeneboom Statistics Professor Jon A. Wellner Statistics We study current status data with competing risks. Such data arise naturally in cross-sectional survival studies with several failure causes. Moreover, generalizations of these data arise in HIV vaccine clinical trials. The general framework is as follows. We analyze a system that can fail from K competing risks, where K N is fixed. The random variables of interest are (X, Y ), where X R + = (0, ) is the failure time of the system, and Y {1,...,K} is the corresponding failure cause. However, we cannot observe (X, Y ) directly. Rather, we observe the current status of the system at a single random observation time T R +, where T is independent of (X, Y ). This means that at time T, we observe whether or not failure occurred, and if and only if failure occurred, we also observe the failure cause Y. We study nonparametric estimation of the sub-distribution functions F 0k (t) = P(X t, Y = k), k = 1,...,K, t R +. We focus on two estimators: the nonparametric maximum likelihood estimator (MLE) and the naive estimator introduced by Jewell, Van der Laan and Henneman (2003). Our main interest is in asymptotic properties of the MLE, and the naive estimator is considered for comparison.
8
9 Until now, the asymptotic properties of the MLE have been largely unknown. We resolve this issue by proving its consistency, n 1/3 -rate of convergence, and limiting distribution. The limiting distribution involves a new self-induced limiting process, consisting of the convex minorants of K correlated two-sided Brownian motion processes plus parabolic drifts, plus an additional term involving the difference between the sum of the K drifting Brownian motions and their convex minorants. Various other aspects that we consider include characterizations of the estimators, uniqueness, graph theory, and computational algorithms. Furthermore, we show that both the MLE and the naive estimator are asymptotically efficient for a family of smooth functionals, with n-rate convergence to a normal limit. Finally, we study an extension of the model, where X is subject to interval censoring and Y is a continuous random variable. We show that the MLE is typically inconsistent in this model, and propose a simple method to repair this inconsistency.
10
11 TABLE OF CONTENTS List of Figures List of Tables iii v Chapter 1: Introduction Motivation and problem description Overview of previous work Overview of new results and outline of this thesis Chapter 2: The estimators Definition of the estimators Censored data perspective Graph theory and uniqueness Characterizations Chapter 3: Computation Reduction and optimization Iterative convex minorant algorithms Chapter 4: Consistency Hellinger consistency Local and uniform consistency Chapter 5: Rate of convergence Hellinger rate of convergence Asymptotic local minimax lower bound Local rate of convergence Technical lemmas and proofs i
12 Chapter 6: Limiting distribution The limiting distribution of the naive estimator The limiting distribution of the MLE Technical lemmas and proofs Chapter 7: A family of smooth functionals Information bound calculations Asymptotic normality of functionals of the MLE Chapter 8: Examples Menopause data Simulations Chapter 9: An extension: interval censored continuous mark data The model and an explicit formula for the MLE Inconsistency of the MLE Repaired MLE via discretization of marks Examples ii
13 LIST OF FIGURES Figure Number Page 2.1 The estimators: Graphical representation of the observed data Graph theory: Intersection graph for the MLE Convex minorant characterizations: Plots for the data in Table Asymptotic local minimax lower bound: The perturbation F nk Local rate: Plot of v n (t) for various values of β Local rate: Example clarifying the proof of Lemma Limiting distribution: Processes for the naive estimator at t 0 = Limiting distribution: Processes for the naive estimator at t 0 = Limiting distribution: Processes for the MLE at t 0 = Limiting distribution: Processes for the MLE at t 0 = Limiting distribution: Comparison of limiting processes at t 0 = Limiting distribution: Comparison of limiting processes at t 0 = Menopause data: Question of the Health Examination Study Menopause data: The MLE and the naive estimator Simulations: The true underlying sub-distribution functions Simulations: The estimators in a single simulation Simulations: Pointwise bias Simulations: Pointwise variance Simulations: Pointwise mean squared error Simulations: Pointwise relative efficiency Simulations: Smooth functionals of the MLE for t 0 = Simulations: Smooth functionals of the naive estimator for t 0 = Simulations: Smooth functionals of the MLE for t 0 = Simulations: Smooth functionals of the naive estimator for t 0 = Continuous mark data: Contour lines for estimates of F 0 (x, y) iii
14 9.2 Continuous mark data: Estimates of F 0X (x) Continuous mark data: Estimates of F 0 (x 0, y) iv
15 LIST OF TABLES Table Number Page 2.1 Censored data perspective: Example data Censored data perspective: Estimators for the data in Table Graph theory: Example data Graph theory: Clique matrix for the data in Table Convex minorant characterizations: Example data Simulations: Pointwise bias, variance and MSE at t = Continuous mark data: Summary of the examples v
16 ACKNOWLEDGMENTS I sincerely thank my advisors, Piet Groeneboom and Jon Wellner, for their mentorship over the past years. Their knowledge, guidance, inspiration and encouragement have been very important to me. I thank Peter Gilbert, Tilmann Gneiting, Peter Hoff and Michael Hudgens for serving on my committee, with special thanks to Michael for suggesting this research problem. I thank Bernard Deconinck for serving as the graduate school representative. I am grateful to the faculty, staff and students in our department for providing a stimulating and supportive research environment. In particular, I thank Fadoua Balabdaoui, Moulinath Banerjee and Hanna Jankowski for helpful discussions. Finally, I want to express my deep gratitude to Steven, my parents, my family and my friends, for their continuous support. vi
17 1 Chapter 1 INTRODUCTION 1.1 Motivation and problem description The work in this thesis is motivated by recent clinical trials of candidate vaccines against HIV/AIDS. The main purpose of such trials is to determine the overall efficacy of a candidate vaccine. Like many viruses, HIV exhibits significant genotypic and phenotypic variation, so that it can be distinguished into several subtypes. Therefore, it is also of interest to determine the efficacy of a vaccine against each subtype of the virus. Establishing vaccine efficacy for certain subtypes can warrant vaccination of populations in which the given subtypes are highly prevalent. Furthermore, establishing that the vaccine is efficacious for some subtypes, but not for others, gives important information for possible improvements of the vaccine. Thus, the variables of interest are the time of infection and the subtype of the infecting virus. These variables cannot be observed directly, because participants of a trial are only tested for the virus at several follow-up times. Since each test indicates whether or not infection happened before the time of the test, the time of infection is interval censored, i.e., only known to lie within a time interval determined by the follow-up times. Since simultaneous infections with several subtypes of a virus are rare, the subtypes are often analyzed as competing risks (see, e.g., Hudgens, Satten and Longini (2001)). Hence, these trials yield interval censored survival data with competing risks. In this thesis, we analyze current status data with competing risks. Current status censoring is the simplest form of interval censoring, where there is exactly one
18 2 observation time for each subject. We study these data for two reasons. First, such data arise naturally in cross-sectional studies with several failure causes. Second, understanding current status data with competing risks is a first step towards understanding the more complicated interval censored data with competing risks that arise in vaccine clinical trials. We consider the following general framework. We analyze a system that can fail from K competing risks, where K N is fixed. The random variables of interest are (X, Y ), where X R + = (0, ) is the failure time of the system, and Y {1,..., K} is the corresponding failure cause. Due to censoring, we cannot observe (X, Y ) directly. Rather, we observe the current status of the system at a single random observation time T R +, where T is independent of (X, Y ). Thus, at time T we observe whether or not failure occurred, and if and only if failure occurred, we also observe the failure cause Y. Examples that fit into this framework can be found in reliability and survival analysis. For an example, see the menopause data analyzed by Krailo and Pike (1983), where X is the age at menopause, Y is the cause of menopause (natural or operative), and T is the age at the time of the survey. In cross-sectional HIV studies we think of X as the time of HIV infection, Y as the subtype of the infecting HIV virus, and T as the time of the HIV test. Note that one is free to define the origin of the time scale as. Common choices include the date of birth and the beginning of the study. Given current status data with competing risks, we consider nonparametric estimation of the sub-distribution functions F 0k (t) = P(X t, Y = k), k = 1,...,K. This problem, or close variants thereof, has been studied by Hudgens, Satten and Longini (2001), Jewell, Van der Laan and Henneman (2003), and Jewell and Kalbfleisch (2004). However, there are still many open problems. In particular, until now, the asymptotic properties of the nonparametric maximum likelihood estimator (MLE) have been largely unknown. In this thesis, we resolve this problem. We prove con-
19 3 sistency, the rate of convergence and the limiting distribution of the MLE. These asymptotic results form an important step towards making inference about the subdistribution functions. The outline of the remainder of this chapter is as follows. In Section 1.2 we give an overview of previous work in this area. In Section 1.3 we give an outline of this thesis, together with a discussion of our main results. 1.2 Overview of previous work Hudgens, Satten and Longini (2001) study competing risks data subject to interval censoring and truncation. They derive the nonparametric maximum likelihood estimator (MLE) and provide an EM algorithm for its computation. They also introduce an alternative pseudo-likelihood estimator. They apply their methods to data from a cohort of injecting drug users in Thailand, where the event of interest is infection with HIV-1, and the competing risks are HIV-1 subtypes B and E. Jewell, Van der Laan and Henneman (2003) study current status data with competing risks. They consider some simple parametric models, some ad-hoc nonparametric estimators, and the MLE. They compare these estimators in a simulation study. Furthermore, they apply their methods to data analyzed by Krailo and Pike (1983), where the event of interest is menopause and the competing risks are natural and operative menopause. Finally, the authors discuss results suggesting that the simple ad-hoc estimators might yield fully efficient estimators for smooth functionals of the sub-distribution functions. Jewell and Kalbfleisch (2004) study maximum likelihood estimation of a series of ordered multinomial parameters. Current status data with competing risks can be viewed as a special case of this setting. The authors focus on the computation of the MLE, and introduce an iterative version of the Pool Adjacent Violators Algorithm.
20 4 1.3 Overview of new results and outline of this thesis We focus on the following two nonparametric estimators for the sub-distribution functions: the MLE F n = ( F n1,..., F nk ), and the naive estimator F n = ( F n1,..., F nk ) introduced by Jewell, Van der Laan and Henneman (2003). 1 Our main interest is in asymptotic properties of the MLE, and the naive estimator is considered for comparison. In Chapter 2 we define the estimators, and discuss the relationship between them. We show that both the MLE and the naive estimator can be viewed as maximum likelihood estimators for censored data. This observation is useful, because it allows us to use readily available theory and computational algorithms. In particular, the naive estimator can be viewed as the maximum likelihood estimator for reduced univariate current status data. Hence, many properties of the naive estimator follow straightforwardly from known results on current status data. The censored data perspective also allows us to use graph theory to study uniqueness properties of the estimators. Finally, we characterize the estimators in terms of necessary and sufficient conditions, in the form of Fenchel characterizations and (self-induced) convex minorant characterizations. These characterizations play a key role in the development of the asymptotic theory, and also lead to computational algorithms. Computational aspects of the MLE are discussed in Chapter 3. Since there are no explicit formulas available for the MLE, we compute the MLE with an iterative algorithm. We discuss two classes of algorithms and the connections between them. The first class is based on sequential quadratic programming, where each quadratic programming problem is solved using a support reduction algorithm. The second class consists of iterative convex minorant algorithms. We prove convergence of algorithms in both classes. Furthermore, we show that one particular iterative convex minorant algorithm can be viewed as a sequential quadratic programming method that only 1 The subscript n denotes the sample size.
21 5 uses the diagonal elements of the Hessian matrix. In Chapter 4 we discuss consistency of the estimators. We prove that both estimators are Hellinger consistent, and we use this to derive various forms of local and uniform consistency. The rate of convergence is discussed in Chapter 5. The Hellinger rate of convergence and the local rate of convergence of the naive estimator are n 1/3. This follows from known results on current status data without competing risks. For the MLE, we prove that the Hellinger rate of convergence is n 1/3. Next, we derive a local asymptotic minimax lower bound of n 1/3, meaning that no estimator can have a better local rate of convergence than n 1/3, in a minimax sense. We proceed by proving that the local rate of convergence of the MLE is n 1/3. This result comes as no surprise given the local asymptotic minimax lower bound and the local rate of convergence of the naive estimator. However, the proof of this result turned out to be rather involved, and required new methods. The key idea is to first establish a rate result for K k=1 F nk that holds uniformly on a fixed neighborhood around a point t 0, instead of on the usual shrinking neighborhood of order O(n 1/3 ). In Chapter 6 we discuss the limiting distribution of the estimators. The limiting distribution of the naive estimator is given by the slopes of the convex minorants of K correlated two-sided Brownian motion processes plus parabolic drifts. The limiting distribution of the MLE involves a new self-induced limiting process, consisting of the convex minorants of K correlated two-sided Brownian motion processes plus parabolic drifts, plus an additional term involving the difference between the sum of the K drifting Brownian motion processes and their convex minorants. In Chapter 7 we consider estimation of smooth functionals. Jewell, Van der Laan and Henneman (2003) suggested that the naive estimator yields asymptotically efficient smooth functionals. We show that this is indeed the case, and that the same holds for the MLE. In Chapter 8 we apply our methods to real and simulated data. We compare
22 6 the MLE and the naive estimator in a simulation study, considering both pointwise estimation and the estimation of smooth functionals. For pointwise estimation, we show that the MLE is superior to the naive estimator in terms of mean squared error, both for small and large sample sizes. For the estimation of smooth functionals, we show that the behavior of the MLE and the naive estimator is similar, and in agreement with the results in Chapter 7. Finally, in Chapter 9 we consider an extension of the model, where X is subject to interval censoring case k, and Y is a continuous random variable. This model is referred to as the interval censored continuous mark model. It is applicable to HIV vaccine clinical trials by letting X be the time of HIV infection, and Y be the viral distance between the infecting HIV virus and the virus present in the vaccine. We derive the limit of the MLE in this model, and show that the MLE is inconsistent in general. We also suggest a simple method for repairing the MLE by discretizing Y, an operation that transforms the data to interval censored data with competing risks. We illustrate the behavior of the MLE and the repaired MLE in four examples.
23 7 Chapter 2 THE ESTIMATORS In this chapter we study finite sample properties of the MLE and the naive estimator. In Section 2.1 we formally define the model and the estimators. Since both estimators can be viewed as maximum likelihood estimators for censored data, Section 2.2 provides a general discussion on the MLE for censored data. In Section 2.3 we use a graph theoretic perspective to derive properties of the estimators. Finally, in Section 2.4, we characterize the estimators in terms of necessary and sufficient Fenchel and convex minorant conditions. 2.1 Definition of the estimators Before we define the MLE and the naive estimator, we introduce some assumptions and notation. Recall that K N denotes the number of competing risks. The variables of interest are (X, Y ), where X R + is the failure time of a system, and Y {1,..., K} is the corresponding failure cause. We do not observe (X, Y ) directly. Rather, we observe the system at a random observation time T R +. At this time, we observe whether or not failure occurred, and if and only if failure occurred, we also observe the failure cause Y. Our goal is nonparametric estimation of the bivariate distribution function of (X, Y ), or equivalently, of the vector of sub-distribution functions F 0 = (F 01,...,F 0K ), where F 0k (t) = P(X t, Y = k), k = 1,..., K. We make the following assumptions:
24 8 (a) T is independent of (X, Y ); (b) The system cannot fail from two or more causes at the same time. Assumption (a) is essential for the development of the theory, and is used in the definition of the estimators in Sections and Assumption (b) ensures that the failure cause is well defined. This assumption is always satisfied by defining simultaneous failure from several causes as a new failure cause. We do not make any other assumptions. In particular, we do not require that all observation times are distinct Notation We denote the observed data by Z = (T, ), where = ( 1,..., K+1 ) and k = 1{X T, Y = k}, k = 1,...,K, (2.1) K+1 = 1{X > T }. (2.2) Thus, for k = 1,..., K, k = 1 if and only if failure happened by time T and was due to cause k. Furthermore, K+1 = 1 if and only if failure did not happen by time T. Note that K+1 k=1 k = 1, and hence K+1 = 1 K k=1 k. A graphical representation of the observed data is given in Figure 2.1. Let Z 1,...,Z n be n i.i.d. observations of Z, where Z i = (T i, i ) and i = ( i1,..., i,k+1 ). We call an observation Z i right censored if i,k+1 = 1, and left censored otherwise. Let T (1),...,T (n) be the order statistics of T 1,...,T n, where ties are broken arbitrarily after ensuring that left censored observation are ordered before right censored observations. We denote the corresponding -vectors by (1),..., (n), where (i) = ( (i)1,..., (i),k+1 ).
25 = (1, 0, 0, 0) = (0, 1, 0, 0) T T = (0, 0, 1, 0) = (0, 0, 0, 1) T T Figure 2.1: Graphical representation of the observed data (T, ) in an example with K = 3 competing risks. The grey sets indicate the values of (X, Y ) that are consistent with (T, ), for each of the four possible values of. Let e k, k = 1,..., K + 1, be the kth unit vector in R K+1, and let Z = {(t, e k ) : t R +, k = 1,...,K + 1}. (2.3) Let G be the distribution of T, and let G n be the empirical distribution of T 1,...,T n. Furthermore, let P n be the empirical distribution of Z 1,...,Z n, i.e., for any function h : Z R we have P n h(z) = h(z)dp n (z) = 1 n n i=1 h(z i). For vectors x = (x 1,...,x K ) R K, we define x + = K k=1 x k and x K+1 = 1 x +. For example, we write + = K k=1 k, F 0+ (t) = K k=1 F 0k(t) and F 0,K+1 (t) = 1 F 0+ (t). The only exception to the notation x K+1 = 1 x + is that we do not use it for the naive estimator. The reason for this will become clear in Section
26 The MLE We now define the MLE F n = ( F n1,..., F nk ) for F 0 = (F 01,..., F 0K ). Note that T Multinomial K+1 (1, (F 01 (T),..., F 0,K+1 (T))). (2.4) Hence, under F = (F 1,..., F K ), the density for a single observation z = (t, δ) is p F (z) = K+1 k=1 F k (t) δ k, (2.5) with respect to the dominating measure µ = G #, where # is counting measure on {e k : k = 1,...,K + 1}. The corresponding log likelihood (divided by n) 1 is l n (F) = log p F (u, δ)dp n (u, δ) = K+1 k=1 δ k log F k (u)dp n (u, δ), (2.6) and the MLE (if it exists) 2 is defined by l n ( F n ) = max F F K l n (F), (2.7) where F K is the set of all K-tuples of sub-distribution functions on R + with pointwise sum bounded by one. Note that we can absorb G in the dominating measure µ because of the assumed independence between T and (X, Y ) The naive estimator We now define the naive estimator F n = ( F n1,..., F n,k+1 ). The naive estimator F nk can be viewed as the MLE for the reduced current status data Z k = (T, k ). To see 1 In order to efficiently use the empirical process notation, we use the convention of dividing all log likelihoods by n. 2 Existence of the estimators will follow from Theorem 2.1 ahead.
27 11 this, let p k,fk (u, δ) be the marginal density of the reduced current status data Z k : p k,fk (u, δ) = F k (u) δ k {1 F k (u)} 1 δ k. Then the naive estimator F nk maximizes the marginal log likelihood l nk (F k ) = = log p k,fk (u, δ)dp n (u, δ) {δ k log F k (u) + (1 δ k ) log(1 F k (u))}dp n (u, δ), (2.8) for k = 1,...,K + 1. Thus, the naive estimators (if they exist) are defined by l nk ( F nk ) = max F k F l nk(f k ), k = 1,...,K, (2.9) l n,k+1 ( F n,k+1 ) = max S S l n,k+1(s). (2.10) where F is the collection of all sub-distribution functions on R +, and S is the collection of all sub-survival functions on R +. Note that we can omit G in the marginal log likelihood, since T and (X, Y ) are independent. The naive estimator provides two different estimators for the overall failure time distribution F 0+, namely F n+ = K k=1 F nk and 1 F n,k+1. Since the naive estimator does not require the sum of the sub-distribution functions to be bounded by one, F n+ may exceed one. In contrast, 1 F n,k+1 is always bounded between zero and one. This estimator is simply the MLE for the overall failure time distribution when information on the failure causes is ignored. In general, Fn,K+1 1 F n+, and we therefore do not use the shorthand notation x K+1 = 1 x + for the naive estimator Comparison of the two estimators In order to point out the similarities and differences between the MLE and the naive estimator, we give the following alternative but equivalent definition of the naive
28 12 estimator. For F = (F 1,...,F K ), we define ln (F) = K k=1 [ ] δ k log F k (u) + (1 δ k ) log(1 F k (u)) dp n (u, δ). (2.11) Then the naive estimator F n = ( F n1,..., F nk ) (if it exists) is defined by ln ( F n ) = max F F K ln (F), (2.12) where F K is the space of all K-tuples of sub-distribution functions on R +. Comparing this optimization problem with the optimization problem (2.7) for the MLE, we see the following two differences: (a) The log likelihood (2.6) for the MLE contains a term involving F K+1 (u) = 1 F + (u), while the log likelihood (2.11) for the naive estimator does not include such a term; (b) The space F K for the MLE includes the constraint that the sum of the subdistribution functions is bounded by one, while the space F K for the naive estimator does not include such a constraint. Thus, the MLE takes into account the K-dimensional system of sub-distribution functions, while the naive estimator ignores this aspect of the problem. In fact, since the sub-distribution functions in optimization problem (2.12) are not related to each other, the optimization problem can be split into the K optimization problems defined in (2.9). Since these optimization problems correspond to the MLE for univariate current status data, both computational results and asymptotic theory follow straightforwardly from known results for current status data (see Groeneboom and Wellner (1992, Part II, Sections 1.1, 4.1 and 5.1)). The fact that the MLE takes into account the system of sub-distribution functions leads to more complicated computation and asymptotic theory. However, these com-
29 13 plications result in a better pointwise behavior of the MLE, as shown in the simulation study in Section Censored data perspective From the definitions of the MLE and the naive estimator, we see that both estimators can be viewed as nonparametric maximum likelihood estimators for censored data. Viewing the estimators from this perspective allows us to use readily available computational algorithms and theory for the MLE for censored data. We consider the following general framework. Let W be a random variable taking values in W. Suppose that W has distribution F 0. Our goal is to estimate this distribution. However, we do not observe W directly. Rather, we observe a vector of random sets D = (D 1,...,D p ) that form a partition of W, i.e., p j=1 D j = W and D j D k = for j k {1,...,p}. We assume that D is independent of W. In principle, we can allow the number of random sets to be random, but for our purposes that is not needed. Furthermore, we observe an indicator vector = ( 1,..., p ), where j = 1{W D j }, j = 1,...,p. Thus, we observe a vector D containing a random partition of W, and an indicator vector indicating which set R {D 1,...,D p } contains the unobservable W. We call the set R an observed set. Using the convention 0 D j =, we can write R = p j=1 jd j. Let Z 1,...,Z n be n i.i.d. copies of Z = (D, ). These data define n i.i.d. observed sets R 1,...,R n. Writing the log likelihood in terms of these sets gives l n (F) = 1 n n log P F (R i ), i=1 where P F (R i ) denotes the probability mass in R i under distribution F. The maximum
30 14 likelihood estimator (if it exists) is defined by l n ( F n ) = max F F l n(f), (2.13) where F is the space of all distribution functions on W. Since l n (F) is optimized over the function space F, the optimization problem (2.13) is infinite dimensional. However, the number of parameters can be reduced by generalizing the reasoning of Turnbull (1976) for univariate censored data. It follows that the estimators can only assign mass to a finite collection of disjoint sets A 1,...,A m, called maximal intersections by Wong and Yu (1999). In the literature, there are several equivalent definitions of maximal intersections. Wong and Yu (1999) define A j to be a maximal intersection if and only if it is a finite intersection of the R i s such that for each i A j R i = or A j R i = A j. Gentleman and Vandal (2002) use a graph theoretic perspective. They show that the maximal intersections correspond to maximal cliques of the intersection graph of the observed sets. We discuss this perspective in detail in the next section. For observed sets that take the form of rectangles in R p, p N, Maathuis (2005) introduces yet another way to view the maximal intersections, using a height map of the observed sets. This height map is a function h : R p {0, 1,..., }, where h(x) is defined as the number of observed sets that overlap at the point x R p. Maathuis (2005) shows that the maximal intersections are exactly the local maxima of the height map of a canonical version of the observed sets. We say that R 1,...,R n are a canonical version of R 1,...,R n if the following three properties hold: (i) R 1,..., R n and R 1,...,R n have the same intersection structure, i.e., R i R j = if and only if R i R j =, for all i, j {1,...,n}; (ii) The x-coordinates of R 1,...,R n are distinct and take values in {1,...,2n}; (ii) The y-coordinates of R 1,...,R n are distinct and take values in {1,..., 2n}. Thus, any ties that may have been present in R 1,..., R n are resolved in R 1,...,R n, but in a way that does not affect the intersection structure. For details on the transformation to canonical sets, see Maathuis (2005, Section 2.1).
31 15 By generalizing the reasoning of Turnbull (1976), it follows that the MLE is indifferent to the distribution of mass within the maximal intersections. As a result, the MLE is typically not uniquely defined on the maximal intersections. This type of non-uniqueness is called representational non-uniqueness by Gentleman and Vandal (2002). Thus, we can at best hope to determine the probability masses α j = P F (A j ), j = 1,..., m. We let α = (α 1,...,α m ) and write the probability mass in an observed set R i in terms of α: P α (R i ) = m α j 1{A j R i }. (2.14) j=1 Then we can write the log likelihood as l n (α) = 1 n n log P α (R i ) = 1 n i=1 ( n m ) log α j 1{A j R i }. (2.15) i=1 j=1 Thus, we can think of the computation of the estimators as a two step process. First, in the reduction step, we compute the maximal intersections A 1,...,A m. Next, in the optimization step, we solve the optimization problem l n ( α) = max A l n(α), (2.16) where A = {α R m : α j 0, j = 1,...,m,1 T α = 1} and 1 is the all-one vector in R m. This optimization problem is an m-dimensional convex constrained optimization problem. Existence of the MLE follows directly from standard methods in optimization theory. Theorem 2.1 The MLE α defined by (2.16) exists.
32 16 Proof: Letting log(0) =, l n (α) is a continuous extended real valued function on the nonempty compact set A. Hence, the maximum exists by, e.g., Zeidler (1985, Corollary 38.10). The optimization problem (2.16) may have several solutions. This forms a second source of non-uniqueness for the MLE, called mixture non-uniqueness by Gentleman and Vandal (2002). We will show in Section 2.3 that for current status data with competing risks, both the MLE and the naive estimator are mixture unique. However, we first show how both estimators fit into the censored data framework Censored data perspective of the MLE For the MLE, the variable of interest is W = (X, Y ), taking values in the space W = R + {1,...,K}. The observation time T defines a partition of p = K + 1 random sets in W: D k = (0, T] {k}, k = 1,...,K, (2.17) D K+1 = (T, ) {1,..., K}. (2.18) Since there is a one-to-one correspondence between D = (D 1,...,D K+1 ) and T, the assumption that T is independent of (X, Y ) is equivalent to the assumption that D is independent of (X, Y ). Furthermore, note that k = 1{X T, Y = k} = 1{(X, Y ) D k } for k = 1,...,K, and K+1 = 1{X > T } = 1{(X, Y ) D K+1 }. Hence, the vector indicates which set contains the unobservable (X, Y ), and the observed data (T, ) give exactly the same information as (D, ). The corresponding observed sets are R = K+1 k=1 kd k, so that (0, T] {k} if k = 1, k = 1,...,K, R = (T, ) {1,...,K} if K+1 = 1. (2.19)
33 17 It follows that we can write the log likelihood (2.6) as l n (F) = 1 n n i=1 log P F(R i ). The MLE maximizes this expression over all bivariate sub-distribution functions F on R + {1,..., K}, or equivalently, over all K-tuples of sub-distribution functions F = (F 1,...,F K ) with pointwise sum bounded by one. We now consider the maximal intersections of the observed sets R 1,...,R n. Note that the observed sets can take the form (t, ) {1,...,K} for some t R +. Such sets are not rectangles in R 2, and hence we cannot directly use the concept of the height map of Maathuis (2005). However, by transforming such sets into (t, ) [1, K], we do have rectangles in R 2. We can then compute the maximal intersections using the concept of the height map. Afterwards we transform sets of the form (t, ) [1, K] back to (t, ) {1,..., K}. Once we have computed α, we obtain F nk (t) by summing the mass in (0, t] {k}, for k = 1,..., K and t R +. For each k {1,..., K + 1}, we call A a maximal intersection for F nk, if A is involved in the computation of F nk. A precise definition is given below. Definition 2.2 Let k {1,...,K}, and let R = {R 1,...,R n } be the observed sets as defined in (2.19). We call A a maximal intersection for F nk if it is a maximal intersection of R and A (R {k}). We call A a maximal intersection for F n+ (or equivalently, for F n,k+1 ) if A is a maximal intersection for some F nk, k = 1,..., K. Note that maximal intersections for F n+ are sets in R + {1,...,K}, although F n+ is a function on R +. Recall from Section that we order the observations such that their observation times are nondecreasing, where ties are broken arbitrarily after ensuring that left censored observations are ordered before right censored observations. Hence, if there is an observation Z i such that T i = T (n) and i,k+1 = 1, then (n),k+1 = 1 holds, even if there are other observations with T i = T (n) and ik = 1 for some k {1,..., K}. This is used in the following lemma, which provides information on the form of the maximal intersections for F nk. The lemma follows directly
34 18 from the idea of the height map. Lemma 2.3 Let k {1,..., K}. Each maximal intersection for F nk satisfies one of the following two conditions: (i) A = (T (i), T (j) ] {k}, with i < j, (i),k+1 = 1, (j)k = 1, and (l),k+1 = (l)k = 0 for all l such that T (i) < T (l) < T (j) ; (ii) A = (T (n), ) {1,..., K}, with (n),k+1 = 1. Moreover, if a set A satisfies one of these conditions, then A is a maximal intersection for F nk Censored data perspective of the naive estimator For the naive estimator F nk, we consider the reduced current status data Z k = (T, k ). Define the variables W k = X1{Y = k} + 1{Y k}, k = 1,...,K, W K+1 = X, taking values in W = R + { }. Note that F 0k (t) = P(W k t) for k = 1,..., K, and F 0,K+1 (t) = P(W K+1 > t). Hence we can take W 1,...,W K+1 to be our variables of interest. The observation time T defines a partition of p = 2 random sets in W: D 1 = (0, T] and D 2 = (T, ]. (2.20) Since there is a one-to-one correspondence between D = (D 1, D 2 ) and T, the assumption that T is independent of (X, Y ) is equivalent to the assumption that D is independent of W 1,...,W K+1.
35 19 For k = 1,..., K, note that k = 1{X T, Y = k} = 1{W k T } = 1{W k D 1 }. Hence, the vector ( k, 1 k ) indicates whether D 1 or D 2 contains the unobservable W k, and the reduced current status data (T, k ) give exactly the same information as (D, k ). The corresponding observed sets are R (k) = k D 1 (1 k )D 2, so that (0, T] if R (k) k = 1, = (T, ) if k = 0. (2.21) We can write the log likelihood (2.8) as l nk (F k ) = 1 n n i=1 log P F(R (k) i ). The naive estimator maximizes this expression over all sub-distribution functions F k on R +. For k = K + 1, note that K+1 = 1{X > t} = 1{W K+1 D 2 }. Hence, the vector (1 K+1, K+1 ) indicates whether D 1 or D 2 contains the unobservable X, and the reduced current status data (T, K+1 ) give exactly the same information as (D, K+1 ). The corresponding observed sets are R (K+1) = (1 K+1 )D 1 K+1 D 2, so that (0, T] if R (K+1) K+1 = 0, = (T, ) if K+1 = 1. (2.22) We can write the log likelihood (2.8) as l n,k+1 (S) = 1 n n i=1 log P S(R (K+1) i ). The naive estimator F n,k+1 maximizes this expression over all sub-survival functions S on R +. Definition 2.4 For k = 1,..., K + 1, we call A a maximal intersection for F nk if it is a maximal intersection of the observed sets R (k) 1,...,R n (k) (2.22). as defined in (2.21) and The maximal intersections for the naive estimator are described in Lemmas 2.5 and 2.6. Both lemmas follow directly from the idea of the height map. Lemma 2.5 Let k {1,...,K}. Each maximal intersections A for F nk satisfies one of the following two conditions:
36 20 (i) A = (T (i), T (j) ], with (T (i), T (j) ) {T 1,..., T n } =, (i)k = 0, and (j)k = 1. (ii) A = (T (n), ), with (n)k = 0. Moreover, if an interval A satisfies one of these conditions, then it is a maximal intersection for F nk. Lemma 2.6 Each maximal intersection for F n,k+1 satisfies one of the following two conditions: (i) A = (T (i), T (j) ], with (T (i), T (j) ) {T 1,...,T n } =, (i),k+1 = 1, and (j),k+1 = 0. (ii) A = (T (n), ), with (n),k+1 = 1. Moreover, if an interval A satisfies one of these conditions, then A is a maximal intersection for F n,k Comparing the maximal intersections for both estimators Definition 2.7 For any set A R 2, we define the x-interval and y-interval of A to be the projections of A on the x-axis and y-axis. Furthermore, we define the lower and upper endpoint of A to be the lower and upper endpoint of its x-interval. We now compare the maximal intersections for F nk and F nk, for k {1,...,K}. Lemma 2.8 For each k = 1,..., K, the number of maximal intersections for F nk is at least as large as the number of maximal intersections for F nk. Moreover, each upper endpoint of a maximal intersection for F nk is an upper endpoint of a maximal intersection for F nk. Proof: Let A be a maximal intersection for F nk. We show that there is a maximal intersection for F nk with the same upper endpoint. Note that A must satisfy one of
37 21 the two conditions of Lemma 2.3. First, suppose that the A = (T (n), ) {1,..., K} with (n),k+1 = 1. Then (n)k = 0, and A = (T (n), ) is a maximal intersection for F nk by Lemma 2.5. Next, suppose that A = (T (i), T (j) ] {k}, with (i),k+1 = 1, (j)k = 1 and (l)k = (l),k+1 = 0 for all l such that T (i) < T (l) < T (j). Then (j 1)k = 0, and hence A = (T (j 1), T (j) ] is a maximal intersection for F nk by Lemma 2.5. Lemma 2.9 The number of maximal intersections for F n,k+1 is at most as large as the number of maximal intersections for F n,k+1. Moreover, the collection of lower endpoints of the maximal intersections for F n,k+1 is identical to the collection of lower endpoints of the maximal intersections for F n,k+1. As a result, the number of regions on the x-axis where F n,k+1 can put mass is identical to the number of regions on the x-axis where F n,k+1 can put mass. Finally, the union of the maximal intersections for F n,k+1 is contained in the union of the x-intervals of the maximal intersections for F n,k+1. Proof: Let A be a maximal intersection for F n,k+1. We show that there is a maximal intersection for F n,k+1 with the same lower endpoint. Note that A must satisfy one of the two conditions of Lemma 2.6. First, suppose that A = (T (i), T (j) ] with (T (i), T (j) ) {T 1,...,T n } =, (i),k+1 = 1 and (j),k+1 = 0. Since (j),k+1 = 0, there must be a k {1,..., K} such that (j)k = 1. But this implies that (T (i), T (j) ] {k} is a maximal intersection for F nk, by Lemma 2.3. Next, suppose that A = (T (n), ) with (n),k+1 = 1. Then (T (n), ) {1,..., K} is a maximal intersection for F n1,..., F nk by Lemma 2.3, and hence it is a maximal intersection for F n,k+1 by definition. Next, let A be a maximal intersection for F n,k+1. We show that there is a maximal intersection for F n,k+1 with the same lower endpoint. By definition, it follows that there is a k {1,...,K} so that A is a maximal intersection for F nk. Hence, A must satisfy one of the two conditions of Lemma 2.3. First, suppose that A = (T (i), T (j) ] {k}, with (i),k+1 = 1, (j)k = 1 and (l)k = (l)k+1 = 0 for all l
38 22 Table 2.1: Example data with K = 2 competing risks, illustrating that the number of positive maximal intersections for F n,k+1 can be larger than the number of positive maximal intersections for F n,k+1. i t (i) δ (i)1 δ (i)2 δ (i) i t (i) δ (i)1 δ (i)2 δ (i) such that T (i) < T (l) < T (j). If S = (T (i), T (j) ) {T 1,..., T n } =, then (T (i), T (j) ] is a maximal intersection for F n,k+1 by Lemma 2.6. Otherwise, (T (i), min{s}] is a maximal intersection for F n,k+1. Next, suppose that A = (T (n), ) {1,..., K} with (n),k+1 = 1. Then (T (n), ) is a maximal intersection for F n,k+1 by Lemma 2.6. The last statement follows by combining the fact that the collection of lower endpoints of the maximal intersections for F n,k+1 and F n,k+1 are identical, with the fact that maximal intersections for F n,k+1 cannot contain observation times in their interior (Lemma 2.6). Remark 2.10 The last statement of Lemma 2.9 has implications for representational non-uniqueness of the estimators. It shows that it is possible that the area in which the MLE F n,k+1 suffers from representational non-uniqueness is larger than the area in which F n,k+1 suffers from representational non-uniqueness. This was also noted by Hudgens, Satten and Longini (2001), and partly motivated their pseudo-likelihood estimator. However, note that it can also happen that F n,k+1 is non-unique over a larger area, if many of the maximal intersections for F n,k+1 get zero mass. For an example, see Tables 2.1 and 2.2. Motivated by Remark 2.10, we now consider maximal intersections that get positive mass. We introduce the following terminology:
39 23 Table 2.2: The estimators for the data in Table 2.1, in terms of their maximal intersections (MIs) and the corresponding probability masses. F n,k+1 MIs mass (0, 1] {1} 3/10 (3, 4] {1} 0 (5, 8] {1} 0 (5, 6] {2} 7/10 F n,k+1 MIs mass (0, 1] 1/3 (3, 4] 1/6 (5, 6] 1/2 Definition 2.11 Let k {1,..., K + 1}. We say that A is a positive maximal intersection for F nk if A is a maximal intersection for F nk and the MLE assigns positive mass to A. Similarly, we say that F nk is a positive maximal intersection for F nk if A is a maximal intersection for F nk and F nk assigns positive mass to A. After reading Lemma 2.9, one may wonder whether the number of positive maximal intersections for F n,k+1 is at most as large as the number of positive maximal intersections for F n,k+1. This is indeed often the case in simulations, but not always. A counter example can be found in Table 2.1. In this example, Fn,K+1 has four maximal intersections, given in Table 2.2. The naive estimator F n,k+1 has three maximal intersections, with corresponding masses given in Table 2.2. Note that the maximal intersections satisfy the statement in Lemma 2.9. However, there are only two positive maximal intersections for F n,k+1, while there are three positive maximal intersections for F n,k Graph theory and uniqueness Gentleman and Vandal (2001), Gentleman and Vandal (2002), Maathuis (2003), and Vandal, Gentleman and Liu (2006) use a graph theoretic perspective to study properties of the maximum likelihood estimator for censored data. Before we apply these methods to our problem, we give an introduction to graph theory. This introduction
40 24 is mostly based on Golumbic (1980), and also partly given in Maathuis (2003, Section 3.3) Introduction to graph theory for censored data Let G = (V, E) be an undirected graph, where V is a set of vertices, and E is a set of edges. An edge is a collection of two vertices. Two vertices v and w are said to be adjacent in G if there is an edge between v and w, i.e., vw E. We say that two sets of vertices S 1 and S 2 are adjacent if there is at least one pair of vertices (v, w) such that v S 1, w S 2 and vw E. A subgraph of G = (V, E) is defined to be any graph G = (V, E ) such that V V and E E. Given a subset A V of vertices, we define the subgraph induced by A to be G A = (A, E A ), where E A = {xy E : x A, y A}. We call a subset M V of vertices a clique if every pair of distinct vertices in M is adjacent. We call M V a maximal clique if there is no clique in G that properly contains M as a subset 3. Every finite graph has a finite number of maximal cliques that we denote by C = {C 1,...,C m }. Let R = {R 1,...,R n } be a family of sets. The intersection graph of R is obtained by representing each set in R by a vertex, and connecting two vertices by an edge if and only if their corresponding sets intersect. An intersection graph of a collection of intervals on a linearly ordered set is called an interval graph. Alternatively, an undirected graph G is called an interval graph if it can be thought of as an intersection graph of a set of intervals on the real line. Every maximal clique C j in an intersection graph has a real representation A j = R C j R, given by the intersection of the sets that form the maximal clique. A sequence of vertices (v 0, v 1,...,v l ) is called a cycle of length l + 1 if v i 1 v i E for all i = 1,..., l and v l v 0 E. A cycle (v 0,...,v l ) is called a simple cycle if v i v j 3 Instead of the terms clique and maximal clique, some authors use the terms complete subgraph and clique.
41 25 for i j. A simple cycle (v 0, v 1,...,v l ) is called chordless if for all i = 0,..., l, v i v j E only for j = (i ± 1) mod (l + 1). A graph is called triangulated if it does not contain chordless cycles of length strictly greater than three. Hajös (1957) showed that every interval graph is triangulated. A clique graph of R is an intersection graph of the maximal cliques C. Thus, in this graph each vertex represents a maximal clique, and two vertices C j and C k are adjacent if and only if C j C k, i.e., if there is at least one set in R that is an element of both C j and C k. We define the clique matrix to be a vertices versus maximal cliques incidence matrix. For n observed sets with m maximal cliques, this is an n m matrix H with elements H ij = 1{A j R i }. 4 We now return to the maximum likelihood estimator for censored data. Let R = {R 1,...,R n } be the observed sets. Gentleman and Vandal (2001) showed that the maximal intersections A 1,..., A m of R, defined in Section 2.2, are exactly the real representations of the maximal cliques of the intersection graph of R. Hence, we can study the intersection graph to deduce properties of the MLE. In particular, Gentleman and Vandal (2002, Lemma 4) showed that α is unique if the intersection graph is triangulated. An alternative proof can be found in Maathuis (2003, Lemma 3.13). Finally, we can use the clique matrix H to rewrite the optimization problem (2.16). Namely, P α (R i ) = (Hα) i, so that (2.16) becomes l n ( α) = max A n log ((Hα) i ) Graph theoretic aspects and uniqueness of the naive estimator i=1 For k = 1,...,K + 1, let R (k) = {R (k) 1,...,R(k) n } be the observed sets for the naive estimator F nk, as defined in (2.21) and (2.22). The following proposition uses the structure of the intersection graph and the form of the maximal intersections to 4 Note that our H is the transpose of the incidence matrix defined in Gentleman and Vandal (2002, page 559).
Inconsistency of the MLE for the joint distribution. of interval censored survival times. and continuous marks
Inconsistency of the MLE for the joint distribution of interval censored survival times and continuous marks By M.H. Maathuis and J.A. Wellner Department of Statistics, University of Washington, Seattle,
More informationarxiv:math/ v2 [math.st] 17 Jun 2008
The Annals of Statistics 2008, Vol. 36, No. 3, 1031 1063 DOI: 10.1214/009053607000000974 c Institute of Mathematical Statistics, 2008 arxiv:math/0609020v2 [math.st] 17 Jun 2008 CURRENT STATUS DATA WITH
More informationMaximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,
More informationSurvival Analysis for Interval Censored Data - Nonparametric Estimation
Survival Analysis for Interval Censored Data - Nonparametric Estimation Seminar of Statistics, ETHZ Group 8, 02. May 2011 Martina Albers, Nanina Anderegg, Urs Müller Overview Examples Nonparametric MLE
More informationarxiv:math/ v2 [math.st] 17 Jun 2008
The Annals of Statistics 2008, Vol. 36, No. 3, 1064 1089 DOI: 10.1214/009053607000000983 c Institute of Mathematical Statistics, 2008 arxiv:math/0609021v2 [math.st] 17 Jun 2008 CURRENT STATUS DATA WITH
More informationMaximum likelihood: counterexamples, examples, and open problems
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington visiting Vrije Universiteit, Amsterdam Talk at BeNeLuxFra Mathematics Meeting 21 May, 2005 Email:
More informationLikelihood Based Inference for Monotone Response Models
Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic
More information1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time
ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,
More informationTopics in Current Status Data. Karen Michelle McKeown. A dissertation submitted in partial satisfaction of the. requirements for the degree of
Topics in Current Status Data by Karen Michelle McKeown A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor in Philosophy in Biostatistics in the Graduate Division
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2001 Paper 100 Maximum Likelihood Estimation of Ordered Multinomial Parameters Nicholas P. Jewell John
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More informationBichain graphs: geometric model and universal graphs
Bichain graphs: geometric model and universal graphs Robert Brignall a,1, Vadim V. Lozin b,, Juraj Stacho b, a Department of Mathematics and Statistics, The Open University, Milton Keynes MK7 6AA, United
More informationOn Some Three-Color Ramsey Numbers for Paths
On Some Three-Color Ramsey Numbers for Paths Janusz Dybizbański, Tomasz Dzido Institute of Informatics, University of Gdańsk Wita Stwosza 57, 80-952 Gdańsk, Poland {jdybiz,tdz}@inf.ug.edu.pl and Stanis
More informationDisjoint G-Designs and the Intersection Problem for Some Seven Edge Graphs. Daniel Hollis
Disjoint G-Designs and the Intersection Problem for Some Seven Edge Graphs by Daniel Hollis A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements
More informationTree sets. Reinhard Diestel
1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview
Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationRegression analysis of interval censored competing risk data using a pseudo-value approach
Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 555 562 http://dx.doi.org/10.5351/csam.2016.23.6.555 Print ISSN 2287-7843 / Online ISSN 2383-4757 Regression analysis of interval
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationMaximum likelihood estimation of ordered multinomial parameters
Biostatistics (2004), 5, 2,pp. 291 306 Printed in Great Britain Maximum lielihood estimation of ordered multinomial parameters NICHOLAS P. JEWELL Division of Biostatistics, School of Public Health, University
More information4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN. Robin Thomas* Xingxing Yu**
4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN Robin Thomas* Xingxing Yu** School of Mathematics Georgia Institute of Technology Atlanta, Georgia 30332, USA May 1991, revised 23 October 1993. Published
More informationThe Strong Largeur d Arborescence
The Strong Largeur d Arborescence Rik Steenkamp (5887321) November 12, 2013 Master Thesis Supervisor: prof.dr. Monique Laurent Local Supervisor: prof.dr. Alexander Schrijver KdV Institute for Mathematics
More informationThe Minimum Rank, Inverse Inertia, and Inverse Eigenvalue Problems for Graphs. Mark C. Kempton
The Minimum Rank, Inverse Inertia, and Inverse Eigenvalue Problems for Graphs Mark C. Kempton A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationMaximum likelihood estimation of a log-concave density based on censored data
Maximum likelihood estimation of a log-concave density based on censored data Dominic Schuhmacher Institute of Mathematical Statistics and Actuarial Science University of Bern Joint work with Lutz Dümbgen
More informationSome hard families of parameterised counting problems
Some hard families of parameterised counting problems Mark Jerrum and Kitty Meeks School of Mathematical Sciences, Queen Mary University of London {m.jerrum,k.meeks}@qmul.ac.uk September 2014 Abstract
More informationOn the number of cycles in a graph with restricted cycle lengths
On the number of cycles in a graph with restricted cycle lengths Dániel Gerbner, Balázs Keszegh, Cory Palmer, Balázs Patkós arxiv:1610.03476v1 [math.co] 11 Oct 2016 October 12, 2016 Abstract Let L be a
More informationClaw-Free Graphs With Strongly Perfect Complements. Fractional and Integral Version.
Claw-Free Graphs With Strongly Perfect Complements. Fractional and Integral Version. Part II. Nontrivial strip-structures Maria Chudnovsky Department of Industrial Engineering and Operations Research Columbia
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More informationREALIZING TOURNAMENTS AS MODELS FOR K-MAJORITY VOTING
California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-016 REALIZING TOURNAMENTS AS MODELS FOR K-MAJORITY VOTING Gina
More informationQuasi-randomness is determined by the distribution of copies of a fixed graph in equicardinal large sets
Quasi-randomness is determined by the distribution of copies of a fixed graph in equicardinal large sets Raphael Yuster 1 Department of Mathematics, University of Haifa, Haifa, Israel raphy@math.haifa.ac.il
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationRobust estimates of state occupancy and transition probabilities for Non-Markov multi-state models
Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval
More informationModular Monochromatic Colorings, Spectra and Frames in Graphs
Western Michigan University ScholarWorks at WMU Dissertations Graduate College 12-2014 Modular Monochromatic Colorings, Spectra and Frames in Graphs Chira Lumduanhom Western Michigan University, chira@swu.ac.th
More informationMultivariate Survival Analysis
Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in
More informationMaximum Smoothed Likelihood for Multivariate Nonparametric Mixtures
Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,
More informationStatistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart
Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms
More informationSTRONG FORMS OF ORTHOGONALITY FOR SETS OF HYPERCUBES
The Pennsylvania State University The Graduate School Department of Mathematics STRONG FORMS OF ORTHOGONALITY FOR SETS OF HYPERCUBES A Dissertation in Mathematics by John T. Ethier c 008 John T. Ethier
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationAditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016
Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of
More informationarxiv: v1 [math.co] 28 Oct 2016
More on foxes arxiv:1610.09093v1 [math.co] 8 Oct 016 Matthias Kriesell Abstract Jens M. Schmidt An edge in a k-connected graph G is called k-contractible if the graph G/e obtained from G by contracting
More informationFAILURE-TIME WITH DELAYED ONSET
REVSTAT Statistical Journal Volume 13 Number 3 November 2015 227 231 FAILURE-TIME WITH DELAYED ONSET Authors: Man Yu Wong Department of Mathematics Hong Kong University of Science and Technology Hong Kong
More informationECE521 lecture 4: 19 January Optimization, MLE, regularization
ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity
More informationHighly Hamiltonian Graphs and Digraphs
Western Michigan University ScholarWorks at WMU Dissertations Graduate College 6-017 Highly Hamiltonian Graphs and Digraphs Zhenming Bi Western Michigan University, zhenmingbi@gmailcom Follow this and
More informationThe Algorithmic Aspects of the Regularity Lemma
The Algorithmic Aspects of the Regularity Lemma N. Alon R. A. Duke H. Lefmann V. Rödl R. Yuster Abstract The Regularity Lemma of Szemerédi is a result that asserts that every graph can be partitioned in
More informationA NEW COMBINATORIAL FORMULA FOR CLUSTER MONOMIALS OF EQUIORIENTED TYPE A QUIVERS
A NEW COMBINATORIAL FORMULA FOR CLUSTER MONOMIALS OF EQUIORIENTED TYPE A QUIVERS D. E. BAYLERAN, DARREN J. FINNIGAN, ALAA HAJ ALI, KYUNGYONG LEE, CHRIS M. LOCRICCHIO, MATTHEW R. MILLS, DANIEL PUIG-PEY
More informationNONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS. By Piet Groeneboom and Geurt Jongbloed Delft University of Technology
NONPARAMETRIC CONFIDENCE INTERVALS FOR MONOTONE FUNCTIONS By Piet Groeneboom and Geurt Jongbloed Delft University of Technology We study nonparametric isotonic confidence intervals for monotone functions.
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationEquational Logic. Chapter Syntax Terms and Term Algebras
Chapter 2 Equational Logic 2.1 Syntax 2.1.1 Terms and Term Algebras The natural logic of algebra is equational logic, whose propositions are universally quantified identities between terms built up from
More informationLearning discrete graphical models via generalized inverse covariance matrices
Learning discrete graphical models via generalized inverse covariance matrices Duzhe Wang, Yiming Lv, Yongjoon Kim, Young Lee Department of Statistics University of Wisconsin-Madison {dwang282, lv23, ykim676,
More informationNonparametric estimation of log-concave densities
Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work
More informationK 4 -free graphs with no odd holes
K 4 -free graphs with no odd holes Maria Chudnovsky 1 Columbia University, New York NY 10027 Neil Robertson 2 Ohio State University, Columbus, Ohio 43210 Paul Seymour 3 Princeton University, Princeton
More informationSPECIAL T K 5 IN GRAPHS CONTAINING K 4
SPECIAL T K 5 IN GRAPHS CONTAINING K 4 A Thesis Presented to The Academic Faculty by Dawei He In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Mathematics School of Mathematics
More informationLecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016
Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationMINORS OF GRAPHS OF LARGE PATH-WIDTH. A Dissertation Presented to The Academic Faculty. Thanh N. Dang
MINORS OF GRAPHS OF LARGE PATH-WIDTH A Dissertation Presented to The Academic Faculty By Thanh N. Dang In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Algorithms, Combinatorics
More informationGenerating p-extremal graphs
Generating p-extremal graphs Derrick Stolee Department of Mathematics Department of Computer Science University of Nebraska Lincoln s-dstolee1@math.unl.edu August 2, 2011 Abstract Let f(n, p be the maximum
More informationModels for Multivariate Panel Count Data
Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous
More informationAdditional Constructions to Solve the Generalized Russian Cards Problem using Combinatorial Designs
Additional Constructions to Solve the Generalized Russian Cards Problem using Combinatorial Designs Colleen M. Swanson Computer Science & Engineering Division University of Michigan Ann Arbor, MI 48109,
More informationLecture 5: January 30
CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 5: January 30 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They
More informationINDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
More informationFair Factorizations of the Complete Multipartite Graph and Related Edge-Colorings. Aras Erzurumluoğlu
Fair Factorizations of the Complete Multipartite Graph and Related Edge-Colorings by Aras Erzurumluoğlu A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the
More informationThe partially monotone tensor spline estimation of joint distribution function with bivariate current status data
University of Iowa Iowa Research Online Theses and Dissertations Summer 00 The partially monotone tensor spline estimation of joint distribution function with bivariate current status data Yuan Wu University
More informationMinimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and
Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern
More informationLikelihood and Fairness in Multidimensional Item Response Theory
Likelihood and Fairness in Multidimensional Item Response Theory or What I Thought About On My Holidays Giles Hooker and Matthew Finkelman Cornell University, February 27, 2008 Item Response Theory Educational
More informationThe extreme points of symmetric norms on R^2
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2008 The extreme points of symmetric norms on R^2 Anchalee Khemphet Iowa State University Follow this and additional
More information1 Directional Derivatives and Differentiability
Wednesday, January 18, 2012 1 Directional Derivatives and Differentiability Let E R N, let f : E R and let x 0 E. Given a direction v R N, let L be the line through x 0 in the direction v, that is, L :=
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationDOUBLY PERIODIC SELF-TRANSLATING SURFACES FOR THE MEAN CURVATURE FLOW
DOUBLY PERIODIC SELF-TRANSLATING SURFACES FOR THE MEAN CURVATURE FLOW XUAN HIEN NGUYEN Abstract. We construct new examples of self-translating surfaces for the mean curvature flow from a periodic configuration
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationThe Chromatic Number of Ordered Graphs With Constrained Conflict Graphs
The Chromatic Number of Ordered Graphs With Constrained Conflict Graphs Maria Axenovich and Jonathan Rollin and Torsten Ueckerdt September 3, 016 Abstract An ordered graph G is a graph whose vertex set
More informationEstimation of Conditional Kendall s Tau for Bivariate Interval Censored Data
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional
More informationParadoxical Results in Multidimensional Item Response Theory
UNC, December 6, 2010 Paradoxical Results in Multidimensional Item Response Theory Giles Hooker and Matthew Finkelman UNC, December 6, 2010 1 / 49 Item Response Theory Educational Testing Traditional model
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationConflict-Free Colorings of Rectangles Ranges
Conflict-Free Colorings of Rectangles Ranges Khaled Elbassioni Nabil H. Mustafa Max-Planck-Institut für Informatik, Saarbrücken, Germany felbassio, nmustafag@mpi-sb.mpg.de Abstract. Given the range space
More informationCounting independent sets of a fixed size in graphs with a given minimum degree
Counting independent sets of a fixed size in graphs with a given minimum degree John Engbers David Galvin April 4, 01 Abstract Galvin showed that for all fixed δ and sufficiently large n, the n-vertex
More informationGraph coloring, perfect graphs
Lecture 5 (05.04.2013) Graph coloring, perfect graphs Scribe: Tomasz Kociumaka Lecturer: Marcin Pilipczuk 1 Introduction to graph coloring Definition 1. Let G be a simple undirected graph and k a positive
More informationARE202A, Fall 2005 CONTENTS. 1. Graphical Overview of Optimization Theory (cont) Separating Hyperplanes 1
AREA, Fall 5 LECTURE #: WED, OCT 5, 5 PRINT DATE: OCTOBER 5, 5 (GRAPHICAL) CONTENTS 1. Graphical Overview of Optimization Theory (cont) 1 1.4. Separating Hyperplanes 1 1.5. Constrained Maximization: One
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationSZEMERÉDI S REGULARITY LEMMA FOR MATRICES AND SPARSE GRAPHS
SZEMERÉDI S REGULARITY LEMMA FOR MATRICES AND SPARSE GRAPHS ALEXANDER SCOTT Abstract. Szemerédi s Regularity Lemma is an important tool for analyzing the structure of dense graphs. There are versions of
More informationThe Skorokhod reflection problem for functions with discontinuities (contractive case)
The Skorokhod reflection problem for functions with discontinuities (contractive case) TAKIS KONSTANTOPOULOS Univ. of Texas at Austin Revised March 1999 Abstract Basic properties of the Skorokhod reflection
More informationLikelihood Based Inference for Monotone Response Models
Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 11, 2006 Abstract The behavior of maximum likelihood estimates (MLEs) and the likelihood ratio
More informationInduced Saturation Number
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2012 Induced Saturation Number Jason James Smith Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd
More informationClassical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems
Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems Rani M. R, Mohith Jagalmohanan, R. Subashini Binary matrices having simultaneous consecutive
More informationApplication of Time-to-Event Methods in the Assessment of Safety in Clinical Trials
Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation
More informationConvergence in shape of Steiner symmetrized line segments. Arthur Korneychuk
Convergence in shape of Steiner symmetrized line segments by Arthur Korneychuk A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Mathematics
More informationModeling and Stability Analysis of a Communication Network System
Modeling and Stability Analysis of a Communication Network System Zvi Retchkiman Königsberg Instituto Politecnico Nacional e-mail: mzvi@cic.ipn.mx Abstract In this work, the modeling and stability problem
More informationConsistency Under Sampling of Exponential Random Graph Models
Consistency Under Sampling of Exponential Random Graph Models Cosma Shalizi and Alessandro Rinaldo Summary by: Elly Kaizar Remember ERGMs (Exponential Random Graph Models) Exponential family models Sufficient
More informationSmall Label Classes in 2-Distinguishing Labelings
Also available at http://amc.imfm.si ISSN 1855-3966 (printed ed.), ISSN 1855-3974 (electronic ed.) ARS MATHEMATICA CONTEMPORANEA 1 (2008) 154 164 Small Label Classes in 2-Distinguishing Labelings Debra
More informationLecture 35: December The fundamental statistical distances
36-705: Intermediate Statistics Fall 207 Lecturer: Siva Balakrishnan Lecture 35: December 4 Today we will discuss distances and metrics between distributions that are useful in statistics. I will be lose
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationModulation of symmetric densities
1 Modulation of symmetric densities 1.1 Motivation This book deals with a formulation for the construction of continuous probability distributions and connected statistical aspects. Before we begin, a
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 7, 04 Reading: See class website Eric Xing @ CMU, 005-04
More informationwith Current Status Data
Estimation and Testing with Current Status Data Jon A. Wellner University of Washington Estimation and Testing p. 1/4 joint work with Moulinath Banerjee, University of Michigan Talk at Université Paul
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationLinear-Time Algorithms for Finding Tucker Submatrices and Lekkerkerker-Boland Subgraphs
Linear-Time Algorithms for Finding Tucker Submatrices and Lekkerkerker-Boland Subgraphs Nathan Lindzey, Ross M. McConnell Colorado State University, Fort Collins CO 80521, USA Abstract. Tucker characterized
More informationHigh dimensional ising model selection using l 1 -regularized logistic regression
High dimensional ising model selection using l 1 -regularized logistic regression 1 Department of Statistics Pennsylvania State University 597 Presentation 2016 1/29 Outline Introduction 1 Introduction
More informationStrongly chordal and chordal bipartite graphs are sandwich monotone
Strongly chordal and chordal bipartite graphs are sandwich monotone Pinar Heggernes Federico Mancini Charis Papadopoulos R. Sritharan Abstract A graph class is sandwich monotone if, for every pair of its
More informationScore Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics
Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee 1 and Jon A. Wellner 2 1 Department of Statistics, Department of Statistics, 439, West
More information