GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1)

Size: px
Start display at page:

Download "GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1)"

Transcription

1 Working Paper 4-2 Statistics and Econometrics Series (4) July 24 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, Getafe (Spain) Fax (34) GOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION Ewa Strzalkowska-Kominiak and Aurea Grané () Abstract In this paper we study the goodness-of-fit test introduced by Fortiana and Grané (23) and Grané (22), in the context of randomly censored data. We construct a new test statistic under general right-censoring, i.e., with unknown censoring distribution, and prove its asymptotic properties. Additionally, we study a special case, when the censoring mechanism follows the well-known Koziol-Green model. We present an extensive simulation study on the empirical power of these two versions of the test statistic. We show the good performance of these statistics in detecting symmetrical alternatives and their advantages over the most famous Pearson-type test proposed by Akritas (988). Finally, we apply our test to the head-and-neck cancer data. Keywords: Goodness-of-fit; Kaplan-Meier estimator; Maximum correlation; Random censoring. AMS classification: 62E2, 62G, 62G2, 62N86. () A. Grané, Statistics Department, Universidad Carlos III de Madrid, C/ Madrid 26, 2893 Getafe (Madrid), Spain, aurea.grane@uc3m.es, corresponding author. E. Strzalkowska-Kominiak, Statistics Department, Universidad Carlos III de Madrid, C/ Madrid 26, 2893 Getafe (Madrid), Spain, ewa.strzalkowska@uc3m.es This work has been partially supported by Spanish grants MTM (Spanish Ministry of Science and Innovation), MTM , ECO (Spanish Ministry of Economy and Competitiveness).

2 Goodness-of-fi test for randomly censored data based on maximum correlation Ewa Strzalkowska-Kominiak Aurea Grané Statistics Department, Universidad Carlos III de Madrid. Abstract In this paper we study the goodness-of-fi test introduced by Fortiana and Grané (23) and Grané (22), in the context of randomly censored data. We construct a new test statistic under general right-censoring, i.e., with unknown censoring distribution, and prove its asymptotic properties. Additionally, we study a special case, when the censoring mechanism follows the well-known Koziol-Green model. We present an extensive simulation study on the empirical power of these two versions of the test statistic. We show the good performance of these statistics in detecting symmetrical alternatives and their advantages over the most famous Pearson-type test proposed by Akritas (988). Finally, we apply our test to the head-and-neck cancer data. Keywords: Goodness-of-fit Kaplan-Meier estimator; Maximum correlation; Random censoring. Introduction In many medical studies one encounters data which are not fully observed and so censored from the right. Let Y,..., Y n be the lifetimes of interest, coming from the continuous distribution function F and C,..., C n the censoring times from the distribution function G. In the context of right-censored data, for every i =,..., n, we observe X i = min(y i, C i ) and δ i = {Yi C i }, A denotes the indicator function being equal to if A is fulfille and otherwise. Even though the unknown distribution function F can be estimated by a well-known product-limit estimator, introduced by Kaplan and Meier (958), very often we may have sufficien information to know the shape Statistics Department, Universidad Carlos III de Madrid, C/ Madrid 26, 2893 Getafe, Spain. s: E. Strzalkowska-Kominiak, ewa.strzalkowska@uc3m.es, A. Grané, aurea.grane@uc3m.es (corresponding author).

3 of the distribution in question. Using such fully parametric models can lead to substantial gain in the efficien y of statistical procedures involving the distribution of the lifetimes F if only the parametric choice of that distribution is correct. This makes the goodness-of-fi test an important statistical tool, when dealing with (right-)censored data. It is clear, however, that the most famous tests for complete data, as Kolmogorov-Smirnov or Cramer-von Mises, are difficul to apply it the presence of censoring since the limit distribution depends on the censoring distribution G. The most recent overview of this kind of tests with randomly censored data is given by Balakrishnan et al (24). Some more classical approaches are to fin in Koziol and Green (976) and Akritas (988). While the firs one is based on the assumption that the distribution function G follows the so called Koziol-Green model and hence is more restrictive, the second one is a χ 2 test applied to general random censoring. For this reason, the Pearson-type goodness-of-fi test proposed by Akritas (988) is so far the best option for randomly censored data with unknown censoring distribution. Nevertheless, it requires a partition of the observations to the cells and an adequate choice of number of classes since the power of the test may vary depending on the degrees of freedom. In this work we propose a new goodness-of-fi test based on maximum correlation coefficient which has a normal limiting distribution and hence it is straightforward to apply. For this, we firs introduce the maximum correlation in a more general set-up. Let Y and Y 2 be two random variables with finit second order moments, joint cumulative distribution function (cdf) H and marginals F and F 2, respectively. Recall, that the Hoeffding representation of the correlation coefficien is given by ρ(f, F 2 ) = (H(x, y) F (x)f 2 (y))dxdy, σ σ 2 R 2 σ i denotes the standard deviation of Y i. Furthermore, the maximum correlation of the pair of random variables (Y, Y 2 ) is define as the correlation coefficien ρ + (F, F 2 ) corresponding to the bivariate cdf H + (x, y) = min(f (x), F 2 (y)), the upper Fréchet bound of H(x, y). The cdf H + (x, y) is a singular distribution, having support on the one-dimensional set {(x, y) R 2 : F (x) = F 2 (y)}, and the maximum correlation coefficien is given by ρ + (F, F 2 ) = ( F (p)f2 (p)dp µ µ 2 ), () σ σ 2 F i is the inverse of F i and µ i is the the mean of Y i. This maximum correlation, ρ + (F, F 2 ), is a measure of agreement between F and F 2, since ρ + = if and only if F = F 2 up to a scale and location change. In particular, Cuadras and Fortiana (993) proposed the statistic based on ρ + (F, F ) as a measure of goodness of fi of an iid sample Y,..., Y n with cdf F = F, to a given distribution F. The goodness-of-fi test based on maximum correlation was further studied by Fortiana and Grané (23) and Grané (22). As in the two latter publications, the present paper is devoted to testing uniformity, i.e. F = F U, a [, ] uniform distribution. As shown by Fortiana and Grané (23) the asymptotic approximation of 2

4 ρ + (F, F U ) is available, but convergence to its limiting law is rather slow. This led to definin Q = which equals one is F = F U. σ /2 ρ + (F, F U ) = 6 x(2f (x) )F (dx), (2) The goal of this paper is to study a test statistic based on Q when Y,..., Y n may not be fully observed and so censored from the right by censoring times C,..., C n. More precisely, we wish to test the hypothesis H : F = F U, F U is the cdf of a [, ] uniform random variable, based on the sample (X i, δ i ) i=,...,n, X i = min(y i, C i ), with X i [, ]. Nevertheless, our approach is not restricted to testing uniformity. We can also consider a more general null hypothesis F since, under H : F = F, we have that the transformed random variable F (Y ) follows a [, ] uniform distribution. That is, Ỹ = F (Y ) F U under the null hypothesis. Setting C = F (C) and since {Ỹi C i } = {Y i C i }, leads us to testing uniformity based on the iid sample ( X, δ ),..., ( X n, δ n ), X i = min(ỹi, C i ) and δ i = { Ỹ i C i }. Hence, testing for uniformity is equivalent to testing for a fully specifie continuous distribution. Even though it seems that we can extend the work of Fortiana and Grané (23) setting Q n = 6 x(2f n(x) )F n (dx), F n denotes the Kaplan-Meier estimator for censored data, it is far from being true. In contrast to the empirical distribution under completely observed data, the Kaplan-Meier estimator is biased (see Stute (994), for details). In Section 2 we show that such Plug- In estimator suffers from bias of this product-limit estimator and hence E(Q n ) = under H does not hold. To avoid this problem we propose to re-write Q in such a way, that it can be estimated by degenerated U-statistics. This leads to significan bias (and variance) reduction. In Section 3 we prove the asymptotic normality of the proposed estimator and in Section 4 we present our new goodness-of-fi test. In Section 5 we present an extensive simulation study. Finally, in Section 6 we adapt the test statistic to the case of composite null hypothesis and apply our test to the head-and-neck cancer data from Haghighi and Nikulin (24). 2 Test statistic In this section we propose our new statistic to test the goodness of fi for randomly censored data based on the modifie maximum correlation coefficient Recall that, under H : F = F U, the quantity Q = σ /2 ρ + (F, F U ) = 6 equals one. Hence in the following we prefer to work with Q = Q = 6 x(2f (x) )F (dx) x(2f (x) )F (dx) (3) 3

5 which equals zero if H is true. First, we defin a Plug-In estimator of Q by replacing F in (3) with the well-known Kaplan-Meier estimator. We obtain F n is define as follows Q n = 6 F n (x) = X i x x(2f n (x) )F n (dx), (4) [ ] δ i n k=. (5) {X k X i } It turns out, that under the null hypothesis and for finit samples the Plug-In estimator Q n suffers from significan bias and its convergence to the limiting distribution is very slow. To solve this problem, we propose to estimate Q with a U-statistic. For this, note that, if F is a continuous cdf and supp(f ) [, ], then Hence Q = = 2 (6x(2F (x) ) 2F (x))f (dx) = F (x)f (dx) =. [(6x 2)F (x) 6x( F (x))]f (dx) [(6x 2) {y x} 6x {y>x} ]F (dx)f (dy). (6) Now we may replace the unknown quantities by their estimators. For this we introduce the jumps of the Kaplan-Meier estimator by setting w in = F n (X i ) F n (X i ), F n (x ) is the left-continuous version of F n (x). Finally, the estimator of Q is given by Q n = w in w jn h(x i, X j ), (7) i= j i h(x, x 2 ) = (6x 2) {x2 x } 6x {x2 >x }. To illustrate the advantages of using Q n over the Plug-In estimator Q n, in panel (a) of Figure, we present the bias and variance of those estimators under the null hypothesis, that is, when the data comes from the [, ] uniform distribution. Additionally, in panels (b)-(c) of Figure, we compare the kernel density estimators of the standardized versions of Q n and Q n to that of the standard normal distribution. The standardization is done using the estimated asymptotic variances, discussed later on. Clearly, the U-statistic, Q n, exhibits much smaller bias (and variance) and its standardized version fit nicely the standard normal distribution for all the considered censoring rates. 4

6 Figure : Comparison between Q n and the Plug-In estimator Q n: Estimated bias (variance) based on 5 trials and kkernel densities for n = 2. (b) Kernel density of standardized Q n (a) Estimated bias (variance) % censoring Q n Q n n = (.8) -. (.47) n =.5 (.32) -.4 (.22) 2% censoring Q n Q n n = 5. (.97) -.69 (.63) n =.47 (.74) -.26 (.26) 3% censoring Q n Q n n = (.483) -.22 (.69) n = (.24) -.9 (.33) Normal % 2 % 3 % (c) Kernel density of standardized Q n Normal % 2 % 3 % Asymptotic properties In this section we study the asymptotic properties of our test statistic Q n. First we consider Q n under general censoring mechanism, that is, without assuming any shape of the distribution function G(x) = P (C x), which is the cdf of the censoring times. Then we apply the results to the special case of Koziol-Green model. Recall that F (x) = P (Y x) is the cdf of the lifetimes of interest. We need following assumptions: A : A2 : F (du) G(u) < ϕ(u) C /2 (u)f (du) < ϕ is a score function, C(x) = x Theorem. Under A and A2, we have n( Qn Q ) N (, σ 2 ), σ 2 = [ ϕ 2 (x) 2 G(x) F (dx) ϕ(x)f (dx)] G(dy) and F is continuous with support in [, ]. ( G(y)) 2 ( F (y)) 5 [ x ] 2 ( F (x))g(dx) ϕ(y)f (dy) ( H(x)) 2

7 and ϕ(x) = 2xF (x) 6x 2 2 x yf (dy) + 6 yf (dy). Proof. In view of (7), we can write Q n in the following way Q n = h(x, y)f n (dx)f n (dy), (8) h(x, y) = (6x 2) {y<x} 6x {y>x}. In the fis step of the proof we write Q n as a sum of four terms as follows Q n = Q + Q 2n + Q 3n + Q 4n, Q = Q 2n = Q 3n = Q 4n = h(x, y)f (dx)f (dy) h(x, y)(f n (dx) F (dx))f (dy) h(x, y)(f n (dy) F (dy))f (dx) h(x, y)(f n (dx) F (dx))(f n (dy) F (dy)). By (6) and since F is continuous, we have that Q = Q. As to Q 2n + Q 3n, we obtain ϕ(x) = Q 2n + Q 3n = h(x, y)f (dx) + = 2xF (x) 6x 2 2 ϕ(x)(f n (dx) F (dx)), x h(x, y)f (dy) yf (dy) + 6 yf (dy). It remains to show that Q 4n = o P (n /2 ). For this, set τ H = inf{t : H(t) = }, H(t) = P(X t) is the cdf of the observed sample. Since supp(f ) [, ] and G fulfill assumption A, we have that τ H =. Moreover, by definitio of h(x, y), we can show that Q 4n = 2 x(f n (x) F (x))(f n (dy) F (dy)) 2(F n () F ()) 2 =: Q a 4n + Q b 4n. 6

8 Now, we may consider the two terms, Q a 4n and Q b 4n, separately. According to Theorem 2 (7) in Ying (989) and under A, the process n(f n F ) converges weakly to a Brownian process. See, also equation () in Wellner (27). More precisely, n(fn F ) ( F )B(C), in D[, τ H], D[, τ H] denotes the Skorohod space. Furthermore, since F is continuous and D is a set of uniformly bounded functions, we have that n(f n F ) D with probability exceeding ε for every ε >. Additionally, x [, ] and sup x [,τ H] F n (x) F (x) almost surely. Hence, using Theorem 2. in Rao (962) with g(x) = n(f n (x) F (x))x, we obtain n Qa 4n = 2 g(x)(f n (dy) F (dy)) = o P (). Additionally, under A, n Q b 4n = o P (). Finally, we obtain the following representation Q n = Q + ϕ(x)(f n (dx) F (dx)) + o P (n /2 ). The asymptotic normality is now a direct consequence of Stute (995). More precisely, under A and A2, we obtain This completes the proof. Consequently, we have n ϕ(x)(f n (dx) F (dx)) N (, σ 2 ). Corollary. Under H, A and A2, we have n Qn N (, σ 2 ). It is to see, that the variance under H would not simplify since it does depend on the distribution function of the censoring times G, which is unknown. Nevertheless, under the Koziol-Green model, we have an explicit expression for σ 2. First, recall that G follows a Koziol-Green model if G(x) = ( F (x)) β, β is an unknown parameter. We can see, however, that p := P (Y > C) = β β + and p = ( G(x))F (dx). Hence β can be easily estimated using Kaplan-Meier estimators for F and G. Finally, it is easy to check that the assumptions A and A2 are fulfille under the Koziol-Green model with β (, ), that is, if the censoring is not heavier than 5%, which is a very reasonable assumption. So, as a consequence of Corollary, we get the following result. 7

9 Corollary 2. Under Koziol-Green model with β (, ) we have that, under H, n Qn N (, σ 2 ), σ 2 = β 4 + 4β 3 7β β 24 (β )(β 2)(β 3)(β 4)(β 5). 4 Goodness-of-fi test Once the test statistic is proposed and its limiting distribution is established, we are in the position to defin the goodness-of-fi test. For this we estimate the asymptotic variance σ 2 using the Plug- In principle. That is, we replace the unknown quantities with its estimators. First, we defin distribution function of the observed times H(x) = P (X x) and set H n (x) = n n i= {X i x} as its empirical counterpart. Moreover, let and H (x) = P(X x, δ = ) = H (x) = P(X x, δ = ) = x x ( F (u))g(du) ( G(u))F (du) be the subdistributions related with the observed censored and uncensored lifetimes. Their estimators are define as follows and Hence σ 2 n = n n i= H n(x) = n {Xi x}( δ i ) i= H n(x) = n {Xi x}δ i. i= [ ϕ 2 n(x i ) ( G n (X i )) δ 2 i n i= δ i ( H n (X i )) 2 ϕ n (x) = 2xF n (x) 6x 2 2 n and G n is a Kaplan-Meier estimator given by i= G n (x) = X i x [ n j= i= ] 2 ϕ n (X i ) G n (X i ) δ i ϕ n (X j ) G n (X j ) δ j {Xj X i } X i δ i G n (X i ) {X i x} + 6 n [ ] δ i n k=. {X k X i } i= ] 2, X i δ i G n (X i ). the 8

10 Finally, we set T n := n Qn σ 2 n. (9) We reject H at level α if T n Φ (α/2) or T n Φ ( α/2), Φ is the inverse of the standard normal cdf. Additionally, under the Koziol-Green model and in view of Corollary 2, we defin T n := n Qn ˆσ 2, () and ˆσ 2 = ˆβ ˆβ 3 7 ˆβ ˆβ 24 ( ˆβ )( ˆβ 2)( ˆβ 3)( ˆβ 4)( ˆβ 5) ( ˆβ = ( G n (x))f n (dx)). As before we reject H at level α if T n Φ (α/2) or T n Φ ( α/2). 5 Simulation study Here we conduct an extensive simulation study to show the behavior of our test. In the following subsection we consider only the null hypothesis, while in Subsection 5.2 we include the power study under different alternatives. In both subsections we compare our method with the Pearsontype goodness-of-fi test proposed by Akritas (988). Following the notation of the previous section, we denote by T n and Tn our test statistics for the general censoring and under the Koziol-Green model, respectively. See, equations (9) and () for details. Moreover, we denote by A (nc) the χ 2 test proposed by Akritas (988), nc denotes the number of cells. 5. Null hypothesis In this section we present the results of the proposed methods under the null hypothesis and at 5% significanc level. As mentioned before, we consider our tests T n and Tn, together with the test presented by Akritas (988). Following the latter work, we consider nc = 2 and nc = 5 and denote these tests by A (2) and A (5), respectively. The results are based on 5 trials. As it is to see in the Table, our test T n and the one from Akritas hold very well the significanc level. As to the one based on Koziol-Green model, it holds the 5% level when censoring is low but for more than 2% of missing data, the variance σ 2 does not captures the variability of our Q n correctly and hence the significanc level is slightly overestimated for heavy censoring. 9

11 Table : Null Hypothesis % censoring 2% censoring 3% censoring T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn n = n = n = Power study To study the power of our test we consider two different models: Model : To test the uniformity (H : F = F U ) we choose three parametric families of alternative probability distributions with support on [, ]. (a) Lehmann alternatives, F θ (x) = x θ, x, θ > ; for θ = we have F θ = F U. (b) compressed uniform alternatives, F θ (x) = x θ 2θ, θ x θ, θ /2; and for θ = we have F θ = F U. (c) centered distributions having a U-shaped density for θ (, ), or wedge-shaped density for θ > F θ (x) = for θ = we have F θ = F U. { 2 (2x)θ, x /2 (2( 2 x))θ, /2 x Model 2: An exponentiality test (with parameter λ = ), the alternatives are Weibull distributions with parameters and θ. More precisely, F θ (x) = e xθ, θ = gives us the exponential distribution of the null hypothesis. Additionally, the censoring variable, C, is generated under the Koziol-Green model. That is, G(x) = ( F (x)) β, β = p and p = P(X > C) is the censoring level. p In the following figure and tables we present the power study at a 5% significanc level. Panels (a)- (c3) of Figure 2 contain the power of the test for Model and panels (d)-(d3) of Figure 2 contains the power under Model 2, for different sample sizes (n = 5,, 2) and one censoring level of 2%. All those figure are based on 2 trials. Moreover, Tables 2 are based on 5 trials and show the power under alternatives for two different sample sizes (n =, 2), censoring levels p =.,.2,.3 and different values of the parameters θ. In particular, for Model 2, we choose

12 θ = + Hn.5 and H { 4, 2, 2, 4}. Both, tables and figures include a comparison to the Pearson-type test proposed by Akritas (988). As before, we use the number of cells (nc) equal to 2 and (a) Model a), n = 5 (a2) Model a), n = (a3) Model a), n = (b) Model b), n = 5 (b2) Model b), n = (b3) Model b), n = (c) Model c), n = 5 (c2) Model c), n = (c3) Model c), n = (d) Model 2, n = 5 (d2) Model 2, n = (d3) Model 2, n = 2 Figure 2: Power study for Model (a c3) and Model 2 (d d3) for three different sample sizes and censoring rate p =.2. T n (solid line), A (5) (dashed line) and A (2) (dash-dotted line). Concerning the uniformity test (Model ), it is clear that for alternatives b) and c) our test outperforms

13 the one proposed by Akritas. Additionally, our test does not depend on the number of cells and the choice of cell boundaries. The influenc of the number of cells is made obvious in panels (a)-(c3) of Figure 2. While A (2) gives better results than A (5) in alternative a), the opposite can be observed in alternatives b) and c). Regarding the exponentiality test (Model 2), we get better results than the competitive test of Akritas (988) when the alternative is Weibull with parameter θ >. For θ <, our test reaches the high power of the Pearson-type test for big sample sizes. Notice, however, that in Model 2 and for all parameters θ, the test statistic under the Koziol-Green model, Tn, gives very good results independently on the sample size. Table 2: Power study for Model and Model 2. Model, Alterative a) n = p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = n = 2 p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = Model, Alterative b) n = p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = n = 2 p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = Model, Alterative c) n = p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = n = 2 p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn θ = θ = θ = Model 2. Power study for θ = + Hn.5 n = p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn H = H = H = H = n = 2 p =. p =.2 p =.3 T n A (5) A (2) Tn T n A (5) A (2) Tn T n A (5) A (2) Tn H = H = H = H =

14 6 Further extensions and Application 6. Composite Null Hypothesis So far, our test T n has been designed to test a fully specifie null hypothesis. It does strongly base on the fact that the transformed lifetime F (X) is [, ] uniformly distributed under H : F = F. In this section we consider a more general case, when the distribution function to be tested depends on an unknown parameter λ. Let now consider the following null hypothesis H : F {F λ : λ R d }. In this case firs we need to estimate the parameter λ using, e.g., a maximum-likelihood estimator ˆλ. Clearly, if F λ is twice differentiable in λ and the estimator ˆλ is n consistent, by the Taylor expansion we have that Fˆλ(X) = U + O P (n /2 ), U = F λ (X) U[, ] under the null hypothesis H. The test statistic Q n should still admit a normal limit but the error term enters the variance of our test statistic and hence the asymptotic variance given in Theorem is no longer valid. Even though the theoretical properties of our test in the case of such a composite hypothesis are beyond the scope of this paper, to test this kind of hypothesis we propose a modifie test with jacknife estimator of the variance, which does take into account the estimation of parameters and works very well in practice. We proceed as follows:. Based on the sample X,..., X n, fin the maximum-likelihood estimator (MLE) ˆλ. 2. Defin the pseudo-values X i = Fˆλ(X i ) for i =,..., n. 3. Based on the sample X,..., X n, compute the test statistic Q n define in (7). 4. Compute the jacknife estimator of the variance following the steps For every i =,..., n, choose the subsample X,..., X i, X i+,..., X n and compute the MLE ˆλ ( i). Defin the pseudo-values X j = Fˆλ( i)(x j ) for j =,..., i, i +,..., n. Based on the the sample X,..., X i, X i+,..., X n, compute the test statistic Q ( i) n. Set Q n = n 5. Defin the test statistic n i= Q ( i) n. nv n ( Q n ) = (n ) J n := ( i= n Qn nv n ( Q. n ) Q ( i) n Q n ) 2, 3

15 6. Reject H, if J n Φ (α/2) or J n Φ ( α/2). To check the behavior of this new jacknife-test, J n, we study the hypothesis H : F {exp(λ) : λ (, )}, the alternatives come from the Weibull distribution. Our simulated sample comes from exp(λ = ) but λ is estimated using Maximum-Likelihood method. In Figure 3 we compare out test T n from Section 4 with the test based on J n Figure 3: Power study: J n (solid line) and T n (dashed line), n = 5 (left), n = (middle) and n = 2 (right), p = Real Data Example We illustrate the use of our test on the head-and-neck cancer data from Haghighi and Nikulin (24). These authors fitte the Power Generalized Weibull distribution F (x, σ, v, γ) to the data. We use here the truncated data set with Survival Times (in months) smaller than 2. It was motivated by the boxplot in Figure 4. This gives us 44 observations with around % censoring rate. We perform a goodness-of-fi test for the before-mentioned Power Generalized Weibull distribution F a (x, σ, v, γ) = F (x, σ, v, γ). Additionally, we also consider the Weibull distribution F b (x, σ, v) = F (x, σ, v, ) and the Exponential distribution F c (x, σ) = F (x, σ,, ), F (x, σ, v, γ) = exp ( ( + (x/σ) v ) /γ). First, we fitte the parameters using MLE under random censoring obtaining the estimators (ˆσ, ˆv, ˆγ) and the following distributions F a (x, 4.63,.82,.9), F b (x,.44, 8.45) and F c (x, 8.33). Then we applied our test J n and obtained the following p-values: p a =.86, p b =.88 and p c =. for Power Generalized Weibull, Weibull and Exponential, respectively. Hence, the results of the test confir what it is to see in Figure 4, that both Power Generalized Weibull and Weibull fi the data very well while the Exponential distribution is not adequate to describe the head-and-neck cancer data. 4

16 F a (x, ˆσ, ˆv, ˆγ) F b (x, ˆσ, ˆv) F c (x, ˆσ) χ 2

17 [2] Balakrishnan, N., Chimitova E., Vedernikova, M. (24) An empirical analysis of some nonparametric goodness-of-fi tests for censored data Communications in Statistics - Simulation and Computation, DOI:.8/ [3] Cuadras, C. M., Fortiana, J.(993) Continuous metric scaling and prediction Multivariate Analysis, Future Directions, vol. 2 (eds C. M. Cuadras and C. R. Rao), pp Amsterdam: North-Holland. [4] Fortiana, J., Grané, A. (23) Goodness-of-Fit Tests Based on Maximum Correlations and Their Orthogonal Decompositions J. Royal Statistical Society. Series B 65, pp [5] Grané, A. (22) Exact goodness-of-fi tests for censored data Ann. Inst. Stat. Math. 64, pp [6] Haghighi, F., Nikulin, H. (24) A chi-squared test for power generalized Weibull family for the head-and-neck cancer censored data Zapiski Naucznych Sjeminarov 3, pp [7] Kaplan, E. L., Meier, P. (958) Nonparametric estimation from incomplete observations Journal of the American Statistical Association 53, pp [8] Koziol J.A., Green S.B. (976) A Cramer-von Mises Statistic for Randomly Censored Data Biometrika 63, pp [9] Rao, R.R. (962) Relations Between Weak and Uniform Convergence of Measures with Applications Annals of Mathematical Statistics 33, pp [] Stute, W.(994) The Bias of Kaplan-Meier Integrals Scandinavian Journal of Statistics 2, pp [] Stute, W.(995) The Central Limit Theorem Under Random Censorship Annals of Statistics 23, pp [2] Wellner, J.A. (27) On an exponential bound for the Kaplan-Meier estimator Lifetime Data Analysis 3, pp [3] Ying, Z. (989) A note on the asymptotic properties of the product-limit estimator on the whole line Statistics & Probability Letters 7, pp

Exact goodness-of-fit tests for censored data

Exact goodness-of-fit tests for censored data Ann Inst Stat Math ) 64:87 3 DOI.7/s463--356-y Exact goodness-of-fit tests for censored data Aurea Grané Received: February / Revised: 5 November / Published online: 7 April The Institute of Statistical

More information

Exact goodness-of-fit tests for censored data

Exact goodness-of-fit tests for censored data Exact goodness-of-fit tests for censored data Aurea Grané Statistics Department. Universidad Carlos III de Madrid. Abstract The statistic introduced in Fortiana and Grané (23, Journal of the Royal Statistical

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

The Central Limit Theorem Under Random Truncation

The Central Limit Theorem Under Random Truncation The Central Limit Theorem Under Random Truncation WINFRIED STUTE and JANE-LING WANG Mathematical Institute, University of Giessen, Arndtstr., D-3539 Giessen, Germany. winfried.stute@math.uni-giessen.de

More information

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations

A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations Journal of Statistical Planning and Inference 18 (22) 85 97 www.elsevier.com/locate/jspi A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations J. Fortiana,

More information

1 Degree distributions and data

1 Degree distributions and data 1 Degree distributions and data A great deal of effort is often spent trying to identify what functional form best describes the degree distribution of a network, particularly the upper tail of that distribution.

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data

Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data Géraldine Laurent 1 and Cédric Heuchenne 2 1 QuantOM, HEC-Management School of

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Lecture 21: Convergence of transformations and generating a random variable

Lecture 21: Convergence of transformations and generating a random variable Lecture 21: Convergence of transformations and generating a random variable If Z n converges to Z in some sense, we often need to check whether h(z n ) converges to h(z ) in the same sense. Continuous

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics Wen-Xin Zhou Department of Mathematics and Statistics University of Melbourne Joint work with Prof. Qi-Man

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach

Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach Part A: Figures and tables Figure 2: An illustration of the sampling procedure to generate a surrogate

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

On the Comparison of Fisher Information of the Weibull and GE Distributions

On the Comparison of Fisher Information of the Weibull and GE Distributions On the Comparison of Fisher Information of the Weibull and GE Distributions Rameshwar D. Gupta Debasis Kundu Abstract In this paper we consider the Fisher information matrices of the generalized exponential

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

An Analysis of Record Statistics based on an Exponentiated Gumbel Model

An Analysis of Record Statistics based on an Exponentiated Gumbel Model Communications for Statistical Applications and Methods 2013, Vol. 20, No. 5, 405 416 DOI: http://dx.doi.org/10.5351/csam.2013.20.5.405 An Analysis of Record Statistics based on an Exponentiated Gumbel

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002 Name: Last Name 1, First Name 1 Stdnt # StudentNumber1 STAT 450: Final Examination Version 1 Richard Lockhart 16 December 2002 Instructions: This is an open book exam. You may use notes, books and a calculator.

More information

Goodness-of-Fit Testing with Empirical Copulas

Goodness-of-Fit Testing with Empirical Copulas with Empirical Copulas Sami Umut Can John Einmahl Roger Laeven Department of Econometrics and Operations Research Tilburg University January 19, 2011 with Empirical Copulas Overview of Copulas with Empirical

More information

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,

More information

Approximate Self Consistency for Middle-Censored Data

Approximate Self Consistency for Middle-Censored Data Approximate Self Consistency for Middle-Censored Data by S. Rao Jammalamadaka Department of Statistics & Applied Probability, University of California, Santa Barbara, CA 93106, USA. and Srikanth K. Iyer

More information

Asymptotic statistics using the Functional Delta Method

Asymptotic statistics using the Functional Delta Method Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Answer Key for STAT 200B HW No. 8

Answer Key for STAT 200B HW No. 8 Answer Key for STAT 200B HW No. 8 May 8, 2007 Problem 3.42 p. 708 The values of Ȳ for x 00, 0, 20, 30 are 5/40, 0, 20/50, and, respectively. From Corollary 3.5 it follows that MLE exists i G is identiable

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Weighted Marshall-Olkin Bivariate Exponential Distribution

Weighted Marshall-Olkin Bivariate Exponential Distribution Weighted Marshall-Olkin Bivariate Exponential Distribution Ahad Jamalizadeh & Debasis Kundu Abstract Recently Gupta and Kundu [9] introduced a new class of weighted exponential distributions, and it can

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

Conditional densities, mass functions, and expectations

Conditional densities, mass functions, and expectations Conditional densities, mass functions, and expectations Jason Swanson April 22, 27 1 Discrete random variables Suppose that X is a discrete random variable with range {x 1, x 2, x 3,...}, and that Y is

More information

Testing Exponentiality by comparing the Empirical Distribution Function of the Normalized Spacings with that of the Original Data

Testing Exponentiality by comparing the Empirical Distribution Function of the Normalized Spacings with that of the Original Data Testing Exponentiality by comparing the Empirical Distribution Function of the Normalized Spacings with that of the Original Data S.Rao Jammalamadaka Department of Statistics and Applied Probability University

More information

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

A NEW CLASS OF SKEW-NORMAL DISTRIBUTIONS

A NEW CLASS OF SKEW-NORMAL DISTRIBUTIONS A NEW CLASS OF SKEW-NORMAL DISTRIBUTIONS Reinaldo B. Arellano-Valle Héctor W. Gómez Fernando A. Quintana November, 2003 Abstract We introduce a new family of asymmetric normal distributions that contains

More information

Spline Density Estimation and Inference with Model-Based Penalities

Spline Density Estimation and Inference with Model-Based Penalities Spline Density Estimation and Inference with Model-Based Penalities December 7, 016 Abstract In this paper we propose model-based penalties for smoothing spline density estimation and inference. These

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Tobit and Interval Censored Regression Model

Tobit and Interval Censored Regression Model Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 98-994 Research India Publications http://www.ripublication.com Tobit and Interval Censored Regression Model Raidani

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Nonparametric Function Estimation with Infinite-Order Kernels

Nonparametric Function Estimation with Infinite-Order Kernels Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density

More information

Lecture 2: CDF and EDF

Lecture 2: CDF and EDF STAT 425: Introduction to Nonparametric Statistics Winter 2018 Instructor: Yen-Chi Chen Lecture 2: CDF and EDF 2.1 CDF: Cumulative Distribution Function For a random variable X, its CDF F () contains all

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Week 9 The Central Limit Theorem and Estimation Concepts

Week 9 The Central Limit Theorem and Estimation Concepts Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population

More information

JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODEL

JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODEL 1 / 25 JEREMY TAYLOR S CONTRIBUTIONS TO TRANSFORMATION MODELS DEPT. OF STATISTICS, UNIV. WISCONSIN, MADISON BIOMEDICAL STATISTICAL MODELING. CELEBRATION OF JEREMY TAYLOR S OF 60TH BIRTHDAY. UNIVERSITY

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence Lehrstuhl für Statistik und Ökonometrie Diskussionspapier 87 / 2011 Some critical remarks on Zhang s gamma test for independence Ingo Klein Fabian Tinkl Lange Gasse 20 D-90403 Nürnberg Some critical remarks

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Chapter 2: Resampling Maarten Jansen

Chapter 2: Resampling Maarten Jansen Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic

Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic Andrew Barron Department of Statistics, Yale University Mirta Benšić Department of Mathematics, University of Osijek and

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

More information