A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function 1

Size: px
Start display at page:

Download "A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function 1"

Transcription

1 A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function Andrew R Barron, Mirta Benšić and Kristian Sabo Abstract The relationship between the norm square of the standardized cumulative distribution and the chi-square statistic is examined using the form of the covariance matrix as well as the projection perspective This investigation enables us to give uncorrelated components of the chi-square statistic and provide interpretation of these components as innovations standardizing the cumulative distribution values Also, it enables us to discuss a difference in large sample properties for estimators that minimize distances between empirical and theoretical distributions evaluated in fixed and random set of points Keywords: weighted least squares; minimum chi-square; empirical distribution; distribution fitting Introduction Let X, X 2,, X n be independent real valued random variables with distribution function F Let F n be the empirical distribution function F n (t) = n n {X i t} and let n(fn (t) F (t)), t T, be the centered empirical process evaluated at a set of points T R It is familiar that when F is an hypothesized distribution and T = R the maximum of the absolute value of this empirical process corresponds to the Kolmogorov Smirnov test statistic, the average square corresponds to the Cramer Von Mises test statistic and the average square with marginal standardization using the variance equal to F (t)( F (t)) produces the Anderson-Darling statistics (average with the distribution F ) (see Anderson (952)) The covariance of the empirical process takes the form nf (t)( F (s)) for t s For finite T let V denote the corresponding symmetric covariance matrix of the column vector n(f n F ) with entries n(f n (t) F (t)), t T Finite T counterparts to the Kolmogorov Smirnov, Cramer Von Mises and Anderson- Darling statistics have been considered in Henze (996) and Choulakian at al (994) In particular, a finite T counterpart to the Anderson Darling statistic is n(f n F ) T (Diag(V )) (F n F ), which uses only the diagonal entries of V Here we focus on the complete standardization of the empirical distribution restricted to T = {t,, t k } leading to the squared distance n(f n F ) T V (F n F ) () This work was supported by the Croatian Science Foundation through research grant IP

2 and to estimation procedures that minimize it The motivation, familiar from regression, is that the complete standardization produces more efficient estimators Although such estimators are usually named weighted least squares (eg Swain at al (988)) or generalized least squares (eg Benšić (204), Benšić (205)) the tridiagonal form of the matrix V (see eg Barrett (978), Barrett (979)) puts them in the minimum chi-square context (see eg Hartley (972)) Indeed, the norm square of the standard empirical distribution given in expression () is in fact equal to the chi-square statistic n (P n (A) P (A)) 2, (2) P (A) A π where π is the partition of R into the k + intervals A formed by consecutive values T = {t,, t k }, where A = (, t ], A 2 = (t, t 2 ],, A k = (t k, t k ] and A k+ = (t k, ) Here P n (A j ) = F n (t j ) F n (t j ) = n n {X i A j} and P (A j ) = F (t j ) F (t j ) with F ( ) = F n ( ) = 0 and F ( ) = F n ( ) = We provide a simple explicit standardization Indeed () and (2) are shown to be equal to the sum of squares k Zj 2 (3) j= of the particular uncorrelated zero mean and unit variance random variables Z j which are proportional to F n (t j+ )F (t j ) F (t j+ )F n (t j ) As we shall show, the equivalence of the form (), (2) and (3) holds also for the case of random sets T with cut-points based on empirical quantiles In this note we address the relationship between the standardized cumulative distribution and the chi-square statistic by using the tridiagonal form of the matrix V as well as from the projection perspective It enables us to give the uncorrelated components of the chi-square statistic and to discuss asymptotic properties of the estimators that minimize () for random and fixed choice of points in the finite set T In Section 2 we explain the framework which we use in this note and address two ways in which the relationship between the standardized cumulative distribution and the chi-square statistic can be seen In Section 3 we present uncorrelated components of the chi-square statistic and provide interpretation of these components as innovations standardizing the cumulative distribution values In the last section we discuss a difference in large sample properties for estimators that minimize () for fixed and random choices of points in T Some of these estimators have interpretation as regression procedures based on discrepencies between the empirical distribution function (or its transformation) and its theoretical counterpart are often used for estimating distributional parameters Examples include research concerning parametrized distributions, for which the maximum likelihood estimate sometimes doesn t exist We can find them in some textbooks (see eg Johnson at al (994) and Rinne (2009)) as 2

3 well as scientific papers which discuss and compare different estimation methods especially in reliability and survival analysis (eg Torres (204), Kundu (2005), Benšić (204), Dey (204 ) and Bdair (202)) As they are mainly applied to continuous distributions, the set of points T in which the empirical and theoretical distributions will be evaluated is obviously very important It is natural to set T to be random using empirical quantiles That leads us to the distribution of uniform order statistics and, in case of the distance (), to the conventional weighted least squares estimator that seeks to minimize the distance between the vector of uniformized order statistics and the corresponding vector of expected values, proposed by Swain at al (988) However, it was mentioned in Swain at al (988), based on practice, that this method, based on ordered statistics, failed to achieve the quality that had been expected and they suggested a different weighting matrix in Johnson s translation system In contrast, the fixed choice of T leads us directly to the classical Pearson minimum chi-square estimator for which best asymptotically normal (BAN) distribution properties are well known (see eg Hsiao (2006) for its BAN properties and see also Amemiya (976), Berkson (949), Berkson (980), Bhapkar (966), Fisher (924), Taylor (953) for more about minimum chi-square estimation) However, the fixed choice is more naturally made with discrete distributions than with continuous At the end of the last section we give an iterative procedure which does produce a BAN estimator through the minimization of () and random T, based on ordered statistics, which can be naturally applied to continuous distributions 2 Common Framework Let r, r 2,, r k+ be random variables with sum, let ρ, ρ 2,, ρ k+ be their expectations, and let R j = j r i and R j = be their cumulative sums We are interested in the differences R j R j Suppose that there is a constant c = c n such that j Cov(R j, R l ) = c R j( R l ) = c V jl (4) for j l Let R = (R,, R k ) τ and R = (R,, R k ) τ We highlight the relationship between (R R) τ V (R R) and k+ and the structure j= ρ i (r j ρ j ) 2 ρ j of the inverse V as well as construction of a version of B(R R) with uncorrelated entries, and V = B τ B We have the following cases for X,, X n iid with distribution function F Case : With fixed t < < t k and t 0 =, t k+ = we set R j = F n (t j ) = n n {Xi t j } 3

4 with expectations R j = F (t j ) These have increments r j = P n (A j ) = n n {Xi A j } and ρ j = P (A j ) = F (t j ) F (t j ) with intervals A j = (t j, t j ] Now the covariance is /n times the covariance in a single draw, so (4) holds with c = n Case 2: With fixed n < n 2 < < n k n and ordered statistics we set t j = X (nj) and X (n ) X (n2 ) X (nk ) R j = F (X (nj )) with expectation R j = n j /(n + ) These have increments r j = P (A j ) and ρ j = (n j n j )/(n + ) Now, when F is continuous the joint distribution of the R j is the Dirichlet distribution of uniform quantiles and (4) holds for c = n + 2 Note that in both cases we examine distribution properties of R j R j which is F n (t j ) F (t j ) in Case and F (t j ) F n (t j )n/(n + ) in Case 2 Thus, the difference R R is a vector of centered cumulative distributions In Case it is the centering of the empirical distribution at t,, t k and in Case 2 it is the centering of the hypothesized distribution function evaluated at the quantiles X (n ), X (n2 ),, X (nk ) 2 Relationship between the standardized cumulative distribution and the chi-square statistic We have two approaches to appreciating the relationship between the standardized cumulative distribution and the chi-square statistic The first one uses the tridiagonal form of V Namely, the triangle property (see Barrett (978)) of the covariance matrix R ( R ) R ( R 2 ) R ( R k ) R ( R 2 ) R 2 ( R 2 ) R 2 ( R k ) V =, R ( R k ) R 2 ( R k ) R k ( R k ) and general characterization of the inverses of positive definite symmetric tridiagonal matrices (see Barrett (978) and Barrett (979)) enable us to express the inverse V in the following form: ρ + ρ 2 ρ ρ 2 ρ 2 + ρ 3 ρ V ρ = 3 ρ 3 + ρ ρ k + ρ k ρ k ρ k ρ k + ρ k+ 4

5 Also, as a consequence of the QR decomposition of a symmetric tridiagonal matrix (see eg Bar-On (997)) we can easily see that V = B τ B, where B = R 2 R R R 2 ρ 2 R R 2 ρ 2 0 R3 R R2 R 3 ρ 3 R2 R 3 ρ R4 0 0 R3 R 4 ρ 4 R k Rk R k ρ k R k Rk R k ρ k Rk ρ k+ Consequently, we have (R R) τ V (R R) = k+ j= (r j ρ j ) 2 ρ j (5) The equation (5) can be also reached from the projection properties Namely, there is, of course, an invertible linear relationship between the cumulative R j and individual r j values via R j = j r i and r j = R j R j, j =, 2,, k + Accordingly, we will have the same norm-squares c n (R R) τ V (R R) and c n (r ρ) τ C (r ρ) for standardized version of the vectors R and r where C/c n is the covariance matrix of the vector r with C i,j = ρ i {i=j} ρ i ρ j These forms use the vectors of length k, because the value r k+ = k r j j= is linearly determined from the others It is known (and easily checked) that the matrix C has entries (C ) i,j = ρ i {i=j} ρ k+ for i, j =, 2,, k (matching the Fisher information of the multinomial) and one finds from this form that (r ρ) τ C (r ρ) is algebraically the same as k+ j= (r j ρ j ) 2 as stated in Neyman (949) So this is another way to see (5) Furthermore, using suitable orthogonal vectors one can see how the chi-square statistic arises as the norm square of the fully standardised cumulative distributions The chi-square value k+ j= j= (r j ρ j ) 2 ρ j ρ j r j ρj is the norm square ξ u 2 of the difference between the vector with entries ξ j = and the unit vector u with entries ρ j, for j =,, k + Here we examine the geometry of the situation in R k+ The projection of ξ in the direction of this unit vector has length ξ τ u = k+ ( ) rj ρj ρj equal to Accordingly, if q, q 2,, q k and q k+ = u are 5

6 orthonormal vectors, then the chi-square value is the squared length of the projection of ξ onto the space orthogonal to u, spanned by q,, q k So it is given by k j= Z 2 j where Z j = ξ τ q j, j =, 2,, k, or equivalently Z j = (ξ u) τ q j This sort of analysis is familiar in linear regression theory A difference here is that the entries of ξ are not uncorrelated Nevertheless, the covariance E(ξ u)(ξ u) τ reduces to c n [I uu τ ] since it has entries which simplifies to E (r j ρ j )(r l ρ l ) ρj ρ l = c n ρ j j=l ρ j ρ l ρj ρ l c n ( {j=l} ρ j ρl ) Accordingly, EZ j Z l = Eqj τ (ξ u)(ξ u)τ q l = c n qj τ (I uuτ )q l is c n qj τ q l equal to 0 for j l Thus the Z j are indeed uncorrelated and have constant variance c n This is a standard way in which we know that the chi-square statistic with k + cells is a sum of k uncorrelated and standardized variables (cf Cramer (946), pages ) 3 A convenient choice of orthogonal vectors Here we wish to benefit from an explicit choice of the orthonormal vectors q,, q k orthogonal to q k+ = u We are motivated in this by the analysis in Stigler (984) For an iid sample Y Y n from N (µ, σ 2 ) the statistic (Y j Ȳn) n n 2 is the sum of squares (Y j of the indepen- j= j=2 Ȳj ) 2 j j dent N (0, σ 2 ) innovations (also known as standardized prediction errors) Z j = Y j Ȳj + j and, accordingly, this sum of squares is explicitly σ 2 times a chi-square random variable with n degrees of freedom These innovations decorrelate the vector of (Y i Ȳn) using q j like those below, with ρ i replaced with n According to Stigler (984) and Kruskal (946), analysis of this type originates with Helmert (876) (cf Rao (973), pp 82 83) The analogous choice for our setting is to let Z j = ξ τ q j, where the q,, q k, q k+ R k+ are the normalization of the following orthogonal vectors in ρ ρ ρ ρ ρ R ρ2 ρ 2 ρ 2 ρ 2 ρ R 2 ρ3 ρ 3 ρ 3 ρ3 R 3 ρ4 ρ 4 ρ ρ k ρk R ρk+ k ρk+ Essentially the same choices of orthogonal q j for determination of uncorrelated components Z j of ξ u are found in Irwin (949) See also Irwin (942), as well 6

7 as Lancaster (949) and Lancaster (965) where the matrix from Irwin (949) is explained as a particular member of a class of generalizations of the Helmert matrix The norm of the j-th such column for j =,, k equals Rj R is j+ ρ j+, so that, for j =,, k, q j = RjR j+ ρ j+ [ ρ,, ] τ R j ρ j,, 0,, 0 ρj+ R j + R2 j ρ j+ which and becomes which is Z j = ξ τ q j with ξ i = r i ρi Z j = r r j + rj+r j ρ j+ Rj R j+ ρj+ Z j = r j+r j R j ρ j+ Rj R j+ ρ j+ or, equivalently, for j =, 2,, k Z j = R j+r j R j R j+ Rj R j+ ρ j+ which are the innovations of the cumulative values R j+ (the standardized error of linear prediction of R j+ using R,, R j ) As a consequence of the above properties of the q j, these Z j are mean zero, orthogonal, and of constant variance /c n Each of these facts can be checked directly using ER j = R j and using the specified form of the covariance Cov(R j, R l ) = c n [min(r j, R l ) R j R l ] To summarize this section let us point out that we find an explicit standardization Z j = F n(t j+ )F (t j ) F n (t j )F (t j+ ), j =,, k (6) c n,j These random variables have mean 0 and variance F (t (with c n,j = j )F (t j+ )P (A j+ ) n ) and they are uncorrelated for j =,, k Moreover, the sum of squares k j= is precisely equal to the statistics given in expressions () and (2) It corresponds to a bidiagonal Cholesky decomposition of V as B τ B with B given by F (t j+ )/c n,j for the (j, j) entries, F (t j )/c n,j for the (j, j + ) entries and 0 otherwise, yielding the vector Z = B(F n F ), where F = (F (t ),, F (t k )) τ, as a full standardization of the vector F n = (F n (t ),, F n (t k )) τ The Z j may also be written as Z 2 j Z j = P n(a j+ )F (t j ) F n (t j )P (A j+ ) c n,j (7) 7

8 so its marginal distribution (with an hypothesized F ) comes from the trinomial distribution of (nf n (t j ), np n (A j )) These uncorrelated Z j, though not independent, suggest approximation to the distribution of Zj 2 from convolution j of the distributions of Zj 2 rather than the asymptotic chi-square Nevertheless, when t,, t k are fixed, it is clear by the multivariate central limit theorem (for the standardized sum of the iid random variables comprising P n (A j+ ) and F n (t j ) from (7)) that the joint distribution of Z = (Z,, Z k ) τ is asymptotically Normal(0, I), providing a direct path to the asymptotic chi-square(k) distribution of the statistic given equivalently in (), (2) and (3) 4 Large sample estimation properties The results of previous sections will be used to discuss asymptotic efficiency of weighted least squares estimators related to Case and Case 2 of Section 2 Let us consider the case of iid random sample X,, X n with distribution function from a parametric family F θ, θ Θ R p, t < < t k, and let R n = R and R n = R be as in Section 2 The vector R n R n, which we denote by (R n R n )(θ), can be considered as a vector depending on the data and the parameter Let θ 0 denote the true parameter value If (R n R n )(θ 0 ) converges to zero in probability P θ0, we may use the weighted least squares procedure for parameter estimation so that we minimize Q n (θ) = (R n R n ) τ (θ)v (R n R n )(θ) for θ Θ Here V is the covariance matrix of R n R n evaluated at a true value θ 0 or at a consistent estimator of the true value Both cases from Section 2, ie fixed and random t,, t k considered in the estimation context, fulfill this requirement Indeed, for Case 2 (random t < < t k, t j = X (nj ), it coincides with the estimator proposed in Swain at al (988)) we have R n (θ) = [F θ (X (n)),, F θ (X (nk ))] τ R n = [F n (X (n)),, F n (X (nk ))] τ n n + Here only the R n (θ) depends on θ and F θ0 (X (nj)) has a Beta(n j, n + n j ) distribution, E θ0 [F θ0 (X (nj ))] = n j n+ = F n(x (nj )) n n+ so that [ ] τ n R n = n +,, n k n + For Case (fixed t < < t k, it coincides with the estimator considered in Benšić (205)) we have R n = [F n (t ),, F n (t k )] τ, F n (x) = n n R n (θ) = E θ R n = [F θ (t ),, F θ (t k )] τ {Xi x}, Here only the expectation R n (θ) depends on θ In both cases, the convergence in probability P θ0 of (R n R n )(θ 0 ) to zero is a consequence of the form of the variances of the mean zero differences, which are (/n)r j ( R j ) in Case and (/(n + 2))R j ( R j ) in Case 2 8

9 However, there is a difference in the analysis of estimation properties for the two mentioned cases Let us discuss them separately Case For the fixed t < < t k, F θ = (F θ (t ),, F θ (t k )) τ and F n = (F n (t ),, F n (t k )) τ we can express the functional Q n,v (θ) as Q n,v (θ) = (F n F θ ) τ V (F n F θ ) For the complete standardization we should use the matrix V θ = Var θ [F n (t ) F θ (t ),, F n (t k ) F θ (t k )] that depends on the parameter Results sumarized in Section 2 then guaranty that minimizing the functional Q n (θ) = Q n,vθ (θ) leads to the classical Pearson minimum chi-square estimator (see eg Hsiao (2006) for its best asymptotically normal (BAN) distribution properties and see also Amemiya (976), Berkson (949), Berkson (980), Bhapkar (966), Fisher (924), Taylor (953) for more about minimum chi-square estimation) This estimation procedure can also be set in the framework of the generalized method of moments (GMM) Indeed, if we use a fixed V or we use V θ where θ is a consistent estimator of the true parameter value instead of V θ in the functional Q n,v (θ), then, as shown in Benšić (205), this estimation procedure can be seen as a GMM procedure Let ˆθ k,n denote the estimator obtained by minimization of the functional Q n,vθ (θ) Refining the notation from Section 2: A i = (t i, t i ], i =,, k, A k+ = (t k, ), P n (A i ) = F n (t i ) F n (t i ), P (A i ) = F θ (t i ) F θ (t i ), P (A i ; θ) = F θ (t i ) F θ (t i ) P 0 (A i ) = F θ0 (t i ) F θ0 (t i ), from the tridiagonal form of the weighting matrix we see that k+ ˆθ k,n = argmin θ Θ (P n (A i ) P (A i ; θ)) 2 P (8) (A i ) If classical assumptions of the generalized method of moments theory are fulfilled (see eg Newey and McFadden (994), Harris and Matyas (999)) it is shown in Benšić (205) that lim[nvar( ˆθ k,n )] has inverse G τ n 0V0 G 0 where G 0 and V 0 are, respectively, the matrices θ [F τ θ (t ),, F θ (t k )] τ and the covariance matrix of [F n (t ),, F n (t k )] τ evaluated at the true parameter value θ 0 And again, the tridiagonal form of V0 enables us to simplify this limit Indeed, if the model is regular, let S(x) = θ log f(x, θ) θ 0 be the population score function evaluated at the true parameter value Now we have [E θ0 (S (,t])] τ G 0 = [E θ0 (S (,tk ])] τ 9

10 and G τ 0V 0 G 0 = = = k+ P 0 (A i ) t i t i S(x)f(x; θ 0 ) dx t i t i t i S τ (x)f(x; θ 0 ) dx k+ S(x)f(x; θ 0 ) dx S τ (x)f(x; θ 0 ) dx t i t i P 0 (A i ) P 0 (A i ) P 0 (A i ) t i k+ P 0 (A i )E θ0 [S(X) A i ]E θ0 [S(X) A i ] τ This can be interpreted as a Riemann-Stieltjes discretization of the Fisher information which arises in the limit of large k Let us note the similarity of ˆθ k,n and the minimum chi-square estimator From (8) we see that they differ only in the denominator so we can interpret ˆθ k,n as a modified minimum chi-square It is well known that various minimum chi-square estimators are in fact generalized least squares (see eg Amemiya (976), Harris and Kanji (983), Hsiao (2006)) and BAN estimators Likewise, the norm square of standardizing the empirical distribution has been known to also provide a generalized least squares estimator Here we give a clear and simple way that summarize these findings through the complete standardization of the empirical distribution Case 2 In this case we have t j = X (nj ) so that the value F n (t j ) = n j /n is predetermined The random part F θ (X nj ) of Q n (θ) depends on the parameter But the matrix V (which should be used for complete standardization) does not depend on θ and can be computed from uniform order statistics covariances Now, the results summarized in Sections 2 (see also Swain at al (988)) enable us to represent the minimizer of the functional Q k (θ) as k+ ˆθ = argmin θ Θ ((F θ (X (ni )) F θ (X (ni ))) ni ni n i n i n+ n+ ) 2 (9) As it was mentioned in Swain at al (988), page 276, based on practice, there is a weight matrix which yields fits to empirical CDFs that are usually superior in many respects to the estimator (9) Nowadays, this can be explained in the view of the generalized spacing estimator (GSE) (see Ghosh and Rao Jammalamadaka (200), Cheng and Amin (983), Ranneby (984)) To discuss this, let us suppose for simplicity that all data are different and k = n so that the estimator can be easily recognized as the GSE Namely, if n i n i = then n+ ( Q n (θ) = (n + 2)(n + ) (F θ (X (ni)) F θ (X (ni ))) ) 2 n + Obviously, n+ ˆθ n = argmin (F θ (X (ni )) F θ (X (ni ))) 2 = θ Θ n+ h(f θ (X (ni )) F θ (X (ni ))), (0) 0

11 where h(t) = t 2 Detailed discussion about conditions for consistency and asymptotic normality for this type of estimator the interested reader can find in Ghosh and Rao Jammalamadaka (200) If we apply these results with h(t) = h 2 it comes out that we face a lack of BAN distribution properties with ˆθ n To illustrate this, let us suppose, for simplicity, that θ = θ is a scalar Theorem 32 from Ghosh and Rao Jammalamadaka (200) gives necessary and sufficient condition on h to generate GSE with minimum variance for a given class of functions which includes h(t) = t 2 It is stated there that asymptotic variance of a GSE is minimized with h(t) = a log t + bt + c where a, b and c are constants Based on the results formulated in Theorem 3 from the same paper, it is also possible to calculate the asymptotic variance of the GSE for the given function h under some regular conditions on the population density Thus, for h(t) = t 2 the expression (9), Theorem 3, from Ghosh and Rao Jammalamadaka (200) equals 2, which means that asymptotic variance of our estimator 2 (under mild conditions on the population density) is I(θ 0 ), where I(θ 0) denotes Fisher information So, for these cases ˆθ n from (0) is not BAN It is only 50% efficient asymptotically However, it is possible to reach the BAN distribution property for Case 2 and k = n through an iterative procedure which includes a modification of the denominator in (9) in each step: Let Q n (θ, θ ) = n+ (F θ (X (i) ) F θ (X (i ) ) n+ )2 F θ (X (i) ) F θ (X (i ) ) 2 Let θ be a consistent estimator for real θ 3 θ = θ θ j+ = argmin Q n (θ, θ j ), j =, 2, θ To show this let us denote F θ = [F θ (X () ),, F θ (X (n) )] τ, e = [,, ] τ R n, G θ = θ [F θ(x () ),, F θ (X (n) )] τ, and V θ = F θ (X () )( F θ (X () )) F θ (X () )( F θ (X (2) )) F θ (X () )( F θ (X (n) )) F θ (X () )( F θ (X (2) )) F θ (X (2) )( F θ (X (2) )) F θ (X (2) )( F θ (X (n) )) F θ (X () )( F θ (X (n) )) F θ (X (2) )( F θ (X (n) )) F θ (X (n) )( F θ (X (n) )) As in Gauss-Newton s method for nonlinear least squares, here we consider the following quadratic approximation ( θ ˆQ n (θ, θ j ) = F θj + G θj (θ θ j ) ) τ ( n + e V θ j F θj + G θj (θ θ j ) ) n + e of the function θ Q n (θ, θ j ) = point θ j ( ) τ ( ) F θ n+ e V θ j F θ n+ e about the,

12 Instead of nonlinear optimization problem min Q n (θ, θ j ), in every iteration θ we solve simple problem min ˆQn (θ, θ j ), that has explicit solution Then the θ corresponding iterative procedure is given by θ j+ = argmin θ or explicitly ( θ j+ = θ j + ( F θj + G θj (θ θ j ) n + e G τ θ j V θ j ) τ ( V θ j F θj + G θj (θ θ j ) ) n + e, ) ( ) G θj G τ θj V θ j n + e F θ j, j =, 2, () If we suppose that the sequence (θ j ) is convergent, then ( ) G τ θ j V θ j n + e F θ j 0 ie the limit of the sequence (θ j ) is the solution of the equation ( ) G τ θv θ n + e F θ = 0 (2) Let us consider the function where h(t) = log t Note that n+ S(θ) = h(f θ (X (i) ) F θ (X (i ) )), S (θ) = G τ θv θ ( ) n + e F θ, ie the condition S (θ) = 0 is exactly the same as equation (2) Finally, we showed the following: if the sequence (θ j ) given by () is convergent, then it converges to the stationary point of the function θ n+ h(f θ (X (ni )) F θ (X (ni ))), where h(t) = log t Here, Q n (θ, θ ) is algebraically the same functional as the one described in Case if we intentionally chose fixed t j to be the same as x (nj ) and behave as we are in Case 5 Conclusion In previous work Benšić (204), has been shown by simulations that fully standardizing the cumulative distribution produces estimators that are superior to those that minimize the Cramer-Von Mises and Anderson-Darling statistics Now, as a result of the presented perspective, we make it easy to understand that this means advocacy of minimum chi-square estimators as superior to estimators based on minimum distance between (unstandardized) cumulative distributions 2

13 We gave here the common framework in which for both, fixed t,, t k and quantiles t i = X (ni ), the form of the covariance of (F n (t i ) F (t i ), i =,, k) assures a simple relationship to chi-square statistic However, we caution that, when using all the empirical quantiles (k = n, n i = i, t i = X (i) ), the standardized (F (X (i) ) i n+, i =,, n) is not shown to have an effective norm square for estimation, being only 50% efficient A modified chi-square like formulation is given for the empirical quantiles that is fully efficient As noted in Section 3, the fully standardized cumulative distribution statistic Z = (Z,, Z k ) is asymptotically Normal (0, I) Thus the asymptotic distribution of Z does not depend on the hypothesized distribution F (that is, it is asymptotically distribution free) unlike the vector of k + components n(pn (A) P (A))/ P (A) whose (asymptotic) distribution depends in particular on the vector of components P (A) to which it is orthogonal As Z is asymptotically distribution free it is akin to the test statistic components studied in Khmaladze (203) A difference is that there the objective is to provide a class of such asymptotically distribution free statistics for discrete settings whereas our objective is to clarify understanding of the fully standardized cumulative distribution for improved efficiency of estimation References Amemiya, T (976) The Maximum Likelihood, the Minimum Chi- Square and the Nonlinear Weighted Least-Squares Estimator in the General Qualitative Response Model, Journal of the American Statistical Association 7, Anderson, T W and D A Darling (952) Asymptotic Theory of Certain Goodness of Fit Criteria Based on Stochastic Processes, The Annals of Mathematical Statistics, 23, Bar-On I, B Codenotti, M Leoncini (997) A Fast Parallel Cholesky Decomposition Algorithm for Tridiagonal Symmetric Matrices, SIAM J Matrix Anal and Appl, 8, Barrett, WW, PJ Feinsilver (978) Gaussian families and a theorem on patterned matrices,j Appl Prob 5, Barrett, WW (979) A Theorem on Inverses of Tridiagonal Matrices,Linear algebra and its applications 27, 2 27 Bdair, OM (202) Different methods of estimation for Marshall Olkin exponential distribution, Journal of Applied Statistical Science Benšić, M (204) Fitting distribution to data by a generalized nonlinear least squares method, Communications in Statistics Simulation and Computation 43, Benšić, M (205) Properties of the generalized nonlinear least squares method applied for fitting distribution to data, Discussiones Mathematicae Probability and Statistics 35, Berkson, J (949) Minimum χ 2 and maximum likelihood solution in terms of a linear transform, with particular reference to bio-assay, Journal of the American Statistical Association, 44,

14 Berkson, J (980) Minimum chi-square, not maximum likelihood!, The Annals of Statistics 8, Bhapkar, V P (966) A Note on the Equivalence of Two Test Criteria for Hypotheses in Categorical Data, Journal of the American Statistical Association 6, Cheng, R C H and N A K Amin (983) Estimating parameters in continuous univariate distributions with a shifted origin, J Roy Statisti Soc Ser B 45, Choulakian, V, R A Lockhart, and M A Stephens (994) Cramer-von Mises statistics for discrete distributions, Canad J Statist 22, Cramer, H (946) Mathematical Methods of Statistics, Princeton University Press Dey, S (204) Two-parameter Rayleigh distribution: Different methods of estimation, American Journal of Mathematical and Management Sciences Fisher, R A (924) The Conditions Under Which χ 2 Measures the Discrepancey Between Observation and Hypothesis, Journal of the Royal Statistical Society 87, Ghosh, K and S Rao Jammalamadaka (200) A general estimation method using spacings, Jouranal of Statistical Planing and Inferences, 93, 7 82 Harris, D and L Matyas (999) Introduction to the generalized method of moment estimation, in: Matyas, L (Ed), Generalized method of moment estimation, Cambridge University Press, Cambridge 3 30 Harris, R R and G K Kanji (983) On the use of minimum chi-square estimation, The Statistician 23, Hartley, HO, RC Pfaffenberger (972) Quadratic forms in order statistics used as goodnes-of-fit criteria, Biometrika 59, Helmert, F R (986) Die Genauigkeit der Formel von Peters zur Berechnung des wahrscheinlichen Beobachtungsfehlers directer Beobachtungen gleicher Genauigkeit, Astronomische Nachrichten 88, columns 3 32 Henze, N (996) Empirical-distribution-function goodness-of-fit tests for discrete models, Canad J Statist 24, 8 93 Hsiao, C (2006) Minimum Chi-Square, Encyclopedia of Statistical Scineces 7, John Wiley & Sons Irwin, J O (942) On the Distribution of a Weighted Estimate of Variance and on Analysis of Variance in Certain Cases of Unequal Weighting, Journal of the Royal Statistical Society 05, 5 8 Irwin, J O (949) A Note on the Subdividsion of χ 2 Biometrika 36, into Components, 4

15 Johnson, NL, S Kotz, N Balakrishnan (994) Continuous Univariate Distributions, Vol, J Wiley & Sons, Inc, New York-Singapore Kantar, YM (205) Generalized least squares and weighted least squares estimation methods for distributional parameters, REVSTAT - Statistical Journal 3, Kundu D, MZ Raqab (2005) Generalized Rayleigh distribution: different methods of estimations, Computational Statistics & Data Analysis 49, Khmaladze, E (203) Note on distribution free testing for discrete distributions, Ann Statist 4, Kruskal, W H (946), Helmert s Distribution, American Mathematical Monthly 53, Lancaster, H O (949) The Derivation and Partition of χ 2 in Certain Discrete Distributions, Biometrika 36, 7 29 Lancaster, H O (965) The Helmert Matrices, The American Mathematical Monthly 72, 4 2 Newey, W K and D McFadden (994) Large Sample Estimation and Hypothesis Testing, in Engle, R and D McFadden, eds, Handbook of Econometrics, Vol 4, New York: North Holland Neyman, J (949) Contribution to the theory of the χ 2 test, In Proc Berkeley Symp Math Stat Prob, Ranneby, B (984) The maximum spacing method: an estimation method related to the maximum likelihood method, Scand J Statist, 93 2 Rao, C R (973) Linear Statistical Inference and Its Applications (2nd ed), New York: John Wiley Rinne, H (2009) The Weibull Distribution A Handbook, Taylor & Francis Group Stigler, S M (984) Kruskal s Proof of the Joint Distribution of X and s 2, The American Statistician 38, Swain, J, Venkatraman, S, Wilson, J (988) Least-squares estimation of distribution function in Johnson s translation system J Statist Comput Simulation 29, Taylor, W F (953) Distance functions and regular best asymptotically normal estimates, The Annals of Mathematical Statistics 24, Torres, FJ (204) Estimation of parameters of the shifted Gomperty distribution using least squares, maximum likelihood and moment methods Journal of Computational and Applied Mathematics 255,

Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic

Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic Andrew Barron Department of Statistics, Yale University Mirta Benšić Department of Mathematics, University of Osijek and

More information

PROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA

PROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA Discussiones Mathematicae Probability and Statistics 35 (2015) 75 94 doi:10.7151/dmps.1172 PROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA Mirta Benšić

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Multivariate Distributions

Multivariate Distributions Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review

More information

Investigation of goodness-of-fit test statistic distributions by random censored samples

Investigation of goodness-of-fit test statistic distributions by random censored samples d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Exact goodness-of-fit tests for censored data

Exact goodness-of-fit tests for censored data Exact goodness-of-fit tests for censored data Aurea Grané Statistics Department. Universidad Carlos III de Madrid. Abstract The statistic introduced in Fortiana and Grané (23, Journal of the Royal Statistical

More information

Submitted to the Brazilian Journal of Probability and Statistics

Submitted to the Brazilian Journal of Probability and Statistics Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study MATEMATIKA, 2012, Volume 28, Number 1, 35 48 c Department of Mathematics, UTM. The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study 1 Nahdiya Zainal Abidin, 2 Mohd Bakri Adam and 3 Habshah

More information

component risk analysis

component risk analysis 273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS

ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS Bull. Korean Math. Soc. 53 (2016), No. 2, pp. 351 363 http://dx.doi.org/10.4134/bkms.2016.53.2.351 ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS Sangyeol Lee Abstract. In this paper, we

More information

Estimation of parametric functions in Downton s bivariate exponential distribution

Estimation of parametric functions in Downton s bivariate exponential distribution Estimation of parametric functions in Downton s bivariate exponential distribution George Iliopoulos Department of Mathematics University of the Aegean 83200 Karlovasi, Samos, Greece e-mail: geh@aegean.gr

More information

Goodness of Fit Test and Test of Independence by Entropy

Goodness of Fit Test and Test of Independence by Entropy Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi

More information

LECTURE 18: NONLINEAR MODELS

LECTURE 18: NONLINEAR MODELS LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity

Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity International Mathematics and Mathematical Sciences Volume 2011, Article ID 249564, 7 pages doi:10.1155/2011/249564 Research Article The Laplace Likelihood Ratio Test for Heteroscedasticity J. Martin van

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS Dimitrios Konstantinides, Simos G. Meintanis Department of Statistics and Acturial Science, University of the Aegean, Karlovassi,

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Table of selected values

Table of selected values Table of selected values Most statistical textbooks list t distribution tables. Nowadays, the better way to a fully precise critical t value or a cumulative probability is the statistical function implemented

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T

More information

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY

COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Annales Univ Sci Budapest, Sect Comp 45 2016) 45 55 COMPETING RISKS WEIBULL MODEL: PARAMETER ESTIMATES AND THEIR ACCURACY Ágnes M Kovács Budapest, Hungary) Howard M Taylor Newark, DE, USA) Communicated

More information

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983), Mohsen Pourahmadi PUBLICATIONS Books and Editorial Activities: 1. Foundations of Time Series Analysis and Prediction Theory, John Wiley, 2001. 2. Computing Science and Statistics, 31, 2000, the Proceedings

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Applied Probability and Stochastic Processes

Applied Probability and Stochastic Processes Applied Probability and Stochastic Processes In Engineering and Physical Sciences MICHEL K. OCHI University of Florida A Wiley-Interscience Publication JOHN WILEY & SONS New York - Chichester Brisbane

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

Tests Concerning Equicorrelation Matrices with Grouped Normal Data

Tests Concerning Equicorrelation Matrices with Grouped Normal Data Tests Concerning Equicorrelation Matrices with Grouped Normal Data Paramjit S Gill Department of Mathematics and Statistics Okanagan University College Kelowna, BC, Canada, VV V7 pgill@oucbcca Sarath G

More information

Burr Type X Distribution: Revisited

Burr Type X Distribution: Revisited Burr Type X Distribution: Revisited Mohammad Z. Raqab 1 Debasis Kundu Abstract In this paper, we consider the two-parameter Burr-Type X distribution. We observe several interesting properties of this distribution.

More information

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring Communications for Statistical Applications and Methods 2014, Vol. 21, No. 6, 529 538 DOI: http://dx.doi.org/10.5351/csam.2014.21.6.529 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation for Mean

More information

i=1 h n (ˆθ n ) = 0. (2)

i=1 h n (ˆθ n ) = 0. (2) Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating

More information

Sample size determination for logistic regression: A simulation study

Sample size determination for logistic regression: A simulation study Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This

More information

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Surveys in Mathematics and its Applications ISSN 842-6298 (electronic), 843-7265 (print) Volume 5 (200), 3 320 IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Arunava Mukherjea

More information

IDL Advanced Math & Stats Module

IDL Advanced Math & Stats Module IDL Advanced Math & Stats Module Regression List of Routines and Functions Multiple Linear Regression IMSL_REGRESSORS IMSL_MULTIREGRESS IMSL_MULTIPREDICT Generates regressors for a general linear model.

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

Exact goodness-of-fit tests for censored data

Exact goodness-of-fit tests for censored data Ann Inst Stat Math ) 64:87 3 DOI.7/s463--356-y Exact goodness-of-fit tests for censored data Aurea Grané Received: February / Revised: 5 November / Published online: 7 April The Institute of Statistical

More information

Exact Solutions for a Two-Parameter Rayleigh Distribution

Exact Solutions for a Two-Parameter Rayleigh Distribution Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 11 (2017), pp. 8039 8051 Research India Publications http://www.ripublication.com/gjpam.htm Exact Solutions for a Two-Parameter

More information

Multivariate GARCH models.

Multivariate GARCH models. Multivariate GARCH models. Financial market volatility moves together over time across assets and markets. Recognizing this commonality through a multivariate modeling framework leads to obvious gains

More information

To appear in The American Statistician vol. 61 (2007) pp

To appear in The American Statistician vol. 61 (2007) pp How Can the Score Test Be Inconsistent? David A Freedman ABSTRACT: The score test can be inconsistent because at the MLE under the null hypothesis the observed information matrix generates negative variance

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

On the Comparison of Fisher Information of the Weibull and GE Distributions

On the Comparison of Fisher Information of the Weibull and GE Distributions On the Comparison of Fisher Information of the Weibull and GE Distributions Rameshwar D. Gupta Debasis Kundu Abstract In this paper we consider the Fisher information matrices of the generalized exponential

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

Moments of the Reliability, R = P(Y<X), As a Random Variable

Moments of the Reliability, R = P(Y<X), As a Random Variable International Journal of Computational Engineering Research Vol, 03 Issue, 8 Moments of the Reliability, R = P(Y

More information

Estimation of the Bivariate Generalized. Lomax Distribution Parameters. Based on Censored Samples

Estimation of the Bivariate Generalized. Lomax Distribution Parameters. Based on Censored Samples Int. J. Contemp. Math. Sciences, Vol. 9, 2014, no. 6, 257-267 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.4329 Estimation of the Bivariate Generalized Lomax Distribution Parameters

More information

SIMULATED POWER OF SOME DISCRETE GOODNESS- OF-FIT TEST STATISTICS FOR TESTING THE NULL HYPOTHESIS OF A ZIG-ZAG DISTRIBUTION

SIMULATED POWER OF SOME DISCRETE GOODNESS- OF-FIT TEST STATISTICS FOR TESTING THE NULL HYPOTHESIS OF A ZIG-ZAG DISTRIBUTION Far East Journal of Theoretical Statistics Volume 28, Number 2, 2009, Pages 57-7 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing House SIMULATED POWER OF SOME DISCRETE GOODNESS-

More information

Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators

Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators Computational Statistics & Data Analysis 51 (26) 94 917 www.elsevier.com/locate/csda Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators Alberto Luceño E.T.S. de

More information

Chapter 31 Application of Nonparametric Goodness-of-Fit Tests for Composite Hypotheses in Case of Unknown Distributions of Statistics

Chapter 31 Application of Nonparametric Goodness-of-Fit Tests for Composite Hypotheses in Case of Unknown Distributions of Statistics Chapter Application of Nonparametric Goodness-of-Fit Tests for Composite Hypotheses in Case of Unknown Distributions of Statistics Boris Yu. Lemeshko, Alisa A. Gorbunova, Stanislav B. Lemeshko, and Andrey

More information

Statistic Distribution Models for Some Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses

Statistic Distribution Models for Some Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses Communications in Statistics - Theory and Methods ISSN: 36-926 (Print) 532-45X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta2 Statistic Distribution Models for Some Nonparametric Goodness-of-Fit

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

1 One-way analysis of variance

1 One-way analysis of variance LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.

More information

Negative Multinomial Model and Cancer. Incidence

Negative Multinomial Model and Cancer. Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,

More information

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Vladas Pipiras University of North Carolina at Chapel Hill UNC Graduate Seminar, November 10, 2010 (joint

More information

Likelihood and p-value functions in the composite likelihood context

Likelihood and p-value functions in the composite likelihood context Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

On the smallest eigenvalues of covariance matrices of multivariate spatial processes

On the smallest eigenvalues of covariance matrices of multivariate spatial processes On the smallest eigenvalues of covariance matrices of multivariate spatial processes François Bachoc, Reinhard Furrer Toulouse Mathematics Institute, University Paul Sabatier, France Institute of Mathematics

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Understanding Ding s Apparent Paradox

Understanding Ding s Apparent Paradox Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test. Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the

More information

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Statistical Signal Processing Detection, Estimation, and Time Series Analysis Statistical Signal Processing Detection, Estimation, and Time Series Analysis Louis L. Scharf University of Colorado at Boulder with Cedric Demeure collaborating on Chapters 10 and 11 A TT ADDISON-WESLEY

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

An Analysis of Record Statistics based on an Exponentiated Gumbel Model

An Analysis of Record Statistics based on an Exponentiated Gumbel Model Communications for Statistical Applications and Methods 2013, Vol. 20, No. 5, 405 416 DOI: http://dx.doi.org/10.5351/csam.2013.20.5.405 An Analysis of Record Statistics based on an Exponentiated Gumbel

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

j=1 π j = 1. Let X j be the number

j=1 π j = 1. Let X j be the number THE χ 2 TEST OF SIMPLE AND COMPOSITE HYPOTHESES 1. Multinomial distributions Suppose we have a multinomial (n,π 1,...,π k ) distribution, where π j is the probability of the jth of k possible outcomes

More information

Estimation of Stress-Strength Reliability for Kumaraswamy Exponential Distribution Based on Upper Record Values

Estimation of Stress-Strength Reliability for Kumaraswamy Exponential Distribution Based on Upper Record Values International Journal of Contemporary Mathematical Sciences Vol. 12, 2017, no. 2, 59-71 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ijcms.2017.7210 Estimation of Stress-Strength Reliability for

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Method of Conditional Moments Based on Incomplete Data

Method of Conditional Moments Based on Incomplete Data , ISSN 0974-570X (Online, ISSN 0974-5718 (Print, Vol. 20; Issue No. 3; Year 2013, Copyright 2013 by CESER Publications Method of Conditional Moments Based on Incomplete Data Yan Lu 1 and Naisheng Wang

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information