Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic
|
|
- Arleen Pope
- 6 years ago
- Views:
Transcription
1 Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic Andrew Barron Department of Statistics, Yale University Mirta Benšić Department of Mathematics, University of Osijek and Kristian Sabo Department of Mathematics, University of Osijek July, 206 Abstract Standardizing the empirical distribution function yields a statistic with norm square that matches the chi-square test statistic To show this one may use the covariance matrix of the empirical distribution which, at any finite set of points, is shown to have an inverse which is tridiagonal Moreover, a representation of the inverse is given which is a product of bidiagonal matrices corresponding to a representation of the standardization of the empirical distribution via a linear combination of values at two consecutive points These properties are discussed also in the context of minimum distance estimation Keywords: Generalized least squares, minimum distance estimation, covariance of empirical distribution, covariance of quantiles, bidiagonal Cholesky factors, Helmert matrix
2 Introduction Let X, X 2,, X n be independent real valued random variables with distribution function F Let F n be the empirical distribution function F n (t) = n ni= {Xi t} and let n(fn (t) F (t)), t T, be the centered empirical process evaluated at a set of points T R It is familiar that when F is an hypothesized distribution and T = R the maximum of the absolute value of this empirical process corresponds to the Kolmogorov Sminov test statistic, the average square corresponds to the Cramer-Von Mises test statistic and the average square with marginal standardization using the variance equal to F (t)( F (t)) produces the Anderson-Darling statistics (average with the distribution F ) (see Anderson (952)) The covariance of the empirical process takes the form F (t)( F (s)) for t s For finite T let V denote n the corresponding symmetric covariance matrix of the column vector n(f n F ) with entries n(f n (t) F (t)), t T A finite T counterpart to the Anderson Darling statistic is n(f n F ) T (Diag(V )) (F n F ), which uses only the diagonal entries of V Complete standardization of the empirical distribution restricted to T has been put forward in Benšić (204) leading to the distance n(f n F ) T V (F n F ), () which is there analysed as a generalized least squares criterion for minimum distance parameter estimation It fits also in the framework of the generalized method of moments (Benšić (205)) The motivation, familiar from regression, is that the complete standardization produces more efficient estimators The purpose of the present work is to show statistical simplifications in the generalized least squares criterion In particular, we show that the expression () is precisely equal to 2
3 the chi-square test statistic n (P n (A) P (A)) 2, (2) A π P (A) where π is the partition of R into the k + intervals A formed by consecutive values T = {t,, t k }, where A = (, t ], A 2 = (t, t 2 ],, A k = (t k, t k ] and A k+ = (t k, ) Here P n (A j ) = F n (t j ) F n (t j ) = ni= n {Xi A j } and P (A j ) = F (t j ) F (t j ) with F ( ) = F n ( ) = 0 and F ( ) = F n ( ) = Moreover, we show that V takes a tridiagonal form with /P (A j ) + /P (A j+ ) for the (j, j) entries on the diagonal; /P (A j ), for the (j, j) entries and /P (A j+ ) for the (j +, j) entries and 0 otherwise We find an explicit standardization Z j = F n(t j+ )F (t j ) F n (t j )F (t j+ ) c n,j, j =,, k These random variables have mean 0 and variance (with c n,j = they are uncorrelated for j =,, k Moreover, the sum of squares F (tj )F (t j+ )P (A j+ ) n ) and k Zj 2 (3) j= is precisely equal to the statistic given in expressions () and (2) It corresponds to a bidiagonal Cholesky decomposition of V as B τ B with B given by F (t j+ )/c n,j the (j, j) entries, F (t j )/c n,j for the (j, j + ) entries and 0 otherwise, yielding the vector Z = B(F n F ), where F = (F (t ),, F (t k )) τ, as a full standardization of the vector F n = (F n (t ),, F n (t k )) τ The Z j may also be written as Z j = P n(a j+ )F (t j ) F n (t j )P (A j+ ) c n,j (4) so its marginal distribution (with an hypothesized F ) comes from the trinomial distribution of (nf n (t j ), np n (A j )) These uncorrelated Z j, though not independent, suggest approxi- 3 for
4 mation to the distribution of Zj 2 from convolution of the distributions of Zj 2 rather than j the asymptotic chi-square Nevertheless, when t,, t k are fixed, it is clear by the multivariate central limit theorem (for the standardized sum of the iid random variables comprising P n (A j+ ) and F n (t j ) from (4)) that the joint distribution of Z = (Z,, Z k ) τ is asymptotically Normal(0, I), providing a direct path to the asymptotic chi-square(k) distribution of the statistic given equivalently in (), (2), (3) There are natural fixed and random choices for the points t,, t k A natural fixed choice is to use k quantiles of a reference distribution If F is an hypothesized continuous distribution, such quantiles can be chosen such that P (A j ) = F (t j ) F (t j ) = /(k + ) A natural random choice is to use empirical quantiles t j = X (nj ) with n < n 2 < < n k n If k + divides n the n j may equal jn/(k + ), for j =,, k With empirical quantiles it is the F (t j ) = F (X (nj )) that are random, having the same distribution as uniform order statistic with mean R j = n j /(n + ) and covariance V j,l /(n + ) where V j,l = R tj ( R tl ) for j l Once again we find that (F R) τ V (F R), where R = (R,, R k ) τ, takes the form of a chi-square test statistic and the story is much the same The Z j are now multiples of R j F (X (nj )) R j F (X (nj )) which are again mean 0, variance and uncorrelated The main difference in this case is that their exact distribution comes from the Dirichlet distribution of (F (X (nj )), (F (X (nj )) F (X (nj ))) rather than from the multinomial The form of the V with bidiagonal decomposition B τ B and the representation of the norm square of the standardized empirical distribution () as the chi-square test statistic (2) provides simplified interpretation, simplified computation and simplified statistical analysis For interpretation, we see that when choosing between tests based on the cumulative distribution (like the Anderson Darling test) and tests based on counts in disjoint 4
5 cells, the choice largely depends on whether one wants the benefit of the more complete standardization leading to the chi-square test For computation, we see that generalized least squares simplifies due to the tridiagonal form As for simplification of statistical analysis, consider the case of a parametric family of distribution function F θ with a real parameter θ The generalized least squares procedure picks ˆθ = ˆθ k,n to minimize (F n F θ ) τ V (F n F θ ), where V is the covariance matrix evaluated at a consistent estimator of the true θ 0 For a fixed set of points t,, t k it is known (Benšić (205)) that lim n [nvar(ˆθ k,n )] has reciprocal G τ V θ0 G where G is the vector θ F θ evaluated at the true parameter value θ = θ 0 Using the tridiagonal inverse we show that this G τ Vθ 0 G simplifies to P θ0 (A)(E[S(X) A)]) 2, (5) A π where S(X) = θ log f(x θ) is the score function evaluated at θ = θ 0 which we interpret as a Riemann Stieltjes discretization of the Fisher information E[S 2 (X)] This Fisher information arises in the limit of large k 2 Common Framework Let r, r 2,, r k+ be random variables with sum, let ρ, ρ 2,, ρ k+ be their expectations, and let R j = j r i j and R j = ρ i i= i= be their cumulative sums We are interested in the differences R j R j Suppose that there is a constant c = c n such that Cov(R j, R l ) = c R j( R l ) = c V j,l (6) 5
6 for j l Let R = (R,, R k ) τ and R = (R,, R k ) τ We explore the relationship between (R R) τ V (R R) and k+ and the structure of the inverse V as j= (r j ρ j ) 2 ρ j well as construction of a version of B(R R) with uncorrelated entries, and V = B τ B We have the following cases for X,, X n iid with distribution function F Case : With fixed t < < t k and t 0 =, t k+ = we set R j = F n (t j ) = n {Xi t n j } i= with expectations R j = F (t j ) These have increments r j = P n (A j ) = n n {Xi A j } and i= ρ j = P (A j ) = F (t j ) F (t j ) with intervals A j = (t j, t j ] Now the covariance is /n times the covariance in a single draw, so (6) holds with c = n Case 2: With fixed n < n 2 < < n k n and ordered statistics X (n ) X (n2 ) X (nk ) we set t j = X (nj ) and R j = F (X (nj )) with expectation R j = n j /(n + ) These have increments r j = P (A j ) and ρ j = (n j n j )/(n + ) Now, when F is continuous the joint distribution of the R j is the Dirichlet distribution of uniform quantiles and (6) holds for c = n + 2 Note that in both cases we examine distribution properties of R j R j which is F n (t j ) F (t j ) in Case and F (t j ) F n (t j )n/(n + ) in Case 2 Thus, the difference R R is a vector of centered cumulative distributions In Case it is the centering of the empirical distribution at t,, t k and in Case 2 it is the centering of the hypothesized distribution function evaluated at the quantiles X (n ), X (n2 ),, X (nk ) 6
7 3 Tridiagonal V and its bidiagonal decomposition We have two approaches to appreciating the relationship between the standardized cumulative distribution and the chi-square statistic In this section, we use elementary matrix calculations to show that the inverse of the matrix V has a special tridiagonal structure, to derive its bidiagonal decomposition and to obtain the following identity: (R R) τ V (R R) = k+ j= (r j ρ j ) 2 ρ j Whereas in the next section we revisit the matter from the geometrical perspective of orthogonal projection Lemma If V = R ( R ) R ( R 2 ) R ( R k ) R ( R 2 ) R 2 ( R 2 ) R 2 ( R k ) R ( R k ) R 2 ( R k ) R k ( R k ), then V = ρ + ρ 2 ρ ρ 2 ρ 2 + ρ 3 ρ ρ 3 ρ 3 + ρ ρ k + ρ k ρ k ρ k ρ k + ρ k+ 7
8 Moreover, V = B τ B, where B = R 2 R R 2 ρ 2 R R R 2 ρ R 3 R2 R 3 ρ 3 R 2 R2 R 3 ρ R 4 R3 R 4 ρ R k Rk R k ρ k R k Rk R k ρ k Rk ρ k+ Proof Firstly, let us show that V V = V V = I Since V is symmetric, it is enough to show V V = I In order to do this, note that for j =,, k we have ( V V ) j,j = k s= V j,s V s,j = R j ( R j ) ρ j + R j ( R j ) where R 0 = 0 and R k+ = Similarly, for j < l k, we have ( V V ) j,l = k s= V j,s V s,l = R l ( R l ) ρ l + R l ( R l ) It remains to show B τ B = V For j =,, k it follows For j =,, k we have ( + ) R j ( R j+ ) =, ρ j ρ j+ ρ j+ ( + ) R l ( R l+ ) = 0 ρ l ρ l+ ρ l+ (B τ B) j,j = R2 j R 2 j+ + = + R j R j ρ j R j R j+ ρ j+ ρ j ρ j+ (B τ R j+ R j B) j,j+ = = Rj R j+ ρ j+ Rj R j+ ρ j+ ρ j+ 8
9 Finally, for l j + 2, it follows (B τ B) j,l = 0 Since both of matrices B τ B and V are symmetric, the identity B τ B = V holds Corollary (R R) τ V (R R) = k+ j= (r j ρ j ) 2 ρ j Proof Note that V can be written in the following way: R + R 2 R R 2 R R 2 R R 2 R + R 3 R 2 R 3 R V = R 3 R 2 R 3 R 2 + R 4 R Consequently, we obtain R k R k 2 + R k R k R k R k R k R k R k + R k R k (R R) τ V (R R) = (R R ) 2 + (R k R k ) 2 R R k k + (R j R j R j + R j ) 2 R j R j = j=2 k+ j= (r j ρ j ) 2 ρ j 4 Projection properties There is, of course, an invertible linear relationship between the cumulative R j and individual r j values via j R j = r i and r j = R j R j, j =, 2,, k + i= 9
10 Accordingly, we will have the same norm-squares c n (R R) τ V (R R) and c n (r ρ) τ C (r ρ) for standardized version of the vectors R and r where C/c n is the covariance matrix of the vector r with C i,j = ρ i {i=j} ρ i ρ j These forms use the vectors of length k, because the value r k+ = k r j is linearly determined from the others It is known (and easily j= checked) that the matrix C has entries (C ) i,j = ρ i {i=j} ρ k+ for i, j =, 2,, k (matching the Fisher information of the multinomial) and one finds from this form that (r ρ) τ C (r ρ) is algebraically the same as k+ (r j ρ j ) 2 j= ρ j as stated in Neyman (949) So this is another way to see the equivalence of expressions () and (2) Furthermore, using suitable orthogonal vectors one can see how the chi-square statistic (2) arises as the norm square of the fully standardised cumulative distributions (3) The chi-square value k+ j= (r j ρ j ) 2 ρ j is the norm square ξ u 2 of the difference between the vector with entries ξ j = r j ρj and the unit vector u with entries ρ j, for j =,, k + Here we examine the geometry of the situation in R k+ The projection of ξ in the direction of this unit vector has length ξ τ u = k+ ( ) r j ρj ρj equal to Accordingly, if q, q 2,, q k j= and q k+ = u are orthonormal vectors, then the chi-square value is the squared length of the projection of ξ onto the space orthogonal to u, spanned by q,, q k So it is given by k Zj 2 where Z j = ξ τ q j, j =, 2,, k, or equivalently Z j = (ξ u) τ q j j= This sort of analysis is familiar in linear regression theory A difference here is that the entries of ξ are not uncorrelated Nevertheless, the covariance E(ξ u)(ξ u) τ reduces 0
11 to c n [I uu τ ] since it has entries E (r j ρ j )(r l ρ l ) ρj ρ l = c n ρ j j=l ρ j ρ l ρj ρ l which simplifies to c n ( {j=l} ρ j ρl ) Accordingly, EZ j Z l = Eq τ j (ξ u)(ξ u) τ q l = c n q τ j (I uu τ )q l is c n q τ j q l equal to 0 for j l Thus the Z j are indeed uncorrelated and have constant variance c n This is a standard way in which we know that the chi-square statistic with k + cells is a sum of k uncorrelated and standardized variables (cf Cramer (946), pages ) 5 A convenient choice of orthogonal vectors Here we wish to benefit from an explicit choice of the orthonormal vectors q,, q k orthogonal to q k+ = u We are motivated in this by the analysis in Stigler (984) For an iid sample Y Y n from N (µ, σ 2 ) the statistic n (Y j j=2 Ȳj ) 2 j j n j= (Y j Ȳn) 2 is the sum of squares of the independent N (0, σ 2 ) innovations (also known as standardized prediction errors) Z j = Y j Ȳj and, accordingly, this sum of squares is explicitly σ 2 times + j a chi-square random variable with n degrees of freedom These innovations decorrelate the vector of (Y i Ȳn) using q j like those below, with ρ i replaced with According to n Stigler (984) and Kruskal (946), analysis of this type originates with Helmert (876) (cf Rao (973), pp 82 83) The analogous choice for our setting is to let Z j = ξ τ q j, where the q,, q k, q k+ are
12 the normalization of the following orthogonal vectors in R k+ ρ ρ ρ ρ ρ ρ2 R ρ 2 ρ 2 ρ 2 ρ2 0 ρ3 R 2 ρ 3 ρ 3 ρ3 0 0 ρ4 R 3 ρ 4 ρ ρ k ρk R k ρk+ ρk+ Essentially the same choices of orthogonal q j for determination of uncorrelated components Z j of ξ u are found in Irwin (949) See also Irwin (942), as well as Lancaster (949) and Lancaster (965) where the matrix from Irwin (949) is explained as a particular member of a class of generalizations of the Helmert matrix The norm of the j-th such column for j =,, k equals so that, for j =,, k, q j = R j R j+ ρ j+ [ ρ,, ρ j, R j + R2 j ρ j+ ] τ R j, 0,, 0 ρj+ which is R j R j+ ρ j+, and becomes which is Z j = ξ τ q j with ξ i = r i ρi Z j = r r j + r j+r j ρ j+ R j R j+ ρj+ Z j = r j+r j R j ρ j+ Rj R j+ ρ j+ 2
13 or, equivalently, for j =, 2,, k Z j = R j+r j R j R j+ Rj R j+ ρ j+ which are the innovations of the cumulative values R j+ (the standardized error of linear prediction of R j+ using R,, R j ) As a consequence of the above properties of the q j, these Z j are mean zero, orthogonal, and of constant variance /c n Each of these facts can be checked directly using ER j = R j and using the specified form of the covariance Cov(R j, R l ) = c n [min(r j, R l ) R j R l ] To summarize this section we have presented uncorrelated components Z j of the chisquare statistic Moreover, we have provided interpretation of these components as innovations standardizing the cumulative distribution values It provides specific proof of the equivalence of expressions (), (2) and (3) 6 Large sample estimation properties The results of previous sections will be used to discuss asymptotic efficiency of certain minimum distance estimators related to Case and Case 2 of Section 2 Let us consider the case of iid random sample X,, X n with distribution function from a parametric family F θ, θ Θ R p, t < < t k, and let R n = R and R n = R be as in Section 2 The vector R n R n, which we denote by (R n R n )(θ), can be considered as a vector depending on the data and the parameter Let θ 0 denote the true parameter value If (R n R n )(θ 0 ) converges to zero in probability P θ0, we may use the generalized least squares procedure for parameter estimation so that we minimize Q n (θ) = (R n R n ) τ (θ)v (R n R n )(θ) for θ Θ Here V is the covariance matrix of R n R n evaluated at a true value θ 0 or at a consistent estimator of the true value 3
14 Both cases from Section 2, ie fixed and random t,, t k considered in the estimation context, fulfill this requirement Indeed, for Case (fixed t < < t k ) we have R n = [F n (t ),, F n (t k )] τ, F n (x) = n {Xi x}, n i= R n (θ) = E θ R n = [F θ (t ),, F θ (t k )] τ Here only the expectation R n (θ) depends on θ For Case 2 (random t < < t k, t j = X (nj )) we have R n (θ) = [F θ (X (n )),, F θ (X (nk ))] τ R n = [F n (X (n )),, F n (X (nk ))] τ n n + Now only the R n (θ) depends on θ Here F θ0 (X (nj )) has a Beta(n j, n + n j ) distribution, E θ0 [F θ0 (X (nj ))] = n j = F n+ n(x (nj )) n so that n+ [ ] n R n = n +,, n τ k n + In both cases, the convergence in probability P θ0 of (R n R n )(θ 0 ) to zero is a consequence of the form of the variances of the mean zero differences, which are (/n)r j ( R j ) in Case and (/(n + 2))R j ( R j ) in Case 2 However, there is a difference in the analysis of estimation properties for the two mentioned cases Let us discuss them separately Case For the fixed t < < t k We can express the functional Q n,v (θ) as Q n,v (θ) = [F n (t ) F θ (t ),, F n (t k ) F θ (t k )] τ V [F n (t ) F θ (t ),, F n (t k ) F θ (t k )] For the complete standardization we should use the matrix V θ = Var θ [F n (t ) F θ (t ),, F n (t k ) F θ (t k )] 4
15 that depends on the parameter Results derived in Sections 3 and 4 then guaranty that minimizing the functional Q n (θ) = Q n,vθ (θ) leads to the classical Pearson minimum chisquare estimator (see eg Hsiao (2006) for its best asymptotically normal (BAN) distribution properties and see also Amemiya (976), Berkson (949), Berkson (980), Bhapkar (966), Fisher (924), Taylor (953) for more about minimum chi-square estimation) However, this estimation procedure can also be set in the framework of the generalized method of moments (GMM) Indeed, if we use a fixed V or we use V θ where θ is a consistent estimator of the true parameter value instead of V θ in the functional Q n,v (θ), then, as shown in Benšić (205), this estimation procedure can be seen as a GMM procedure Let ˆθ k,n denote the estimator obtained by minimization of the functional Q n,vθ (θ) Refining the notation from Section 2: A i = (t i, t i ], i =,, k, A k+ = (t k, ), P n (A i ) = F n (t i ) F n (t i ), P (A i ) = F θ (t i ) F θ (t i ), P (A i ; θ) = F θ (t i ) F θ (t i ) P 0 (A i ) = F θ0 (t i ) F θ0 (t i ), analogous to the results of the previous sections, we see that k+ ˆθ k,n = argmin θ Θ i= (P n (A i ) P (A i ; θ)) 2 (7) P (A i ) If classical assumptions of the generalized method of moments theory are fulfilled (see eg Newey and McFadden (994), Harris and Matyas (999)) it is shown in Benšić (205) that lim n [nvar( ˆθ k,n )] has inverse G τ 0V 0 G 0 where G 0 and V 0 are, respectively, the matrices 5
16 [F θ τ θ (t ),, F θ (t k )] τ and the covariance matrix of [F n (t ),, F n (t k )] τ evaluated at the true parameter value θ 0 Using the tridiagonal form for the inverse of V 0 we can simplify this limit Indeed, if the model is regular, let S(x) = θ log f(x, θ) θ 0 the true parameter value Now we have [E θ0 (S (,t ])] τ G 0 = [E θ0 (S (,tk ])] τ and G τ 0V 0 G 0 = = = k+ i= k+ i= k+ i= P 0 (A i ) t i t i t i be the population score function evaluated at S(x)f(x; θ 0 ) dx t i t i S τ (x)f(x; θ 0 ) dx S(x)f(x; θ 0 ) dx S τ (x)f(x; θ 0 ) dx t i t i P 0 (A i ) P 0 (A i ) P 0 (A i ) P 0 (A i )E θ0 [S(X) A i ]E θ0 [S(X) A i ] τ This can be interpreted as a Riemann-Stieltjes discretization of the Fisher information which arises in the limit of large k Let us note the similarity of ˆθ k,n and the minimum chi-square estimator From (7) we see that they differ only in the denominator so we can interpret ˆθ k,n as a modified minimum chi-square It is well known that various minimum chi-square estimators are in fact generalized least squares (see eg Amemiya (976), Harris and Kanji (983), Hsiao (2006)) and BAN estimators Likewise, the norm square of standardizing the empirical distribution has been known to also provide a generalized least squares estimator What was not recognized is that minimizing the norm squared of the fully standardized empirical distribution is in the same as minimizing chi-square 6 t i
17 Case 2 In this case we have t j = X (nj ) so that the value F n (t j ) = n j /n is predetermined The random part F θ (X nj ) of Q n (θ) depends on the parameter But the matrix V (which should be used for complete standardization) does not depend on θ and can be computed from uniform order statistics covariances Now, the results from Sections 3 and 4 enable us to represent the minimizer of the functional Q k (θ) as k+ ˆθ = argmin θ Θ i= ((F θ (X (ni )) F θ (X (ni ))) n i n i n i n i n+ n+ ) 2 (8) To discuss asymptotic properties of this estimator let us suppose that all data are different and k = n so that the estimator can be easily recognized as a generalized spacing estimator (GSE) (see Ghosh and Rao Jammalamadaka (200), Cheng and Amin (983), Ranneby (984)) Namely, if n i n i = then Obviously, Q n (θ) = (n + 2)(n + ) n+ ˆθ n = argmin θ Θ i= n+ i= (F θ (X (ni )) F θ (X (ni ))) 2 = ( (F θ (X (ni )) F θ (X (ni ))) ) 2 n + n+ i= h(f θ (X (ni )) F θ (X (ni ))), (9) where h(t) = t 2 Detailed discussion about conditions for consistency and asymptotic normality for this type of estimator the interested reader can find in Ghosh and Rao Jammalamadaka (200) If we apply these results with h(t) = h 2 it comes out that we face a lack of BAN distribution properties with ˆθ n To illustrate this, let us suppose, for simplicity, that θ = θ is a scalar Theorem 32 from Ghosh and Rao Jammalamadaka (200) gives necessary and sufficient condition on h to generate GSE with minimum variance for a given class of functions which includes h(t) = t 2 It is stated there that asymptotic variance of a GSE is minimized with h(t) = a log t + bt + c where a, b and c are constants Based on the results formulated in Theorem 3 from the same paper, it is also possible to 7
18 calculate the asymptotic variance of the GSE for the given function h under some regular conditions on the population density Thus, for h(t) = t 2 the expression (9), Theorem 3, from Ghosh and Rao Jammalamadaka (200) equals 2, which means that asymptotic variance of our estimator (under mild conditions on the population density) is 2 I(θ 0 ), where I(θ 0 ) denotes Fisher information So, for these cases ˆθ n from (9) is not BAN It is only 50% efficient asymptotically However, it is possible to reach the BAN distribution property for Case 2 and k = n through an iterative procedure which includes a modification of the denominator in (8) in each step: Let Q n (θ, θ ) = n+ i= (F θ (X (i) ) F θ (X (i ) ) n+ )2 F θ (X (i) ) F θ (X (i ) ) 2 Let θ be a consistent estimator for real θ 3 θ = θ θ j+ = argmin Q n (θ, θ j ), j =, 2, θ To show this let us denote F θ = [F θ (X () ),, F θ (X (n) )] τ, e = [,, ] τ R n, G θ = [F θ θ(x () ),, F θ (X (n) )] τ, and F θ (X () )( F θ (X () )) F θ (X () )( F θ (X (2) )) F θ (X () )( F θ (X (n) )) F V θ = θ (X () )( F θ (X (2) )) F θ (X (2) )( F θ (X (2) )) F θ (X (2) )( F θ (X (n) )), F θ (X () )( F θ (X (n) )) F θ (X (2) )( F θ (X (n) )) F θ (X (n) )( F θ (X (n) )) 8
19 As in Gauss-Newton s method for nonlinear least squares, here we consider the following quadratic approximation θ ˆQ n (θ, θ j ) = ( F θj + G θj (θ θ j ) )τ ( n + e Vθ j F θj + G θj (θ θ j ) ) n + e of the function θ Q n (θ, θ j ) = ( F θ n+ e) τ V θ j ( Fθ n+ e) about the point θ j Instead of nonlinear optimization problem min Q n (θ, θ j ), in every iteration we solve θ simple problem min ˆQn (θ, θ j ), that has explicit solution Then the corresponding iterative θ procedure is given by ( θ j+ = argmin F θj + G θj (θ θ j ) )τ ( θ n + e Vθ j F θj + G θj (θ θ j ) ) n + e, or explicitly θ j+ = θ j + ( ) ( ) G τ θ j Vθ j G θj G τ θj Vθ j n + e F θ j, j =, 2, (0) If we suppose that the sequence (θ j ) is convergent, then ( ) G τ θ j Vθ j n + e F θ j 0 ie the limit of the sequence (θ j ) is the solution of the equation Let us consider the function where h(t) = log t Note that S(θ) = ( ) G τ θvθ n + e F θ = 0 () n+ i= h(f θ (X (i) ) F θ (X (i ) )), ( ) S (θ) = G τ θvθ n + e F θ, 9
20 ie the condition S (θ) = 0 is exactly the same as equation () Finally, we showed the following: if the sequence (θ j ) given by (0) is convergent, then it converges to the stationary point of the function θ n+ h(f θ (X (ni )) F θ (X (ni ))), where h(t) = log t i= Here, Q n (θ, θ ) is algebraically the same functional as the one described in Case if we intentionally chose fixed t j to be the same as x (nj ) and behave as we are in Case 7 Conclusion In previous work Benšić (205), has been shown by simulations that fully standardizing the cumulative distribution produces estimators that are superior to those that minimize the Cramer-Von Mises and Anderson-Darling statistics Now, as a result of the present work we understand that this means advocacy of minimum chi-square estimators as superior to estimators based on minimum distance between (unstandardized) cumulative distributions We have seen here that for both fixed t,, t k and quantiles t i = X (ni ) the form of the covariance of (F n (t i ) F (t i ), i =,, k) permits simple standardization and a simple relationship to chi-square statistic For fixed t,, t k we also see clearly the chi-square(k) asymptotic distribution However, we caution that using all the empirical quantiles (k = n, n i = i, t i = X (i) ) the standardized (F (X (i) ) i, i =,, n) is not shown to have an n+ effective norm square for estimation, being only 50% efficient A modified chi-square like formulation is given for the empirical quantiles that is fully efficient References Amemiya, T (976) The Maximum Likelihood, the Minimum Chi- Square and the Nonlinear Weighted Least-Squares Estimator in the General Qualitative Response Model, 20
21 Journal of the American Statistical Association 7, Anderson, T W and D A Darling (952) Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes, The Annals of Mathematical Statistics, 23, Benšić, M (205) Properties of the generalized nonlinear least squares method applied for fitting distribution to data, Discussiones Mathematicae Probability and Statistics 35, Benšić, M (204) Fitting distribution to data by a generalized nonlinear least squares method, Communications in Statistics Simulation and Computation 43, Berkson, J (949) Minimum χ 2 and maximum likelihood solution in terms of a linear transform, with particular reference to bio-assay, Journal of the American Statistical Association, 44, Berkson, J (980) Minimum chi-square, not maximum likelihood!, The Annals of Statistics 8, Bhapkar, V P (966) A Note on the Equivalence of Two Test Criteria for Hypotheses in Categorical Data, Journal of the American Statistical Association 6, Cheng, R C H and N A K Amin (983) Estimating parameters in continuous univariate distributions with a shifted origin, J Roy Statisti Soc Ser B 45, Cramer, H (946) Mathematical Methods of Statistics, Princeton University Press Fisher, R A (924) The Conditions Under Which χ 2 Measures the Discrepancey Between Observation and Hypothesis, Journal of the Royal Statistical Society 87,
22 Ghosh, K and S Rao Jammalamadaka (200) A general estimation method using spacings, Jouranal of Statistical Planing and Inferences, 93, 7 82 Harris, D and L Matyas (999) Introduction to the generalized method of moment estimation, in: Matyas, L (Ed), Generalized method of moment estimation, Cambridge University Press, Cambridge 3 30 Harris, R R and G K Kanji (983) On the use of minimum chi-square estimation, The Statistician 23, Helmert, F R (986) Die Genauigkeit der Formel von Peters zur Berechnung des wahrscheinlichen Beobachtungsfehlers directer Beobachtungen gleicher Genauigkeit, Astronomische Nachrichten 88, columns 3 32 Hsiao, C (2006) Minimum Chi-Square, Encyclopedia of Statistical Scineces 7, John Wiley & Sons Irwin, J O (942) On the Distribution of a Weighted Estimate of Variance and on Analysis of Variance in Certain Cases of Unequal Weighting, Journal of the Royal Statistical Society 05, 5 8 Irwin, J O (949) A Note on the Subdividsion of χ 2 into Components, Biometrika 36, Kruskal, W H (946), Helmert s Distribution, American Mathematical Monthly 53, Lancaster, H O (949) The Derivation and Partition of χ 2 in Certain Discrete Distributions, Biometrika 36,
23 Lancaster, H O (965) The Helmert Matrices, The American Mathematical Monthly 72, 4 2 Newey, W K and D McFadden (994) Large Sample Estimation and Hypothesis Testing, in Engle, R and D McFadden, eds, Handbook of Econometrics, Vol 4, New York: North Holland Neyman, J (949) Contribution to the theory of the χ 2 test, In Proc Berkeley Symp Math Stat Prob, Ranneby, B (984) The maximum spacing method: an estimation method related to the maximum likelihood method, Scand J Statist, 93 2 Rao, C R (973) Linear Statistical Inference and Its Applications (2nd ed), New York: John Wiley Stigler, S M (984) Kruskal s Proof of the Joint Distribution of X and s 2, The American Statistician 38, Taylor, W F (953) Distance functions and regular best asymptotically normal estimates, The Annals of Mathematical Statistics 24,
A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function 1
A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function Andrew R Barron, Mirta Benšić and Kristian Sabo Abstract The relationship between the
More informationPROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA
Discussiones Mathematicae Probability and Statistics 35 (2015) 75 94 doi:10.7151/dmps.1172 PROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA Mirta Benšić
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationMultivariate Distributions
Copyright Cosma Rohilla Shalizi; do not distribute without permission updates at http://www.stat.cmu.edu/~cshalizi/adafaepov/ Appendix E Multivariate Distributions E.1 Review of Definitions Let s review
More informationLECTURE 18: NONLINEAR MODELS
LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:
More informationThomas J. Fisher. Research Statement. Preliminary Results
Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationFinal Examination Statistics 200C. T. Ferguson June 11, 2009
Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationCOMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract
Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T
More informationj=1 π j = 1. Let X j be the number
THE χ 2 TEST OF SIMPLE AND COMPOSITE HYPOTHESES 1. Multinomial distributions Suppose we have a multinomial (n,π 1,...,π k ) distribution, where π j is the probability of the jth of k possible outcomes
More informationSample size determination for logistic regression: A simulation study
Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This
More information1 One-way analysis of variance
LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.
More informationKRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS
Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting
More informationIrr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland
Frederick James CERN, Switzerland Statistical Methods in Experimental Physics 2nd Edition r i Irr 1- r ri Ibn World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI CONTENTS
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationConsistent Bivariate Distribution
A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationGeneralized Method of Moments Estimation
Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample
More informationEcon 583 Final Exam Fall 2008
Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random
More informationNegative Multinomial Model and Cancer. Incidence
Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,
More informationcomponent risk analysis
273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationBIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84
Chapter 2 84 Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random variables. For instance, X = X 1 X 2.
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationExact goodness-of-fit tests for censored data
Exact goodness-of-fit tests for censored data Aurea Grané Statistics Department. Universidad Carlos III de Madrid. Abstract The statistic introduced in Fortiana and Grané (23, Journal of the Royal Statistical
More informationSingle Equation Linear GMM with Serially Correlated Moment Conditions
Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )
More informationHigh Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data
High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department
More informationLikelihood and p-value functions in the composite likelihood context
Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationLECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.
Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationMultivariate GARCH models.
Multivariate GARCH models. Financial market volatility moves together over time across assets and markets. Recognizing this commonality through a multivariate modeling framework leads to obvious gains
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationcovariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of
Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationThe purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.
Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That
More informationR = µ + Bf Arbitrage Pricing Model, APM
4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS
Bull. Korean Math. Soc. 53 (2016), No. 2, pp. 351 363 http://dx.doi.org/10.4134/bkms.2016.53.2.351 ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS Sangyeol Lee Abstract. In this paper, we
More informationLECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO
LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationEstimation of parametric functions in Downton s bivariate exponential distribution
Estimation of parametric functions in Downton s bivariate exponential distribution George Iliopoulos Department of Mathematics University of the Aegean 83200 Karlovasi, Samos, Greece e-mail: geh@aegean.gr
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationSTATISTICS SYLLABUS UNIT I
STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationToday we will prove one result from probability that will be useful in several statistical tests. ... B1 B2 Br. Figure 23.1:
Lecture 23 23. Pearson s theorem. Today we will prove one result from probability that will be useful in several statistical tests. Let us consider r boxes B,..., B r as in figure 23.... B B2 Br Figure
More informationStat 710: Mathematical Statistics Lecture 31
Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:
More informationOn corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR
Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional
More informationAsymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationIDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM
Surveys in Mathematics and its Applications ISSN 842-6298 (electronic), 843-7265 (print) Volume 5 (200), 3 320 IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Arunava Mukherjea
More informationGoodness of Fit Test and Test of Independence by Entropy
Journal of Mathematical Extension Vol. 3, No. 2 (2009), 43-59 Goodness of Fit Test and Test of Independence by Entropy M. Sharifdoost Islamic Azad University Science & Research Branch, Tehran N. Nematollahi
More informationTesting Homogeneity Of A Large Data Set By Bootstrapping
Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationHypothesis testing in multilevel models with block circular covariance structures
1/ 25 Hypothesis testing in multilevel models with block circular covariance structures Yuli Liang 1, Dietrich von Rosen 2,3 and Tatjana von Rosen 1 1 Department of Statistics, Stockholm University 2 Department
More informationMohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),
Mohsen Pourahmadi PUBLICATIONS Books and Editorial Activities: 1. Foundations of Time Series Analysis and Prediction Theory, John Wiley, 2001. 2. Computing Science and Statistics, 31, 2000, the Proceedings
More informationGeneralized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses
Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:
More informationORTHOGONALITY TESTS IN LINEAR MODELS SEUNG CHAN AHN ARIZONA STATE UNIVERSITY ABSTRACT
ORTHOGONALITY TESTS IN LINEAR MODELS SEUNG CHAN AHN ARIZONA STATE UNIVERSITY ABSTRACT This paper considers several tests of orthogonality conditions in linear models where stochastic errors may be heteroskedastic
More informationJournal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error
Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA
More informationRegression: Lecture 2
Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and
More information2.1.3 The Testing Problem and Neave s Step Method
we can guarantee (1) that the (unknown) true parameter vector θ t Θ is an interior point of Θ, and (2) that ρ θt (R) > 0 for any R 2 Q. These are two of Birch s regularity conditions that were critical
More informationBayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples
Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,
More informationPractical Linear Algebra: A Geometry Toolbox
Practical Linear Algebra: A Geometry Toolbox Third edition Chapter 12: Gauss for Linear Systems Gerald Farin & Dianne Hansford CRC Press, Taylor & Francis Group, An A K Peters Book www.farinhansford.com/books/pla
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationEmpirical Power of Four Statistical Tests in One Way Layout
International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationEconomic modelling and forecasting
Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation
More informationTests Concerning Equicorrelation Matrices with Grouped Normal Data
Tests Concerning Equicorrelation Matrices with Grouped Normal Data Paramjit S Gill Department of Mathematics and Statistics Okanagan University College Kelowna, BC, Canada, VV V7 pgill@oucbcca Sarath G
More informationMaximum Likelihood Drift Estimation for Gaussian Process with Stationary Increments
Austrian Journal of Statistics April 27, Volume 46, 67 78. AJS http://www.ajs.or.at/ doi:.773/ajs.v46i3-4.672 Maximum Likelihood Drift Estimation for Gaussian Process with Stationary Increments Yuliya
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationSingle Equation Linear GMM with Serially Correlated Moment Conditions
Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )
More informationRandom Vectors and Multivariate Normal Distributions
Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random 75 variables. For instance, X = X 1 X 2., where each
More information