Tube formula approach to testing multivariate normality and testing uniformity on the sphere Akimichi Takemura 1 Satoshi Kuriki 2 1 University of Tokyo 2 Institute of Statistical Mathematics December 11, 2010 December 11, 2010 1 / 24
References This talk is based on the following two papers: Kuriki and Takemura (2008). The tube method for the moment index in projection pursuit. Journal of Statistical Planning and Inference, 138, No.9, 2749 2762. Kuriki and Takemura (2004). Tail probabilities of the limiting null distributions of the Anderson-Stephens statistics. Journal of Multivariate Analysis, 89, 261 291. December 11, 2010 2 / 24
Contents 1 Tube formula approximation to maximum type test statistics 2 Projection pursuit index and testing multivariate normality 3 Anderson-Stephens statistic for testing uniformity on sphere 4 Summary December 11, 2010 3 / 24
Tube formula approximation to maximum type test statistics Tube formula: some historical background Jacob Steiner already had Steiner s formula (1840) for the volume of a tube of a convex set. Minkowski defined mixed volumes. Hotelling (1939) derived the tube formula for a one-dimensional curve and then H.Weyl immediately generalized it to a general dimension. Hotelling s motivation was a nonlinear regression problem. Revival of tube formula in statistics around 1990. (Knowles-Siegmund(1989), J.Sun(1991,93) and many other people). December 11, 2010 4 / 24
Tube formula approximation to maximum type test statistics Euler characteristic method (independent development) Euler characteristic heuristic was initiated by R.J.Adler for approximating the distribution of the maximum of a random random field (Adler-Hasofer(1976), Adler s book(1981)). This method has been vigorously developed by Adler and Keith Worsley. Some important foundational work was done by Jonathan Taylor (2001 thesis). A standard textbook now is Random Fields and Geometry by Adler and Taylor, 2007, Springer. December 11, 2010 5 / 24
Tube formula approximation to maximum type test statistics Two methods are equivalent Around 2000, I and Kuriki were sitting in a talk by Keith Worsley in ISM (Institute of Statistical Mathematics, Tokyo, Japan) and was surprised that he was doing the same computations as us. Takemura and Kuriki (2002) proved the equivalence of these two methods by using Morse theorem (for finite dimensional case). Tube method can be understood as finite dimensional specialization of Euler characteristic method. I should also mention that abstract tube by Naiman and Wynn is a discrete analog of tube formula. December 11, 2010 6 / 24
Tube formula approximation to maximum type test statistics Canonical form of tube formula Let z = (z 1,...,z n ) N n (0, I n ). Let M S n 1 be a C 2 -submanifold of dimension d = dimm with piecewise smooth boundaries. Let Z(u) = u z = n u i z i, u = (u 1,...,u n ) M. i=1 Also consider a standardized random field Y (u) = u z/ z, u M, z = z z. December 11, 2010 7 / 24
Tube formula approximation to maximum type test statistics Canonical form of tube formula We want to evaluate the distributions of maxima, corresponding to maximum type test statistics: T = max u M Z(u), U = max u M Y (u). The tube method gives an approximation of the tail probabilities P(T x), x, and P(U x), x 1. December 11, 2010 8 / 24
Tube formula approximation to maximum type test statistics Spherical tube and its volume Evaluation of the distribution reduces to the evaluation of the volume of a spherical tube around M. M M θ 0 S n-1 Figure: Spherical tube around M December 11, 2010 9 / 24
Tube formula approximation to maximum type test statistics Spherical tube and its volume Let M θ = { } v S n 1 min u M cos 1 (u v) θ denote the tube around M with radius θ. Let Vol(M θ ) denote the (n 1)-dimensional spherical volume of M θ. By definition where ( ) P max Y (u) cos θ = Vol(M θ )/Ω n, u M Ω n = Vol(S n 1 ) = 2πn/2 Γ(n/2) and B a,b ( ) denotes the upper probability of beta distribution with parameter (a, b). December 11, 2010 10 / 24
Tube formula approximation to maximum type test statistics Tube formula for the volume of a spherical tube Tube formula: For θ smaller than the critical radius, Vol(M θ ) = Ω n {w d+1 B d+1 2, n d 1 2 + + w 1 B 1 2, n 1 (cos 2 θ) 2 (cos 2 θ) + w d B d 2, n d (cos 2 θ) 2 }, where w 1,...,w d+1 are geometric invariants of M, which can be evaluated by differential geometric methods. In particular w d+1 = Vol(M)/Ω d+1, w d = Vol( M)/(2Ω d+1 ). We omit explanation of critical radius in this talk. December 11, 2010 11 / 24
Tube formula approximation to maximum type test statistics Tail probability for T (non-standardized maximum) For T = max u M Z(u) we need integration of the tube formula in z. By integration on z we have ( ) P max Z(u) x = w d+1 Ḡ d+1 (x 2 ) + w d Ḡ d (x 2 ) + u M +w 1 Ḡ 1 (x 2 ) + O(Ḡ n (x 2 (1 + tan 2 θ c ))), where Ḡ a ( ) is the upper probability of χ 2 distribution with a degrees of freedom and θ c is the critical radius. December 11, 2010 12 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson x t R q, t = 1,...,n: observation vectors S q 1 : the unit sphere in R q. z t = h x t : projection of x t onto the direction h Projection pursuit: looks for the direction h such that the projected data z 1,...,z n do not look like normally distributed. I n (h): projection pursuit index, which measures non-normality of projected data. maximize I n (h) in h S q 1. December 11, 2010 13 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson The null hypothesis: H 0 : x t N q (µ,σ), i.i.d. K k,n (h): the kth sample cumulant of the projected data z 1,...,z n. Skewness and kurtosis: B 1,n (h) = K 3,n (h)/k 2,n (h) 3/2 : the sample skewness B 2,n (h) = K 4,n (h)/k 2,n (h) 2 : the sample kurtosis Jones-Sibson index I n (h) = n 6 B 1,n(h) 2 + n 24 B 2,n(h) 2 Asymptotic distribution of max h S q 1 I n (h)? can be solved by tube formula. December 11, 2010 14 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Theorem 1 (Asymptotic distribution of the random field) Let ξ 1 R q3, ξ 2 R q4 be random vectors consisting of independent standard normal random variables. For a unit vector h S q 1, let Z 1 (h) = (h h h) ξ 1, Z 2 (h) = (h h h h) ξ 2, where denotes the Kronecker product. Under the null hypothesis H 0 of multivariate normality, as n, max h S q 1 I n (h) converges in distribution to max h S q 1 I(h), where I(h) = Z 1 (h) 2 + Z 2 (h) 2. December 11, 2010 15 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Theorem 2 (Asymptotic distribution of Jones-Sibson index) As c, ( P max I(h) c2) = h S q 1 where q e=0, e:even κ e ( 3) e/2 (q 1)! κ e = Ω q (q e)! E k = π/2 π/2 Γ( q+1 e 2 ) 2 1+e/2 π (q+1)/2ḡq+1 e(c 2 )(1 + o(1)), e/2 j=0 (3 cos 2 θ + 4 sin 2 θ) k dθ. (q e 2j) (e/2 j)! j! ( 2)j E (q 1 e)/2 j, December 11, 2010 16 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Note that we are first letting n and then getting the tail probability c. 1 0.9 n= tube approx. 0.8 0.7 tail probability 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 x Figure: Tail probability of limiting distribution (solid line) and its approximation by the tube method (dotted line). December 11, 2010 17 / 24
Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson 1 0.9 n= n=3000 n=1000 n=300 0.8 0.7 tail probability 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 x Figure: Tail probabilities of finite sample distributions (n = 300,1000,3000, ). December 11, 2010 18 / 24
Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic H 0 : x t, t = 1,...,n are i.i.d. uniform vectors on S q 1. For h S q 1 let S(h) = 1 n n (h x t ) 2 t=1 Let S max = max h S q 1 S(h), S min = min h S q 1 S(h). Anderson-Stephens test: reject H 0 if S max c or S min c. We also propose to use S range = S max S min December 11, 2010 19 / 24
Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic S max and S min are the largest and smallest eigenvalues λ 1 (Q) and λ q (Q) of a q q matrix Q = 1 n n z t z t. Let A : q q, symmetric, have the multivariate symmetric normal distribution, i.e., a ii N(0, 1), a ij, i < j, N(0, 1/2), all mutually independent. The limiting null distribution of the eigenvalues of n(q I q /q) is given by the distribution of the eigenvalues of 2 tr(a) (A q(q + 2) q I q) tube formula works again. December 11, 2010 20 / 24 t=1
Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic Need some more adjustment of constants: Lemma 3 As n, the null distributions of both of n(s max 1/q) and n(s min 1/q) converge to the distribution of 2(q 1)/q 2 (q + 2)T 1, where ( q T 1 = λ 1 (B) with B = A tr(a) ) q 1 q I q. (1) The null distribution of n(s max S min ) converges to the distribution of (2/ q(q + 2))T 2, where T 2 = 1 2 (λ 1 (A) λ q (A)). (2) December 11, 2010 21 / 24
Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic Theorem 4 When q 3, the asymptotic expansion of the upper tail probability of T 1 = λ 1 (B) is given by where When q = 2, P(T 1 x) = w q e = 1 2 q 1 e=0, e:even w q e Ḡ q e (x 2 )(1 + o(1)), x, (3) ( ) 2q (q 1)/2 ( q + 1 ) e/2 Γ( q+1 2 ) q 1 2q Γ( q e+1 2 )( e (4) 2 )!. P(T 1 x) = Ḡ2(x 2 ), x 0. December 11, 2010 22 / 24
Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic tail probability 0.0 0.2 0.4 0.6 0.8 1.0 limiting distribution n=10,100,1000 approximation by the tube method 0 1 2 3 4 5 x Tail probabilities of S max when q = 3. (n = 10, 100, 1000, and approximation by the tube method.) December 11, 2010 23 / 24
Summary Summary We gave a brief introduction to tube method. We applied the method to testing multivariate normality based on projection pursuit index and to testing uniformity on the unit sphere. We presented numerical examples to show that tube formula gives a good approximation to the tail probability of test statistics. December 11, 2010 24 / 24