Tube formula approach to testing multivariate normality and testing uniformity on the sphere

Similar documents
Application of tube formula to distributional problems in multiway layouts

ON THE EQUIVALENCE OF THE TUBE AND EULER CHARACTERISTIC METHODS FOR THE DISTRIBUTION OF THE MAXIMUM OF GAUSSIAN FIELDS OVER PIECEWISE SMOOTH DOMAINS

Tail probability of linear combinations of chi-square variables and its application to influence analysis in QTL detection

RANDOM FIELDS AND GEOMETRY. Robert Adler and Jonathan Taylor

A test for a conjunction

SOME NOTES ON HOTELLING TUBES

Comment. February 1, 2008

Statistical applications of geometry and random

ON THE EQUIVALENCE OF THE TUBE AND EULER CHARACTERISTIC METHODS FOR THE DISTRIBUTION OF THE MAXIMUM OF GAUSSIAN FIELDS OVER PIECEWISE SMOOTH DOMAINS

Statistical Inference

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Akaike Information Criterion

Tail probability via the tube formula when the critical radius is zero

Anderson-Darling Type Goodness-of-fit Statistic Based on a Multifold Integrated Empirical Distribution Function

Evgeny Spodarev WIAS, Berlin. Limit theorems for excursion sets of stationary random fields

Central Limit Theorem ( 5.3)

Applications of the signed distance function to surface geometry. Daniel Mayost

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

Statistical Inference with Monotone Incomplete Multivariate Normal Data

11 Survival Analysis and Empirical Likelihood

PRE-TEST ESTIMATION OF THE REGRESSION SCALE PARAMETER WITH MULTIVARIATE STUDENT-t ERRORS AND INDEPENDENT SUB-SAMPLES

4.5.1 The use of 2 log Λ when θ is scalar

INFERENCE FOR EIGENVALUES AND EIGENVECTORS OF GAUSSIAN SYMMETRIC MATRICES

Übungen zu RT2 SS (4) Show that (any) contraction of a (p, q) - tensor results in a (p 1, q 1) - tensor.

1 Appendix A: Matrix Algebra

High-dimensional asymptotic expansions for the distributions of canonical correlations

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Spectra of adjacency matrices of random geometric graphs

Elliptically Contoured Distributions

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

Inference on distributions and quantiles using a finite-sample Dirichlet process

Hypothesis Testing For Multilayer Network Data

Minimum distance tests and estimates based on ranks

A significance test for the lasso

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Stationarity of non-radiating spacetimes

Statistical Inference On the High-dimensional Gaussian Covarianc

Institute of Actuaries of India

Local Whittle Likelihood Estimators and Tests for non-gaussian Linear Processes

Statistical Inference

GEOMETRICAL TOOLS FOR PDES ON A SURFACE WITH ROTATING SHALLOW WATER EQUATIONS ON A SPHERE

p(z)

GARCH Models Estimation and Inference

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control

2.6.3 Generalized likelihood ratio tests

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Summary and discussion of: Exact Post-selection Inference for Forward Stepwise and Least Angle Regression Statistics Journal Club

Unified univariate and multivariate random field theory

Modeling the Free Energy Landscape for Janus Particle Self-Assembly in the Gas Phase. Andy Long Kridsanaphong Limtragool

5 Introduction to the Theory of Order Statistics and Rank Statistics

RANDOM FIELDS OF MULTIVARIATE TEST STATISTICS, WITH APPLICATIONS TO SHAPE ANALYSIS

Geometric Interpolation by Planar Cubic Polynomials

An introduction to General Relativity and the positive mass theorem

Random Eigenvalue Problems Revisited

Stat 451 Lecture Notes Numerical Integration

Lecture 7. Quaternions

Random Matrix Eigenvalue Problems in Probabilistic Structural Mechanics

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

A CONSTRUCTION OF TRANSVERSE SUBMANIFOLDS

Detecting fmri activation allowing for unknown latency of the hemodynamic response

Testing Statistical Hypotheses

Testing Restrictions and Comparing Models

DETECTING SPARSE CONE ALTERNATIVES FOR GAUSSIAN RANDOM FIELDS,

On prediction and density estimation Peter McCullagh University of Chicago December 2004

The Canonical Gaussian Measure on R

Statistics & Data Sciences: First Year Prelim Exam May 2018

Stability of Hybrid Control Systems Based on Time-State Control Forms

Multivariate Non-Normally Distributed Random Variables

4sec 2xtan 2x 1ii C3 Differentiation trig

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris.

A Very Brief Summary of Statistical Inference, and Examples

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Statistical Analysis

Lecture 3: Central Limit Theorem

ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY

An Introduction to Multivariate Statistical Analysis

Engineering Mechanics Statics

On Spatial Involute Gearing

+ Specify 1 tail / 2 tail

CHAPTER 3. Gauss map. In this chapter we will study the Gauss map of surfaces in R 3.

Geometric projection of stochastic differential equations

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Daniel M. Oberlin Department of Mathematics, Florida State University. January 2005

Higher order moments of the estimated tangency portfolio weights

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

A nonparametric two-sample wald test of equality of variances

Canonical Correlation Analysis of Longitudinal Data

Hypothesis Testing One Sample Tests

Large sample covariance matrices and the T 2 statistic

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Bayesian Econometrics

Neural Network Training

Eigenvalues and eigenfunctions of the Laplacian. Andrew Hassell

Gaussian Measure of Sections of convex bodies

Asymptotic Statistics-III. Changliang Zou

On Spatial Involute Gearing

STAT 730 Chapter 5: Hypothesis Testing

Testing equality of two mean vectors with unequal sample sizes for populations with correlation

Holonomic Gradient Method for Multivariate Normal Distribution Theory. Akimichi Takemura, Univ. of Tokyo August 2, 2013

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Transcription:

Tube formula approach to testing multivariate normality and testing uniformity on the sphere Akimichi Takemura 1 Satoshi Kuriki 2 1 University of Tokyo 2 Institute of Statistical Mathematics December 11, 2010 December 11, 2010 1 / 24

References This talk is based on the following two papers: Kuriki and Takemura (2008). The tube method for the moment index in projection pursuit. Journal of Statistical Planning and Inference, 138, No.9, 2749 2762. Kuriki and Takemura (2004). Tail probabilities of the limiting null distributions of the Anderson-Stephens statistics. Journal of Multivariate Analysis, 89, 261 291. December 11, 2010 2 / 24

Contents 1 Tube formula approximation to maximum type test statistics 2 Projection pursuit index and testing multivariate normality 3 Anderson-Stephens statistic for testing uniformity on sphere 4 Summary December 11, 2010 3 / 24

Tube formula approximation to maximum type test statistics Tube formula: some historical background Jacob Steiner already had Steiner s formula (1840) for the volume of a tube of a convex set. Minkowski defined mixed volumes. Hotelling (1939) derived the tube formula for a one-dimensional curve and then H.Weyl immediately generalized it to a general dimension. Hotelling s motivation was a nonlinear regression problem. Revival of tube formula in statistics around 1990. (Knowles-Siegmund(1989), J.Sun(1991,93) and many other people). December 11, 2010 4 / 24

Tube formula approximation to maximum type test statistics Euler characteristic method (independent development) Euler characteristic heuristic was initiated by R.J.Adler for approximating the distribution of the maximum of a random random field (Adler-Hasofer(1976), Adler s book(1981)). This method has been vigorously developed by Adler and Keith Worsley. Some important foundational work was done by Jonathan Taylor (2001 thesis). A standard textbook now is Random Fields and Geometry by Adler and Taylor, 2007, Springer. December 11, 2010 5 / 24

Tube formula approximation to maximum type test statistics Two methods are equivalent Around 2000, I and Kuriki were sitting in a talk by Keith Worsley in ISM (Institute of Statistical Mathematics, Tokyo, Japan) and was surprised that he was doing the same computations as us. Takemura and Kuriki (2002) proved the equivalence of these two methods by using Morse theorem (for finite dimensional case). Tube method can be understood as finite dimensional specialization of Euler characteristic method. I should also mention that abstract tube by Naiman and Wynn is a discrete analog of tube formula. December 11, 2010 6 / 24

Tube formula approximation to maximum type test statistics Canonical form of tube formula Let z = (z 1,...,z n ) N n (0, I n ). Let M S n 1 be a C 2 -submanifold of dimension d = dimm with piecewise smooth boundaries. Let Z(u) = u z = n u i z i, u = (u 1,...,u n ) M. i=1 Also consider a standardized random field Y (u) = u z/ z, u M, z = z z. December 11, 2010 7 / 24

Tube formula approximation to maximum type test statistics Canonical form of tube formula We want to evaluate the distributions of maxima, corresponding to maximum type test statistics: T = max u M Z(u), U = max u M Y (u). The tube method gives an approximation of the tail probabilities P(T x), x, and P(U x), x 1. December 11, 2010 8 / 24

Tube formula approximation to maximum type test statistics Spherical tube and its volume Evaluation of the distribution reduces to the evaluation of the volume of a spherical tube around M. M M θ 0 S n-1 Figure: Spherical tube around M December 11, 2010 9 / 24

Tube formula approximation to maximum type test statistics Spherical tube and its volume Let M θ = { } v S n 1 min u M cos 1 (u v) θ denote the tube around M with radius θ. Let Vol(M θ ) denote the (n 1)-dimensional spherical volume of M θ. By definition where ( ) P max Y (u) cos θ = Vol(M θ )/Ω n, u M Ω n = Vol(S n 1 ) = 2πn/2 Γ(n/2) and B a,b ( ) denotes the upper probability of beta distribution with parameter (a, b). December 11, 2010 10 / 24

Tube formula approximation to maximum type test statistics Tube formula for the volume of a spherical tube Tube formula: For θ smaller than the critical radius, Vol(M θ ) = Ω n {w d+1 B d+1 2, n d 1 2 + + w 1 B 1 2, n 1 (cos 2 θ) 2 (cos 2 θ) + w d B d 2, n d (cos 2 θ) 2 }, where w 1,...,w d+1 are geometric invariants of M, which can be evaluated by differential geometric methods. In particular w d+1 = Vol(M)/Ω d+1, w d = Vol( M)/(2Ω d+1 ). We omit explanation of critical radius in this talk. December 11, 2010 11 / 24

Tube formula approximation to maximum type test statistics Tail probability for T (non-standardized maximum) For T = max u M Z(u) we need integration of the tube formula in z. By integration on z we have ( ) P max Z(u) x = w d+1 Ḡ d+1 (x 2 ) + w d Ḡ d (x 2 ) + u M +w 1 Ḡ 1 (x 2 ) + O(Ḡ n (x 2 (1 + tan 2 θ c ))), where Ḡ a ( ) is the upper probability of χ 2 distribution with a degrees of freedom and θ c is the critical radius. December 11, 2010 12 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson x t R q, t = 1,...,n: observation vectors S q 1 : the unit sphere in R q. z t = h x t : projection of x t onto the direction h Projection pursuit: looks for the direction h such that the projected data z 1,...,z n do not look like normally distributed. I n (h): projection pursuit index, which measures non-normality of projected data. maximize I n (h) in h S q 1. December 11, 2010 13 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson The null hypothesis: H 0 : x t N q (µ,σ), i.i.d. K k,n (h): the kth sample cumulant of the projected data z 1,...,z n. Skewness and kurtosis: B 1,n (h) = K 3,n (h)/k 2,n (h) 3/2 : the sample skewness B 2,n (h) = K 4,n (h)/k 2,n (h) 2 : the sample kurtosis Jones-Sibson index I n (h) = n 6 B 1,n(h) 2 + n 24 B 2,n(h) 2 Asymptotic distribution of max h S q 1 I n (h)? can be solved by tube formula. December 11, 2010 14 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Theorem 1 (Asymptotic distribution of the random field) Let ξ 1 R q3, ξ 2 R q4 be random vectors consisting of independent standard normal random variables. For a unit vector h S q 1, let Z 1 (h) = (h h h) ξ 1, Z 2 (h) = (h h h h) ξ 2, where denotes the Kronecker product. Under the null hypothesis H 0 of multivariate normality, as n, max h S q 1 I n (h) converges in distribution to max h S q 1 I(h), where I(h) = Z 1 (h) 2 + Z 2 (h) 2. December 11, 2010 15 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Theorem 2 (Asymptotic distribution of Jones-Sibson index) As c, ( P max I(h) c2) = h S q 1 where q e=0, e:even κ e ( 3) e/2 (q 1)! κ e = Ω q (q e)! E k = π/2 π/2 Γ( q+1 e 2 ) 2 1+e/2 π (q+1)/2ḡq+1 e(c 2 )(1 + o(1)), e/2 j=0 (3 cos 2 θ + 4 sin 2 θ) k dθ. (q e 2j) (e/2 j)! j! ( 2)j E (q 1 e)/2 j, December 11, 2010 16 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson Note that we are first letting n and then getting the tail probability c. 1 0.9 n= tube approx. 0.8 0.7 tail probability 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 x Figure: Tail probability of limiting distribution (solid line) and its approximation by the tube method (dotted line). December 11, 2010 17 / 24

Projection pursuit index and testing multivariate normality Projection pursuit index by Jones and Sibson 1 0.9 n= n=3000 n=1000 n=300 0.8 0.7 tail probability 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 x Figure: Tail probabilities of finite sample distributions (n = 300,1000,3000, ). December 11, 2010 18 / 24

Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic H 0 : x t, t = 1,...,n are i.i.d. uniform vectors on S q 1. For h S q 1 let S(h) = 1 n n (h x t ) 2 t=1 Let S max = max h S q 1 S(h), S min = min h S q 1 S(h). Anderson-Stephens test: reject H 0 if S max c or S min c. We also propose to use S range = S max S min December 11, 2010 19 / 24

Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic S max and S min are the largest and smallest eigenvalues λ 1 (Q) and λ q (Q) of a q q matrix Q = 1 n n z t z t. Let A : q q, symmetric, have the multivariate symmetric normal distribution, i.e., a ii N(0, 1), a ij, i < j, N(0, 1/2), all mutually independent. The limiting null distribution of the eigenvalues of n(q I q /q) is given by the distribution of the eigenvalues of 2 tr(a) (A q(q + 2) q I q) tube formula works again. December 11, 2010 20 / 24 t=1

Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic Need some more adjustment of constants: Lemma 3 As n, the null distributions of both of n(s max 1/q) and n(s min 1/q) converge to the distribution of 2(q 1)/q 2 (q + 2)T 1, where ( q T 1 = λ 1 (B) with B = A tr(a) ) q 1 q I q. (1) The null distribution of n(s max S min ) converges to the distribution of (2/ q(q + 2))T 2, where T 2 = 1 2 (λ 1 (A) λ q (A)). (2) December 11, 2010 21 / 24

Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic Theorem 4 When q 3, the asymptotic expansion of the upper tail probability of T 1 = λ 1 (B) is given by where When q = 2, P(T 1 x) = w q e = 1 2 q 1 e=0, e:even w q e Ḡ q e (x 2 )(1 + o(1)), x, (3) ( ) 2q (q 1)/2 ( q + 1 ) e/2 Γ( q+1 2 ) q 1 2q Γ( q e+1 2 )( e (4) 2 )!. P(T 1 x) = Ḡ2(x 2 ), x 0. December 11, 2010 22 / 24

Anderson-Stephens statistic for testing uniformity on sphere Anderson-Stephens statistic tail probability 0.0 0.2 0.4 0.6 0.8 1.0 limiting distribution n=10,100,1000 approximation by the tube method 0 1 2 3 4 5 x Tail probabilities of S max when q = 3. (n = 10, 100, 1000, and approximation by the tube method.) December 11, 2010 23 / 24

Summary Summary We gave a brief introduction to tube method. We applied the method to testing multivariate normality based on projection pursuit index and to testing uniformity on the unit sphere. We presented numerical examples to show that tube formula gives a good approximation to the tail probability of test statistics. December 11, 2010 24 / 24