Anderson-Darling Type Goodness-of-fit Statistic Based on a Multifold Integrated Empirical Distribution Function

Similar documents
Tail probability of linear combinations of chi-square variables and its application to influence analysis in QTL detection

UDC Anderson-Darling and New Weighted Cramér-von Mises Statistics. G. V. Martynov

On Cramér-von Mises test based on local time of switching diffusion process

Stability of Stochastic Differential Equations

Asymptotic results for empirical measures of weighted sums of independent random variables

ON THE FIRST TIME THAT AN ITO PROCESS HITS A BARRIER

Math 3313: Differential Equations Laplace transforms

Tube formula approach to testing multivariate normality and testing uniformity on the sphere

Laplace transforms of L 2 ball, Comparison Theorems and Integrated Brownian motions. Jan Hannig

Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Process Approximations

Recall the Basics of Hypothesis Testing

Asymptotic results for empirical measures of weighted sums of independent random variables

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis

Journal of Inequalities in Pure and Applied Mathematics

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

Lecture 4: Numerical Solution of SDEs, Itô Taylor Series, Gaussian Approximations

Formulas for probability theory and linear models SF2941

Stationary distribution and pathwise estimation of n-species mutualism system with stochastic perturbation

Characteristic Functions and the Central Limit Theorem

Approximation theory

Is Analysis Necessary?

On an Eigenvalue Problem Involving Legendre Functions by Nicholas J. Rose North Carolina State University

Bayesian estimation of the discrepancy with misspecified parametric models

Finding Outliers in Monte Carlo Computations

Bessel Functions Michael Taylor. Lecture Notes for Math 524

Asymptotic Statistics-VI. Changliang Zou

Cramér-von Mises Gaussianity test in Hilbert space

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

Moment Properties of Distributions Used in Stochastic Financial Models

b n x n + b n 1 x n b 1 x + b 0

where r n = dn+1 x(t)

Making Models with Polynomials: Taylor Series

Bridging the Gap between Center and Tail for Multiscale Processes

EXPLICIT EXPRESSIONS FOR MOMENTS OF χ 2 ORDER STATISTICS

MS 2001: Test 1 B Solutions

THE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, R. M. Dudley

Mathematics of Physics and Engineering II: Homework problems

July 21 Math 2254 sec 001 Summer 2015

DUBLIN CITY UNIVERSITY

A Class of Fractional Stochastic Differential Equations

CIV-E1060 Engineering Computation and Simulation Examination, December 12, 2017 / Niiranen

On corrections of classical multivariate tests for high-dimensional data

Simulating Properties of the Likelihood Ratio Test for a Unit Root in an Explosive Second Order Autoregression

Holonomic gradient method for distribution function of a weighted sum of noncentral chi-square random variables

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

From a Mesoscopic to a Macroscopic Description of Fluid-Particle Interaction

Review of Power Series

Math Numerical Analysis

Goodness of fit test for ergodic diffusion processes

Examples of the Fourier Theorem (Sect. 10.3). The Fourier Theorem: Continuous case.

Quadratic and Polynomial Inequalities in one variable have look like the example below.

First Order Differential Equations f ( x,

Karhunen-Loève Expansions of Lévy Processes

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH

STAT 512 sp 2018 Summary Sheet

Chapter 4: Interpolation and Approximation. October 28, 2005

MA22S3 Summary Sheet: Ordinary Differential Equations

Non-parametric Inference and Resampling

Asymptotical distribution free test for parameter change in a diffusion model (joint work with Y. Nishiyama) Ilia Negri

Summer Jump-Start Program for Analysis, 2012 Song-Ying Li

Nonlinear Integral Equation Formulation of Orthogonal Polynomials

Testing Exponentiality by comparing the Empirical Distribution Function of the Normalized Spacings with that of the Original Data

Stochastic Differential Equations.

Math 216 First Midterm 18 October, 2018

Algebraic Geometry and Model Selection

1 + lim. n n+1. f(x) = x + 1, x 1. and we check that f is increasing, instead. Using the quotient rule, we easily find that. 1 (x + 1) 1 x (x + 1) 2 =

Generating and characteristic functions. Generating and Characteristic Functions. Probability generating function. Probability generating function

Expansion of 1/r potential in Legendre polynomials

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

JUST THE MATHS UNIT NUMBER LAPLACE TRANSFORMS 3 (Differential equations) A.J.Hobson

Scaling limits of anisotropic random growth models

Section Taylor and Maclaurin Series

Prelim 1 Solutions V2 Math 1120

Math 216 Final Exam 14 December, 2017

Fluctuation results in some positive temperature directed polymer models

EXTENDED RECIPROCAL ZETA FUNCTION AND AN ALTERNATE FORMULATION OF THE RIEMANN HYPOTHESIS. M. Aslam Chaudhry. Received May 18, 2007

Pathwise uniqueness for stochastic differential equations driven by pure jump processes

On the Goodness-of-Fit Tests for Some Continuous Time Processes

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Statistical Problems Related to Excitation Threshold and Reset Value of Membrane Potentials

Infinite Series. 1 Introduction. 2 General discussion on convergence

x s 1 e x dx, for σ > 1. If we replace x by nx in the integral then we obtain x s 1 e nx dx. x s 1

Gaussian processes for inference in stochastic differential equations

Interest Rate Models:

Rademacher functions

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Chap 2.1 : Random Variables

Introduction to Limits

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Math 46, Applied Math (Spring 2009): Final

Evgeny Spodarev WIAS, Berlin. Limit theorems for excursion sets of stationary random fields

Fixed point iteration and root finding

A Brief Analysis of Central Limit Theorem. SIAM Chapter Florida State University

LEGENDRE POLYNOMIALS AND APPLICATIONS. We construct Legendre polynomials and apply them to solve Dirichlet problems in spherical coordinates.

Week 9 The Central Limit Theorem and Estimation Concepts

Journal of Inequalities in Pure and Applied Mathematics

Sparse Legendre expansions via l 1 minimization

A High Order Conservative Semi-Lagrangian Discontinuous Galerkin Method for Two-Dimensional Transport Simulations

Holonomic Gradient Method for Multivariate Normal Distribution Theory. Akimichi Takemura, Univ. of Tokyo August 2, 2013

An approach to the Biharmonic pseudo process by a Random walk

Transcription:

Anderson-Darling Type Goodness-of-fit Statistic Based on a Multifold Integrated Empirical Distribution Function S. Kuriki (Inst. Stat. Math., Tokyo) and H.-K. Hwang (Academia Sinica) Bernoulli Society Satellite Meeting to ISI23 Wed 4 Sept 23, Univ Tokyo / 27

Outline I. Anderson-Darling statistic and its extension II. Limiting null distribution III. Moment generating function IV. Statistical power V. Extension of Watson s statistic Summary 2 / 27

I. Anderson-Darling statistic and its extension 3 / 27

Goodness-of-fit tests X,..., X n : i.i.d. sequence from cdf F Goodness-of-fit test: (G is a given cdf) H : F = G vs. H : F G When G is continuous, we can assume G(x) = x (i.e., Unif(,)) WLOG. Empirical distribution function F n (x) = n n l(x i x) i= Test statistic is defined as a measure of discrepancy between F n (x) and G(x) = x. 4 / 27

Goodness-of-fit tests (cont) We focus on two integral-type test statistics. Anderson-Darling (952) statistic: A n = n x( x) (F n(x) x) 2 dx Watson (96) statistic (for testing uniformity on the unit sphere in R 2 ): { U n = n (F n (x) x) 2 dx n (F n (x) x)dx Limiting null distributions: Let ξ, ξ 2,... be i.i.d. sequence from N(, ). As n, A n d k= k(k + ) ξ2 k, U n d k= } 2 2π 2 k 2 (ξ2 2k + ξ2 2k ) 5 / 27

Closely looking at Anderson-Darling Anderson-Darling statistic Here, A n = B n (x) 2 x( x) dx where B n(x) = n(f n (x) x) B n (x) = n h [] (t; x)df n (t), h [] (t; x) = l(t x) x h [] (t; x) x t 6 / 27

An extension to Anderson-Darling To propose a new class of test statistics, instead of h [] ( ; x), we prepare different type h [m] ( ; x). Note first that h [] ( ; x) is piecewise constant s.t. h[] (t; x) dt = Define h [] ( ; x) to be continuous and piecewise linear s.t. h [] ( ; x) h[] (t; x) (at + b) dt =, a, b x t 7 / 27

An extension to Anderson-Darling (cont) General from of h [m] (t; x): h [m] (t; x) = m! (x t)m l(t x) m k= x m! (x u)m L k (u)du L k (t) where L k ( ) is the Legendre polynomial of degree k h [2] ( ; x) x t 8 / 27

An extension to Anderson-Darling (cont) We propose an extension of Anderson-Darling: where B [m] n A [m] n = B n [m] (x) 2 dx {x( x)} m+ (x) = n h [m] (t; x)df n (t) = { n F n (x )dx dx m x>x m > >x > m k= x>x m> >x > L k (x )dx dx m } L k (t)df n (t) (m-fold integral of empirical distribution function) A [m] n is well-defined (the integral exists) whenever X i (, ). A [] n is the original Anderson-Darling. 9 / 27

II. Limiting null distribution / 27

Main results Limiting null distribution W ( ) : the Winer process on [, ] B(x) = W (x) xw () : Brownian bridge Let B [m] n (x) = h[m] (t; x)db n (x). We can prove that as n, B [m] n ( ) d B [m] ( ) in L 2, where B [m] (x) = and hence (by continuous mapping) A [m] n = B n [m] (x) 2 {x( x)} m+ dx d A [m] := h [m] (t; x)db(t) B [m] (x) 2 dx {x( x)} m+ We will examine these limiting distributions B [m] ( ) and A [m]. / 27

Main results Limiting null distribution (cont) Theorem (Karhunen-Loève expansion) B [m] (x) {x( x)} (m+)/2 = k=m+ (uniformly in x, with prob. ), where ξ k = (k m )! (k + m + )! L(m+) k (x) ξ k L k (t)db(t), i.i.d. N(, ) L (m+) k is the associate Legendre function. Corollary (Limiting null distribution of A [m] n ) A [m] = B [m] (x) 2 dx = {x( x)} m+ k=m+ (k m )! (k + m + )! ξ2 k ξ 2 k χ2 () i.i.d. 2 / 27

III. Moment generating function 3 / 27

Moment generating function The moment generating function (Laplace transform) of is A [m] = k= λ k ξ 2 k, λ k = k(k + ) (k + 2m + ) [ E e sa[m]] = k= ( 2s λ k ) 2 Theorem Let x j (s) (j =,,..., 2m + ) be the solution of λ x 2s =, i.e., x(x + ) (x + 2m + ) 2s = Then [ E e sa[m]] = 2m+ j= Γ( x j (s)) j! 4 / 27

Moment generating function (m = ) When m = (Anderson-Darling), λ k = k(k + ) The equation x(x + ) 2s = has a solution x (s), x (s) = 2 ± + 8s Hence, [ E e sa[]] = Γ( x (s))γ( x (s)) = 2πs cos π 2 + 8s (Anderson and Darling, 952) Euler s reflection formula Γ(z)Γ( z) = π/ sin(πz) is used. 5 / 27

Moment generating function (m = ) When m =. λ k = k(k + )(k + 2)(k + 3). The equation x(x + )(x + 2)(x + 3) 2s = has the explicit solution, because by letting x = y 3/2, LHS =(y 3/2)(y /2)(y + /2)(y + 3/2) 2s = { y 2 (3/2) 2}{ y 2 (/2) 2} 2s is a quadratic equation in y 2 = (x + 3/2) 2. As a result, x (s), x (s), x 2 (s), x 3 (s) = ( ± 5 ± 4 ) 2s + 3, 2 [ E e sa[]] πs = 3 cos( π 2 5 4 + 2s) cos( π 2 5 + 4 + 2s) 6 / 27

Moment generating function (m = 2) When m = 2, [ E e sa[2]] = (πs) 3/2 432 cos(π η ) cosh(π η 2 ) cosh(π η 3 ) where η = 3 27s + 8 + 3 8s 2 + 48s 728 and η = ( 4η 2 + 35η + 2 ) 2η η 2 = ( 4e πi/3 η 2 35η + 2e πi/3) 2η η 3 = ( 4e πi/3 η 2 35η + 2e πi/3) 2η 7 / 27

Calculation of upper prob. Finite representation is useful in numerical calculation. P ( A [m] > x ) = ( ) k k= π λ2k /2 λ 2k /2 (Smirnov-Slepian technique, see Slepian (958)). The case m = : s e xs E [e sa[m] ] ds Upper Prob.8.6.4.2.5.5 2 2.5 3 x 8 / 27

IV. Statistical power 9 / 27

Statistical power We have the sample counterpart of the KL-expansion: B n [m] (x) {x( x)} (m+)/2 = (k m )! (k + m + )! L(m+) k (x) ξ k where ξ k = k m+ L k (x)db n (x) = n n L k (X i ), k The (extended) Anderson-Darling statistics are also written in terms of ξ k s as A [] n A [] n = k = k 2 i= k(k + ) ξ 2 k = 2 ξ 2 + 6 ξ 2 2 + 2 ξ 2 3 + (k )k(k + )(k + 2) ξ 2 k = 24 ξ 2 2 + 2 ξ 2 3 + 2 / 27

Statistical power (cont) First two components: ξ = 2nm, ξ2 = 6 5n (m 2 /2), where m k = n n i= (X i /2) k (the sample kth moment around /2) A [] n and A [] n = ξ 2 /2 + has much power for mean-shift alternative, = ξ 2 2 /24 + has much power for dispersion-change alternative 2 / 27

V. Extension of Watson s statistic 22 / 27

Watson s statistic and its extension Similar extensions are possible for Watson s statistic. Let where U [m] n = h [m] (t; x) = C [m] n (x) 2 dx, C [m] n = h [m] (t; x)df n (t), (t x)m l(t x ) + m! (m + )! b m+(t x) b m (y) is the Bernoulli polynomial, which satisfies U [] n b m (y + ) = b m (y) + my m. is the original Watson statistic. U [] n is proposed by Henze and Nikitin (22). 23 / 27

Watson s statistic and its extension (cont) h [] ( ; x) -x h [] ( ; x) and h [2] ( ; x) x t x t x t 24 / 27

Limiting null distribution Let C [m] n ( ) d C [m] ( ) and U n [m] U [m]. KL-expansion of C [m] (x): C [m] { } (x) = (2kπ) m+ l [m] 2k (x)ξ 2k + l [m] 2k (x)ξ 2k, where k= l [m] 2k (x) = sin ( ξ 2k = Consequently, 2kπx m + 2 sin(2kπx)db(x), ξ 2k = U [m] = k= ) ( π, l [m] 2k (x) = cos d 2kπx m + 2 cos(2kπx)db(x). ) π, (2kπ) 2(m+) {ξ2 2k + ξ2 2k }. 25 / 27

Summary We proposed a class of extended Anderson-Darling statistics A [m] n (m ) based on m-fold integrated empirical distribution function. The limiting null distribution A [m] is explicitly derived as weighted infinite sums of chi-square random variables. We provided moment generating function of A [m] without using infinite product. The same-type extension for Watson s statistic U n is possible. Acknowledgment: The authors thank Y. Nishiyama of ISM. 26 / 27

References Anderson, T. W. and Darling, D. A. (952). Asymptotic theory of certain goodness of fit criteria based on stochastic processes. Ann. Math. Statist., 23 (2), 93 22. Henze, N. and Nikitin, Ya. Yu. (22). Watson-type goodness-of-fit tests based on the integrated empirical process. Math. Methods. Stat., (2), 83 22. Slepian, D. (957). Fluctuations of random noise power. Bell. Syst. Tech. J., 37, 63 84. Watson, G. S. (96). Goodness-of-fit tests on a circle. Biometrika, 48 (,2), 9 4. 27 / 27