Poisson Approximation for Two Scan Statistics with Rates of Convergence

Similar documents
Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Introduction to Rare Event Simulation

Estimation of arrival and service rates for M/M/c queue system

Chapter 5. Chapter 5 sections

ELEMENTS OF PROBABILITY THEORY

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Quick Tour of Basic Probability Theory and Linear Algebra

18.175: Lecture 13 Infinite divisibility and Lévy processes

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

STAT FINAL EXAM

Rare-Event Simulation

Stein s Method: Distributional Approximation and Concentration of Measure

PROBABILITY THEORY LECTURE 3

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Master s Written Examination - Solution

Lecture 5: Moment Generating Functions

1 Complete Statistics

Learning Patterns for Detection with Multiscale Scan Statistics

Data analysis and stochastic modeling

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Chapter 4: Modelling

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

Asymptotic Statistics-III. Changliang Zou

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

18.175: Lecture 15 Characteristic functions and central limit theorem

Other properties of M M 1

Review. December 4 th, Review

Notes 9 : Infinitely divisible and stable laws

Selected Exercises on Expectations and Some Probability Inequalities

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

STAT215: Solutions for Homework 1

18.440: Lecture 28 Lectures Review

q P (T b < T a Z 0 = z, X 1 = 1) = p P (T b < T a Z 0 = z + 1) + q P (T b < T a Z 0 = z 1)

µ X (A) = P ( X 1 (A) )

Exercises and Answers to Chapter 1

Poisson Processes. Stochastic Processes. Feb UC3M

Formulas for probability theory and linear models SF2941

Exponential Distribution and Poisson Process

Random Variables and Their Distributions

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

Chapter 3 : Likelihood function and inference

On the Breiman conjecture

Stat 426 : Homework 1.

STAT Chapter 5 Continuous Distributions

Chapter 4. Chapter 4 sections

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

In Memory of Wenbo V Li s Contributions

ECE353: Probability and Random Processes. Lecture 18 - Stochastic Processes

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

1 Presessional Probability

Continuous-Time Markov Chain

Chapter 2. Poisson Processes. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Notes 18 : Optional Sampling Theorem

MODERATE DEVIATIONS IN POISSON APPROXIMATION: A FIRST ATTEMPT

Test Problems for Probability Theory ,

Chapter 2: Random Variables

Continuous-time Markov Chains

Networks: Lectures 9 & 10 Random graphs

Final. Fall 2016 (Dec 16, 2016) Please copy and write the following statement:

01 Probability Theory and Statistics Review

Probability and Statistics Notes

Ching-Han Hsu, BMES, National Tsing Hua University c 2015 by Ching-Han Hsu, Ph.D., BMIR Lab. = a + b 2. b a. x a b a = 12

CS Lecture 19. Exponential Families & Expectation Propagation

Concentration of Measures by Bounded Couplings

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Nonparametric hypothesis tests and permutation tests

A Probability Review

2905 Queueing Theory and Simulation PART III: HIGHER DIMENSIONAL AND NON-MARKOVIAN QUEUES

Figure 10.1: Recording when the event E occurs

Renewal theory and its applications

PhysicsAndMathsTutor.com

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Lower Tail Probabilities and Normal Comparison Inequalities. In Memory of Wenbo V. Li s Contributions

{σ x >t}p x. (σ x >t)=e at.

Spring 2012 Math 541B Exam 1

Random Walks Conditioned to Stay Positive

STA 260: Statistics and Probability II

Stein Couplings for Concentration of Measure

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

[Chapter 6. Functions of Random Variables]

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

18.175: Lecture 17 Poisson random variables

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

8 Laws of large numbers

A Few Notes on Fisher Information (WIP)

Math 576: Quantitative Risk Management

Stat 100a, Introduction to Probability.

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

,..., θ(2),..., θ(n)

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

The Central Limit Theorem: More of the Story

E cient Monte Carlo for Gaussian Fields and Processes

ACCOUNTING FOR INPUT-MODEL AND INPUT-PARAMETER UNCERTAINTIES IN SIMULATION. < May 22, 2006

6.041/6.431 Fall 2010 Quiz 2 Solutions

Transcription:

Poisson Approximation for Two Scan Statistics with Rates of Convergence Xiao Fang (Joint work with David Siegmund) National University of Singapore May 28, 2015

Outline The first scan statistic The second scan statistic Other scan statistics

A statistical testing problem Let {X 1,..., X n } be an independent sequence of random variables. We want to test the hypothesis against the alternative H 0 : X 1,..., X n F θ0 ( ) H 1 : for some i < j, X i+1,..., X j F θ1 ( ) X 1,..., X i, X j+1,..., X n F θ0 ( ) i and j are called change-points. They are not specified in the alternative hypothesis. θ 0 may be given, or may need to be estimated. θ 1 may be given, or may be a nuisance parameter.

The first scan statistic If j i = t is given and F θ0 ( ) and F θ1 ( ) have different mean values, a natural statistic is M n;t = max T i, T i = X i + + X i+t 1. 1 i n t 1 We are interested in its p-value: Assume X 1,..., X n F θ0 ( ), P(M n;t b) = P( max T i b) 1 i n t+1 =?

Known results Let Y i = I (T i b). {max 1 i n t+1 T i b} = { n t+1 i=1 Y i 1}. Dembo and Karlin (1992) proved that if t is fixed and b, n plus mild conditions on F θ0 ( ), then n t+1 P(M n;t b) = P( Y i 1) 1 e λ where λ = (n t + 1)E(Y 1 ). i=1 Mild conditions on F θ0 ( ) ensures that P(Y i+1 = 1 Y i = 1) 0.

t : If X i Bernoulli(p) and b is an integer, Arratia, Gordon and Waterman (1990) prove that P(M n;t b) (1 e λ ) C(e ct + t )(λ 1) (1) n where λ = (n t + 1)P(T 1 = b)( b t p). Haiman (2007) derived more accurate approximations using the distribution function of Z k := max{t 1,..., T kt+1 } for k = 1 and 2. The distribution functions of Z k for k = 1 and 2 are only known for Bernoulli and Poisson random variables. Our objective is to extend (1) to other random variables.

Preparation for the main result: Let µ 0 = E(X 1 ). We assume b = at where a > µ 0. P( max T i b) = P( 1 i n t+1 max 1 i n t+1 X i + + X i+t 1 t a). We assume the distribution of X 1 can be imbedded in an exponential family of distributions df θ (x) = e θx Ψ(θ) df (x), θ Θ. (2) It is known that F θ has mean Ψ (θ) and variance Ψ (θ). Assume θ 0 = 0, i.e., X 1 F and there exists θ a Θ o such that Ψ (θ a ) = a. Example: X 1 N(0, 1), Ψ(θ) = θ2 2, θ a = a, F θa N(a, 1).

Assumption (2) is used in two places: 1 To obtain an accurate approximation to the marginal probability P(T 1 at) by change of measure. 2 Local limit theorem Diaconis and Freedman (1988): d TV (L(X 1,..., X m T 1 = at), L(X a 1,..., X a m)) Cm t where X a 1,..., X a m are i.i.d. and X a 1 F θ a. Let D k = k i=1 (X a i X i ). Let σ 2 a = Ψ (θ a ).

Theorem Under the assumption (2), for some constant C depending only on the exponential family (2), µ 0, and a, we have P(M n;t at) (1 e λ ) C( (log t)2 + t where if X 1 is nonlattice plus mild conditions, (log t log(n t)) )(λ 1), n t λ = (n t + 1)e [aθa Ψ(θa)]t θ a σ a (2πt) 1/2 exp[ k=1 1 k E(e θad+ k )], and if X 1 is integer-valued with span 1, λ = (n t + 1)e (aθa Ψ(θa))t e θa( at at) (1 e θa )σ a (2πt) 1/2 exp[ k=1 1 k E(e θad+ k )].

Remarks: We don t have an explicit expression for the constant C. The relative error 0 if t, n t. Let g(x) = Ee ixd 1 and ξ(x) = log{1/[1 g(x)]}. Woodroofe (1979) proved that for the nonlattice case, 1 k E(e θad+ k ) = log[(a µ0 )θ a ] 1 π k=1 + 1 π 0 0 θ 2 a[i ξ(x) π 2 ] x(θ 2 a + x 2 ) dx θ a {Rξ(x) + log[(a µ 0 )x]} θa 2 + x 2 dx where R and I denote real and imaginary parts. Tu and Siegmund (1999) proved that for the arithmetic case, 1 k E(e θad+ k ) = log(a µ0 ) k=1 + 1 2π 2π 0 { ξ(x)e θ a ix 1 e θa ix + ξ(x) + log[(a µ 0)(1 e ix } )] 1 e ix dx.

Example 1: Normal distribution. n t a p 1 p 2 1000 50 0.2 0.9315 0.9594 1000 50 0.4 0.2429 0.2624 1000 50 0.5 0.0331 0.0334 2000 50 0.5 0.0668 0.0672

Example 2: Bernoulli distribution. n t µ 0 a p 1 p 2 7680 30 0.1 11/30 0.14097 0.14021 7680 30 0.1 0.4 0.029614 0.029387 15360 30 0.1 0.4 0.058458 0.058003

Sketch of proof: Let m = C(log t log(n t)). Let Let Y i = I (T i at,t i+1 < T i,..., T i+m < T i T i 1 < T i,..., T i m < T i ). W = n t+1 i=1 P(M n;t at) P(W 1). Y i, λ 1 = EW = (n t + 1)EY 1. From the Poisson approximation theorem of Arratia, Goldstein and Gordon (1990), we have P(W 1) (1 e λ 1 ) C( 1 t + 1 )(λ 1). n t

Approximating λ 1 by λ: EY 1 =P(T 1 at, T 2 < T 1,..., T 1+m < T 1 ; T 0 < T 1,..., T 1 m < T 1 ) P(T 1 at)p 2 (T 1 T 2 > 0,..., T 1 T 1+m > 0 T 1 at) Note that T 1 T 2 = X 1 X t+1 and that given T 1 at, X 1 F θa approximately and X t+1 F. Thus, {T 1 T 2 > 0} {D 1 > 0} where D 1 = X a 1 X 1. Similarly, {T 1 T k+1 > 0} {D k > 0}, D k = k Therefore, i=1 (X a EY 1 P(T 1 at)p 2 (D k > 0, k = 1, 2,... ). i X i ). Recall λ = (n t + 1)e [aθa Ψ(θa)]t θ a σ a (2πt) 1/2 exp[ k=1 1 k E(e θad+ k )].

Corollary Let {X 1,..., X n } be i.i.d. random variables with distribution function F that can be imbedded in an exponential family, as in (2). Let EX 1 = µ 0. Assume X 1 is integer-valued with span 1. Suppose a = sup{x : p x := P(X 1 = x) > 0} is finite. Let b = at. Then we have, with constants C and c depending only on p a, P(M n;t b) (1 e λ ) C(λ 1)e ct where λ = (n t)p t a(1 p a ) + p t a.

The second scan statistic Recall that we want to test against the alternative H 0 : X 1,..., X n F θ0 ( ) H 1 : for some i < j, X i+1,..., X j F θ1 ( ) X 1,..., X i, X j+1,..., X n F θ0 ( ) Now assume j i is not given, and F θ0 and F θ1 are from the same exponential family of distributions df θ (x) = e θx Ψ(θ) df (x), θ Θ. Then the log likelihood ratio statistic is j max (θ 1 θ 0 )(X k Ψ(θ 1) Ψ(θ 0 ) ). 0 i<j n θ 1 θ 0 k=i+1

It reduces to the following problem: Let {X 1,..., X n } be independent, identically distributed random variables. Let EX 1 = µ 0 < 0. Let S 0 = 0 and S i = i j=1 X j for 1 i n. We are interested in the distribution of M n := max (S j S i ). 0 i<j n Iglehart (1972) observed that it can be interpreted as the maximum waiting time of the first n customers in a single server queue. Karlin, Dembo and Kawabata (1990) discussed genomic applications.

The limiting distribution was derived by Iglehart (1972): Assume the distribution of X 1 can be imbedded in an exponential family of distributions df θ (x) = e θx Ψ(θ) df (x), θ Θ. Assume EX 1 = Ψ (0) = µ 0 < 0 and there exists a positive θ 1 Θ such that Ψ (θ 1 ) = µ 1, Ψ(θ 1 ) = 0. When X 1 is nonlattice, we have lim n P(M n log n θ 1 + x) = 1 exp( K e θ 1x ).

Theorem Let h(b) > 0 be any function such that h(b), h(b) = O(b 1/2 ) as b. Suppose n b/µ 1 > b 1/2 h(b). We have, P(M n b) (1 e λ ) { } ( Cλ 1 + b/h2 (b) ) e ch2 (b) + b1/2 h(b) n b/µ 1 n b µ 1 where if X 1 is nonlattice plus mild conditions, λ = (n b µ 1 ) e θ 1b θ 1 µ 1 exp( 2 k=1 1 k E θ 1 e θ 1S + k ), and if X 1 is integer-valued with span 1 and b is an integer, λ = (n b e θ 1b 1 ) µ 1 (1 e θ exp( 2 1 )µ1 k E θ 1 e θ 1S + k ). k=1

Remarks: By choosing h(b) = b 1/2, we get P(M n b) (1 e λ ) Cλ{e cb + b n } By choosing h(b) = C(log b) 1/2 with large enough C, we can see that the relative error in the Poisson approximation goes to zero under the conditions b, (b log b) 1/2 n b/µ 1 = O(e θ 1b ), where n b/µ 1 = O(e θ 1b ) ensures that λ is bounded. For the smaller range (in which case λ 0) b, δb n b/µ 1 = o(e 1 2 θ 1b ) for some δ > 0, Siegmund (1988) obtained more accurate estimates by a technique different from ours.

Let G(z) = 0 p kz k + 1 q kz k, and let z 0 denote the unique root > 1 of G(z) = 1. For the case p k = 0 for k > 1, using the notation Q(z) = k q kz k, one can show for large values of n and b that λ nz0 b {[Q(1) Q(z 1 0 )] (1 z 1 0 )z 1 0 Q (z0 1 )}. For the case q k = 0 for k > 1, λ nz0 b (1 z 1 0 )) G (1) 2 /G (z 0 ). In particular if q 1 = q and p 1 = p, where p + q = 1, both these results specialize to λ n(p/q) b (q p) 2 /q.

Sketch of proof (for the case h(b) = b 1/2 ): Recall S i = ik=1 X k. Define T b := inf{n 1 : S n / [0, b)}. For a positive integer m, let ω + m be the m-shifted sample path of ω := {X 1,..., X n }. Let t = b µ 1 + b and m = cb such that m < t. For 1 i n t, let Y i = I ( S i < S i j, 1 j m; T b (ω + i ) t, S Tb (ω + i ) b ). That is, Y i is the indicator of the event that the sequence {S 1,... S n } reaches a local minimum at i and the i-shifted sequence {S i (ω + α )} exits the interval [0, b) within time t and the first exiting position is b. Let W = n t i=1 Y i.

Sketch of proof (cont.) P(M n b) P(W 1). P(W 1) (1 e λ 1 ) Cλe cb. λ 1 = (n t)ey 1 (n t)p(τ 0 = )P(S Tb b) where τ 0 := inf{n 1 : S n 0}. λ 1 λ.

Other statistics Recall again that we want to test H 0 : X 1,..., X n F θ0 ( ) against the alternative H 1 : for some i < j, X i+1,..., X j F θ1 ( ) X 1,..., X i, X j+1,..., X n F θ0 ( ) 1. If θ 0 is not given, we need to consider P(M n;t b S n ) and P(M n b S n ).

2. If θ 0 is given but θ 1 is a nuisance parameter, then the log likelihood ratio statistic is max max[θ(s j S i ) (j i)ψ(θ)]. 0 i<j n For normal distribution, it reduces to θ (S j S i ) 2 max 0 i<j n 2(j i). The limit of is only know for normal distribution and for n b 2 [Siegmund and Venkatraman (1995)].

3. Frick, Munk and Sieling (2014) proposed the following multiscale statistic: { Sj S i max 2 log ( n ) }. 0 i<j n j i j i The penalty term 2 log(n/(j i)) was first studied in Dümbgen and Spokoiny (2001) and motivated by Lévy s modulus of continuity theorem.

4. Let X 1,..., X m be an independent sequence of Gaussian random variables with mean EX i = µ i and variance 1. We are interested in testing the null hypothesis H 0 : µ 1 = = µ m against the alternative hypothesis that there exist 1 τ 1 < < τ K m 1 such that H 1 :µ 1 =... µ τ1 µ τ1 +1 = = µ τ2 = µ τk µ τk +1 = = µ m.

4. (cont.) If K = 1, the log likelihood ratio statistic is S m S t m t St t max. 1 t m 1 1 t + 1 m t If K > 1, an appropriate statistic is max 0 i<j<k m { S j S i j i S k S j k j 1 j i + 1 k j }.

Thank you!