Scan Statistics: Theory and Applications

Similar documents
Approximations for two-dimensional discrete scan statistics in some dependent models

A Markov chain approach to quality control

Approximations for Multidimensional Discrete Scan Statistics

Combinations. April 12, 2006

The Markov Chain Imbedding Technique

CSE 312 Final Review: Section AA

On lower and upper bounds for probabilities of unions and the Borel Cantelli lemma

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Mathematical Methods for Computer Science

Waiting time distributions of simple and compound patterns in a sequence of r-th order Markov dependent multi-state trials

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Introduction to probability theory

A PROBABILISTIC MODEL FOR CASH FLOW

Homework 1 Solutions ECEn 670, Fall 2013

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011

On the estimation of the entropy rate of finite Markov chains

Poisson Approximation for Two Scan Statistics with Rates of Convergence

Modern Discrete Probability Branching processes

On monotonicity of expected values of some run-related distributions

An FKG equality with applications to random environments

WAITING-TIME DISTRIBUTION FOR THE r th OCCURRENCE OF A COMPOUND PATTERN IN HIGHER-ORDER MARKOVIAN SEQUENCES

Modelling claim exceedances over thresholds

The symmetry in the martingale inequality

Evgeny Spodarev WIAS, Berlin. Limit theorems for excursion sets of stationary random fields

Stochastic Processes

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Week 2. Review of Probability, Random Variables and Univariate Distributions

Limit theorems for dependent regularly varying functions of Markov chains

Markov Chain Model for ALOHA protocol

Randomized Load Balancing:The Power of 2 Choices

7 To solve numerically the equation of motion, we use the velocity Verlet or leap frog algorithm. _ V i n = F i n m i (F.5) For time step, we approxim

Modèles de dépendance entre temps inter-sinistres et montants de sinistre en théorie de la ruine

Lower Tail Probabilities and Related Problems

PAIRS OF SUCCESSES IN BERNOULLI TRIALS AND A NEW n-estimator FOR THE BINOMIAL DISTRIBUTION

A MARTINGALE APPROACH TO SCAN STATISTICS

Spring 2012 Math 541B Exam 1

Chapter 5. Chapter 5 sections

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

The asymptotic number of prefix normal words

Solution: The process is a compound Poisson Process with E[N (t)] = λt/p by Wald's equation.

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

1 Exercises for lecture 1

There is no isolated interface edge in very supercritical percolation

ON COMPOUND POISSON POPULATION MODELS

n(1 p i ) n 1 p i = 1 3 i=1 E(X i p = p i )P(p = p i ) = 1 3 p i = n 3 (p 1 + p 2 + p 3 ). p i i=1 P(X i = 1 p = p i )P(p = p i ) = p1+p2+p3

THE QUEEN S UNIVERSITY OF BELFAST

UPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Karaliopoulou Margarita 1. Introduction

Discrete distribution. Fitting probability models to frequency data. Hypotheses for! 2 test. ! 2 Goodness-of-fit test

Optional Stopping Theorem Let X be a martingale and T be a stopping time such

Problem Points S C O R E Total: 120

Ruin Probabilities of a Discrete-time Multi-risk Model

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

Convergence at first and second order of some approximations of stochastic integrals

PITMAN S 2M X THEOREM FOR SKIP-FREE RANDOM WALKS WITH MARKOVIAN INCREMENTS

Selected Exercises on Expectations and Some Probability Inequalities

Poisson INAR processes with serial and seasonal correlation

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Probability and Measure

THE NUMBER OF SQUARE ISLANDS ON A RECTANGULAR SEA

PRIME GENERATING LUCAS SEQUENCES

Chapter 2: Markov Chains and Queues in Discrete Time

Tom Salisbury

Math Review Sheet, Fall 2008

Solutions to Problem Set 4

Max stable processes: representations, ergodic properties and some statistical applications

Imprecise Software Reliability

4 Branching Processes

A COMPOUND POISSON APPROXIMATION INEQUALITY

Meixner matrix ensembles

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Bounds for reliability of IFRA coherent systems using signatures

Fundamental Tools - Probability Theory IV

Math-Stat-491-Fall2014-Notes-V

The Distribution of Partially Exchangeable Random Variables

The Distribution of Mixing Times in Markov Chains

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Asymptotic properties of the maximum likelihood estimator for a ballistic random walk in a random environment

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

Contents. 4 Arithmetic and Unique Factorization in Integral Domains. 4.1 Euclidean Domains and Principal Ideal Domains

Stochastic process. X, a series of random variables indexed by t

Average Reward Parameters

Geometric Inference for Probability distributions

Chapter 1. Sets and probability. 1.3 Probability space

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Practical conditions on Markov chains for weak convergence of tail empirical processes

A log-scale limit theorem for one-dimensional random walks in random environments

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 2. UNIVARIATE DISTRIBUTIONS

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

Gwinnett has an amazing story to share, spanning 200 years from 1818 until today.

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answer sheet.

Birth-death chain. X n. X k:n k,n 2 N k apple n. X k: L 2 N. x 1:n := x 1,...,x n. E n+1 ( x 1:n )=E n+1 ( x n ), 8x 1:n 2 X n.

Consensus formation in the Deuant model

arxiv: v1 [math.pr] 9 Apr 2015 Dalibor Volný Laboratoire de Mathématiques Raphaël Salem, UMR 6085, Université de Rouen, France

Transcription:

Scan Statistics: Theory and Applications Alexandru Am rioarei Laboratoire de Mathématiques Paul Painlevé Département de Probabilités et Statistique Université de Lille 1, INRIA Modal Team Seminaire de Probabilité et Statistique Laboratoire de Mathématiques Paul Painlevé 5 March, 2014, Lille A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 1 / 61

Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 2 / 61

Introduction Example Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 3 / 61

Introduction Example Epidemiology example 1 year JFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMAMJJASOND 1991 1992 1993 1994 1995 Observation of disease cases over time: N = 19 cases over a period of T = 5 years Observation The epidemiologist notes a one year period (from April 93 - through April 94) with 8 cases: 42%! Question Given 19 cases over 5 years, how unusual is it to have a 1 year period containing as many as 8 cases? A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 4 / 61

Introduction Example The answer: A First approach A First approach: X = the number of cases falling in [April 93, April 94] X Bin(19, 0.2) P(X 8) = 0.023 Conclusion: atypical situation! But: it is not the answer to our question: the one year period is not xed but identied after the scanning process! A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 5 / 61

Introduction Example The answer: Correct approach The scan statistics: S = the maximum number of cases over any continuous one year period in [0, T ] Thus, P(S 8) = 0.379 gives the answer to the epidemiologist question. Conclusion: no unusual situation! A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 6 / 61

One Dimensional Scan Statistics Denitions and Notations Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 7 / 61

One Dimensional Scan Statistics Denitions and Notations Introducing the Scan Statistics Model Let T 1 be a positive integer and X 1, X 2,..., X T1 a sequence of i.i.d. r.v.'s. For m 1 T 1 integer, consider the moving sums Y i1 = i 1 +m 1 1 j=i 1 The discrete one dimensional scan statistics Problem S m1 (T 1 ) = X j max Y i 1. 1 i 1 T 1 m 1 +1 Approximate the distribution of one dimensional scan statistic P (S m1 (T 1 ) k). Used for testing the null hypotheses of randomness against the alternative hypothesis of clustering. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 8 / 61

One Dimensional Scan Statistics Denitions and Notations Related Statistics Let X 1,..., X T1 be a sequence of i.i.d. 0 1 Bernoulli of parameter p W m1,k - the waiting time until we rst observe at least k successes in a window of size m 1 P ( W m1,k T 1 ) = P (Sm1 (T 1 ) k) D T1 (k) - the length of the smallest window that contains at least k successes L T1 P (D T1 (k) m 1 ) = P (S m1 (T 1 ) k) - the length of the longest success run P (L T1 m 1 ) = P (S m1 (T 1 ) m 1 ) = P (S m1 (T 1 ) = m 1 ) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 9 / 61

One Dimensional Scan Statistics Denitions and Notations Literature A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 10 / 61

One Dimensional Scan Statistics Exact Formula by MCIT (Bernoulli Case) Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 11 / 61

One Dimensional Scan Statistics Exact Formula by MCIT (Bernoulli Case) Approach Fu (2001) applied the Markov Chain Imbedding Technique to nd the distribution of binary scan statistics. Main Idea Express the distribution of the S m1 (T 1 ) in terms of the waiting time distribution of a special compound pattern dene for 0 k m 1 m1 {}}{ F m 1,k = {Λ i Λ 1 = 1... 1, Λ }{{} 2 = 10 1... 1,..., Λ }{{} l = 1... 1 0... 01} }{{} k k 1 k 1 Fm 1,k the compound pattern Λ = l i=1 Λ i, Λ i F m 1,k m1 k1 ( k 2 + j ) = j j=0 P(S m 1 (T 1) < k) = P(W (Λ) T 1 + 1). P(S m 1 (T 1) < k) = ξn T 1 1, where ξ = (1, 0,..., 0) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 12 / 61

One Dimensional Scan Statistics Exact Formula by MCIT (Bernoulli Case) Example Consider the i.i.d. two-state sequence (X i ) i {1,2,...,T1 } with p = P(X 1 = 1) and q = P(X 1 = 0). A realisation for T 1 = 20 00101011101101010110 For k = 3 and m 1 = 4 F 4,3 = {Λ 1 = 111, Λ 2 = 1011, Λ 3 = 1101} The state space the principal matrix: N = 0 q p 0 0 0 0 0 q p 0 0 0 0 0 0 0 q p 0 0 0 q 0 0 0 p 0 0 0 0 0 0 0 q 0 0 0 q 0 0 0 0 q 0 0 0 0 0 Ω = {, 0, 1, 10, 11, 101, 110, α 1, α 2, α 3 } A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 13 / 61

One Dimensional Scan Statistics Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 14 / 61

One Dimensional Scan Statistics Naus Product Type Approximation Considering that T 1 = L 1 m 1, Naus (1982) gave the following product type approximation ( ) T 1 Q3m 1 P (S m1 (T 1 ) k) Q 2 m 1 2m1 Q 2m, 1 where Q 2m1 = P (S m1 (2m 1 ) k) and Q 3m1 = P (S m1 (3m 1 ) k). Idea of proof: P (S m 1 (T 1) k) = P(E 1 )P(E 2 E 1 ) P ( E L 1 where for j {1, 2,..., L 1}, { } E j = max Y i 1 k, (j 1)m1+1 i1 jm1 and use the Markov-like approximation (exchangeability) l 1 P E l E i P (E l E l 1 ) P (E 2 E 1 ) P (E i=1 1 ) L 2 i=1 E i ) = Q 3m1. Q 2m 1, A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 15 / 61

One Dimensional Scan Statistics Product Type Approximation for Bernoulli For Bernoulli r.v.'s, Naus (1982) derived exact formulas for Q 2m1 and Q 3m1 : Q 2m 1 = F 2 (k; m 1, p) kb(k + 1; m 1, p)f (k 1; m 1, p) + m 1 pb(k + 1; m 1, p)f (k 2, m 1 1), Q 3m 1 = F 3 (k; m 1, p) A 1 + A 2 + A 3 A 4, where A 1 = 2b(k + 1; m 1, p)f (k; m 1, p)[kf (k 1; m 1, p) m 1 pf (k 2; m 1 1, p)], A 2 = 0.5b 2 (k + 1; m 1, p) [k(k 1)F (k 2; m 1, p) 2(k 1)m 1 F (k 3; m 1 1, p) +m 1 (m 1 1)p 2 F (k 4; m 1 2, p) ], k A 3 = b(2(k + 1) r; m 1, p)f 2 (r 1; m 1, p), r=1 k A 4 = b(2(k + 1) r; m 1, p)b(r + 1; m 1, p)[rf (r 1; m 1, p) m 1 pf (r 2; m 1 1, p)]. r=2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 16 / 61

One Dimensional Scan Statistics Product Type Approximation for Binomial and Poisson If X i Bin(n, p) or X i Pois(λ), we have the approximation P (S m 1 (T 1) k) Q 2m 1 Q 2m 1 1 ( Q3m 1 Q 2m 1 ( Q2m 1 ) T 1 m 1 2, T 1 3m 1 Q 2m 1 1 ) T 1 2m1+1, T 1 2m 1 where Q 2m1 1, Q 2m1 and Q 3m1 are computed by Karwe and Naus (1996) recurrence. Karwe Naus algorithm for Q 2m 1 1 and Q2m 1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 17 / 61

One Dimensional Scan Statistics Bounds Glaz and Naus (1991) developed a variety of tight bounds: Lower Bounds Q 2m 1 P (S m 1 (T 1) k) [ ], T 1 + Q2m 1 1 Q2m T 1 2m 1 1 2m1 1 Q 2m 1 1Q2m 1 Q 3m 1 [ 1 + Q 2m 1 1 Q2m 1 Q 3m 1 1 ] T 1 3m1, T 1 3m 1 Upper Bounds P (S m 1 (T 1) k) Q 2m 1 [1 Q 2m1 1 + Q 2m 1 ]T 1 2m1, T 1 2m 1 Q 3m 1 [1 Q 2m1 1 + Q 2m 1 ]T 1 3m1, T 1 3m 1 The values Q 2m1 1, Q 2m1, Q 3m1 1, Q 3m1 are computed using Karwe and Naus (1996) algorithm. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 18 / 61

One Dimensional Scan Statistics Haiman Type Approximation: Main Observation Haiman proposed in 2000 a dierent approach Main Observation The scan statistic r.v. can be viewed as a maximum of a sequence of 1-dependent stationary r.v.. The idea: discrete and continuous one dimensional scan statistic: Haiman (2000,2007) discrete and continuous two dimensional scan statistic: Haiman and Preda (2002,2006) discrete three dimensional scan statistic: Am rioarei and Preda (2013) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 19 / 61

One Dimensional Scan Statistics Haiman Type Approximation: Writing the Scan as an Extreme of 1-Dependent R.V.'s Let T 1 = L 1 m 1 positive integer Dene for j {1, 2,..., L 1 1} Z j = (Z j ) j is 1-dependent and stationary max (j 1)m 1 +1 i 1 jm 1 +1 Y i 1 {}}{ X 1, X 2,..., X m 1, X m m1+1,..., X 2m 1 }{{}, X2m1+1,..., X 3m 1, X 3m1+1,..., X 4m 1 }{{} Z1 Z2 Z3 Observe S m1 (T 1 ) = max 1 j L 1 1 Z j A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 20 / 61

One Dimensional Scan Statistics Haiman Type Approximation: Main Tool Let (Z j ) j 1 be a strictly stationary 1-dependent sequence of r.v.'s and let q m = q m (x) = P(max(Z 1,..., Z m ) x), with x < sup{u P(Z 1 u) < 1}. Theorem (Haiman 1999) For any x such that P(Z 1 > x) = 1 q 1 0.025 and any integer m > 3, q 6(q 1 q 2 ) 2 + 4q 3 3q 4 m (1 + q 1 q 2 + q 3 q 4 + 2q1 2 + 3q2 2 5q 1q 2 ) m H 1 (1 q 1 ) 3, q 2q 1 q 2 m [1 + q 1 q 2 + 2(q 1 q 2 ) 2 ] m H 2 (1 q 1 ) 2, H 1 = 561 + 88m[1 + 124m(1 q 1) 3 ] H 1 = 9 + 561(1 q 1) + 3.3m[1 + 4.7m(1 q 1 ) 2 ]. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 21 / 61

One Dimensional Scan Statistics Haiman Type Approximation: Improvement Main Theorem (Amarioarei 2012) For x such that P(Z 1 > x) = 1 q 1 α < 0.1 and m > 3 we have q 6(q 1 q 2 ) 2 + 4q 3 3q 4 m (1 + q 1 q 2 + q 3 q 4 + 2q1 2 + 3q2 2 5q 1q 2 ) m 1(1 q 1 ) 3, q 2q 1 q 2 m [1 + q 1 q 2 + 2(q 1 q 2 ) 2 ] m 2(1 q 1 ) 2, 1 = 1 (α, q 1, m) = Γ(α) + mk(α) [ 2 = me(α, q 1, m) = m Increased range of applicability 1 + 3 m + K(α)(1 q 1) + Γ(α)(1 q 1) m Sharp bounds values (ex. α = 0.025: 561 145 and 88 17.5) Selected values for K(α) and Γ(α) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 22 / 61 ].

One Dimensional Scan Statistics Dierence between the results: 1 q 1 = 0.025 9 8 2 H 2 Value of the coefficients 7 6 5 4 3 2 1 0 50 100 150 200 250 300 350 400 450 500 Length of the sequence (n) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 23 / 61

One Dimensional Scan Statistics Haiman Type Approximation: Result T 1 Observe that Q 2m1 = P(Z 1 k) Q 3m1 = P(Z 1 k, Z 2 k) X 1 Xm1 X m1 +1 X 2m1 X 2m1 +1 m 1 2m 1 Q 2m1 X 3m1 X T1 If 1 Q 2m1 0.1 the approximation 3m 1 Q 3m1 P(S k) 2Q2m 1 Q 3m 1 [1+Q2m 1 Q 3m 1 +2(Q 2m 1 Q 3m 1) 2]L 1 1 where S = S m1 (T 1 ) Approximation error, about (L 1 1)E(α, L 1 1) (1 Q 2m 1 )2 with E(α, α, m) = E(α, m). Q 2m1 and Q 3m1 are evaluated using the previous approaches A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 24 / 61

One Dimensional Scan Statistics Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 25 / 61

One Dimensional Scan Statistics Comparison between methods Table 1 : n = 1, p = 0.005, m 1 = 10, T 1 = 1000, It App = 10 4 k Exact Glaz and Naus Haiman Approximation Lower Upper Product type Approximation Error Bound Bound 1 0.810209 0.810216 0.810404 0.001111 0.809903 0.810439 2 0.995764 0.995764 0.995764 3 10 7 0.995764 0.995764 3 0.999950 0.999950 0.999950 4 10 11 0.999950 0.999950 Table 2 : n = 5, p = 0.05, m 1 = 25, T 1 = 500, It App = 10 4, It Sim = 10 3 k ˆP(S k) Glaz and Naus Haiman Total Lower Upper Product type Approximation Error Bound Bound 13 0.712750 0.705787 0.714699 0.039308 0.697431 0.706948 14 0.867498 0.862184 0.865029 0.012502 0.859543 0.862407 15 0.946912 0.943329 0.946177 0.004169 0.942552 0.943362 16 0.980230 0.978959 0.979822 0.001354 0.978733 0.978963 17 0.993486 0.992821 0.993134 0.000433 0.992756 0.992822 18 0.997802 0.997726 0.997849 0.000127 0.997708 0.997726 19 0.999362 0.999327 0.999358 3 10 5 0.999322 0.999327 20 0.999819 0.999813 0.999825 9 10 6 0.999812 0.999813 21 0.999954 0.999951 0.999953 2 10 6 0.999951 0.999951 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 26 / 61

Two Dimensional Scan Statistics Model Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 27 / 61

Two Dimensional Scan Statistics Model Introducing the Model Let T 1, T 2 be positive integers T 2.. j. 1 X i,j T 1 T 2 Rectangular region R = [0, T 1 ] [0, T 2 ] (X ij ) 1 i T1 1 j T 2 i.i.d. integer r.v.'s Bernoulli(B(1, p)) Binomial(B(n, p)) Poisson(P(λ)) X ij number of observed events in the elementary subregion r ij = [i 1, i] [j 1, j] 1 i T 1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 28 / 61

Two Dimensional Scan Statistics Model Dening the Scan Statistic Let m 1, m 2 be positive integers T 1 Dene for 1 i j T j m j + 1, Y i1 i 2 = i 1 +m 1 1 i 2 +m 2 1 X ij i=i 1 j=i 2 m 2 T 2 The two dimensional scan statistic, m 1 S m1,m 2 (T 1, T 2 ) = max Y i1 i 2 1 i 1 T 1 m 1 +1 1 i 2 T 2 m 2 +1 Used for testing the null hypotheses of randomness against the alternative hypothesis of clustering A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 29 / 61

Two Dimensional Scan Statistics Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 30 / 61

Two Dimensional Scan Statistics Product Type Approximation Bernoulli Case Boutsikas and Koutras (2000) using Markov Chain Imbedding approach proposed the approximation Where, P (S m 1,m2 (T 1, T 2 ) k) Q(m 1,m2) (T 1 m 1 1)(T 2 m 2 1) Q(m1+1,m2+1) (T 1 m 1 )(T 2 m 2 ) Q(m1,m2+1) (T 1 m 1 1)(T 2 m 2 ) Q(m1+1,m2) (T 1 m 1 )(T 2 m 2 1) Q(m1, m2) = F (k; m1m2, p) k Q(m1 + 1, m2) = F 2 (k s; m2, p) b (s; (m1 1)m2, p) s=0 k k 1 Q(m1 + 1, m2 + 1) = b(s1; m1 1, p)b(s2; m1 1, p)b(t1; m2 1, p) s 1,s 2 =0 t 1,t 2 =0 i 1,i 2,i 3,i 4 =0 i b(t2; m2 1, p)p j (1 p) 4 i j F (u; (m 1 1)(m2 1), p) u = min {k s1 t1 i1, k s2 t1 i2, k s1 t2 i3, k s2 t2 i4} ( ) n b(s; n, p) = p s (1 p) n s s s F (s; n, p) = b(i; n, p) i =0 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 31 / 61

Two Dimensional Scan Statistics Bounds for the Bernoulli Case The following bounds were established by Boutsikas and Koutras (2003) Lower Bound LB = (1 Q 1 ) (T 1 m1)(t2 m2) (1 Q 2 ) T 1 m1 (1 Q 3 ) T 2 m2 (1 Q 4 ) Upper Bound ( UB = (1 Q1) 1 q (m 1 1)(3m 2 2)+(2m 1 1)(m 2 1) (T 1 Q1) m 1 1)(T 2 m 2 1) ( 1 q m 1 (m 2 1) ) T 2 Q1 m 2 1 ( 1 q (m 1 1)(2m 2 1)+(m 1 1)(m 2 1) T 1 Q1) m 1 1 ( 1 q (m 1 1)(2m 2 1)+m 1 (m 2 1) ) T 1 Q2 m 1 ( 1 q (m 1 1)(3m 2 2)+m 1 (m 2 1)+(m 1 1)(m 2 1) T 2 Q3) m 2 ( 1 q (m 1 1)(2m 2 1)+m 1 (m 2 1) ) Q4. Where X ij B(p), q = 1 p and Q1 = F c k+1,m 1 m 2 qm2 F c k+1,(m 1 1)m 2 qm1 F c k+1,m 1 (m 2 1) + qm 1 +m 2 1 F c k+1,(m 1 1)(m 2 1), Q2 = F c k+1,m 1 m 2 qm2 F c k+1,(m 1 1)m 2, Q 3 = F c k+1,m 1 m 2 qm1 F c k+1,m 1 (m 2 1), Q 4 = F c k+1,m 1 m 2 Fi c,m = 1 F (i 1; m, p). A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 32 / 61

Two Dimensional Scan Statistics Product Type Approximation Binomial and Poisson For X ij Bin(n, p) or X ij Pois(λ), Chen and Glaz (2009) proposed the product type approximation Q(m (T 1,2m2 1) 1 m1 1)(T2 2m2) Q(m1+1,m2) (T 1 m 1 )(T 2 m 2 1) Q(m1,2m2) (T 1 m 1 1)(T 2 2m 2 +1) P (S m 1,m2 (T 1, T 2 ) k) Q(m 1+1,m2+1) (T 1 m 1 )(T 2 m 2 ) Where, Q(m 1, 2m 2 1) = P (S m1,m 2 (m 1, 2m 2 1) k) Q(m 1, 2m 2 ) = P (S m1,m 2 (m 1, 2m 2 ) k) To compute the unknown variables we use Q(m 1, 2m 2 1) and Q(m 1, 2m 2 ) - adaptation of Karwe and Naus algorithm Q(m 1 + 1, m 2 ) and Q(m 1 + 1, m 2 + 1) - conditioning Formulas for Q(m1 + 1, m2) and Q(m1 + 1, m2 + 1) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 33 / 61

Two Dimensional Scan Statistics Lower Bound for Binomial and Poisson Glaz and Chen (1996) gave a lower bound applying Hoover Bonferroni type inequality of order r 3, P ( S m 1,m 2 (T 1, T2) k ) T 1 m 1 +1 = P i 1 =1 T 1 m 1 +1 i 1 =1 T 1 m 1 +1 i 1 =1 T 2 m 2 +1 T 2 m 2 i 2 =1 r 1 l =2 A i i 2 =1 1,i 2 T 1 m 1 +1 i 1 =1 T 2 m 2 +1 i 2 =1 ) P (A T 1 i 1,i 2 m 1 Ai 1,i 2 +1 T 2 m 2 +1 l i 2 =1 Where A i1,i 2 = {Y i1,i 2 k} and for r = 4, Lower Bound i 1 =1 ( ) P A i 1,i 2 ) P (A i 1,1 Ai 1 +1,1 ) P (A i 1,i 2 Aci 1,i 2 +1... Aci 1,i 2 +l 1 Ai 1,i 2 +l P ( S m 1,m 2 (T 1, T2) k ) (T1 m1) (Q(m1 + 1, m2) 2Q(m1, m2)) (T1 m1 + 1)(T2 m2 3) Q(m1, m2 + 2) + (T1 m1 + 1)(T2 m2 2)Q(m1, m2 + 3). Q(m 1 + 1, m 2 ), Q(m 1, m 2 ), Q(m 1, m 2 + 2), Q(m 1, m 2 + 3) - Karwe and Naus algorithm (variant) A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 34 / 61

Two Dimensional Scan Statistics Upper Bound for Binomial and Poisson For the upper bound we adapt the inequality of Kuai et al. (2000) to the two dimensional framework: where P ( S m 1,m 2 We have T 1 m 1 +1 (T 1, T2) k) = 1 P i 1 =1 T 1 m 1 +1 T 2 m 2 +1 [ 1 Σ(i1, i2) = i 1 =1 T 1 m 1 +1 j 1 =1 i 2 =1 T 2 m 2 +1 j 2 =1 ( ) P A i 1,i 2 Aj 1,j = 2 ) P (Y i 1,i 2 n, Yj 1,j 2 n = Z = T 2 m 2 +1 A i i 2 =1 1,i 2 θ i 1,i 2 P(Ai 1,i 2 )2 Σ(i1, i2) + (1 θ i 1,i 2 )P(Ai 1,i 2 ) + (1 θi 1,i 2 )P(Ai 1,i 2 ) 2 ] Σ(i1, i2) θ i 1,i 2 P(Ai 1,i 2 ) ( ) P A i 1,i 2 Aj 1,j and θ 2 i 1,i 2 = Σ(i 1,i 2 ) P(A i 1,i 2 ) Σ(i 1,i 2 ) P(A i 1,i 2 ). { [1 Q(m 1, m2)] 2, if i1 j1 m1 or i2 ) j2 m2, 1 2Q(m1, m2) + P (Y i 1,i 2 n, Yj 1,j 2 n, otherwise n P(Z = k)p(y i 1,i 2 Z n k)2, k=0 (i 1 +m 1 1) (j 1 +m 1 1) s=i 1 j 1 (i 2 +m 2 1) (j 2 +m 2 1) t=i 2 j 2 X s,t. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 35 / 61

Two Dimensional Scan Statistics Haiman Type Approximation and Error Bounds: Writing the Scan as an Extreme of 1-Dependent R.V.'s Let L j = T j m j, j {1, 2} positive integers Dene for l {1, 2,..., L 2 1} Z l = max Y i1 i 2 1 i 1 (L 1 1)m 1 +1 (l 1)m 2 +1 i 2 lm 2 +1 (Z l ) l is 1-dependent and stationary Observe S m1,m 2 (T 1, T 2 ) = max 1 k L 2 1 Z k T 1 T 2 m 2 m 1 Z 3 Z 1 Z 2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 36 / 61

Two Dimensional Scan Statistics Haiman Type Approximation and Error Bounds: First Step Approximation Using Main Theorem we obtain Dene Q 2 = P(Z 1 k) Q 3 = P(Z 1 k, Z 2 k) If 1 Q 2 α 1 < 0.1 the (rst) approximation T 2 m 2 m 1 T 1 R P(S k) 2Q2 Q3 [1+Q2 Q3+2(Q2 Q3) 2 ] L 2 1 T1 T1 where S = S m1,m 2 (T 1, T 2 ) Approximation error (L 2 1)E(α 1, L 2 1)(1 Q 2 ) 2 T2 m1 Q 2 2m2 T2 m1 Q 3 3m2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 37 / 61

Two Dimensional Scan Statistics Haiman Type Approximation and Error Bounds: Second Step Approximation Q 2 : For s {1, 2,..., L 1 1} Z (2) s = max (s 1)m1+1 i1 sm1+1 1 i2 m2+1 Y i 1i2 ( ) Q 2 = P max Z (2) s k 1 s L1 1 Dene Q22 = P(Z (2) 1 k) Q 32 = P(Z (2) 1 k, Z (2) 2 k) Approximation (1 Q 22 α 2 ) Q 2 2Q22 Q32 [1+Q22 Q32+2(Q22 Q32) 2 ] L 1 1 (L 1 1)E(α 2, L 1 1)(1 Q 22 ) 2 Q 3 : For s {1, 2,..., L 1 1} Z (3) s = max (s 1)m1+1 i1 sm1+1 1 i2 2m2+1 Y i 1i2 ( ) Q 3 = P max Z (3) s k 1 l L1 1 Dene Q23 = P(Z (3) 1 k) Q 33 = P(Z (3) 1 k, Z (3) 2 k) Approximation (1 Q 23 α 2 ) Q 3 2Q23 Q33 [1+Q23 Q33+2(Q23 Q33) 2 ] L 1 1 (L 1 1)E(α 2, L 1 1)(1 Q 23 ) 2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 38 / 61

Two Dimensional Scan Statistics Haiman Type Approximation and Error Bounds: Illustration of the Approximation Process T1 T2 R m2 T1 m1 T1 T2 T2 Q 2 2m2 Q 3 3m2 m1 m1 T1 T1 T1 T1 T2 T2 T2 T2 2m1 Q 22 2m2 3m1 Q 32 2m2 Q 23 2m1 3m2 3m1 Q 33 3m2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 39 / 61

Two Dimensional Scan Statistics Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 40 / 61

Two Dimensional Scan Statistics Comparison between methods Table 3 : n = 1, p = 0.005, m 1 = m 2 = 6, T 1 = T 2 = 30, It App = 10 3, It Sim = 10 3 k ˆP(S k) Glaz and Naus Haiman Total Lower Upper Product type Approximation Error(App+Sim) Bound Bound 2 0.915903 0.914013 0.920211 0.041483 0.901935 0.945623 3 0.994292 0.994395 0.994578 0.000803 0.993785 0.996638 4 0.999747 0.999757 0.999760 2 10 5 0.999737 0.999858 5 0.999992 0.999992 0.999992 7 10 7 0.999992 0.999995 Table 4 : n = 5, p = 0.002, m 1 = 5, m 2 = 10, T 1 = 50, T 2 = 80, It App = 10 4, It Sim = 10 3 k ˆP(S k) Glaz and Naus Haiman Total Lower Upper Product type Approximation Error(App+Sim) Bound Bound 4 0.894654 0.873256 0.893724 0.037136 0.803422 0.944318 5 0.988003 0.986249 0.988144 0.002125 0.981418 0.993451 6 0.998963 0.998847 0.998963 0.000152 0.998543 0.999401 7 0.999926 0.999919 0.999925 9 10 6 0.999903 0.999955 8 0.999995 0.999995 0.999995 5 10 7 0.999994 0.999997 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 41 / 61

Three Dimensional Scan Statistics Model and Approximations Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 42 / 61

Three Dimensional Scan Statistics Model and Approximations Introducing the Model T 3 X ijk T 2 T 1 Let T 1, T 2, T 3 be positive integers Rectangular region R = [0, T 1 ] [0, T 2 ] [0, T 3 ] ( ) Xijk 1 i T 1 i.i.d. integer r.v.'s 1 j T 2 1 k T 3 Bernoulli(B(1, p)) Binomial(B(n, p)) Poisson(P(λ)) X ijk number of observed events in the elementary subregion r ijk = [i 1, i] [j 1, j] [k 1, k] A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 43 / 61

Three Dimensional Scan Statistics Model and Approximations Dening the Scan Statistic Let m 1, m 2, m 3 be positive integers Dene for 1 i j T j m j + 1, Y i1 i 2 i 3 = i 1 +m 1 1 i 2 +m 2 1 i 3 +m 3 1 X ijk i=i 1 j=i 2 k=i 3 T 3 m 3 Y i1 i 2 i 3 The three dimensional scan statistic, S m1,m 2,m 3 = max 1 i 1 T 1 m 1 +1 1 i 2 T 2 m 2 +1 1 i 3 T 3 m 3 +1 Y i1 i 2 i 3. m 2 T T 1 2 m 1 Used for testing the null hypotheses of randomness against the alternative hypothesis of clustering A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 44 / 61

Three Dimensional Scan Statistics Model and Approximations Product Type Approximation Glaz et al. proposed in the product type approximation P ( S m 1,m 2,m 3 k) Q(m1 + 1, m2 + 1, m3 + 1) (T 1 m 1 )(T 2 m 2 )(T 3 m 3 ) Q(m1 + 1, m2, m3) (T 1 m 1 )(T 2 m 2 1)(T 3 m 3 1) Q(m1, m2, m3) (T 1 m 1 1)(T 2 m 2 1)(T 3 m 3 1) Q(m1 + 1, m2 + 1, m3) (T 1 m 1 )(T 2 m 2 )(T 3 m 3 1) Q(m1, m2 + 1, m3) (T 1 m 1 1)(T 2 m 2 )(T 3 m 3 1) Q(m1, m2, m3 + 1) (T 1 m 1 1)(T 2 m 2 1)(T 3 m 3 1) Q(m1 + 1, m2, m3 + 1) (T 1 m 1)(T 2 m 2 1)(T 3 m 3) Q(m1, m2 + 1, m3 + 1) (T 1 m 1 1)(T 2 m 2 )(T 3 m 3) Where, Q(N 1, N 2, N 3 ) = P (S m1,m 2,m 3 (N 1, N 2, N 3 ) k) The approximation also works for binomial and Poisson distribution Three Poisson Type Approximation A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 45 / 61

Three Dimensional Scan Statistics Model and Approximations Haiman Type Approximation: Writing the Scan as an Extreme of 1-Dependent R.V.'s m3 m2 m1 Let L j = T j m j, j {1, 2, 3} positive integers Dene for l {1, 2,..., L 3 1} Z l = max Y i1 i 2 i 3 1 i 1 (L 1 1)m 1 +1 1 i 2 (L 2 1)m 2 +1 (l 1)m 3 +1 i 3 lm 3 +1 Z3 Z2 Z1 T 3 (Z l ) l is 1-dependent and stationary Observe S m1,m 2,m 3 = max 1 l L 3 1 Z l T 2 T 1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 46 / 61

Three Dimensional Scan Statistics Model and Approximations Haiman Type Approximation: First Step Approximation 2m3 R T 3 T 2 T 1 Q 2 Q 3 T 2 T 1 3m3 T 2 T 1 Dene Q 2 = P(Z 1 k) Q 3 = P(Z 1 k, Z 2 k) If 1 Q 2 α 1 < 0.1 the (rst) approximation P(S k) 2Q2 Q3 [1+Q2 Q3+2(Q2 Q3) 2 ] L 3 1 where S = S m1,m 2,m 3 (T 1, T 2, T 3 ) Approximation error (L 3 1)E(α 1, L 3 1)(1 Q 2 ) 2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 47 / 61

Three Dimensional Scan Statistics Model and Approximations Haiman Type Approximation: Approximation for Q 2 and Q 3 Q 2 : For l {1, 2,..., L 2 1} Z (2) = max Y l i 1i2i3 1 i1 (L1 1)m1+1 (l 1)m2+1 i2 lm2+1 1 i3 m3+1 ( ) Q 2 = P max Z (2) n 1 l L2 1 l Dene Q22 = P(Z (2) 1 n) Q 32 = P(Z (2) 1 n, Z (2) 2 n) Approximation (1 q 22 α 2 ) Q 2 2Q22 Q32 [1+Q22 Q32+2(Q22 Q32) 2 ] L 2 1 (L 2 1)E(α 2, L 2 1)(1 Q 22 ) 2 Q 3 : For l {1, 2,..., L 2 1} Z (3) = max Y l i 1i2i3 1 i1 (L1 1)m1+1 (l 1)m2+1 i2 lm2+1 1 i3 2m3+1 ( ) Q 3 = P max Z (3) n 1 l L2 1 l Dene Q23 = P(Z (3) 1 n) Q 33 = P(Z (3) 1 n, Z (3) 2 n) Approximation (1 q 23 α 2 ) Q 3 2Q23 Q33 [1+Q23 Q33+2(Q23 Q33) 2 ] L 2 1 (L 2 1)E(α 2, L 2 1)(1 Q 23 ) 2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 48 / 61

Three Dimensional Scan Statistics Model and Approximations Haiman Type Approximation: Last Step (Approximating Q ts ) Applying again the second part of the Main Theorem... For s, t {2, 3} and j {1, 2,..., L 1 1} dene Observe Z (ts) j = max (j 1)m1+1 i1 jm1+1 1 i2 (t 1)m2+1 1 i3 (s 1)m3+1 Y i 1i2i3 ( ) Q ts = P max Z (ts) n 1 j L1 1 j ( Dene for r, s, t {2, 3}, Q rts = P Q ts r 1 j=1 ) j n} {Z (ts) If 1 Q 2ts α 3 then the approximation and the error 2Q2ts Q3ts (L [1+Q2ts Q3ts +2(Q2ts Q3ts ) 2 ] L 1 1 1 1)E(α 3, L 1 1)(1 Q 2ts ) 2 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 49 / 61

Three Dimensional Scan Statistics Model and Approximations An Illustration of the Approximation Chain (Q 2 ) Q 2 2m3 Q 22 T2 T1 Q 32 2m3 2m3 2m2 T1 3m2 T1 Q 222 Q322 Q 232 Q 332 2m3 2m3 2m3 2m3 2m2 2m1 2m2 3m1 3m2 2m1 3m2 3m1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 50 / 61

Three Dimensional Scan Statistics Outline 1 Introduction Example 2 One Dimensional Scan Statistics Denitions and Notations Exact Formula by MCIT (Bernoulli Case) 3 Two Dimensional Scan Statistics Model 4 Three Dimensional Scan Statistics Model and Approximations 5 References A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 51 / 61

Three Dimensional Scan Statistics Bernoulli Case Comparing with existing results: Table 5 : n = 1, p = 0.00005, m 1 = m 2 = m 3 = 5, T 1 = T 2 = T 3 = 60, It App = 10 5 k ˆP(S k) Glaz et al. Our Approximation Simulation Total Product type Approximation Error Error Error 1 0.841806 0.841424 0.851076 0.011849 0.064889 0.076738 2 0.999119 0.999142 0.999192 0.000000 0.000170 0.000170 3 0.999997 0.999998 0.999997 0.000000 3 10 7 3 10 7 Table 6 : n = 1, p = 0.0001, m 1 = m 2 = m 3 = 5, T 1 = T 2 = T 3 = 60, It App = 10 5 k ˆP(S k) Glaz et al. Our Approximation Simulation Total Product type Approximation Error Error Error 2 0.993294 0.993241 0.993192 0.000010 0.001367 0.001377 3 0.999963 0.999964 0.999963 0.000000 0.000005 0.000005 4 0.999999 0.999999 0.999999 0.000000 2 10 9 2 10 9 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 52 / 61

References Glaz, J., Naus, J., Wallenstein, S.: Scan statistic. Springer (2001). Glaz, J., Pozdnyakov, V., Wallenstein, S.: Scan statistic: Methods and Applications. Birkhauser (2009). Amarioarei, A.: Approximation for the distribution of extremes of one dependent stationary sequences of random variables, arxiv:1211.5456v1 (submitted) Amarioarei, A., Preda, C.: Approximation for the distribution of three dimensional discrete scan statistic, Methodol Comput Appl Probab (2013) DOI:10.1007/s11009-013-9382-3. Amarioarei, A., Preda, C.: Approximation for two dimensional discrete scan statistics in some block-factor type dependent models, arxiv:1401.2822 (submitted) Boutsikas, M.V., Koutras, M.: Reliability approximations for Markov chain imbeddable systems. Methodol Comput Appl Probab 2 (2000), 393412. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 53 / 61

References Boutsikas, M. and Koutras, M. Bounds for the distribution of two dimensional binary scan statistics, Probability in the Engineering and Information Sciences, 17, 509525, 2003. Chen, J. and Glaz, J. Two-dimensional discrete scan statistics, Statistics and Probability Letters 31, 5968, 1996. Glaz, J., Guerriero, M., Sen, R.: Approximations for three dimensional scan statistic. Methodol Comput Appl Probab 12 (2010), 731747. Haiman, G.: First passage time for some stationary sequence. Stoch Proc Appl 80 (1999), 231248. Haiman, G.: Estimating the distribution of scan statistics with high precision. Extremes 3 (2000), 349361. Haiman, G., Preda, C.: A new method for estimating the distribution of scan statistics for a two-dimensional Poisson process. Methodol Comput Appl Probab 4 (2002), 393407. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 54 / 61

References Haiman, G., Preda, C.: Estimation for the distribution of two-dimensional scan statistics. Methodol Comput Appl Probab 8 (2006), 373381. Haiman, G.: Estimating the distribution of one-dimensional discrete scan statistics viewed as extremes of 1-dependent stationary sequences. J. Stat Plan Infer 137 (2007), 821828. Kuai, H., Alajaji, F., Takahara, G.: A lower bound on the probability of a nite union of events. Discrete Mathematics 215 (2000), 147158. A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 55 / 61

Block-Factor Type Model Introducing the Model Let 1 c s Ts, s {1, 2} integers ( ) Xij i.i.d. r.v.'s 1 i T 1 1 j T 2 conguration matrix in (i, j) C (i,j) = ( C (i,j) (k, l) ) 1 k c 2 1 l c 1 C (i,j) (k, l) = Xi+l 1,j+c2 k transformation Π : M c2,c 1 (R) R X i,j+c 2 1.... X i,j... X i,j Π Xi +c 1 1,j+c 2 1. X i +c 1 1,j Dene the block-factor model, T 1 = T1 c 1 + 1, T 2 = T2 c 2 + 1 X i,j = Π ( C (i,j) ), 1 i T 1 1 j T 2 Return A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 56 / 61

Block-Factor Type Model Model: case c 1 = c 2 = 3 To simplify the presentation we consider c 1 = c 2 = 3 X i,j+2 X i+1,j+2 X i+2,j+2 X i,j+1 X i+1,j+1 X i+2,j+1 T 2 X i,j X i+1,j X i+2,j X i,j T 2. j. 1 Π. j. 1 1 i T 1 1 i T 1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 57 / 61

Block-Factor Type Model Model: case c 1 = c 2 = 3 To simplify the presentation we consider c 1 = c 2 = 3 X i,j+2 X i+1,j+2 X i+2,j+2 X i,j+1 X i+1,j+1 X i+2,j+1 T 2 X i,j X i+1,j X i+2,j X i,j T 2. j. 1 Π. j. 1 1 i T 1 1 i T 1 A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 57 / 61

Block-Factor Type Model Example: A game of minesweeper Model: X i,j B(p) (presence, absence of a mine) number of neighboring mines T (C (i,j) ) = (s,t) {0,1,2} 2 (s,t) (1,1) X i,j = Π ( C (i,j) ) X i+s,j+t Return A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 58 / 61

Block-Factor Type Model Computation of Q(m 1 + 1, m 2 ) and Q(m 1 + 1, m 2 + 1) Consider X ij B(n, p) and the notation Y j 1,j2 i1,i2 = j1 i1=1 i2=1 j2 X ij, P ( S m 1,m 2 (m 1 + 1, m2) k ) k (m 1 1)m 2 n = P 2 ( Y 1,m ) ( 2 1,1 k y P Y m 1,m ) 2 2,1 = y y=0 P ( S m 1,m 2 (m 1 + 1, m2 + 1) k ) = k (m 1 1)(m 2 1)n y 1 =0 [k y 1 y 2 y 3 ] (m 1 1)n y 4 =0 ( P Y 1,m 2 +1 ) 1,m 2 +1 a 2 ( P Y m 1,m ) 2 2,2 = y1 P ( P Y m 1,1 ) 2,1 = y4 P (k y 1 ) (m 2 1)n y 2 =0 (k y 1 ) (m 2 1)n y 3 =0 [k y 1 y 2 y 3 ] (m 1 1)n ( P Y m 1 +1,1 ( P Y1,1 1,1 ) a 1 y 5 =0 ) ( m 1 +1,1 a 3 P Y m 1 +1,m 2 +1 ) m 1 +1,m 2 +1 a 4 ) ( P Y m 1 +1,m ) m 2 1 +1,2 = y3 ) ( Y 1,m 2 1,2 = y2 ( Y m 1,m 2 +1 2,m 2 +1 = y5 a1 = k y1 y2 y4, a2 = k y1 y2 y5, a3 = k y1 y3 y4, a4 = k y1 y3 y5, Return A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 59 / 61

Block-Factor Type Model Karwe Naus recursive methods Dene We have the recurrence relations b k 2(m) (y) = P (Sm(2m) k, Y m+1(m) = y) f (y) = P(X1 = y) Q2m k = P (Sm(2m) k) k b k 2(1) (y) = f (j) f (y) j=0 y b k 2(m) (y) = η=0 ν=0 k y+η k Q2m k = b k 2(m) (y) y=0 k Q2m 1 k = f (x)q k x 2(m 1) x=0 b k ν (y η)f (ν)f (η) 2(m 1) Return A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 60 / 61

Block-Factor Type Model Selected Values for K(α) and Γ(α) α K(α) Γ(α) 0.1 38.63 480.69 0.05 21.28 180.53 0.025 17.56 145.20 0.01 15.92 131.43 Table 7 : Selected values for K(α) and Γ(α) Return A. Am rioarei (Lab. P. Painlevé) Scan Statistics Lille 2014 61 / 61