An Algorithm of a Longest of Runs Test for Very Long. Sequences of Bernoulli Trials

Similar documents
Professor Wei Zhu. 1. Sampling from the Normal Population

RECAPITULATION & CONDITIONAL PROBABILITY. Number of favourable events n E Total number of elementary events n S

such that for 1 From the definition of the k-fibonacci numbers, the firsts of them are presented in Table 1. Table 1: First k-fibonacci numbers F 1

( m is the length of columns of A ) spanned by the columns of A : . Select those columns of B that contain a pivot; say those are Bi

Recent Advances in Computers, Communications, Applied Social Science and Mathematics

XII. Addition of many identical spins

χ be any function of X and Y then

On EPr Bimatrices II. ON EP BIMATRICES A1 A Hence x. is said to be EP if it satisfies the condition ABx

Exponential Generating Functions - J. T. Butler

2.1.1 The Art of Estimation Examples of Estimators Properties of Estimators Deriving Estimators Interval Estimators

Lecture 10: Condensed matter systems

= y and Normed Linear Spaces

The Linear Probability Density Function of Continuous Random Variables in the Real Number Field and Its Existence Proof

Module Title: Business Mathematics and Statistics 2

Minimum Hyper-Wiener Index of Molecular Graph and Some Results on Szeged Related Index

The Exponentiated Lomax Distribution: Different Estimation Methods

ˆ SSE SSE q SST R SST R q R R q R R q

Fairing of Parametric Quintic Splines

Trace of Positive Integer Power of Adjacency Matrix

Chapter 7 Varying Probability Sampling

GREEN S FUNCTION FOR HEAT CONDUCTION PROBLEMS IN A MULTI-LAYERED HOLLOW CYLINDER

Best Linear Unbiased Estimators of the Three Parameter Gamma Distribution using doubly Type-II censoring

Distribution of Geometrically Weighted Sum of Bernoulli Random Variables

Non-axial symmetric loading on axial symmetric. Final Report of AFEM

2. Sample Space: The set of all possible outcomes of a random experiment is called the sample space. It is usually denoted by S or Ω.

Hyper-wiener index of gear fan and gear wheel related graph

Minimizing spherical aberrations Exploiting the existence of conjugate points in spherical lenses

CISC 203: Discrete Mathematics for Computing II Lecture 2, Winter 2019 Page 9

VECTOR MECHANICS FOR ENGINEERS: Vector Mechanics for Engineers: Dynamics. In the current chapter, you will study the motion of systems of particles.

Lecture 9 Multiple Class Models

AIRCRAFT EQUIVALENT VULNERABLE AREA CALCULATION METHODS

FUZZY MULTINOMIAL CONTROL CHART WITH VARIABLE SAMPLE SIZE

Question 1. Typical Cellular System. Some geometry TELE4353. About cellular system. About cellular system (2)

Chapter Linear Regression

FIBONACCI-LIKE SEQUENCE ASSOCIATED WITH K-PELL, K-PELL-LUCAS AND MODIFIED K-PELL SEQUENCES

By the end of this section you will be able to prove the Chinese Remainder Theorem apply this theorem to solve simultaneous linear congruences

Iterative Algorithm for a Split Equilibrium Problem and Fixed Problem for Finite Asymptotically Nonexpansive Mappings in Hilbert Space

Allocations for Heterogenous Distributed Storage

(b) By independence, the probability that the string 1011 is received correctly is

Inequalities for Dual Orlicz Mixed Quermassintegrals.

THREE-PARAMETRIC LOGNORMAL DISTRIBUTION AND ESTIMATING ITS PARAMETERS USING THE METHOD OF L-MOMENTS

Learning Bayesian belief networks

1 Onto functions and bijections Applications to Counting

A New Approach to Moments Inequalities for NRBU and RNBU Classes With Hypothesis Testing Applications

Pattern Avoiding Partitions, Sequence A and the Kernel Method

Quasi-Rational Canonical Forms of a Matrix over a Number Field

Lecture 12: Spiral: Domain Specific HLS. Housekeeping

Atomic units The atomic units have been chosen such that the fundamental electron properties are all equal to one atomic unit.

Lecture 6: October 16, 2017

ESS Line Fitting

Robust Regression Analysis for Non-Normal Situations under Symmetric Distributions Arising In Medical Research

ON THE CONVERGENCE THEOREMS OF THE McSHANE INTEGRAL FOR RIESZ-SPACES-VALUED FUNCTIONS DEFINED ON REAL LINE

CHAPTER 5 : SERIES. 5.2 The Sum of a Series Sum of Power of n Positive Integers Sum of Series of Partial Fraction Difference Method

Numerical Solution of Non-equilibrium Hypersonic Flows of Diatomic Gases Using the Generalized Boltzmann Equation

A. Thicknesses and Densities

Permutations that Decompose in Cycles of Length 2 and are Given by Monomials

BINOMIAL THEOREM An expression consisting of two terms, connected by + or sign is called a

Harmonic Curvatures in Lorentzian Space

Chapter 17. Least Square Regression

MATH Midterm Solutions

Counting Functions and Subsets

BINOMIAL THEOREM NCERT An expression consisting of two terms, connected by + or sign is called a

University of Pavia, Pavia, Italy. North Andover MA 01845, USA

The Pigeonhole Principle 3.4 Binomial Coefficients

Fault diagnosis and process monitoring through model-based case based reasoning

APPROXIMATE ANALYTIC WAVE FUNCTION METHOD IN ELECTRON ATOM SCATTERING CALCULATIONS. Budi Santoso

A GENERAL CLASS OF ESTIMATORS UNDER MULTI PHASE SAMPLING

Simple Linear Regression

A Bijective Approach to the Permutational Power of a Priority Queue

Chapter 3: Theory of Modular Arithmetic 38

The Mathematical Appendix

5 Short Proofs of Simplified Stirling s Approximation

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

The calculation of the characteristic and non-characteristic harmonic current of the rectifying system

Finite q-identities related to well-known theorems of Euler and Gauss. Johann Cigler

A DATA DRIVEN PARAMETER ESTIMATION FOR THE THREE- PARAMETER WEIBULL POPULATION FROM CENSORED SAMPLES

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

An Unconstrained Q - G Programming Problem and its Application

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

Multivector Functions

Ch 3.4 Binomial Coefficients. Pascal's Identit y and Triangle. Chapter 3.2 & 3.4. South China University of Technology

Chapter 2: Descriptive Statistics

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Median as a Weighted Arithmetic Mean of All Sample Observations

1. Overview of basic probability

Mu Sequences/Series Solutions National Convention 2014

Probability. Stochastic Processes

Objectives. Learning Outcome. 7.1 Centre of Gravity (C.G.) 7. Statics. Determine the C.G of a lamina (Experimental method)

The internal structure of natural numbers, one method for the definition of large prime numbers, and a factorization test

Bayesian Nonlinear Regression Models based on Slash Skew-t Distribution

Chapter 14 Logistic Regression Models

SOME ARITHMETIC PROPERTIES OF OVERPARTITION K -TUPLES

On ARMA(1,q) models with bounded and periodically correlated solutions

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

MA 524 Homework 6 Solutions

Application of Generating Functions to the Theory of Success Runs

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Some Notes on the Probability Space of Statistical Surveys

NUMERICAL SIMULATION OF TSUNAMI CURRENTS AROUND MOVING STRUCTURES

P 365. r r r )...(1 365

Transcription:

A Algothm of a Logest of Rus Test fo Vey Log equeces of Beoull Tals Alexade I. KOZYNCHENKO Faculty of cece, Techology, ad Meda, Md wede Uvesty, E-857, udsvall, wede alexade_kozycheko@yahoo.se Abstact A ew algothm of computg statstcs of a logest of us test s poposed fo the case of equal pobablty Beoull tals pocesses. The algothm s fouded o the aalyss of the evet tee dagam, whch has show the ole of Fboacc umbes of hghe odes coutg the umbe of outcomes of teest the sample space. The poof by ducto s gve. Compaed to the classcal combatoal fomulas, the poposed algothm povdes the eo-fee exact pobabltes ad makes possble the pocessg of vey log bomal data sets up to 3 o cotempoay computes. Keywods: us tests, logest u, Beoull tals, Fboacc umbes, computg algothms Mathematcs ubject Classfcatos: 6G; 6-4; B39. Itoducto Dstbuto-fee tests fo adomess of a sample data play a mpotat ole much statstcal feece ad ae elevat to may applcatos socology, bology, psychology, egeeg etc., cludg such patcula poblems as egesso ad

cuve fttg. Thee s a geat body of lteatue o the subject, wothy of meto of whch ae the books by egel ad Castella, [], ad pet []. Ths pape s coceed wth the computatoal aspects of a mpotat dstbutofee us test, amely, the logest of us test of adomess appled to log pattes of bomal tals. Amogst a umbe of publcatos the us tests vestgato, t s woth metog the pape by Mood, [3], ad the moogaph by Badley, [4], who gave a detaled teatmet of us tests, as well as the appopate suvey of the woks doe 94s-6s by Olmstead, Mostelle, Gat, Bu ad Cae, et al. The exstg us tests ae based o ethe umbes of us o legths of us. The total-umbe-of-us test povdes both exact combatoal fomulas ad asymptotc oes the assumpto of omally dstbuted statstcs fo lage samples see, e.g., [4], p. 6. Howeve, fo the logest of us tests thee s o such a asymptotc theoy, ad we have to use the exact combatoal fomulas. o, the questo ases as to whethe those fomulas ae applcable fo computg the statstcs o cotempoay computes the case of lage samples, o t s ecessay to deve moe adequate theoy applcable to pocessg log samples. P. Aalyss of the classcal combatoal fomulas fo the logest of us test The covetoal appoach to devg a geeal fomula fo the pobablty?, of obtag at least oe u of legth o geate amog ethe the s o the s had bee descbed [3]. It s to be oted that ethe cludes the possblty

3 of both. The appoach s based o the fomula of calculatg the pobablty of a sum of adom compatble evets: P?, P P,, + P ad,, whee,,? ae umbes of us of s, s, ad of uspecfed type of elemet cotag the u, espectvely; P s the pobablty of obtag at least oe u of legth, amog the s but ot amog the s; P s the pobablty of obtag at least oe such u amog the s, but ot amog the s; P ad s the pobablty of obtag at least oe such,, u amog both the s ad the s; uppose that a sequece of tals cotas s ad s. I ths case, the pobabltes ca be computed o the followg combatoal fomulas: P, / + +, / + + P,, 3

4 + + + + + + + + / / / /,, ad P 4 These u fomulas take ad as gve. But the case of Beoull tals, whe ad ae mutually exclusve outcomes wth pobabltes p ad q espectvely of occuece o a sgle tal, t would be coveet to elmate paametes ad. The exteso whee ad ae ot fxed, so that the pobablty s completely espectve of ad depeds oly o ad p, s descbed [3]. The compoud pobablty s obtaed by takg the poduct of the bomal pobablty q p ad the pobablty beg computed o -4. The sum of that poduct ove all possble values of gves the sought-fo pobablty:,,?,?, p p P p P 5 Evdetly, t s woth whle developg the computg algothm ode to check the coectess ad to evaluate the pefomace of ths fomula. The autho has ceated the C++ pogam that computes the pobablty,?, p P usg the fomulas - 5. The code s placed Appedx A. A umbe of computg tests has bee

5 accomplshed, ad the aalyss of the esults has evealed two dawbacks of the fomula 5. Fst of all, t gves a systematc eo that mafests tself the expesso 4. Let us cosde, fo stace, the case of 8, 4, p.5. The computatos o the fomula 5 gve the pobablty P,.375 ad the umbe of?, p outcomes of teest N 96 the umbe of all possble outcomes equals to 8 56. Howeve, the coect values ae.367875 ad 94, espectvely. The easo s the fomula 4 that gves the eoeous zeo value fo the pobablty P, ad,, wheeas the coect value s -7, whch coespods to two outcomes of teest avalable: ad. I ode to futhe checkg of the classcal fomulas, the autho developed a bute-foce algothm based o the beadth-fst seach techque. It gves the exact solutos, but has the expoetal computato tme t O ad theefoe caot be appled to samples of legth > 4. The compaso of the esults obtaed by ths algothm wth that of the classcal fomulas dscloses that the classcal fomulas gve a egula postve eo whe. 5. Ths evdetly cofms the fallacy of the fomula 4, sce t elates to the paths wth two o moe us of legth whe the fomula 4 s appled. ecodly, the computg tests have evealed a uppe lmt o the legth of bomal sequeces beg pocessed o cotempoay PCs equpped, e.g., the AMD Athlo 64x Dual Coe pocesso 46+. Ths lmt amouts to 8 fo < ad p.5. uch a estcto does ot actually allow pocessg vey log bomal sequeces whee the adequate powe of us tests could be attaed.

6 3. Descpto of the poposed algothm usg the Fboacc umbes, ts poof, ad pefomace The logest of us test of a Beoull tals pocess ca be aalysed usg a bay tee dagam as the stadad techque of epesetg the sample space ad coutg pobabltes, [5]. The paths wth outcomes of teest cota at least oe u of legth o geate. They all ae dcated Fg., whee sold ad dotted les mea success o, say, ad falue o of a chace expemet, coespodgly. Numbe of a tal j, 3 4 5 6 7 8 C Fg.. The half of a evet tee dagam showg the umbe of the us of legth the sequece of Beoull tals of legth 8 4 B F D A E 4 7 Fboacc umbes F+-, - of ode -, j- 4 ome paths cotag a u of legth at the eds ae show completely fom the oot to a leaf e.g., ABDF, wheeas the othes ae pooled to clustes of paths havg both a tal commo explct pat that eds wth a u of legth ad a subsequet abtay sub-tee see, e.g., clustes ABC o ABDE.

7 We wll cosde the patcula ad most mpotat case of a Beoull tals pocess wth equal pobabltes of successes ad falues o a chace expemet p q.5. Hee, the pobabltes of all outcomes ae the same, beg equal to.5 fo Beoull tals. Hece, ode to compute the pobablty P?, of obtag at least oe u of legth o geate amog ethe the s o the s we eed to calculate the umbe of outcomes of teest. I the case 8, 4 depcted Fg., ths umbe ca be estmated as follows: 4 3 8, p.5 + + + 4 + 7 N, 6 4?, 4 3 whee the fst tem,, coespods to the cluste ABC, the secod oe,, coespods to ABDE, ad so o utl the tem 7 that elates to the dvdual paths ot cludg sub-tees, as ABDF. As we ca see, the factos,,, 4, 7 fom a pat wthout zeos of the sequece of Fboacc umbes of 3 d ode. Havg aalysed the tee dagams fo othe, cases, the geeal fomula fo abtay, < s deved: N, p.5?, F + F +... + F +... + F F, +, + 3, whee F, s a + th Fboacc umbe of - ode. + +, +, 7 o, the fomula fo the sought-fo pobablty s deved fom by dvdg t by the umbe of all possble outcomes : F +, P + 8?,, p.5 The fomula 7 ca be poved by mathematcal ducto:

8. The bass: the fomula 7 holds whe. Ideed, ths case 7 gves us the coect umbe two of paths of legth that cota a u of legth : N +, p.5 F F?, +,,. The ductve step: suppose that the fomula 7 holds fo some. We eed to pove that the fomula 7 also holds whe + s substtuted fo. Let us wte dow the fomula 7 fo +: N +, p.5?, F +, + + F + F +, + 3, + + The fst summad of ths expesso gves a umbe of those outcomes of teest fo + bay tals, whch ae geeated fom the eds of all paths exstg at the th level of a tee dagam. These outcomes ae epeseted by clustes of paths at the +st level see Fg.. The secod summad gves a umbe of the sgle outcomes of teest appeag at the + st level. These ew outcomes belog to the paths havg oly oe u of the legth whch s stuated at the ed of the path. These tematg us ogate at the paths havg the same featue oly oe u of the legth at the ed of the path at levels, -,. The total umbe of these geeatg paths equals to the sum of the umbes at levels,, K, +. As we ca see fom the fomula 7 fo the th level wtte usg the Hoe scheme N +?,, p.5 F +, F + F + + F + K+ F + F KK +, +, K +, 3,,.

9 the abovemetoed umbes equal to the Fboacc umbes F, + j, j,. Ths meas that the acto F +3, of the secod summad equals to the sum of Fboacc umbes j F + j, ad, theefoe, s deed a Fboacc umbe of - ode by defto. That s, the ductve step s pove The aalyss of computato pefomace of the fomula 8 has bee caed out usg the C++ code gve Appedx B. Fst of all, the pogam calculates the elated Fboacc umbes placed to aay fb. I so dog the computg algothm takes to accout the e stuctue of a Fboacc umbes sequece, whch cotas a tal sub-sequece of umbes beg a powe of. eveal Fboacc umbes sequeces ae lsted Table A, the tal sub-sequeces beg selected by the gey backgoud colou. Table A. Fboacc umbes F+, of ode,, 3 4 5 6 7 8 9 3 3 5 8 3 34 55 89 4 4 7 3 4 44 8 49 74 5 4 8 5 9 56 8 8 4 6 4 8 6 3 6 36 464 7 4 8 6 3 63 5 48 49 8 4 8 6 3 64 7 53 54 9 4 8 6 3 64 8 55 59

The secod pat of the pogam computes the pobablty P, p.5 by the?, fomula 8. The algothm ad code ae able to make calculatos o cotempoay PCs fo vey log Beoull tals sequeces, up to 3. The esults obtaed ae depcted the Fg. ad ca be used to test fo adomess of a patte of Beoull tals wth p q. 5 ull hypothess. Pobablty of obtag at least oe u of legth o geate amog ethe the s o the s, p.5 P?, 9.3. 3. 4 5 4 6 8 ze of the sequece of Beoull tals Fo example, let us cosde a sequece of s ad s of legth 6 cotag a u of legth 4, ad test the ull hypothess ude the sgfcace level α. 5. Cosultg the Fg., oe ca fd that the ull hypothess should be ejected. If we cease the umbe of tals up to, the chace pobablty that a sequece of Beoull tals wth p q. 5 would cota a u of 4 o moe cosecutve ethe

s o s s about.6, so the ull hypothess caot be ejected at the gve sgfcace level. The Fboacc umbes of hghe odes ae used othe Beoull tals elated poblems, such as the co tossg see, e.g. [6], whee the pobablty that o us of k cosecutve tals wll occu co tosses s gve by Fboacc k-step umbe kth ode. F k + /, whee k F l s a 4. ummay I the pape, a ew poweful appoach to the logest of us test s descbed, whch ca effectvely eplace the classcal combatoal fomulas the patcula, but mpotat, case of equal pobabltes Beoull tals pocesses. Ths appoach s based o a thoough aalyss of the evet tee dagam, whch suggested devg a cocse fomula fo the pobablty of obtag at least oe u of legth o geate amog ethe the s o the s. The deved fomula extesvely uses the Fboacc umbes of hghe odes. The fomula poves to be capable pocessg vey log dchotomous sequeces up to 3 as compaed to 8 fo the classcal combatoal appoach. The coectess of the esults obtaed was checked by a beadth-fst seach algothm, ad the complete cocdece has bee show. The sde esult of the pape les evealg a egula eo beg heet the classcal combatoal algothm some cases. 5. Ackowledgemets The autho would lke to thak Pof. Wej-M Huag fo hs commets ad suggestos that led to mpovemets the pape.

Appedx A //The C++ code developed fo computg the statstcs of the //classcal logest of us test: #clude<osteam> #clude<cmath> #clude<omap> usg amespace std; double Factoalt double t; t ; fot, ; < ; ++ t * ; etu t; double Ct, t f < etu ; double t; t ; fot, ; > -+; -- t * ; etu t/factoal; double Pobt, t s, double p.5, double q.5 double pob ; t ; fot ; < ; ++ double pob, pob, pob ; fo ; < /s; ++ pob + pow-, +*C-+, *C-*s, -; fo ; < -/s; ++ pob + pow-, +*C+, *C-*s, ; fot ; < -s+; ++ double a, a, a3, a4 ; fo ; <-/s-; ++ a + pow-, +*C, *C--*s-, -; fo ; < --+/s-; ++ a + pow-, +*C-, *C---*s-, -; fo ; < --/s-; ++ a3 + pow-, +*C, *C---*s-, -; fo ; < ---/s-; ++ a4 + pow-, +*C+, *C---*s-, ; pob + a*a + *a3 + a4; pob + pob + pob pob*powp, *powq, -; etu pob; t ma t 8, s 4; double pob Pob, s; cout.setfos::fxed; cout << " " << << " " << "s " << s << edl <<"cout "

3 <<setw6<<setpecso<<pob*pow,<< edl << "pob4 " << setw4<<setpecso4<<pob<<edl; etu ; Appedx B //The C++ code fo the poposed logest of us test algothm //usg the Fboacc umbes: #clude<osteam> #clude<cmath> #clude<omap> usg amespace std; double RusFbt, t s double* fb ew double[-s+]; fot ; < -s; ++ fb[] ; double p ; fs > cout << "eo" << edl; etu -; else fb[] ; fb[] ; fo ; < s- && < -s+; ++ fb[] pow,-; fo s-; < -s; ++ fot j ; j < s-; ++j fb[] + fb[-j-]; fo ; < -s; ++ p + fb[]*pow.5, s+; p * ; cout.setfos::fxed; cout << " " << << " s " << s << edl; cout << " pob4 " << setpecso4 << p << edl; delete [] fb; etu p; t ma t 8, s 4; RusFb,s; etu ;

4 Refeeces [] egel,., Castella, N.J., J., 988, Nopaametc tatstcs fo the Behavoal ceces, d ed. New Yok: McGaw-Hll. [] pet P., 993, Appled Nopaametc tatstcal Methods, d ed. Lodo: Chapma & Hall. [3] Mood, A. M., 94, The Dstbuto Theoy of Rus, Aals of Mathematcal tatstcs,, 367-39. [4] Badley, J.V., 968, Dstbuto-Fee tatstcal Tests Eglewood Clffs, New Yok: Petce Hall. [5] Gstead C. M., ell J. L., 997, Itoducto to Pobablty, d ev. ed. Ameca Mathematcal ocety. [6] http://mathwold.wolfam.com/fboacc-tepnumbe.html