Analysis of Boyer and Moore s MJRTY algorithm

Similar documents
Easter bracelets for years

Methylation-associated PHOX2B gene silencing is a rare event in human neuroblastoma.

Smart Bolometer: Toward Monolithic Bolometer with Smart Functions

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach

On size, radius and minimum degree

A new simple recursive algorithm for finding prime numbers using Rosser s theorem

b-chromatic number of cacti

Case report on the article Water nanoelectrolysis: A simple model, Journal of Applied Physics (2017) 122,

Evolution of the cooperation and consequences of a decrease in plant diversity on the root symbiont diversity

Dispersion relation results for VCS at JLab

Completeness of the Tree System for Propositional Classical Logic

Passerelle entre les arts : la sculpture sonore

Thomas Lugand. To cite this version: HAL Id: tel

On infinite permutations

Vibro-acoustic simulation of a car window

Can we reduce health inequalities? An analysis of the English strategy ( )

Full-order observers for linear systems with unknown inputs

L institution sportive : rêve et illusion

On Symmetric Norm Inequalities And Hermitian Block-Matrices

3D-CE5.h: Merge candidate list for disparity compensated prediction

Soundness of the System of Semantic Trees for Classical Logic based on Fitting and Smullyan

A Simple Proof of P versus NP

Two-step centered spatio-temporal auto-logistic regression model

On the longest path in a recursively partitionable graph

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Multiple sensor fault detection in heat exchanger system

Solution to Sylvester equation associated to linear descriptor systems

Parallel Repetition of entangled games on the uniform distribution

Negative results on acyclic improper colorings

The Windy Postman Problem on Series-Parallel Graphs

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

Exact Comparison of Quadratic Irrationals

A note on the computation of the fraction of smallest denominator in between two irreducible fractions

Unbiased minimum variance estimation for systems with unknown exogenous inputs

Hook lengths and shifted parts of partitions

A remark on a theorem of A. E. Ingham.

A simple kinetic equation of swarm formation: blow up and global existence

A new approach of the concept of prime number

Posterior Covariance vs. Analysis Error Covariance in Data Assimilation

Numerical Modeling of Eddy Current Nondestructive Evaluation of Ferromagnetic Tubes via an Integral. Equation Approach

AC Transport Losses Calculation in a Bi-2223 Current Lead Using Thermal Coupling With an Analytical Formula

The FLRW cosmological model revisited: relation of the local time with th e local curvature and consequences on the Heisenberg uncertainty principle

Factorisation of RSA-704 with CADO-NFS

Reduced Models (and control) of in-situ decontamination of large water resources

Impedance Transmission Conditions for the Electric Potential across a Highly Conductive Casing

The Accelerated Euclidean Algorithm

approximation results for the Traveling Salesman and related Problems

Comment on: Sadi Carnot on Carnot s theorem.

Impairment of Shooting Performance by Background Complexity and Motion

Best linear unbiased prediction when error vector is correlated with other random vectors in the model

Territorial Intelligence and Innovation for the Socio-Ecological Transition

Impulse response measurement of ultrasonic transducers

Learning an Adaptive Dictionary Structure for Efficient Image Sparse Coding

FORMAL TREATMENT OF RADIATION FIELD FLUCTUATIONS IN VACUUM

On the Griesmer bound for nonlinear codes

Axiom of infinity and construction of N

On path partitions of the divisor graph

Sound intensity as a function of sound insulation partition

Solubility prediction of weak electrolyte mixtures

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

Stochastic invariances and Lamperti transformations for Stochastic Processes

Determination of absorption characteristic of materials on basis of sound intensity measurement

New estimates for the div-curl-grad operators and elliptic problems with L1-data in the half-space

RHEOLOGICAL INTERPRETATION OF RAYLEIGH DAMPING

There are infinitely many twin primes 30n+11 and 30n+13, 30n+17 and 30n+19, 30n+29 and 30n+31

Exogenous input estimation in Electronic Power Steering (EPS) systems

The Zenith Passage of the Sun in the Plan of Brasilia

On one class of permutation polynomials over finite fields of characteristic two *

Spatial representativeness of an air quality monitoring station. Application to NO2 in urban areas

Water Vapour Effects in Mass Measurement

A note on the acyclic 3-choosability of some planar graphs

Comments on the method of harmonic balance

Basic concepts and models in continuum damage mechanics

On Politics and Argumentation

Differential approximation results for the Steiner tree problem

LAWS OF CRYSTAL-FIELD DISORDERNESS OF Ln3+ IONS IN INSULATING LASER CRYSTALS

A Study of the Regular Pentagon with a Classic Geometric Approach

The Learner s Dictionary and the Sciences:

STATISTICAL ENERGY ANALYSIS: CORRELATION BETWEEN DIFFUSE FIELD AND ENERGY EQUIPARTITION

Diurnal variation of tropospheric temperature at a tropical station

On Poincare-Wirtinger inequalities in spaces of functions of bounded variation

Quantum efficiency and metastable lifetime measurements in ruby ( Cr 3+ : Al2O3) via lock-in rate-window photothermal radiometry

Towards an active anechoic room

All Associated Stirling Numbers are Arithmetical Triangles

Trench IGBT failure mechanisms evolution with temperature and gate resistance under various short-circuit conditions

Theoretical calculation of the power of wind turbine or tidal turbine

The beam-gas method for luminosity measurement at LHCb

Climbing discrepancy search for flowshop and jobshop scheduling with time-lags

Numerical modification of atmospheric models to include the feedback of oceanic currents on air-sea fluxes in ocean-atmosphere coupled models

Question order experimental constraints on quantum-like models of judgement

Cutwidth and degeneracy of graphs

Prompt Photon Production in p-a Collisions at LHC and the Extraction of Gluon Shadowing

Fast Computation of Moore-Penrose Inverse Matrices

Using multitable techniques for assessing Phytoplankton Structure and Succession in the Reservoir Marne (Seine Catchment Area, France)

Solving the neutron slowing down equation

On The Exact Solution of Newell-Whitehead-Segel Equation Using the Homotopy Perturbation Method

Unfolding the Skorohod reflection of a semimartingale

Some diophantine problems concerning equal sums of integers and their cubes

Entropies and fractal dimensions

RENORMALISATION ON THE PENROSE LATTICE

Transcription:

Analysis of Boyer and Moore s MJRTY algorithm Laurent Alonso, Edward M. Reingold To cite this version: Laurent Alonso, Edward M. Reingold. Analysis of Boyer and Moore s MJRTY algorithm. Information Processing Letters, Elsevier, 2013, 113, pp.495-497. <10.1016/.ipl.2013.04.005>. <hal-00926106> HAL Id: hal-00926106 https://hal.inria.fr/hal-00926106 Submitted on 9 Jan 2014 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Analysis of Boyer and Moore s MJRTY Algorithm Laurent Alonso Edward M. Reingold April 11, 2013 Abstract: Given a set of n elements each of which is either red or blue, Boyer and Moore s MJRTY algorithm uses pairwise equal/not equal color comparisons to determine the maority color. We analyze the average behavior of their algorithm, proving that if all 2 n possible inputs are equally liely, the average number of color comparisons used is n 2n/π + O1 with variance π 2n/π 2n/π +O1. Key words: Algorithm analysis, maority problem Given a set {x 1,x 2,...,x n }, each element of which is colored either red or blue, we must determine an element of the maority color by maing equal/not equal color comparisons x u : x v ; when n is even, we must report that there is no maority if there are equal numbers of each color. How many such questions are necessary and sufficient? It is easy to obtain an algorithm using at most n νn questions, where νn denotes the number of 1-bits in the binary representation of n; furthermore, n νn is a lower bound on the number of questions needed see [?] and [?]. In [?], the average case was investigated: Assuming all 2 n distinct colorings of the n elements are equally probable, 2n 3 8n 9π +Ologn comparisons are necessary and sufficient in the average case to determine the maority. In this note we analyze the average performance of an inferior, but historically important maority algorithm, that of Boyer and Moore [?], first described in 1980 as an example of the logic supported by the Boyer-Moore Theorem Prover 1971; see [?]. The algorithm, Algorithm 1, is subtle in that it wors correctly even when there are arbitarily many colors as long as the set is nown to have an element that occurs more than half the time in this multi-color case, if the set is not nown to have a maority element, a second pass over all n INRIA-Lorraine and LORIA, Université Henri Poincaré-Nancy I, BP 239, 54506 Vandoeuvre-lès-Nancy, France. Email: Laurent.Alonso@loria.fr Department of Computer Science, Illinois Institute of Technology, 10 West 31st Street, Chicago, Illinois 60616-2987, USA. Email: reingold@iit.edu 1

elements is necessary. It is Algorithm 1, called MJRTY in [?], that led to an intensive study of the matter; see [?, sec. 5.8] and [?] for historical details. Algorithm 1 Boyer and Moore s MJRTY algorithm [?]. 1: c 0 2: for i 1 to n do 3: // if c = 0 there are equal numbers of red/blue items among 4: // x 1,...,x i 1 ; otherwise there are c more of the color of x 5: if c = 0 then 6: i 7: c 1 8: else if x i = x then 9: c c+1 10: else 11: c c 1 12: end if 13: end for 14: if c = 0 then 15: No maority 16: else 17: Maority is color of x 18: end if In the two-color case, the number of color comparisons in line 8 of MJRTY is equal to n less the number of times the algorithm finds c = 0 in line 5, so we focus on counting that number of times. But c = 0 at line 5 can happen only when i is odd, which happens at least once when i = 1 and at most n/2 times. It happens when i = 2+1 if x 1,...,x 2 contains copies of each of the two colors, which happens with probability 2 /2 2 if all 2 n two-colorings of x 1,...,x n are equally probable. Let m = n/2 ; the average number of times that c = 0 in line 5 is thus =0 2 /2 2, 1 partial sums of the central binomial coefficients, divided by corresponding powers of 4. The generating function for the central binomial coefficients is Cz = 1/ 1 4z, so the generating function for the partial sums in 1 is Cz/4/1 z = 2C z/4; hence =0 2 /2 2 = m 2m m /2 2, 2 where m 2m m are called Apéry numbers [?, A005430]. By Stirling s formula, the righthand side of 2 is 2n/π + O1 for m = n/2. It follows that in the two-color case Boyer and Moore s MJRTY algorithm uses at least n/2 and at 2

most n 1 color comparisons, and an average of n 2n/π O1 3 color comparisons. We now compute the variance. Let Sn be the set of all possible 2 n twocolor input sequences to MJRTY. For I Sn and 1 i n, let { 1 if c = 0 in line 5 at iteration i for input I, c i I = 0 otherwise, and let S i n Sn be the set of input sequences I such that c = 0 at iteration i for input I; and hence c i I = 1 for all I S i n. With this notation, our derivation of the average 3 is n n c i I PrI = n n c i I PrI The variance is n = 2 n c i I PrI = n 2n/π O1. 4 n 2 c i I PrI n n c i I PrI n 2 c i I PrI, 5 because Varx y = Varx + Vary 2CoVarx,y and we have x = n which is constant. Thus we need the value of n 2 c i I PrI = 2 n I n n c i c I. which, because PrI = 1/ Sn = 2 n, and by distributivity, = 2 n n n c i I c I, so changing the order of summation, = 2 n n = 2 n n I S in n c i I c I, n c I 2 3

because c i I is 1 if I S i n and 0 otherwise, n = 2 n S i n, 6 where S i n is the total number of times that c = 0 at line 5 over all inputs I S i n. Note that for m 1, in the input sequence x 1,...,x 2m, the two possible choices for the color of x 2m do not affect the number of times that c = 0 at line 5 because x 1,...,x 2 must contain a maority color, hence c 0 in line 5 for i = 2m. Thus we need only compute S i n for odd n because S i 2m = 2 S i 2. We can view an arbitrary odd-length input sequence x 1,...,x n S 2+1 n, n = 2, as being composed of two contiguous subsequences: a even-length front part x 1,...,x 2, in which there is no maority color, and an odd-length rear part x 2+1,...,x n. These subsequences are independent, so we can count the number times c = 0 in each separately. First, consider the possible front parts, sequences x 1,...,x 2 with no maority color. The total number of times that c = 0 at line 5 over all such sequences is 1 f = =0 2 2 = =0 2 2 2, 7 because if c = 0 happens at line 5 for i = 2 + 1 with <, neither of the subsequences: x 1,...,x 2 and x 2+1,...,x 2 which is not empty contains a maority color. Observing that the sum in 7 is a convolution of the central binomial numbers with themselves, the generating function for this sum is Cz 2 = 1/1 4z and hence 2 f = 4. Now consider the rear part, an arbitrary odd-length input sequence x 1,..., x 2+1. The total number of times that c = 0 in line 5 over all such sequences is r = =0 2 2 2 2 +1 = 2 2+1 /2 2, =0 because c = 0 at line 5 for i = 2 +1 if x 1,...,x 2 contains elements of each color and the remaining 2 +1 elements are colored arbitrarily, evaluated as in 2. 2 +2 = +1, +1 4

Finally, because each possible front word occurs 2 2 2 times as first part of a word in S 2+1 2, and each rear word occurs 2 times, we obtain 2 S 2+1 2 = 2 2 2 f + r. Therefore, S 2+1 2 = 2 2 2 2 2 and so we have 2 2 2m+1 /2 2 + S i 2 = 2 2m+1 = =0 1 =0 =0 +2 2m+1 n 1 =0 2 m S 2+1 2 2 /2 2 2 m 2m m 2m 2 m,. We can simplify the second sum as in 2; for the third sum, we note that it is a convolution, so that CzzC 2z z = 1 4z 2 = z d 2dz 2 2 2m+1 1 = 1 4z 2 2 2 z, 2m S i 2 = 2 2m S i 2m = 2m 1 2m m 4 m. 8 We have m = n/2, so putting together 2, 5, 6, and 8, Stirling s formula gives the variance of the number of color comparisons in line 8 of MJRTY, 2m 1 2m /4 m m m 2m m /2 2 2 = π 2 π n 2n/π +O1. 5