A Note on the Distribution of the Number of Prime Factors of the Integers

Similar documents
Asymptotic distribution of products of sums of independent random variables

Lecture 3: August 31

An Introduction to Randomized Algorithms

Self-normalized deviation inequalities with application to t-statistic

Lecture 2: Concentration Bounds

Central limit theorem and almost sure central limit theorem for the product of some partial sums

Independence number of graphs with a prescribed number of cliques

On Random Line Segments in the Unit Square

Basics of Probability Theory (for Theory of Computation courses)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Agnostic Learning and Concentration Inequalities

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Probability and statistics: basic terms

A Note on the Kolmogorov-Feller Weak Law of Large Numbers

Goldbach s Pigeonhole. Edward Early, Patrick Kim, and Michael Proulx

Notes 19 : Martingale CLT

1 Introduction to reducing variance in Monte Carlo simulations

BIRKHOFF ERGODIC THEOREM

Topic 9: Sampling Distributions of Estimators

Problem Set 2 Solutions

Solutions to selected exercise of Randomized Algorithms

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Spectral Partitioning in the Planted Partition Model

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

A Note on Matrix Rigidity

Distribution of Random Samples & Limit theorems

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Lecture 19: Convergence

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

Bull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung

ON POINTWISE BINOMIAL APPROXIMATION

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling

Math 216A Notes, Week 5

A Note on the Symmetric Powers of the Standard Representation of S n

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Commutativity in Permutation Groups

A Weak Law of Large Numbers Under Weak Mixing

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Topic 9: Sampling Distributions of Estimators

Glivenko-Cantelli Classes

Lecture 4: April 10, 2013

Random Variables, Sampling and Estimation

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Stat 421-SP2012 Interval Estimation Section

Gamma Distribution and Gamma Approximation

A statistical method to determine sample size to estimate characteristic value of soil parameters

Problem Set 4 Due Oct, 12

Lecture 2. The Lovász Local Lemma

Regression with an Evaporating Logarithmic Trend

Lecture 2 February 8, 2016

A Block Cipher Using Linear Congruences

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

THE DATA-BASED CHOICE OF BANDWIDTH FOR KERNEL QUANTILE ESTIMATOR OF VAR

On Algorithm for the Minimum Spanning Trees Problem with Diameter Bounded Below

Convergence of random variables. (telegram style notes) P.J.C. Spreij

An elementary proof that almost all real numbers are normal

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Some illustrations of possibilistic correlation

The standard deviation of the mean

Advanced Stochastic Processes.

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Limit distributions for products of sums

Mixtures of Gaussians and the EM Algorithm

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

The random version of Dvoretzky s theorem in l n

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

MATH/STAT 352: Lecture 15

MDIV. Multiple divisor functions

Random assignment with integer costs

Math 2784 (or 2794W) University of Connecticut

32 estimating the cumulative distribution function

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

This section is optional.

Research Article On the Strong Laws for Weighted Sums of ρ -Mixing Random Variables

Empirical Process Theory and Oracle Inequalities

Lecture 2: Monte Carlo Simulation

Control chart for number of customers in the system of M [X] / M / 1 Queueing system

Lecture 12: September 27

FLUID LIMIT FOR CUMULATIVE IDLE TIME IN MULTIPHASE QUEUES. Akademijos 4, LT-08663, Vilnius, LITHUANIA 1,2 Vilnius University

Quantum Computing Lecture 7. Quantum Factoring

ST5215: Advanced Statistical Theory

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

Bayesian Methods: Introduction to Multi-parameter Models

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable


OFF-DIAGONAL MULTILINEAR INTERPOLATION BETWEEN ADJOINT OPERATORS

Sequences of Definite Integrals, Factorials and Double Factorials

A NEW METHOD FOR CONSTRUCTING APPROXIMATE CONFIDENCE INTERVALS FOR M-ESTU1ATES. Dennis D. Boos

6.883: Online Methods in Machine Learning Alexander Rakhlin

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

11 Correlation and Regression

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Monte Carlo method and application to random processes

Notes 5 : More on the a.s. convergence of sums

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Transcription:

A Note o the Distributio of the Number of Prime Factors of the Itegers Aravid Sriivasa 1 Departmet of Computer Sciece ad Istitute for Advaced Computer Studies, Uiversity of Marylad, College Park, MD 20742. Abstract The Cheroff-Hoeffdig bouds are fudametal probabilistic tools. A elemetary approach is preseted to obtai a Cheroff-type upper-tail boud for the umber of prime factors of a radom iteger i {1, 2,..., }. The method illustrates tail bouds i egatively-correlated settigs. Key words: Cheroff bouds, probabilistic umber theory, primes, tail bouds, depedet radom variables, radomized algorithms 1 Itroductio Large-deviatio bouds such as the Cheroff-Hoeffdig bouds are of much use i radomized algorithms ad probabilistic aalysis. Hece, it is valuable to uderstad how such bouds ca be exteded to situatios where the coditios of these bouds as preseted i their stadard versios, do ot hold: the most importat such coditio is idepedece. Here, we show oe such A earlier versio of this work appeared i the Proc. Hawaii Iteratioal Coferece o Statistics, Mathematics, ad Related Fields, 2005 Email address: sri@cs.umd.edu Aravid Sriivasa). URL: http://www.cs.umd.edu/ sri Aravid Sriivasa). 1 This research was doe i parts at: i) Corell Uiversity, Ithaca, NY supported i part by a IBM Graduate Fellowship), ii) Istitute for Advaced Study, Priceto, NJ supported i part by grat 93-6-6 of the Alfred P. Sloa Foudatio), iii) DIMACS Ceter, Rutgers Uiversity, Piscataway, NJ supported i part by NSF-STC91-19999 ad by support from the N.J. Commissio o Sciece ad Techology), ad iv) the Uiversity of Marylad supported i part by NSF Award CCR-0208005). Preprit submitted to Elsevier 17 October 2008

extesio, to the classical problem of the distributio of the umber of prime factors of itegers. For ay positive iteger N, let νn) deote the umber of prime factors of N, igorig multiplicities. Let l x deote the atural logarithm of x, as usual. It is kow that the average value of νi), for i [] = {1, 2,..., }, is µ. = l l + O1) ± Ol 2 ), for sufficietly large. See also the discussio i Alo & Specer [1]. We are iterested i seeig if there is a sigificat fractio of itegers i [] for which νi) deviates largely from µ. Formally, Hardy & Ramauja [4] showed that for ay fuctio ω) with lim w) =, {i [] : νi) l l + ω) l l } = o1), 1) where the o1) term goes to zero as icreases. Their proof was fairly complicated. Turá [9] gave a very elegat ad short proof of this result; his proof is as follows. Let E[Z] ad V ar[z] deote the expected value ad variace of radom variable Z, respectively. Defie P to be the set of primes i []. For a radomly picked x [], defie, for every prime p P, X p to be 1 if p divides x, ad 0 otherwise. Clearly, νx) = p P X p. Hece, µ = E[νx)] = E[X p ] = /p p P p P ad thus, µ µ. = 1 p P p = l l + 0.261... ± Ol 2 ), where the last equality follows from Mertes theorem see, for istace, Rosser & Schoefeld [7]). By Chebyshev s iequality, P r νx) µ) λ) V ar[νx)] λ 2 ad by obtaiig good upper bouds o the variaces V ar[x p ] ad the co- 2

variaces Cov[X p, X q ], Turá obtais his result that ) l l P r νx) µ λ) O, 2) λ 2 which, i particular, implies 1). Erdős & Kac [3] show that as, the tail of νx) ad of ay fuctio from a fairly broad class of fuctios of x) approaches that of the correspodig ormal distributio, i.e., that if ω is real ad if K = {i [] : νi) l l + ω 2 l l }, the K lim = π 1/2 w e u2 du. 3) We stregthe the upper-tail part of 2) by showig that for ay ad ay parameter δ > 0, e δ ) µ P rνx) µ 1 + δ)). 1 + δ) 1+δ I cotrast with 3), we get a boud for every ; thus, for istace, we get a cocrete boud for deviatios that are of a order of magitude more tha the stadard deviatio. We poit out that strog upper- ad lower-tail bouds are kow usig o-probabilistic methods [6]. The goal of this ote is to show that a simple probabilistic approach suffices to derive expoetial upper-tail bouds here. We also hope that the method ad result may be of pedagogic use i showig the stregth of probabilistic methods, ad i the study of tail bouds for egatively) correlated radom variables. 2 Large Deviatio Bouds We first quickly review some saliet features of the work of Schmidt, Siegel & Sriivasa [8]. 2.1 Cheroff-Hoeffdig type bouds i o-idepedet scearios The basic idea used i the Cheroff Hoeffdig heceforth CH) bouds is as follows [2,5]. Give radom variables heceforth r.v. s) X 1, X 2,..., X, 3

we wat to upper boud the upper tail probability P rx a), where X. = i=1 X i, µ. = E[X], a = µ1 + δ) ad δ > 0. For ay fixed t > 0, P rx a) = P re tx e at ) E[etX ] e at ; by computig a upper boud ut) o E[e tx ] ad miimizig ut) e at over t > 0, we ca upper boud P rx a). Suppose X i {0, 1} for each i, a commoly occurig case. I this case, a commoly used such boud is P rx µ1 + δ)) F µ, δ) =. e δ ) µ 4) 1 + δ) 1+δ see, for example, [1]). Oe basic idea of [8] whe X i {0, 1} is as follows. Suppose we defie, for z = z 1, z 2,..., z ) R, a family of fuctios S j z), j = 0, 1,...,, where S 0 z) 1, ad for 1 j, S j z). = 1 i 1 <i 2 <i j z i1 z i2 z ij. The, for ay t > 0, there exist o-egative reals a 0, a 1,..., a such that e tx i=0 a i S i X 1, X 2,..., X ). So, we may cosider fuctios of the form y i S i X 1, X 2,..., X ) i=0 where y 0, y 1,..., y 0, istead of restrictig ourselves to those of the form e tx, for some t > 0. For ay y = y 0, y 1,..., y ) R +1 +, defie f y X 1, X 2,..., X ) =. i=0 y i S i X 1, X 2,..., X ). The, it is easy to see that )) a a P rx a) = P r f y X 1,..., X ) y i i=0 i E[f yx 1,..., X )] ai=0 y i a i). So, the goal ow is to miimize this upper boud over y 0, y 1,... y ) R +1 +. Assumig that the X i s are idepedet, it is show i [8] that the optimum for the upper tail occurs roughly whe: y i = 1 if i = µδ, ad y i = 0 otherwise. We ca summarize this discussio by Theorem 2.1 [8]) Let bits X 1, X 2,... X be radom with X = i X i, ad 4

let µ = E[X], k = µδ. The for ay δ > 0, P rx µ1 + δ)) E[S kx 1, X 2,..., X )] ). µ1+δ) k If the X i s are idepedet, the this is at most e δ ) µ. 1 + δ) 1+δ 2.2 Tail Bouds for νx) Returig to our origial sceario, let be our give iteger. For a radomly picked x [], let X p be 1 if p divides x, ad 0 otherwise. As stated earlier, νx) = p P X p. Let { ˆX p p P } be a set of idepedet biary radom variables with P rx p = 1) = 1/p. For ay r ad ay set of primes p i1, p i2,..., p ir, ote that r r E[ X pij ] = P r p ij x)) j=1 j=1 = / r j=1 p ij 1 rj=1 5) p ij = E[ r j=1 ˆX pij ]. Thus we get Theorem 2.2 For ay 2 ad for ay δ > 0, e δ ) µ P rνx) µ 1 + δ)), 1 + δ) 1+δ just by ivokig Theorem 2.1. 5

3 Variats Why does our approach ot work directly for the lower tail of νx) also? The reaso is that a direct egative-correlatio result aalogous to 5) does ot appear to hold. It would be iterestig to see if good lower-tail bouds also ca be obtaied by a short proof; as i [3,9], it may be possible to make quatitative use of the fact that the {X p } are all almost idepedet. It is kow that coutig the prime divisors icludig multiplicity chages the fuctios a little [6], ad it would be worth cosiderig short probabilistic) proofs for the tail behavior of this fuctio also. More geerally, ca we cocretely exploit the ear-idepedece properties of additive umber-theoretic fuctios [3]? Ackowledgmets. I thak Noga Alo, Eric Bach, Carl Pomerace, Christia Scheideler ad Joel Specer for valuable discussios & suggestios. Refereces [1] N. Alo ad J. Specer. The Probabilistic Method, Secod Editio. Joh Wiley & Sos, Ic., 2000. [2] H. Cheroff. A measure of asymptotic efficiecy for tests of a hypothesis based o the sum of observatios. Aals of Mathematical Statistics, 23, 493 509 1952). [3] P. Erdős ad M. Kac. The Gaussia law of errors i the theory of additive umber theoretic fuctios. America Joural of Mathematics, 62, 738 742 1940). [4] G. H. Hardy ad S. Ramauja. The ormal umber of prime factors of a umber. Quarterly J. Math., 48, 76 92 1917). [5] W. Hoeffdig. Probability iequalities for sums of bouded radom variables. America Statistical Associatio Joural, 58, 13 30 1963). [6] C. Pomerace. O the distributio of roud umbers. Number Theory Proceedigs, Ootacamud, Idia, 1984. K. Alladi, ed., Lecture Notes i Math. 1122, 173 200 1985). [7] J. B. Rosser ad L. Schoefeld. Approximate formulas for some fuctios of prime umbers. Illiois Joural of Mathematics, 6, 64 94 1962). [8] J. P. Schmidt, A. Siegel, ad A. Sriivasa. Cheroff-Hoeffdig bouds for applicatios with limited idepedece. SIAM Joural o Discrete Mathematics, 8, 223 250 1995). [9] P. Turá. O a theorem of Hardy ad Ramauja. Joural of the Lodo Mathematics Society, 9, 274 276 1934). 6