DISTRIBUTIONAL CONVERGENCE FOR THE NUMBER OF SYMBOL COMPARISONS USED BY QUICKSELECT (DRAFT, NOT FOR DISTRIBUTION) Abstract

Size: px
Start display at page:

Download "DISTRIBUTIONAL CONVERGENCE FOR THE NUMBER OF SYMBOL COMPARISONS USED BY QUICKSELECT (DRAFT, NOT FOR DISTRIBUTION) Abstract"

Transcription

1 DISTRIBUTIONAL CONVERGENCE FOR THE NUMBER OF SYMBOL COMPARISONS USED BY QUICKSELECT (DRAFT, NOT FOR DISTRIBUTION) JAMES ALLEN FILL TAKÉ NAKAMA Abstract Whe the search algorithm QuickSelect compares keys durig its executio i order to fid a key of target rak, it must operate o the keys represetatios or iteral structures, which were igored by the previous studies that quatified the executio cost for the algorithm i terms of the umber of required key comparisos. I this paper, we aalyze ruig costs for the algorithm that take ito accout ot oly the umber of key comparisos but also the cost of each key compariso. We suppose that keys are represeted as sequeces of symbols geerated by various probabilistic sources ad that QuickSelect operates o idividual symbols i order to fid the target key. We idetify limitig distributios for the costs ad derive itegral ad series expressios for the expectatios of the limitig distributios. These expressios are used to recapture previously obtaied results o the umber of key comparisos required by the algorithm. 1. Itroductio ad Summary QuickSelect, itroduced by Hoare [8] i 1961 ad also kow as Fid or Hoare s selectio algorithm, is a simple search algorithm widely used for fidig a key (a object draw from a liearly ordered set) of target rak i a file of keys. We briefly review the operatio of the algorithm. Suppose that there are keys (we will suppose that these are all distict) ad that the target rak is m, where 1 m. QuickSelect QuickSelect(, m) chooses a uiformly radom key, called the pivot, ad compares each other key to it. This determies the rak j (say) of the pivot. If j = m, the the algorithm returs the pivot key ad termiates. If j > m, the QuickSelect is applied recursively to fid the key of rak m i the set of j 1 keys foud to be smaller tha the pivot. If j < m, the QuickSelect is applied recursively to fid the key of rak m j i the set of j keys larger tha the pivot. May studies have examied this algorithm to quatify its executio costs (a o-exhaustive list of refereces is Kuth [10]; Mahmoud, Modarres, ad Smythe [12]; Prodiger [15]; Grübel ad U. Rösler [7]; Let ad Mahmoud [11]; Grübel [6]; Mahmoud ad Smythe [13]; Devroye [3]; Hwag ad Tsai [9]; Fill ad Nakama [5]; ad Vallée, Clémet, Fill, ad Flajolet [20]); ad all of them except for Fill ad Date: last revised July 2, Research supported by NSF grat DMS-XXXXXXX ad by the Acheso J. Duca Fud for the Advacemet of Research i Statistics. 1

2 2 JAMES ALLEN FILL TAKÉ NAKAMA Nakama [5] ad Vallée et al. [20] have coducted the quatificatio with regard to the umber of key comparisos required by the algorithm to achieve its task. As a result, most of the theoretical results o the complexity of QuickSelect are about expectatios or distributios for the umber of required key comparisos. However, oe ca reasoably argue that aalyses of QuickSelect i terms of the umber of key comparisos caot fully quatify its complexity. For istace, if keys are represeted as biary strigs, the idividual bits of the strigs must be compared i order for QuickSelect to complete its task, ad results obtaied by aalyzig the algorithm with respect to the umber of bit comparisos required to fid a target key more accurately reflect actual executio costs. (We will cosider bit comparisos as a example of symbol comparisos.) Whe QuickSelect (or ay other algorithm) compares keys durig its executio, it must operate o the keys represetatios or iteral structures, so these should ot be igored i fully characterizig the performace of the algorithm. Also, symbol-complexity aalysis allows us to compare key-based algorithms such as QuickSelect ad QuickSort with digital algorithms such as those utilizig digital search trees. Fill ad Jaso [4] pioeered symbol-complexity aalysis by aalyzig the expected umber of bit comparisos required by QuickSort. They assumed that the algorithm is applied to keys that are i.i.d. (idepedet ad idetically distributed) from the uiform distributio over (0, 1) ad represeted (via their biary expasios) as biary strigs, ad that the algorithm operates o idividual bits i order to do comparisos ad fid the target key. They foud that the expected umber of bit comparisos required by QuickSort to sort keys is asymptotically equivalet to (l )(lg ) (where lg deotes biary logarithm), whereas the lead-order term of the expected umber of key comparisos is 2 l, smaller by a factor of order log. I their Sectio 6 they also cosidered i.i.d. keys draw from other distributios with desity o (0, 1). By closely followig [4], Fill ad Nakama [5] studied the expected umber of bit comparisos required by QuickSelect. More precisely, they treated the case of i.i.d. uiform keys represeted as biary strigs ad produced exact expressios for the expected umber of bit comparisos by QuickSelect(, m) for geeral ad m. Their asymptotic results were limited to the algorithms QuickMi, QuickMax, ad QuickRad. Here QuickMi refers to QuickSelect applied to fid the smallest key, i.e., to QuickSelect(, m) with m = 1; ad QuickMax similarly refers to QuickSelect(, m) with m =. QuickRad is the algorithm that results from takig m to be uiformly distributed over {1, 2,..., }. They showed that the expected umber of bit comparisos required by QuickMi or QuickMax is asymptotically liear i with lead-order coefficiet approximately equal to Thus i these cases the expected umber of bit comparisos is asymptotically larger tha that of key comparisos required to complete the same task oly by a costat factor, sice the expectatio for key comparisos is asymptotically 2. Fill ad Nakama [5] also foud that the expected umber of bit comparisos required by QuickRad is also asymptotically liear i (with slope approximately ), as for key comparisos (with slope 3). Vallée et al. [20] exteded the average-case aalyses of [4] ad [5] to keys represeted by sequeces of geeral symbols geerated by ay of a wide variety of sources that iclude memoryless, Markov, ad other dyamical sources. They broadly exteded the results of [5] i aother directio as well by treatig QuickQuat(, α)

3 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 3 for geeral α [0, 1], ot just QuickMi, QuickMax, ad QuickRad. Here the algorithm QuickQuat(, α) (for Quick Quatile ) refers to QuickSelect(, m ) with m / α. Roughly summarized, Vallée et al. showed that if symbols are geerated by a suitably ice source, the the expected umber of symbol comparisos i processig a file of keys is of order log 2 for QuickSort ad, for ay α, of order for QuickQuat(, α). (For example, all memoryless sources are suitably ice.) For a more detailed discussio of sources ad the results of Vallée et al. [20] for QuickQuat, see Sectio 2. The mai purpose of this paper is to exted the average-case aalysis of Vallée et al. [20] by establishig limitig distributios for the umber of symbol comparisos. To our kowledge the preset paper is the first to establish a limitig distributio for the umber of symbol comparisos required by ay key-based algorithm. Our elemetary approach allows us to hadle rather geeral kids of cost for comparig two keys, ad i particular to recover i a rather direct way kow results about the umber of key comparisos. There is o disadvatage to allowig geeral costs, sice our results rely o at most broad limitatios o the ature of the cost. Outlie of the paper. We shall be cocered primarily with QuickQuat QuickQuat(, α), which is what we call the algorithm QuickSelect whe applied to fid the key of rak m i a file of size, where we are give 0 α 1 ad a sequece (m ) such that m / α. It turs out to be coveiet mathematically to aalyze a close cousi to QuickQuat itroduced by Vallée et al. [20], amely, QuickVal, ad the treat QuickQuat by compariso. So, after a careful descriptio of the probabilistic models used to gover the geeratio of keys i Sectio 2.1, a review of kow results about key ad symbol comparisos i Sectio 2.2, ad a descriptio of QuickVal i Sectio 2.3, i Sectio 3 we establish limitig-distributio results for QuickVal (whose mai theorems are Theorem 3.1 ad Theorem 3.4) ad the move o to QuickQuat i Sectio 4 (which cotais Theorem 5.1, the mai theorem of this paper). We are curretly i the process of extedig our aalysis to ivestigate limitig distributios for geeral costs required by QuickSort. Remark 1.1. Although the cotractio method has bee used i fidig limitig distributios for the umber of key comparisos required by recursive algorithms such as QuickSort (e.g., Rösler [17], Rösler ad Rüschedorf [18]), our aalysis does ot deped o it. I examiig covergece for the umber of key comparisos used by QuickQuat, Grübel ad Rösler [7] metioed that they did ot use the cotractio method due to the parameter that represets target rak. (However, they did egage i cotractio argumets to characterize the limitig distributio.) Iterestigly, Mahmoud et al. [12] succeeded i establishig a fixed poit equatio to idetify the limitig distributios of the ormalized umbers of key comparisos required by QuickRad, QuickMi, ad QuickMax. Régier [16] used martigales to show covergece for the umber of key comparisos required by QuickSort. 2. Backgroud ad prelimiaries 2.1. Probabilistic source models for the keys. I this subsectio we describe what is meat by a probabilistic source, our model for how the i.i.d. keys are geerated, usig the termiology ad otatio of Vallé et al. [20].

4 4 JAMES ALLEN FILL TAKÉ NAKAMA Let Σ deote a totally ordered alphabet (set of symbols), assumed to be isomorphic either to {0,..., r 1} for some fiite r or to the full set of oegative itegers, i either case with the atural order; a word is the a elemet of Σ, i.e., a ifiite sequece (or strig ) of symbols. We will follow the customary practice of deotig a word w = (w 1, w 2,...) more simply by w 1 w 2. We will use the word prefix i two closely related ways. First, the symbol strigs belogig to Σ k are called prefixes of legth k, ad so Σ := 0 k< Σ k deotes the set of all prefixes of ay oegative fiite legth. Secod, if w = w 1 w 2 is a word, the we will call (2.1) w(k) := w 1 w 2 w k Σ k its prefix of legth k. Lexicographic order is the liear order (to be deoted i the strict sese by ad i the weak sese by ) o the set of words specified by declarig that w w if (ad oly if) for some 0 k < the prefixes of w ad w of legth k are equal but w k+1 < w k+1. We deote the cost of determiig w w whe comparig distict words w ad w by c(w, w ); we will always assume that the fuctio c is symmetric ad oegative. Example 2.1. Here is a example of a atural class of cost fuctios. Start with oegative symmetric fuctios c i : Σ Σ [0, ), i = 1, 2,..., modelig the cost of comparig symbols i the respective ith positios of two words. This allows for the symbol-compariso costs to deped both o the positios of the symbols i the words ad o the symbols themselves. The, for comparisos of distict words, defie k+1 k c(w, w ) := c i (w i, w i) = c i (w i, w i ) + c k+1 (w k+1, w k+1) i=1 i=1 where k is the legth of the logest commo prefix of w ad w. If c i δ i0,i (idepedet of the symbols beig compared) for give positive iteger i 0, the c is the cost used i coutig comparisos of symbols i positio i 0 ; i particular, if i 0 = 1 the c 1 is the cost used i coutig key comparisos. O the other had, if c i 1 for all i, the c k + 1 is the cost used i coutig symbol comparisos. A probabilistic source is simply a stochastic process W = W 1 W 2 with state space Σ (edowed with its total σ-field) or, equivaletly, a radom variable W takig values i Σ (with the product σ-field). Accordig to Kolmogorov s cosistecy criterio, the distributios µ of such processes are i oe-to-oe correspodece with cosistet specificatios of fiite-dimesioal margials, that is, of the probabilities p w := µ({w 1 w k } Σ ), w = w 1 w 2 w k Σ. Here the fudametal probability p w is the probability that a word draw from µ has w 1 w k as its legth-k prefix. Because the aalysis of QuickSelect is sigificatly more complicated whe its iput keys are ot all distict, we will restrict attetio to probabilistic sources with cotiuous distributios µ. Expressed equivaletly i terms of fudametal probabilities, our cotiuity assumptio is that for ay w = w 1 w 2 Σ we have p w(k) 0 as k, recallig the prefix otatio (2.1). Example 2.2. We preset a few classical examples of sources. For more examples, ad for further discussio, see Sectio 3 of [20].

5 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 5 (a) I computer sciece jargo, a memoryless source is oe with W 1, W 2,... i.i.d. The the fudametal probabilities p w have the product form p w = p w1 p w2 p wk, w = w 1 w 2 w k Σ. (b) A Markov source is oe for which W 1 W 2 is a Markov chai. (c) A itermittet source over the fiite alphabet Σ = {0,..., r 1} is defied by specifyig the coditioal distributios L(W j W 1,..., W j 1 ) i a way that pays special attetio to a particular symbol σ. The source is said to be itermittet of expoet γ > 0 with respect to σ if L(W j W 1,..., W j 1 ) depeds oly o the maximum value k such that the last k symbols i the prefix W 1 W j 1 are all σ ad (i) is the uiform distributio o Σ, if k = 0; ad (ii) if 1 k j 1, assigs mass [k/(k + 1)] γ to σ ad distributes the remaiig mass uiformly over the remaiig elemets of Σ. We ext preset a equivalet descriptio of probabilistic sources (with a correspodig equivalet coditio for cotiuity) that will prove coveiet because it allows us to treat all sources withi a uiform framework. If M is ay measurable mappig from (0, 1) (with its Borel σ-field) ito Σ ad U is distributed uif(0, 1), the M(U) is a probabilistic source. Coversely, give ay probability measure µ o Σ there exists a mootoe measurable mappig M such that M(U) has distributio µ whe U uif(0, 1); here (weakly) mootoe meas that M(t) M(u) wheever t u. Ideed, if F is the distributio fuctio F (w) := µ{w Σ : w w}, w Σ, for µ, the we ca always use the iverse probability trasform M(u) := if{w Σ : u F (w)}, u (0, 1) for M. The measure µ is cotiuous if ad oly if this M is strictly mootoe. So heceforth we will assume that our keys are geerated as M(U 1 ),..., M(U ), where M : (0, 1) Σ is strictly mootoe ad U 1,..., U (we will call these the seeds of the keys) are i.i.d. uif(0, 1). Give a specificatio of costs c(w, w ) i comparig words, we ca ow defie a source-specific otio of cost by settig β(u, t) := c(m(u), M(t)). I our mai applicatio, β symb (u, t) represets the umber of symbol comparisos required to compare words with seeds u ad t. The followig associated termiology ad otatio from [20] will also prove useful. For each prefix w Σ, we let I w = (a w, b w ) deote the iterval that cotais all seeds whose correspodig words begi with w ad µ w := (a w + b w )/2 its midpoit. We call I w the fudametal iterval associated with w. (There is o eed to be fussy as to whether the iterval is ope or closed or half-ope, because the probability that a radom seed U takes ay particular value is 0. Also, we always assume that a w < b w, sice the case that a w = b w will ot cocer us.) The fudametal probability p w ca be expressed as b w a w. The fudametal triagle of prefix w, deoted by T w, is the triagular regio T w := {(u, t) : a w < u < t < b w }, ad whe w is the empty prefix we deote this triagle by T : T := {(u, t) : 0 < u < t < 1}.

6 6 JAMES ALLEN FILL TAKÉ NAKAMA For some of our results, the quatity (2.2) π k := max{p w : w Σ k } will play a importat role. The followig defiitio of a Π-tamed probabilistic source is take (with slight modificatio) from [20]: Defiitio 2.3. Let 0 < γ < ad 0 < A <. We say that the source is Π-tamed (with parameters γ ad A) if the sequece (π k ) at (2.2) satisfies π k A(k + 1) γ for every k 0. Observe that a Π-tamed source is always cotiuous. There is a related coditio for cost fuctios β that will be assumed (for suitable values of the parameters) i some of our results: Defiitio 2.4. Let 0 < ɛ < ad 0 < c <. We say that the symmetric cost fuctio β 0 is tamed (with parameters ɛ ad c) if β(u, t) c(t u) ɛ for all (u, t) T. We say that β is ɛ-tamed if it is tamed with parameters ɛ ad c for some c. We leave it to the reader to make the simple verificatio that a source is Π-tamed with parameters γ ad A if ad oly if β symb is tamed with parameters ɛ = 1/γ ad c = A 1/γ. Remark 2.5. (a) May commo sources have geometric decrease i π k (call these g-tamed ) ad so for ay γ are Π-tamed with parameters γ ad A for suitably chose A A γ [equivaletly, the symbol-comparisos cost β symb is ɛ-tamed for ay ɛ; i fact, if π k b k for every k, the β symb (u, t) 1 + log b 1 t u for all (u, t) T ]. For example, a memoryless source satisfies π k = p k max, where p max := sup w Σ 1 p w satisfies p max < 1 except i the highly degeerate case of a essetially siglesymbol alphabet. We also have π k p k max for ay Markov source, where ow p max is the supremum of all oe-step trasitio probabilities, ad so such a source is g-tamed provided p max < 1. Expadig dyamical sources are also g-tamed. (b) For a itermittet source as i Example 2.2, for all large k the maximum probability π k is attaied by the word σ k ad equals π k = r 1 k γ. Itermittet sources are therefore examples of Π-tamed sources for which π k decays at a truly iverse-polyomial rate, ot a expoetial rate as i the case of g-tamed sources.

7 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT Kow results for the umbers of key ad symbol comparisos. I this subsectio we give for QuickSelect a abbreviated review of what is already kow about the distributio of the umber of key comparisos (β 1 i our otatio) ad (from Vallée et al. [20]) about the expected umber of symbol comparisos (β = β symb ). To our kowledge, o other cost fuctios have previously bee cosidered, or has there bee ay treatmet of the full distributio of the umber of symbol comparisos. Let K,m deote the umber of key comparisos required by the algorithm to fid a key of rak m i a file of keys (with 1 m ). Thus K,1 ad K, represet the key compariso costs required by QuickMi ad QuickMax, respectively. (Clearly K,1 L = K, ). It has bee show (see Mahmoud et al. [12], Hwag ad Tsai [9]) that as, K,1 / coverges i law to the Dickma distributio, which ca be described as the distributio of the perpetuity 1 + U 1 U k, where U k are i.i.d. uiform(0, 1). Mahmoud et al. [12] established a fixed-poit equatio for the limitig distributio of the ormalized (by dividig by ) umber of key comparisos required by QuickRad ad also explicitly idetified this limitig distributio. By usig process-covergece techiques, Grübel ad Rösler [7, Theorem 8] idetified, for each 0 α < 1, a odegeerate radom variable K(α) to which K, α +1 / coverges i distributio; see also the fixed-poit equatio i their Theorem 10, ad Grübel [6], who used a Markov chai approach ad characterized the limitig distributio i his Theorem 3. Earlier, Devroye [2] had show that sup 1 max P(K,m t) Cρ t 1 m for ay ρ > 3/4 ad some C C(ρ). Cocerig momets, Grübel ad Rösler [7, Theorem 11] showed that E K(α) = 2[1 α l α (1 α) l(1 α)] ad Paulse [14] calculated higher-order momets of K(α). Grübel [6, ed of Sectio 2] proved covergece of the momets for fiite to the correspodig momets of the limitig K(α). Prior to the preset paper, oly expectatios have bee studied for the umber of symbol comparisos for QuickQuat. The curret state of kowledge is summarized by part (i) of Theorem 2 i Vallée et al. [20] (see also their accompayig Figures 1 3); we refer the reader to [20] for the other parts of the theorem, which routiely specialize part (i) to QuickMi, QuickMax, ad QuickRad. To review their result we eed the otatio ad termiology of Sectio 2.1 ad a bit more. Usig the o-stadard abbreviatios y + := (1/2)+y ad y := (1/2) y ad the covetio 0 l 0 := 0, we defie { (y + l y + + y l y ), if 0 y 1/2 H(y) := y (l y + l y ), if y 1/2 ad the set L(y) := 2[1 + H(y)]. Accordig to Theorem 2(i) i [20], for ay Π-tamed source the mea umber of symbol comparisos for QuickQuat(, α) is asymptotically ρ + O( 1 δ ) for some δ > 0. Here ρ ρ(α) ad δ both deped o

8 8 JAMES ALLEN FILL TAKÉ NAKAMA the probabilistic source, with (2.3) ρ := ( ) α µ w p w L p w. w Σ They derive (2.3) by first provig the equality (2.4) ρ = β(u, t) [(α t) (α u)] 1 du dt for Π-tamed sources. T 2.3. QuickQuat ad QuickVal. Let S Q S Q (α) deote the total cost required by QuickQuat(, α). To prove covergece of S Q / (i suitable seses to be made precise later), we exploit a idea itroduced by Vallée et al. [20] ad begi with the study of a related algorithm, called QuickVal QuickVal(, α), which we ow describe. QuickVal is admittedly somewhat artificial ad iefficiet; it is importat to keep i mid that we study it maily as a aid to studyig QuickQuat. Havig geerated seeds ad the keys M 1,..., M (say) usig our probabilistic source, QuickVal is a recursive radomized algorithm to fid the rak of the additioal word M(α) i the set {M 1,..., M, M(α)}; thus, while QuickQuat fids the value of the α-quatile i the sample of keys, QuickVal dually fids the rak of the populatio α-quatile i the augmeted set. First, QuickVal selects a pivot uiformly at radom from the set of keys {M 1,..., M } ad fids the rak of the pivot by (a) comparig the pivot with each of the other keys (we will cout these comparisos) ad (b) comparig the pivot with M(α) (we will fid it coveiet ot to cout the cost of this compariso i the total cost). With probability oe, the pivot key will differ from the word M(α). If M(α) is smaller tha the pivot key, the the algorithm operates recursively o the set of keys smaller tha the pivot ad determies the rak of the word M(α) i the set M smaller {M(α)}, where M smaller deotes the set of keys smaller tha the pivot. Similarly, if M(α) is greater tha the pivot key, the the algorithm operates recursively o the set of keys larger tha the pivot [together with the word M(α)]. Evetually the set of words o which the algorithm operates reduces to the sigleto {M(α)}, ad the algorithm termiates. Notice that the operatio of QuickVal is quite close to that of QuickQuat, for the same value of α; we expect ruig costs of the two algorithms to be close, sice whe is large the rak of M(α) i {M 1,..., M, M(α)} should be close (i relative error terms) to α. I fact, we will show that if S V S V (α) deotes the total cost of executig QuickVal(, α), the S Q / ad S V / have the same limitig distributio, assumig oly that the cost fuctio β is ɛ-tamed for suitably small ɛ. I fact, we will show that whe all the radom variables S Q 1, SQ 2,... ad S1 V, S2 V,... are strategically defied o a commo probability space, the S Q / ad S V / both coverge i L p to a commo limit for 1 p <. 3. Aalysis of QuickVal Followig some prelimiaries i Sectio 3.1, i Sectio 3.2 we show that for 1 p <, a suitably defied S V / coverges i L p to a certai radom variable S (defied at the ed of Sectio 3.1) provided oly that E S <. We also show that, whe the cost fuctio is suitably tamed, S V / coverges almost surely to S; see Theorem 3.4 i Sectio 3.3. We derive a itegral expressio for E S valid

9 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 9 for a completely geeral cost fuctio β i Sectio 3.4 ad use it to compute the expectatio whe β 1. I Sectio 3.5, we focus o E S with β = β symb ad derive a series expressio for the expectatio. Few comparisos of results obtaied here with the kow results reviewed i Sectio 2.2 are made i the preset sectio; most such comparisos are deferred to (the first paragraph of) Sectio 4, where the previously-studied algorithm of greater iterest, QuickQuat, is treated Prelimiaries. Our goal is to establish a limit, i various seses, for the ratio of the total cost required by QuickVal whe applied to a file of keys to. It will be both atural ad coveiet to defie all these total costs, oe for each value of, i terms of a sigle ifiite sequece (U i ) i 1 of seeds that are i.i.d. uiform(0, 1). Ideed, let L 0 := 0 ad R 0 := 1. For k 1, iductively defie (3.1) (3.2) (3.3) (3.4) τ k := if{i : L k 1 < U i < R k 1 }, L k := 1(U τk < α)u τk + 1(U τk > α)l k 1, R k := 1(U τk < α)r k 1 + 1(U τk > α)u τk, S,k := 1(L k 1 < U i < R k 1 ) β(u i, U τk ). i: τ k <i (Note that S,k vaishes if τ k.) We the claim that, for each, (3.5) S V := S,k has the distributio of the total cost required by QuickVal(, α). We offer some explaatio here. For each k 1, the radom iterval (L k 1, R k 1 ) (whose legth decreases mootoically i k) cotais both the target seed α ad the seed U τk correspodig to the kth pivot; the iterval cotais precisely those seed values still uder cosideratio after k 1 pivots have bee performed. The oly differece betwee how we have defied S V ad how it is usually defied is that we have chose the iitial pivot seed to be the first seed rather tha a radom oe, ad have made this same chage recursively. But our chage is permissible because of the followig basic probabilistic fact: If U 1,..., U N, M are idepedet radom variables with U 1,..., U N i.i.d. uiform(0, 1) ad M uiformly distributed o {1,..., N}, the U M, like U 1, is distributed uiform(0, 1). Thus the coditioal distributio of U τk give (L k 1, R k 1 ) is uiform(l k 1, R k 1 ). We illustrate our otatio for the first two pivots. First, τ 1 = 1; that is, the seed of the first pivot is the uiform(0, 1) radom variable U 1. After that, if α < U 1 the the seed U τ2 of the secod pivot is chose as the first seed fallig i (0, U 1 ), while if α > U 1 the U τ2 is the first seed fallig i (U 1, 1). We ote that if α = 0 (which meas that we are dealig with the total cost required by QuickMi), the the first of these two cases is always the oe that applies ad so for every k 1 we have L k = 0 ad R k = U τk ; we the have that U τk is just the kth record low value amog U 1, U 2,.... I order to describe the limit of S V /, we let (3.6) (3.7) I(t, x, y) := y x β(u, t) du, I k := I(U τk, L k 1, R k 1 ), S := I k.

10 10 JAMES ALLEN FILL TAKÉ NAKAMA Notice that i the case β 1 of key comparisos we have I(t, x, y) y x ad so I k = R k 1 L k 1. I Sectio 3.2 we show that for 1 p <, S V / coverges i L p to S as uder proper techical coditios. Uder a stroger assumptio, we will also prove almost sure covergece i Sectio Covergece of S V / i L p for 1 p <. Theorem 3.1 is our mai result cocerig QuickVal. To state the result, we eed the followig otatio, extedig that of (3.6): (3.8) I p (t, x, y) := y x β p (u, t) du, I p,k := I p (U τk, L k 1, R k 1 ),. Theorem 3.1. If 1 p < ad (3.9) (E I p,k ) 1/p <, the S V / coverges i L p (ad therefore also i probability ad i distributio) to S as. Remark 3.2. For p = 1, otice that the assumptio of Theorem 3.1 oly requires that E S <, which is equivalet to the assertio that E I k <. Proof. We use to deote L p -orm. As backgroud, we recall that the L p law of large umbers (L p LLN) states that for 1 p < ad i.i.d. radom variables ξ 1, ξ 2,... with fiite L p -orm, the sample meas ξ = 1 i=1 ξ i coverge i L p to the expectatio. To prove this, we may assume with o loss of geerality that the expectatio is 0, ad the the boud P( ξ p > c) c 1 E ξ p c 1 ξ 1 p p followig from Markov s iequality ad the triagle iequality for L p -orm shows that the sequece ( ξ p) is uiformly itegrable. So the L p LLN follows from the better-kow strog law of large umbers. Returig to the settig of the theorem, fix k. Coditioally give the quadruple C k = (L k 1, R k 1, τ k, U τk ), the radom variables U i with i > τ k are i.i.d. uiform(0, 1). By the L p LLN we have [usig the covetio 0/0 = 0 for S,k /( τ k ) whe = τ k ] [ p ] S,k (3.10) E I k τ k C k a.s. 0 as sice, with U uiformly distributed ad idepedet of all the U i s, (3.11) E[1(L k 1 < U < R k 1 ) β(u, U τk ) C k ] = I k. For our coditioal applicatio of the L p LLN i (3.10), it is sufficiet to assume oly that the probabilistic source ad the cost fuctio β 0 are such that I p,k is a.s. fiite, ad this clearly holds by (3.9). Our ext goal is to show that the left side of (3.10) is domiated by a sigle radom variable (depedig o the fixed value of k) with fiite expectatio, ad

11 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 11 the we will apply the domiated covergece theorem. For every, usig the covexity of x p for x > 0 we obtai [ p ] ( [( ) S,k p ] ) E I k τ k C k 2 p 1 S,k E C k + I p k. τ k We claim that each of the two terms multiplyig 2 p 1 o the right here is bouded by I p,k. First, usig the triagle iequality for coditioal L p -orm give C k, the fact that the radom variables summed to obtai S,k are coditioally i.i.d. give C k, ad the defiitio (3.8) of I p,k, we ca boud the pth root of the first term by { [( ) p ]} S,k 1/p E C k τ k 1 {E [1(L k 1 < U i < R k 1 )β p (U i, U τk ) C k ]} 1/p τ k (3.12) i:τ k <i = {E [1(L k 1 < U < R k 1 )β p (U, U τk ) C k ]} 1/p = I 1/p p,k with U as at (3.11). For the secod term we observe that [I k /(R k 1 L k 1 )] p is the pth power of the absolute value of a uiform average ad so is bouded by the correspodig uiform average of absolute values of pth powers, amely, I p,k /(R k 1 L k 1 ); thus (3.13) I p k (R k 1 L k 1 ) p 1 I p,k I p,k. So we coclude that [ p ] S,k E I k τ k C k 2 p I p,k. Thus it follows from E I p,k < [which follows from (3.9)] ad the domiated covergece theorem that p (3.14) E S,k I k τ k 0 as. Next, we will show from (3.14) that, for each k, p (3.15) E S,k I k 0 as by provig that d,k d p,,k := E S,k S,k τ k p ( τk = E S,k τ k vaishes i the limit as. Ideed, the correspodig coditioal expectatio give C k is ( τk ) [( ) p p ] S,k ( τk ) p 1(τ k < ) E C k 1(τ k < ) Ip,k τ k recallig the iequality (3.12). So agai usig E I p,k < ad applyig the domiated covergece theorem we fid that d,k 0, as desired. Fially, we show that S V / coverges to S i L p. Sice we have termwise L p -covergece of S V / to S by (3.15), the triagle iequality for L p -orm ad the ) p

12 12 JAMES ALLEN FILL TAKÉ NAKAMA domiated covergece theorem for sums imply that S V / coverges i L p to S provided we ca fid a summable sequece b k such that { } max sup S,k, I k p b k. p 1 But, for ay 1, we have [by takig pth powers i (3.12), the takig expectatios, the takig pth roots] S,k S,k p τ k (E I p,k ) 1/p. p Further, I k p (E I p,k ) 1/p follows from (3.13). Fially, b k := (E I p,k ) 1/p is assumed to be summable. Thus S V / coverges to S i L p. Remark 3.3. Lettig K deote the umber of key comparisos required by QuickVal(, α), we fid from Theorem 3.1 with β 1 that K / coverges i L p (1 p < ) to K := (R k L k ). k=0 (I Sectio 3.4, we will explicitly show the required coditio that E K < ; see Remark 3.9.) Suppose α = 0; the the umber of key comparisos K for QuickVal(, α) is the same as for QuickMi. I this case Theorem 3.1 gives (3.16) K L p K = 1 + U τk for 1 p <. The limitig radom variable K has mea 2 ad the same so-called Dickma distributio as the perpetuity (3.17) 1 + U 1 U k. That (3.16) (3.17) holds is well kow (e.g., Mahmoud et al. [12], Hwag ad Tsai [9]) Almost Sure Covergece of S V /. Uder a tamedess assumptio, we ca also show that S V / coverges to S almost surely. (Recall Defiitio 2.4.) Theorem 3.4. Suppose that the cost β is ɛ-tamed for some ɛ < 1/4. The S V / defied at (3.5) coverges to S almost surely. Before provig this theorem, we establish three lemmas boudig various quatities of iterest. Lemma 3.5. For ay p > 0 ad k 1, we have ( ) 2 2 E(R k L k ) p p k. p + 1 Here ote that for all p > 0 we have (3.18) 0 < 2 2 p p + 1 < 1.

13 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 13 Proof. Fix p > 0 ad k 1. Sice R 0 L 0 = 1, it is sufficiet to prove that E[(R k L k ) p L k 1, R k 1 ] 2 2 p p + 1 (R k 1 L k 1 ) p. Coditio o (L k 1, R k 1 ); the with U uiformly distributed over (L k 1, R k 1 ) we have the stochastic iequality Thus for L k 1 R k 1, with we have E[(R k L k ) p L k 1, R k 1 ] R k L k st max{u L k 1, R k 1 U}. A k 1 := (L k 1 + R k 1 )/2, E[(max{U L k 1, R k 1 U}) p L k 1, R k 1 ] = (R k 1 L k 1 ) 1 [ Ak 1 = 2 2 p p + 1 (R k 1 L k 1 ) p, as desired. L k 1 (R k 1 u) p du + Rk 1 A k 1 (u L k 1 ) p Lemma 3.6. Suppose that the cost β is tamed with parameters ɛ ad c. The for ay iterval (a, b) (0, 1), ay t (a, b), ad ay 0 q < 1/ɛ, we have b a β q (u, t) du 2qɛ c q 1 qɛ (b a)1 qɛ. Proof. Usig the tamedess assumptio, itegratio immediately gives b a β q (u, t) du cq [ (t a) 1 qɛ + (b t) 1 qɛ]. 1 qɛ The lemma ow follows from the cocavity of x 1 qɛ for x > 0. The ext lemma is a simple cosequece of the precedig two. Lemma 3.7. Suppose that the cost β is tamed with parameters ɛ < 1 ad c. The for ay k 1 ad ay q > 0, we have ( E I q 2 ɛ ) q ( ) k c 2 2 q(1 ɛ) k 1, 1 ɛ q(1 ɛ) + 1 ad so k E Iq k < geometrically quickly. Proof. Recallig we fid from Lemma 3.6 that I k I k = Rk 1 L k 1 β(u, U τk ) du, ( 2 ɛ ) c (R k 1 L k 1 ) 1 ɛ. 1 ɛ By applicatio of Lemma 3.5 we thus obtai the desired boud o E I q k. series-covergece assertio follows from the observatio (3.18). Now we prove Theorem 3.4. ] du The

14 14 JAMES ALLEN FILL TAKÉ NAKAMA Proof of Theorem 3.4. Clearly it suffices to show that (3.19) ad (3.20) where S V S a.s. 0 S S a.s. 0, S := ( τ k ) + I k. We tackle (3.20) first ad the (3.19). By the mootoe covergece theorem, S / S almost surely. But from Lemma 3.7 (usig oly ɛ < 1) we have E S = E I k <, which implies that S < almost surely. Hece (3.20) follows. Our proof of (3.19) both is ispired by ad follows alog the same lies as the fourth-momet proof of the strog law of large umbers described i Ross [19, Chapter 8]; as i that proof, we prefer easy calculatios ivolvig fourth momets to more difficult oes ivolvig tail probabilities perhaps with the expese that the value 1/4 i the statemet of Theorem 3.4 could be raised by more sophisticated argumets. For (3.19) it suffices to show that, for ay δ > 0, ( S V P S ) > δ i.o. = 0, for which it is sufficiet by the first Borel Catelli lemma ad Markov s iequality to show that ( S V E S ) 4 (3.21) <. 1 Here, by the triagle iequality for the L 4 orm, ( S V E S ) 4 4 S,k ( τ k) + 4 I k 1 1 = 4 ( τ k ) + ( S,k 4 (3.22) I k), τ k 1 where we agai use the covetio 0/0 = 0 for S,k /( τ k ) whe = τ k. As i the proof of Theorem 3.1, we let C k deote the quadruple (L k 1, R k 1, τ k, U τk ). Also we defie ad Ĩ k := 1(L k 1 < U < R k 1 )β(u, U τk ). M m (k) := E[(Ĩk I k ) m C k ],

15 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 15 where U is uif(0, 1) ad idepedet of C k. The routie calculatio (see Ross [19, Sectio 8.4]) shows that [ ( τk ) + ( )] 4 [ [ {( S,k τk ) + ( )} 4 ]] S,k E I k = E E I k τ k τ k C k { [( τk ) + ] 4 [ ( τk ) + M 4 (k) + 3( τ k ) + ( τ k 1) + M 2 ] } 2 (k) = E [( τ k ) + ] 4 (3.23) E { 4 [M 4 (k) + 3( 1)M 4 (k)] } 3 2 E M 4 (k), where the first iequality holds because M 4 (k) M2 2 (k). We will show that E M 4 (k) decays geometrically ad the use that fact to prove (3.21). Sice (a b) 4 8(a 4 + b 4 ) for ay real a ad b, we have (3.24) M 4 (k) 8 ( E[Ĩ4 k C k ] + I 4 k First, usig Lemma 3.7 we fid (usig oly ɛ < 1) that E Ik 4 < decays geometrically: ( 2 E Ik 4 ɛ ) 4 ( ) c 2 2 4(1 ɛ) k 1 (3.25). 1 ɛ 5 4ɛ Now we aalyze, i similar fashio, E[Ĩ4 k C k] i (3.24). 0 < ɛ < 1/4 ad Lemma 3.6 we fid ). ] E [Ĩ4 k Ck 24ɛ c 4 1 4ɛ (R k 1 L k 1 ) 1 4ɛ. Applyig Lemma 3.5 thus gives the geometric decay Usig the assumptio (3.26) E Ĩ4 k 24ɛ c 4 1 4ɛ ( ) 2 2 (1 4ɛ) k ɛ Therefore, it follows from (3.22) (3.23) ad (3.25) (3.26) that (3.21) holds: ( S V E S ) (E M 4 (k)) 1/4 <. 1 1 This completes the proof of Theorem Computatio of E S: a itegral expressio. I this sectio we derive the followig simple double-itegral expressio for E S i terms of the cost fuctio β. Theorem 3.8. For ay symmetric cost fuctio β 0 we have E S = 2 β(u, t) [(α t) (α u)] 1 du dt. 0<u<t<1 Proof. Recall that E S = E I k, where I k = Rk 1 L k 1 β(u, U τk ) du.

16 16 JAMES ALLEN FILL TAKÉ NAKAMA Recall also that, for each k, the coditioal distributio of U τk R k 1 is uiform(l k 1, R k 1 ). Thus give L k 1 ad E I k = E = = 2 Rk 1 L k 1 0<w,u<1 0<w<u<1 (R k 1 L k 1 ) 1 Rk 1 L k 1 β(u, w) dw du β(w, u) E[(R k 1 L k 1 ) 1 1(L k 1 < u, w < R k 1 )] dw du 0 x<α<y 1 Hece (3.27) E S = 2 where ν is the measure β(w, u) 0<w<u<1 (y x) 1 1(x < w < u < y)p(l k 1 dx, R k 1 dy) dw du. β(w, u) 0 x<α<y 1 (y x) 1 1(x < w < u < y) ν(dx, dy) dw du (3.28) ν(dx, dy) := k 0 P(L k dx, R k dy). As established i the Appedix i Propositio A.1, oe has the tractable expressio ν(dx, dy) = δ 0 (dx) δ 1 (dy) + (1 x) 1 dx δ 1 (dy) + δ 0 (dx) y 1 dy + 2(y x) 2 dx dy. Usig this last expressio, routie calculatio shows that, for 0 < w < u < 1, (3.29) (y x) 1 1(x < w < u < y) ν(dx, dy) = [(α u) (α w)] 1. 0 x<α<y 1 Substitute (3.29) ito (3.27) to complete the proof of the theorem. Remark 3.9. We ow let β 1 ad use Theorem 3.8 to aalyze the expectatio of the umber K of key comparisos required by QuickVal(, α). The the expected value i Theorem 3.8 is (3.30) 2 [(α t) (α u)] 1 du dt = 2[1 α l α (1 α) l(1 α)] <. 0<u<t<1 It follows by (3.30) that for α = 0 we have lim E K / = 2, which is well kow sice K i this case represets the umber of key comparisos requred by QuickMi applied to a file of keys (e.g., Mahmoud et al. [12]). Thus we are ow able to coclude that for ay α (0 α 1), E K / coverges to the simple costat i (3.30). Also otice that we have verified the hypothesis of Theorem 3.1 for p = 1 (see also 3.2) by (3.30), as we promised i Remark 3.3 that we would.

17 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT Computatio of E S: a series expressio. We ow restrict to the cost fuctio β symb ad use Theorem 3.8 to derive a series expressio for E S. I the otatio of Sectio 2.1, we have 1 2 E S = [(α t) (α u)] 1 du dt, T w w Σ which is easily obtaied by otig that for u < t we have (3.31) β(u, t) = 1(a w < u < t < b w ). w Σ Defie J (w) := [(α t) (α u)] 1 du dt. T w The routie calculatio shows that ( ) α µ w J (w) = p w L. Thus (3.32) E S = ( ) α µ w p w L p w. w Σ p w 4. Aalysis of QuickQuat Followig some prelimiaries i Sectio 4.1, i Sectio 5 we show that a suitably defied S Q / coverges i L p to S for 1 p < provided that the cost fuctio β is ɛ-tamed with ɛ < 1/p; hece S Q / ad S V / have the same limitig distributio provided oly that the cost fuctio β is ɛ-tamed for suitably small ɛ. Gratig that result for a momet, we ca ow relate three of the results obtaied i Sectio 3 to previously kow results reviewed i Sectio 2.2. From Remark 3.3 we recover the result of [7, Theorem 8] (i a cosmetically differet, but equivalet, form; compare [6, Theorem 3]) for the limitig distributio of the umber of key comparisos, ad from Remark 3.9 we recover first-momet iformatio for the same. Fially, recallig that L 1 -covergece implies covergece of meas, from (3.32) we recover at least the lead-order term i the asymptotics of [20] discussed at (2.3) Prelimiaries. We will closely follow the framework described i Sectio 3 for the aalysis of QuickVal ad costruct a radom variable, call it S Q, that has the distributio of the total cost required by QuickQuat whe applied to a file of keys. Our goal is to show that, uder suitable techical coditios, S Q / coverges i L p to S defied at (3.7). Agai, we defie S Q i terms of a ifiite sequece (U i ) i 1 of seeds that are i.i.d. uiform(0, 1). Let m (with m / α) deote our target rak for QuickQuat. Let τ k () deote the idex of the seed that correspods to the kth pivot. As i Sectio 3.1 we will set the first pivot idex τ 1 () to 1 rather tha to a radomly chose iteger from {1,..., }. For k 1, we will use L k 1 () ad R k 1 (), as defied below, to deote the lower ad upper bouds, respectively, of seeds of words that are eligible to be compared with the kth pivot. [Notice that τ k (), L k (), ad R k () are aalogous to τ k, L k, ad R k defied i Sectio 3.1; see (3.1) (3.3).] Hece we let L 0 () := 0 ad R 0 () := 1, ad for k 1 we iductively defie τ k () := if{i : L k 1 () < U i < R k 1 ()},

18 18 JAMES ALLEN FILL TAKÉ NAKAMA ad L k () := 1(pivrak k () m ) U τk () + 1(pivrak k () > m ) L k 1 (), R k () := 1(pivrak k () m ) U τk () + 1(pivrak k () < m ) R k 1 () if τ k () < but (L k (), R k ()) := (L k 1 (), R k 1 ()) if τ k () =. Here pivrak k () deotes the rak of the kth pivot seed U τk () if τ k () < ad m otherwise. Recall that the ifimum of the empty set is ; hece τ k () = if ad oly if L k 1 () = R k 1 (). Usig this otatio, let S Q,k := 1(L k 1 () < U i < R k 1 ())β(u i, U τk ()) i: τ k ()<i be the total cost of all comparisos (for the first keys) with the kth pivot key. The (4.1) S Q := S Q,k has the distributio of the total cost required by QuickQuat. Notice that the expressio (4.1) is aalogous to (3.5). I fact, we will prove the L p -covergece of S Q / to S by comparig the correspodig expressios for QuickVal ad QuickQuat. 5. Covergece of S Q / i L p for 1 p < The followig is our mai theorem regardig QuickQuat. Theorem 5.1. Let 1 p <. Suppose that the cost fuctio β is ɛ-tamed with ɛ < 1/p. The S Q / coverges i L p to S. Remark 5.2. Professor Fill s commets: Note that as p icreases, gettig L p - covergece requires the icreasigly stroger coditio ɛ < 1/p. (I do t thik that s surprisig.) So we have covergece of momets of *all* orders provided the source is γ-tamed for *every* γ > 0 for example, if π k decreases geometrically quickly i k, as is true for memoryless ad most Markov sources. The proof of Theorem 5.1 will make use of the followig aalogue of Lemma 3.5, whose proof is essetially the same ad therefore omitted. Lemma 5.3. For ay p > 0 ad k 1 ad 1, we have ( ) 2 2 E(R k () L k ()) p p k. p + 1 Proof of Theorem 5.1. Part of our strategy i provig this theorem is to compare QuickQuat with QuickVal. Hece we will frequetly refer to the otatio established i Sectio 3.1 for the aalysis of QuickVal. For each k, observe that as we have τ k () a.s. τ k, U τk () a.s. U τk, L k () a.s. L k, R k () a.s. R k,

19 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 19 where τ k, L k, ad R k, are defied i Sectio 3.1 [see (3.1) (3.3)]. (I fact, i each of these four cases of covergece, the left-had side almost surely becomes equal to its limit for all sufficietly large.) Thus for each k 1 we have (5.1) S Q,k S,k a.s. 0, where S,k is defied at (3.4); ideed, agai the differece almost surely vaishes for all sufficietly large. I provig Theorem 3.1, we showed [at (3.15)] that S,k L p I k, where I k is defied at (3.6), ad it is somewhat easier (by meas of coditioal applicatio of the strog law of large umbers, rather tha the L p law of large umbers, together with Fubii s theorem) to show that (5.2) S,k a.s. I k. Combiig (5.1) ad (5.2), for each k 1 we have (5.3) What we wat to show is that S Q,k a.s. I k. (5.4) S Q = S Q,k L p I k = S. Choose ay sequece (a k ) of positive umbers summig to 1, ad let A be the probability measure o the positive itegers with this probability mass fuctio. The, oce agai usig the fact that the pth power of the absolute value of a average is bouded by the average of pth powers of absolute values, S Q S p S Q,k a k a p k p I k = S Q p,k I k. a k a 1 k S Q,k I k So for (5.4) it suffices to prove that, with respect to the product probability P A, as the sequece a p S Q p,k k I k coverges i L 1 to 0. What we kow from (5.3) is that the sequece coverges almost surely with respect to P A. Now almost sure covergece together with boudedess i L 1+δ are, for ay δ > 0, sufficiet for covergece i L 1 because the boudedess coditio implies uiform itegrability (e.g., Chug [1, Exercise 4.5.8]). Thus our proof is reduced to showig that, for some q > p, the sequece a 1 q k E S Q,k I k q p

20 20 JAMES ALLEN FILL TAKÉ NAKAMA is bouded i, for a suitably chose probability mass fuctio (a k ). Ideed, by covexity of qth power, (5.5) 2 1 q a 1 q S Q q,k k E I k a 1 q S Q q,k k E + a 1 q k E I q k, ad we will show that each sum o the right-had side of (5.5) is bouded i order to prove the theorem. The value of q that we use ca be ay satisfyig ɛ < 1/q < 1/p. First we recall from Lemma 3.7 that ( (5.6) E I q 2 ɛ ) q ( ) k c 2 2 q(1 ɛ) k 1, k 1. 1 ɛ q(1 ɛ) + 1 with geometric decay. Thus the secod sum o the right i (5.5) is fiite if the cost is ɛ-tamed with ɛ < 1 ad the sequece (a k ) is suitably chose ot to decay too quickly. Next we aalyze E S Q,k / q for the first sum o the right i (5.5). Let ν k 1 () := {i : L k 1 () < U i < R k 1 (), τ k () < i }. Util further otice our calculatios are doe oly over the evet {ν k 1 () > 0}. The, boudig the qth power of the absolute value of a average by the average of qth powers of absolute values, S Q,k (5.7) q = 1 ν k 1 () 1 ν k 1 () q 1(τ k () < i ) β(u i, U τk ()) ( ) q νk 1 () 1(τ k () < i ) β q (U i, U τk ()) i: L k 1 ()<U i<r k 1 () i: L k 1 ()<U i<r k 1 () ( ) q νk 1 (). Let D k () deote the quituple (L k 1 (), R k 1 (), τ k (), U τk (), ν k 1 ()), ad otice that, coditioally give D k (), the ν k 1 () values U i appearig i (5.7) are i.i.d. uif(l k 1 (), R k 1 ()). Usig (5.7), we boud the coditioal expectatio of S Q,k / q give D k (). We have (5.8) [ S Q q ],k E D k() Rk 1() [R k 1 () L k 1 ()] 1 β q (u, U τk ()) du L k 1 () ( ) q νk 1 (). Uder ɛ-tamedess of β with ɛ < 1/q, we fid from Lemma 3.6 that (5.9) Rk 1 () L k 1 () β q (u, U τk ()) du 2qɛ c q 1 qɛ [R k 1() L k 1 ()] 1 qɛ.

21 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 21 From (5.8) (5.9), it follows that if ɛ < 1/q, the [ S Q q ],k E D k() 2qɛ c q 1 qɛ [R k 1() L k 1 ()] q qɛ ( ) q ν k 1 (). (R k 1 () L k 1 ()) Util this poit we have worked oly over the evet {ν k 1 () > 0}, but ow we elarge our scope to the evet {L k 1 () < R k 1 ()} ad ote that the precedig iequality holds there, as well. Next otice that, coditioally give the triple D k () := (L k 1 (), R k 1 (), τ k ()), the values U i with τ k () < i are i.i.d. uif(0, 1), ad so the umber of them fallig i the iterval (L k 1 (), R k 1 ()) is distributed biomial(m, t) with m = τ k () ad t = R k 1 () L k 1 (), ad (represetig a biomial as a sum of idepedet Beroulli radom variables ad applyig the triagle iequality for L q ) momet of order q bouded by m q t. Thus so that [( E ν k 1 () (R k 1 () L k 1 ()) [ S Q q ],k E D k () ) q Dk ()] [R k 1 () L k 1 ()] 1 q, 2qɛ c q 1 qɛ [R k 1() L k 1 ()] 1 qɛ. Sice this iequality holds eve whe L k 1 () = R k 1 (), we ca take expectatios to coclude S Q q,k E 2qɛ c q 1 qɛ E[R k 1() L k 1 ()] 1 qɛ 2qɛ c q ( ) 2 2 (1 qɛ) k 1 (5.10), 1 qɛ 2 qɛ where at the secod iequality we have employed Lemma 5.3. From (5.6) ad (5.10) we see that we ca choose (a k ) to be the geometric distributio a k = (1 θ)θ k 1, k 1, with 2 2 q(1 ɛ) q(1 ɛ) + 1 < θ < 1. We the coclude that ) a1 q k E (S Q,k / q I k is bouded i, ad therefore that S Q / coverges to S i L p, if the cost fuctio is ɛ-tamed with ɛ < 1/p. A. Appedix: A tractable expressio for the measure ν The purpose of this appedix is to prove the followig propositio used i the computatio of E S i Sectio 3.4. Propositio A.1. With (L k, R k ) defied at (3.2) (3.3) as the iterval of values eligible to be compared with the kth pivot chose by QuickVal, ad with ν(dx, dy) := k 0 P(L k dx, R k dy)

22 22 JAMES ALLEN FILL TAKÉ NAKAMA as defied at (3.28), we have ν(dx, dy) = δ 0 (dx) δ 1 (dy) + (1 x) 1 dx δ 1 (dy) + δ 0 (dx) y 1 dy + 2(y x) 2 dx dy. Proof. To begi, sice L 0 := 0 ad R 0 := 1 we have (A.1) P(L 0 dx, R 0 dy) = δ 0 (dx) δ 1 (dy), where δ z deotes the probability measure cocetrated at z. Now assume k 1. If 0 λ < α < ρ 1, the P(L k dx, R k dy L k 1 = λ, R k 1 = ρ) = δ ρ (dy)1(λ < x < α)(ρ λ) 1 dx + δ λ (dx)1(α < y < ρ)(ρ λ) 1 dy. Hece (A.2) P(L k dx, R k dy) = [δ ρ (dy)1(λ < x < α)(ρ λ) 1 dx +δ λ (dx)1(α < y < ρ)(ρ λ) 1 dy] P(L k 1 dλ, R k 1 dρ). We ca ifer [ad iductively prove usig (A.2)] that, for k 1, (A.3) where P(L k dx, R k dy) = δ 1 (dy)f k (x)dx + δ 0 (dx)g k (y)dy + h k (x, y)dx dy, f 1 (x) = 1(0 x < α), g 1 (y) = 1(α < y 1), h 1 (x, y) = 0, ad, for k 2, (A.4) f k (x) = 1(0 x < α) 1(0 λ < x)(1 λ) 1 f k 1 (λ) dλ, (A.5) g k (y) = 1(α < y 1) 1(y < ρ 1)ρ 1 g k 1 (ρ) dρ, [ (A.6) h k (x, y) = 1(0 x < α < y 1) (1 x) 1 f k 1 (x) + y 1 g k 1 (y) + 1(0 λ < x)(y λ) 1 h k 1 (λ, y) dλ ] + 1(y < ρ 1)(ρ x) 1 h k 1 (x, ρ) dρ. Heceforth suppose 0 x < α < y 1. From (A.5) we obtai (A.7) whece (A.8) g k (y) = ( l y)k 1, k 1, (k 1)! g k (y) = y 1. By recogizig symmetry betwee (A.4) ad (A.5), we also fid (A.9) ad so (A.10) f k (x) = [ l(1 x)]k 1, k 1, (k 1)! f k (x) = (1 x) 1.

23 CONVERGENCE IN DISTRIBUTION FOR QUICKSELECT 23 I order to compute h k(x, y), we cosider the geeratig fuctio (A.11) H(x, y, z) := h k (x, y) z k. From (A.6), H(x, y, z) = z (1 x) 1 f k (x) z k + y 1 g k (y) z k x 1 ] (A.12) + (y λ) 1 H(λ, y, z) dλ + (ρ x) 1 H(x, ρ, z) dρ. 0 Usig this itegral equatio, we will show via a series of lemmas culmiatig i Lemma A.10 that (A.13) H(x, y) := H(x, y, 1) = h k (x, y) equals 2(y x) 2. y Combiig equatios (A.3), (A.8), (A.10), ad (A.13), we obtai the desired expressio for ν. Throughout the remaider of this appedix, wheever we refer to H(x, y) we tacitly suppose that 0 x < α < y 1. Lemma A.2. H(x, y) < almost everywhere. Proof. We revisit Remarks 3.3 ad 3.9 ad cosider the umber of key comparisos required by QuickVal(, α). As show at (3.30), we have E S < i this case. O the other had, with β 1, from (3.27) (A.1), (A.3), ad (A.8) (A.10), we have [ E S = (1 x) 1 1(x < w) dx + y 1 1(y > u) dy 0<w<u<1 + 0 x<α<y 1 0 x<α Thus H(x, y) < almost everywhere. α<y 1 ] (y x) 1 1(x < w < u < y) H(x, y) dx dy dw du. The ext lemma establishes mootoicity properties of H(x, y). Lemma A.3. H(x, y) is icreasig i x ad decreasig i y. Proof. For each k 1, we see from (A.9) that f k (x) is icreasig i x ad from (A.7) that g k (y) is decreasig i y. Sice h 1 0, it follows by iductio o k from (A.6) that h k (x, y) is icreasig i x ad decreasig i y for each k. Thus H(x, y) = h k(x, y) ejoys the same mootoicity properties. Lemma A.4. H(x, y) < for all x ad y. Proof. This is immediate from Lemmas A.2 A.3. Lemma A.5. The geeratig fuctio H(x, y, z) at (A.11), is (with h 0 : 0) the uique power-series solutio H(x, y, z) = k 0 h k (x, y)z k (i 0 z 1) to the itegral equatio (A.12) such that 0 h k (x, y) h k (x, y) for all k, x, y.

24 24 JAMES ALLEN FILL TAKÉ NAKAMA Proof. We have already see that H is such a solutio. Coversely, if H is such a solutio, the equatig coefficiets of z k i the itegral equatio [which is valid because we kow by Lemma A.4 that H(x, y, z), ad hece also H(x, y, z), is fiite for 0 z 1] we fid that the fuctios h k (x, y) satisfy h k 0 for k = 0, 1 ad the recurrece relatio (A.6) for k 2. It the follows by iductio that h k (x, y) = h k (x, y) for all k, x, y. Next we let H 0 (x, y, z) : 0 ad, for 0 z 1, iductively defie H (x, y, z) by applyig successive substitutios to the itegral equatio (A.12); that is, for each 1 we defie H (x, y, z) := z (1 x) 1 f k (x) z k + y 1 g k (y) z k x 1 ] (A.14) + (y λ) 1 H 1 (λ, y, z) dλ + (ρ x) 1 H 1 (x, ρ, z) dρ. 0 Let [z k ] H (x, y, z) deote the coefficiet of z k i H (x, y, z). Lemma A.6. For each k 1, [z k ] H (x, y, z) is odecreasig i 0. Proof. The iequality [z k ] H (x, y, z) [z k ] H 1 (x, y, z) is proved easily by iductio o 1. Accordig to the ext lemma, H domiates each H. Lemma A.7. For all 0 ad k 1 we have (A.15) 0 [z k ] H (x, y, z) h k (x, y, z). Proof. Lemma A.6 establishes the first iequality, ad the secod is proved easily by iductio o. Lemmas A.5 A.7 lead to the followig lemma: Lemma A.8. For 0 x < α < y 1 ad 0 z 1 we have H (x, y, z) H(x, y, z) as. Proof. Recallig Lemmas A.6 A.7, defie H(x, y, z) to be the power series i z with coefficiet of z k equal to h k (x, y) := lim [z k ] H (x, y, z), which satisfies 0 h k (x, y) h k (x, y). O the other had, H satisfies the itegral equatio (A.12) by applyig the mootoe covergece theorem to (A.14). Thus it follows from Lemma A.5 that H = H. Fially, aother applicatio of the mootoe covergece theorem shows that H(x, y, z) = lim H (x, y, z). Our ext lemma, whe combied with the precedig oe, immediately leads to iequality i oe directio i (A.13). Lemma A.9. For 0 x < α < y 1 ad all 0, H (x, y, 1) 2(y x) 2. Proof. We will prove this lemma by iductio o, startig with H 0 (x, y) = 0 2(y x) 2. y

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Analysis of the Expected Number of Bit Comparisons Required by Quickselect

Analysis of the Expected Number of Bit Comparisons Required by Quickselect Aalysis of the Expected Number of Bit Comparisos Required by Quickselect James Alle Fill Takéhiko Nakama Abstract Whe algorithms for sortig ad searchig are applied to keys that are represeted as bit strigs,

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Appendix to Quicksort Asymptotics

Appendix to Quicksort Asymptotics Appedix to Quicksort Asymptotics James Alle Fill Departmet of Mathematical Scieces The Johs Hopkis Uiversity jimfill@jhu.edu ad http://www.mts.jhu.edu/~fill/ ad Svate Jaso Departmet of Mathematics Uppsala

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

arxiv: v1 [math.pr] 31 Jan 2012

arxiv: v1 [math.pr] 31 Jan 2012 EXACT L 2 -DISTANCE FROM THE LIMIT FOR QUICKSORT KEY COMPARISONS EXTENDED ABSTRACT) arxiv:20.6445v [math.pr] 3 Ja 202 PATRICK BINDJEME JAMES ALLEN FILL Abstract Usigarecursiveapproach, weobtaiasimpleexactexpressioforthel

More information

Beurling Integers: Part 2

Beurling Integers: Part 2 Beurlig Itegers: Part 2 Isomorphisms Devi Platt July 11, 2015 1 Prime Factorizatio Sequeces I the last article we itroduced the Beurlig geeralized itegers, which ca be represeted as a sequece of real umbers

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Math 2784 (or 2794W) University of Connecticut

Math 2784 (or 2794W) University of Connecticut ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Rates of Convergence by Moduli of Continuity

Rates of Convergence by Moduli of Continuity Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

lim za n n = z lim a n n.

lim za n n = z lim a n n. Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget

More information

Analysis of Execution Costs for QuickSelect. Takéhiko Nakama

Analysis of Execution Costs for QuickSelect. Takéhiko Nakama Aalysis of Executio Costs for QuickSelect by Takéhiko Nakama A dissertatio submitted to The Johs Hopkis Uiversity i coformity with the requiremets for the degree of Doctor of Philosophy. Baltimore, Marylad

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

7 Sequences of real numbers

7 Sequences of real numbers 40 7 Sequeces of real umbers 7. Defiitios ad examples Defiitio 7... A sequece of real umbers is a real fuctio whose domai is the set N of atural umbers. Let s : N R be a sequece. The the values of s are

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutions CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Hoggatt and King [lo] defined a complete sequence of natural numbers

Hoggatt and King [lo] defined a complete sequence of natural numbers REPRESENTATIONS OF N AS A SUM OF DISTINCT ELEMENTS FROM SPECIAL SEQUENCES DAVID A. KLARNER, Uiversity of Alberta, Edmoto, Caada 1. INTRODUCTION Let a, I deote a sequece of atural umbers which satisfies

More information

Complex Analysis Spring 2001 Homework I Solution

Complex Analysis Spring 2001 Homework I Solution Complex Aalysis Sprig 2001 Homework I Solutio 1. Coway, Chapter 1, sectio 3, problem 3. Describe the set of poits satisfyig the equatio z a z + a = 2c, where c > 0 ad a R. To begi, we see from the triagle

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

Assignment 5: Solutions

Assignment 5: Solutions McGill Uiversity Departmet of Mathematics ad Statistics MATH 54 Aalysis, Fall 05 Assigmet 5: Solutios. Let y be a ubouded sequece of positive umbers satisfyig y + > y for all N. Let x be aother sequece

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

The natural exponential function

The natural exponential function The atural expoetial fuctio Attila Máté Brookly College of the City Uiversity of New York December, 205 Cotets The atural expoetial fuctio for real x. Beroulli s iequality.....................................2

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

ENGI Series Page 6-01

ENGI Series Page 6-01 ENGI 3425 6 Series Page 6-01 6. Series Cotets: 6.01 Sequeces; geeral term, limits, covergece 6.02 Series; summatio otatio, covergece, divergece test 6.03 Stadard Series; telescopig series, geometric series,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

A Note on the Symmetric Powers of the Standard Representation of S n

A Note on the Symmetric Powers of the Standard Representation of S n A Note o the Symmetric Powers of the Stadard Represetatio of S David Savitt 1 Departmet of Mathematics, Harvard Uiversity Cambridge, MA 0138, USA dsavitt@mathharvardedu Richard P Staley Departmet of Mathematics,

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

INFINITE SEQUENCES AND SERIES

INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES I geeral, it is difficult to fid the exact sum of a series. We were able to accomplish this for geometric series ad the series /[(+)]. This is

More information

MAS111 Convergence and Continuity

MAS111 Convergence and Continuity MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece

More information

Chapter IV Integration Theory

Chapter IV Integration Theory Chapter IV Itegratio Theory Lectures 32-33 1. Costructio of the itegral I this sectio we costruct the abstract itegral. As a matter of termiology, we defie a measure space as beig a triple (, A, µ), where

More information

Notes on Snell Envelops and Examples

Notes on Snell Envelops and Examples Notes o Sell Evelops ad Examples Example (Secretary Problem): Coside a pool of N cadidates whose qualificatios are represeted by ukow umbers {a > a 2 > > a N } from best to last. They are iterviewed sequetially

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

A PROOF OF THE TWIN PRIME CONJECTURE AND OTHER POSSIBLE APPLICATIONS

A PROOF OF THE TWIN PRIME CONJECTURE AND OTHER POSSIBLE APPLICATIONS A PROOF OF THE TWI PRIME COJECTURE AD OTHER POSSIBLE APPLICATIOS by PAUL S. BRUCKMA 38 Frot Street, #3 aaimo, BC V9R B8 (Caada) e-mail : pbruckma@hotmail.com ABSTRACT : A elemetary proof of the Twi Prime

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Math 113, Calculus II Winter 2007 Final Exam Solutions

Math 113, Calculus II Winter 2007 Final Exam Solutions Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

Roger Apéry's proof that zeta(3) is irrational

Roger Apéry's proof that zeta(3) is irrational Cliff Bott cliffbott@hotmail.com 11 October 2011 Roger Apéry's proof that zeta(3) is irratioal Roger Apéry developed a method for searchig for cotiued fractio represetatios of umbers that have a form such

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

f n (x) f m (x) < ɛ/3 for all x A. By continuity of f n and f m we can find δ > 0 such that d(x, x 0 ) < δ implies that

f n (x) f m (x) < ɛ/3 for all x A. By continuity of f n and f m we can find δ > 0 such that d(x, x 0 ) < δ implies that Lecture 15 We have see that a sequece of cotiuous fuctios which is uiformly coverget produces a limit fuctio which is also cotiuous. We shall stregthe this result ow. Theorem 1 Let f : X R or (C) be a

More information

Ma 530 Introduction to Power Series

Ma 530 Introduction to Power Series Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power

More information

MA131 - Analysis 1. Workbook 9 Series III

MA131 - Analysis 1. Workbook 9 Series III MA3 - Aalysis Workbook 9 Series III Autum 004 Cotets 4.4 Series with Positive ad Negative Terms.............. 4.5 Alteratig Series.......................... 4.6 Geeral Series.............................

More information

2.4 - Sequences and Series

2.4 - Sequences and Series 2.4 - Sequeces ad Series Sequeces A sequece is a ordered list of elemets. Defiitio 1 A sequece is a fuctio from a subset of the set of itegers (usually either the set 80, 1, 2, 3,... < or the set 81, 2,

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

Mathematical Methods for Physics and Engineering

Mathematical Methods for Physics and Engineering Mathematical Methods for Physics ad Egieerig Lecture otes Sergei V. Shabaov Departmet of Mathematics, Uiversity of Florida, Gaiesville, FL 326 USA CHAPTER The theory of covergece. Numerical sequeces..

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information