Outline for Today. A simple and lightning fast hash table implementation. Why the degree of independence matters.

Size: px
Start display at page:

Download "Outline for Today. A simple and lightning fast hash table implementation. Why the degree of independence matters."

Transcription

1 Liear Probig

2 Outlie for Today Liear Probig Hashig A simple ad lightig fast hash table implemetatio. Aalyzig Liear Probig Why the degree of idepedece matters. Fourth Momet Bouds Aother approach for estimatig frequecies.

3 Hashig Strategies All hash table implemetatios eed to address what happes whe collisios occur. Commo strategies: Closed addressig: Store all elemets with hash collisios i a secodary data structure (liked list, BST, etc.) Perfect hashig: Choose hash fuctios to esure that collisios do't happe, ad rehash or move elemets whe they do. Ope addressig: Allow elemets to leak out from their preferred positio ad spill over ito other positios. Liear probig is a example of ope addressig. We'll see a type of perfect hashig (cuckoo hashig) o Thursday.

4 Liear Probig Liear probig is a simple ope-addressig hashig strategy. To isert a elemet x, compute h(x) ad try to place x there. If that spot is occupied, keep movig through the array, wrappig aroud at the ed, util a free spot is foud. y x 2 6 z 3 4 5

5 Liear Probig Liear probig is a simple ope-addressig hashig strategy. To isert a elemet x, compute h(x) ad try to place x there. If that spot is occupied, keep movig through the array, wrappig aroud at the ed, util a free spot is foud. y w 1 7 x 2 6 z 3 4 5

6 Liear Probig Liear probig is a simple ope-addressig hashig strategy. To isert a elemet x, compute h(x) ad try to place x there. If that spot is occupied, keep movig through the array, wrappig aroud at the ed, util a free spot is foud. y w 1 7 x 2 6 z 3 4 5

7 Liear Probig To look up a elemet x, compute h(x) ad start lookig there. Move aroud the rig util either the elemet is foud or a blak spot is detected. (We'll assume the load factor prohibits us from isertig so may elemets that there are o free spaces.) y r r 8 w 1 7 x 2 6 z 3 4 5

8 Liear Probig To look up a elemet x, compute h(x) ad start lookig there. Move aroud the rig util either the elemet is foud or a blak spot is detected. (We'll assume the load factor prohibits us from isertig so may elemets that there are o free spaces.) y r w r 1 7 x 2 6 z 3 4 5

9 Liear Probig To look up a elemet x, compute h(x) ad start lookig there. Move aroud the rig util either the elemet is foud or a blak spot is detected. (We'll assume the load factor prohibits us from isertig so may elemets that there are o free spaces.) y r r 8 w 1 7 x 2 6 z 3 4 5

10 Liear Probig To look up a elemet x, compute h(x) ad start lookig there. Move aroud the rig util either the elemet is foud or a blak spot is detected. (We'll assume the load factor prohibits us from isertig so may elemets that there are o free spaces.) y r 15 9 r 0 8 w 1 7 x 2 6 z 3 4 5

11 Liear Probig Deletios are a bit trickier tha i chaied hashig. y z We caot just do a search ad remove the elemet where we fid it. Why? r 9 w 8 w 7 x

12 Liear Probig Deletios are a bit trickier tha i chaied hashig. y z We caot just do a search ad remove the elemet where we fid it. Why? r 9 r 8 7 x

13 Liear Probig Deletios are a bit trickier tha i chaied hashig. y z We caot just do a search ad remove the elemet where we fid it. Why? r r x

14 Liear Probig Deletios are ofte implemeted usig tombstoes. Whe removig a elemet, mark that the cell is empty ad was previously occupied. y z 3 Whe doig a lookup, do't stop at a tombstoe. Istead, keep the search goig w 4 5 You eed to watch out for wraparouds. Whe isertig, feel free to replace ay tombstoe you ecouter. 10 r 9 8 w 7 x 6

15 Liear Probig Deletios are ofte implemeted usig tombstoes. Whe removig a elemet, mark that the cell is empty ad was previously occupied. y z 3 Whe doig a lookup, do't stop at a tombstoe. Istead, keep the search goig You eed to watch out for wraparouds. Whe isertig, feel free to replace ay tombstoe you ecouter. 10 r 9 8 墓 7 x 6

16 Liear Probig Deletios are ofte implemeted usig tombstoes. Whe removig a elemet, mark that the cell is empty ad was previously occupied. y z 3 Whe doig a lookup, do't stop at a tombstoe. Istead, keep the search goig You eed to watch out for wraparouds. Whe isertig, feel free to replace ay tombstoe you ecouter. 10 r 9 r 8 墓 7 x 6

17 Liear Probig i Practice I practice, liear probig is oe of the fastest geeral-purpose hashig strategies available. This is surprisig it was origially iveted i 1954! It's pretty amazig that it still holds up so well. Why is this? Low memory overhead: just eed a array ad a hash fuctio. Excellet locality: whe collisios occur, we oly search i adjacet locatios i the array. Great cache performace: a combiatio of the above two factors.

18 The Weakess Liear probig exhibits severe performace degradatios whe the load factor gets high. The umber of collisios teds to grow as a fuctio of the umber of existig collisios. This is called primary clusterig

19 So how fast is liear probig?

20 Time-Out for Aoucemets!

21 Fial Project Topics Fial project topics have bee assiged, ad we re really excited to see what you ed up makig! We recommed that you make slow ad steady progress o the project over the ext couple of weeks. We ll work out a presetatio schedule i a week or so.

22 Problem Sets Problem Set Four is due this Thursday at 2:30PM. Have questios? Stop by office hours or ask o Piazza! We re workig o gradig PS3 right ow ad will try to get it back to you soo. PS5 will go out o Thursday ad will be due oe week from this Thursday. Ad that s it!

23 Later This Week Keith will be out of tow through the ed of the week. Rafa ad Mitchell will be coverig Keith s office hours at the regular time (2PM 4PM) i the Huag Basemet. Sam will be givig Thursday s lecture o cuckoo hashig (super iterestig stuff!)

24

25 GTGTC Exec Applicatios Girls Teachig Girls to Code (GTGTC) is lookig for people to serve o ext year s executive committee. This is a excellet program that s bee aroud for years. It s a great way to make a impact. Iterested? Apply here by this Suday.

26 Back to CS166!

27 Aalyzig Liear Probig

28 You probably saw a aalysis of chaied hash tables i CS161. What makes liear probig differet, iterestig, or oteworthy?

29 Why Liear Probig is Differet I chaied hashig, collisios oly occur whe two values have exactly the same hash code. I liear probig, collisios ca occur betwee elemets with etirely differet hash codes. To aalyze liear probig, we eed to kow more tha just how may elemets collide with us The 3 3 The lookup lookup time time here here is is huge huge eve eve though though this this key key oly 4 4 oly directly directly collides collides with with oe oe other. other

30 Some Brief History I 1954, Gee Amdahl, Elaie McGraw, ad Arthur Samuel ivet liear probig as a subroutie for a assembler. I 1962, Do Kuth, i his first ever aalysis of a algorithm, proves that liear probig takes expected time O(1) for lookups if the hash fuctio is truly radom (-wise idepedece). I 1995, Schmidt ad Siegel proved O(log )-idepedet hash fuctios guaratee fast performace for liear probig, but ote that such hash fuctios either take a log time to evaluate or require a lot of space. I 2006, Aa Pagh et al. proved that 5-idepedet hash fuctios give expected costat-time lookups. (This is the aalysis we ll see today.) These hash fuctios ca be stored i O(1) space ad evaluated i O(1) time. I 2007, Mitzemacher ad Vadha proved that 2-idepedece will give expected O(1)-time lookups, assumig there s some measure of radomess i the keys. I 2010, Pătrașcu ad Thorup proved that 5-idepedece is the miimum idepedece eeded for adversarially-chose keys.

31 The Aalysis!

32 For simplicity, let s assume a load factor of α = ¹/₃. This elemet is far from home. A regio of size m is a cosecutive set of m locatios i the hash table A elemet x hashes to regio R if h(x) R, though x may ot be placed i R O expectatio, a regio of size 2 s should have at most ¹/₃ 2 s elemets hash to it. It would be very ulucky if a regio had twice as may elemets i it as expected. A regio of size 2 s is overloaded if at least ²/₃ 2 s elemets hash to it Ituitio: If If a a elemet elemet eds eds up up far far from from its its home home locatio, locatio, the the some some large large regio regio ear ear its its home home has has to to be be overloaded.

33 Theorem: The probability that a elemet xₐ eds up betwee 2 s ad 2 s+1 steps from its home locatio is upper-bouded by c Pr[ the regio of size 2 s cetered c Pr[ o h(xₐ) is overloaded for some fixed costat c idepedet of s. Proof: Set up some cleverly-chose rages over the hash table ad use the pigeohole priciple. See Thorup s lecture otes.

34 Aalyzig the Rutime The cost of lookig up some key xₐ is bouded from above by the legth of the ru cotaiig xₐ. The expected cost of performig a lookup is therefore at most log O(1) s=0 The previous theorem tells us that this cost is log O(1) s=0 2 s+1 Pr [ x q is betwee 2 s ad 2 s+1 spots from home 2 s Pr[ the regio of size 2 s o h( x a ) is overloaded If we ca determie the probability that a regio of size 2 s is overloaded, we'll have a boud o the expected lookup cost for xₐ.

35 Overloaded Regios Recall: A regio is a cotiguous spa of table slots, ad we ve chose α = ¹/₃. A overloaded regio has at least ⅔ 2ˢ elemets i it. Let the radom variable Bₛ represet the umber of keys that hash ito the block of size 2ˢ cetered o h(xₐ). We wat to kow Pr[ Bₛ ⅔ 2 s. Assumig our hash fuctios are at least 2-idepedet, we have E[Bₛ = ⅓ 2ˢ. The the above quatity is equivalet to Pr[ Bₛ 2 E[Bₛ, ad lookig up a elemet takes, o expectatio, time log O(1) s=0 2 s Pr[ B s 2 E[ B s

36 Cocetratio Iequalities The expressio Pr[ Bₛ 2 E[Bₛ seems like a perfect case to try to use a cocetratio boud, like we did last Thursday. Kowig othig about Bₛ other tha the fact that it's oegative, we could start off by tryig to use Markov's iequality: Usig what we have: Pr[ X c E[X / c Pr[ Bₛ 2 E[Bₛ E[Bₛ / 2 E[Bₛ = ½. That's a pretty weak boud. What does that do to our aalysis?

37 A Rutime Boud The expected cost of lookig up xₐ i a liear probig table is log O(1) 2 s Pr[ B s 2 E[ B s s=0 Assumig 2-idepedet hashig, this is log O(1) s=0 log O(1) s=0 log = O(1) s=0 = O() 2 s Pr[ B s 2 E[ B k 2 s s This boud is ot at all useful. We're goig to eed to do better tha this!

38 Cocetratio Iequalities This aalysis used Markov s iequality without ay additioal kowledge about Bₛ. Bₛ is the umber of elemets that hash ito the block of size 2ˢ ear h(xₐ). What does that tell us? Let Xᵢₛ be a idicator variable that's 1 if xᵢ hashes ito the regio of size 2ˢ cetered o h(xₐ) ad 0 otherwise. The we ca write B s = Notice that X is. E[ B s = E[ X is = E [ X is.

39 Cheroff Bouds Last time, we saw the Cheroff boud, which says that if X ~ Biom(, p) ad p < 1/2, the We just saw that our variable Bₛ is the sum of a umber of Beroulli variables Xᵢₛ, so it seems like we might be able to apply Cheroff bouds here. Problem: These Xᵢₛ variables are ot idepedet of oe aother! (1/2 p) 2 2p Pr[ X > /2 < e We kow h is k-idepedet ad we kow what h(xₐ) is. So ay other group of k-1 hashes are idepedet, but ot all of them. Therefore, Bₛ is ot biomially distributed, so we ca't use a Cheroff boud.

40 Chebyshev's Iequality The last remaiig boud that we used last time was Chebyshev's iequality, which states that Pr [ X E[X c Var[X / c 2. If we ca determie Var[Bₛ, the we ca try usig Chebyshev's iequality to boud the probability that Bₛ is too large.

41 The Variace Var[ B s = Var [ = = X is Var[ X is E[ X is 2 E[ X is = E[ X is = E[ B s Assume, Assume, goig goig forward, forward, that that the the Xᵢₛ's Xᵢₛ's are are pairwise pairwise idepedet. idepedet. We're We're already already coditioig coditioig o o kowig kowig h(xₐ). h(xₐ). This This meas meas that that we we eed eed our our hash hash fuctio fuctio to to be be at at least least 3-idepedet 3-idepedet from from this this poit poit oward. oward.

42 The Variace Stadard Stadard techique techique we we saw saw last last time: time: use use the the fact fact that that Var[Z Var[Z E[Z E[Z Var[ B s = Var [ = = X is Var[ X is E[ X 2 is E[ X is = E[ X is = E[ B s

43 The Variace Stadard Stadard techique techique we we saw saw last last time: time: if if Z is is a a idicator idicator variable, variable, the the Z 2 2 = Z. Z. Var[ B s = Var [ = = X is Var[ X is E[ X 2 is E[ X is = E[ X is = E[ B s

44 The Variace More More geerally: geerally: if if X is is a sum sum of of pairwise pairwise idepedet idepedet idicator idicator variables, variables, the the Var[X Var[X E[X. E[X. Var[ B s = Var [ = = X is Var[ X is E[ X 2 is E[ X is = E[ X is = E[ B s

45 Usig Chebyshev We wat to kow Pr[ Bₛ 2 E[Bₛ = Pr[ Bₛ E[Bₛ E[Bₛ Usig Chebyshev's iequality: Pr[ Bₛ E[Bₛ E[Bₛ Pr[ Bₛ E[Bₛ E[Bₛ Var[Bₛ / E[Bₛ 2 E[Bₛ / E[Bₛ 2 = 1 / E[Bₛ = 3 2 ˢ.

46 A Better Boud The expected cost of lookig up xₐ i a liear probig table is log O(1) 2 s Pr[ B s 2 E[ B s s=0 Assumig 3-idepedet hashig, this is log O(1) s=0 log O(1) s=0 log = O(1) s=0 = O(log ) 2 s Pr[ B s 2 E[ B s 2 s 3 2 s 3 Theorem: This rutime boud is tight (there's a adversarial choice of a 3-idepedet hash fuctio that degrades the rutime to this level.)

47 Why This Works Key idea: Icreasig the degree of idepedece lets us cotrol the variace of the distributio. With 2-idepedet hashig, we use oe degree of idepedece to coditio o kowig where some specific key lads. At that poit, we oly have oe more degree of idepedece ot eough to cotrol the variace! With 3-idepedet hashig, we use oe degree of idepedece to coditio o kowig where the key lads. We ca the use the two remaiig degrees of idepedece to cotrol the variace ad use Chebyshev's iequality. Small icreases to the idepedece of a hash fuctio ca dramatically tighte cocetratio bouds.

48 Questio: If we icrease the degree of idepedece further, ca we costrai the spread of the elemets i a way that improves our rutime? (This is the theory versio of ca we do better? )

49 Geeralizig Variace The variace of a radom variable X is defied as Var[X = E[(X E[X) 2. We ca geeralize this to higher expoets. The fourth cetral momet of X, deoted 4th[X, is defied as 4th[X = E[(X E[X) 4. Like the variace, 4th[X measures how likely we are to get far away from E[X. Because of the fourth-power term, 4th[X is much more sesitive to outliers.

50 Geeralizig Chebyshev The fourth momet iequality states that Pr[ X E[X c 4th[X / c 4. Proof: Let X be a radom variable. The Pr [ X E[X c = Pr[ (X E[X) 4 c 4. Let Y = (X E[X) 4. Notice that E[Y = E[(X E[X) 4 = 4th[X, so via Markov's iequality, we have Pr[ X E[X c = Pr[ Y c 4 Good Good questio questio to to poder: poder: why why does't does't this this work work for for the the third third cetral cetral momet, momet, where where 3rd[X 3rd[X = (X (X E[X) E[X) 3? 3? E[Y / c 4 = 4th[X / c 4.

51 Geeralizig Idicator Variace Theorem: If X is a idicator variable for the evet Ɛ, the 4th[X E[X. Proof: X takes o value 1 with probability Pr[Ɛ ad 0 with probability 1 Pr[Ɛ. Therefore, we have 4th[X = E[(X E[X) 4 = (1 Pr[Ɛ) 4 Pr[Ɛ + Pr[Ɛ 4 (1 Pr[Ɛ) (1 Pr[Ɛ) 3 Pr[Ɛ + Pr[Ɛ 4 = Pr[Ɛ Pr[Ɛ 4 + Pr[Ɛ 4 = Pr[Ɛ = E[X. Read Read this this o o your your ow ow time time it s it s cute! cute!

52 Updatig our Aalysis For liear probig, we're ultimately iterested i boudig Pr[ Bₛ 2 E[Bₛ i the case where Bₛ represets the umber of elemets hittig a particular block. Usig 2-idepedet hashig, the best boud we could use was Markov's iequality, which gave a extremely weak boud. Usig 3-idepedet hashig, we could use Chebyshev's iequality, which gave a iverse expoetial boud. Questio: If we use stroger hash fuctios, ca we tighte this boud usig the fourth momet iequality?

53 What is 4th[Bₛ?

54 The Limits of Our Geeralizatio There s a lovely little expressio for Var[X: That s because Var[X = E[X 2 E[X 2. Var[X = E[(X E[X) 2 = E[X 2 2X E[X + E[X 2 = E[X 2 2E[X E[X + E[X 2 = E[X 2 2E[X 2 + E[X 2 = E[X 2 E[X 2. We ca try this for fourth momets, but, well, um... 4th[X = E[(X E[X) 4 = E[X 4 4X 3 E[X + 6X 2 E[X 2 4X E[X 3 + E[X 4 = E[X 4 4E[X E[X 3 + 6E[X 2 E[X 2 4E[X E[X 3 + E[X 4 = E[X 4 4E[X E[X 3 + 6E[X 2 E[X 2 3E[X 4 = \_( ツ )_/

55 The Fourth Momet Let s see if we ca boud 4th[Bₛ. 4th[ B s = E[(B s E[B s ) 4 = E[( X is E[ X is )4 = E[( ( X is ))4 = E[ j=1 k=1 l=1 = j=1 k=1 l=1 ( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) So ow we just eed to simplify this expressio.

56 Icreasig our Idepedece We ow have this lovely expressio: 4th[B s = j=1 k=1 l=1 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) Recall: If our hash fuctio is k-idepedet, the we've already used oe degree of idepedece coditioig o kowig where h(xₐ) is. That leaves us with k-1 degrees of idepedece. Let's suppose we're usig a 5-idepedet hash fuctio, meaig that ay four hash values are idepedet of oe aother. This allows us to dramatically simplify this expressio.

57 Explorig this Summatio The terms of this summatio might sometimes rage over the same variables at the same time: 4th[B s = E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) j=1 k=1 l=1 Claim: Ay term i the above summatio where Xᵢₛ is a differet radom variable tha Xⱼₛ, Xₖₛ, ad Xₗₛ is zero. Proof: Suppose that Xᵢₛ is a differet radom variable from the others. The sice Xᵢₛ, Xⱼₛ, Xₖₛ, ad Xₗₛ are idepedet, we have = E[ (Xᵢₛ E[Xᵢₛ)(Xⱼₛ E[Xⱼₛ)(Xₖₛ E[Xₖₛ)(Xₗₛ E[Xₗₛ) = E[Xᵢₛ E[Xᵢₛ E[(Xⱼₛ E[Xⱼₛ)(Xₖₛ E[Xₖₛ)(Xₗₛ E[Xₗₛ) = 0 E[(Xⱼₛ E[Xⱼₛ)(Xₖₛ E[Xₖₛ)(Xₗₛ E[Xₗₛ) = 0

58 Explorig this Summatio The terms of this summatio might sometimes rage over the same variables at the same time: 4th[B s = E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) j=1 k=1 l=1 Claim: Every term i this sum is zero except for the followig: Terms where i = j = k = l. Terms where two of i, j, k, ad l refer to oe value ad the other two of i, j, k, ad l refer to aother. Proof: If a variable appears exactly oe time, the by our previous logic the term evaluates to zero. If a variable appears exactly three times, the the other variable appears exactly oce ad the term evaluates to zero. That leaves behid the two remaiig cases here.

59 Explorig this Summatio The terms of this summatio might sometimes rage over the same variables at the same time: 4th[B s = E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) j=1 k=1 l=1 Claim: Every term i this sum is zero except for the followig: Terms where i = j = k = l. Terms where two of i, j, k, ad l refer to oe value ad the other two of i, j, k, ad l refer to aother. E[( X is ) 4 + ( 4 2) p=1 E[( X ps E[ X ps ) 2 ( X qs E[ X qs ) 2 q=p+1

60 Explorig this Summatio The terms of this summatio might sometimes rage over the same variables at the same time: 4th[B s = E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) j=1 k=1 l=1 Claim: Every term i this sum is zero except for the followig: Terms where i = j = k = l. Terms where two of i, j, k, ad l refer to oe value ad the other two of i, j, k, ad l refer to aother. E[( X is ) 4 + ( 4 2) p=1 E[( X ps E[ X ps ) 2 ( X qs E[ X qs ) 2 q=p+1 Which Which of of i, i, j, j, k, k, ad ad l l refer refer to to the the first first value? value? What s What s the the first first value? value? What s What s the the secod? secod? (It (It must must be be differet differet tha tha the the first!) first!)

61 Explorig this Summatio The terms of this summatio might sometimes rage over the same variables at the same time: 4th[B s = E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) j=1 k=1 l=1 Claim: Every term i this sum is zero except for the followig: Terms where i = j = k = l. Terms where two of i, j, k, ad l refer to oe value ad the other two of i, j, k, ad l refer to aother. E[( X is ) 4 + ( 4 2) E[( X is ) 2 ( X js E[ X js ) 2 j=i+1 We ll We ll use use i i ad ad j j as as our our summatio summatio variables, variables, sice sice that s that s easier easier to to read. read.

62 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = 4th[ X is + 6 4th[ X is + 3 j=i+1 j=1 j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Sice Var[ X is Var[ X js Sice h is is 5-idepedet 5-idepedet ad ad we re we re 3( coditioig coditioig o = 4th[ X is + o just just kowig kowig oe oe hash hash locatio locatio Var[ X is )2 (h(xₐ)), (h(xₐ)), these these are are idepedet idepedet radom radom variables. variables. = 4th[ X is + 3Var[ B s 2 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2

63 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 This This is is the the defiitio 3( = 4th[ X is + of of the the fourth cetral j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 momet. E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2 This This is is the the defiitio of of variace. So So is is this. this.

64 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 4th[ X is + 3( j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 6 = 3 3 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2

65 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 4th[ X is + 3( j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2 = 2

66 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 4th[ X is + 3( j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2 Var[ X is = Var [ X is = Var [ B s

67 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 4th[ X is + 3( j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2 If If X is is a a idicator, idicator, the the 4th[X 4th[X E[X. E[X. We We kow kow from from our our 3-idepedece 3-idepedece aalysis aalysis that that Var[Bₛ Var[Bₛ E[Bₛ E[Bₛ

68 4th[B s = j=1 k=1 l=1 = E[( X is ) 4 + ( 4 E[( X is )( X js E[ X js )( X ks E[ X ks )( X ls E[ X ls ) 2) = E[( X is ) = = 4th[ X is + 6 j=i+1 4th[ X is + 3 j=1 4th[ X is + 3( j=i+1 j=i+1 E[( X is ) 2 ( X js E[ X js ) 2 E[( X is ) 2 E[( X js E[ X js ) 2 Var[ X is Var[ X js Var[ X is Var[ X js Var[ X is )2 = 4th[ X is + 3Var[ B s 2 E[ X is + 3E[ B s 2 = E[ B s + 3E[ B s 2 4E[ B s 2 (As (As log log as as E[Bₛ E[Bₛ 1, 1, which which we we ca ca assume assume if if we re we re talkig talkig about about sufficietly sufficietly large large regios.) regios.)

69 The Net Result We've just show that 4th[B 4 E[B 2 Phew! That was crazy. But at least we ow have a boud o the fourth momet, which lets us use the fourth momet iequality!

70 Fourth Momets for Victory Usig the fourth momet iequality: Pr[ Bₛ 2E[Bₛ = Pr[ Bₛ E[Bₛ E[Bₛ 4th[Bₛ / E[Bₛ 4 4 E[Bₛ 2 / E[Bₛ 4 = 4 / E[Bₛ 2 = 4 / (¹/₃ 2 s ) 2 = s. Notice that this is expoetially better tha our previous boud!

71 A Strog Rutime Boud The expected cost of lookig up xₐ i a liear probig table is log O(1) 2 s Pr[ B s 2 E[ B s s=0 Assumig 5-idepedet hashig, this is log O(1) s=0 log O(1) s=0 log = O(1) s=0 = O(1) 2 s Pr[ B s 2 E[ B s 2 s s 36 2 s We've fially obtaied a O(1) boud o the cost of operatios i a chaied hash table provided that we use 5-idepedet hashig!

72 What Just Happeed? With oe degree of idepedece, we could obtai the expected value ad use that to boud the probability with Markov's iequality. Usig two degrees of idepedece, we could obtai the variace ad use that to boud the probability with Chebyshev's iequality. Usig four degrees of idepedece, we could obtai the fourth cetral momet ad use that to boud the probability with the fourth momet boud. Icreasig the stregth of a hash fuctio allows us to obtai more cetral momets ad, therefore, to tighte our boud more tha might iitially be suspected.

73 More to Explore Mitzemacher ad Vadha s paper Why Simple Hash Fuctios Work provides a fudametally differet strategy for aalyzig liear probig. Pătrașcu ad Thorup s paper o the lower boud for 5-idepedece here gives a glimpse of how you d argue that these bouds ca t be improved.

74 Next Time Cuckoo Hashig Hashig with worst-case O(1) lookups! The Cuckoo Graph Radom graphs for Fu ad Profit.

CS 330 Discussion - Probability

CS 330 Discussion - Probability CS 330 Discussio - Probability March 24 2017 1 Fudametals of Probability 11 Radom Variables ad Evets A radom variable X is oe whose value is o-determiistic For example, suppose we flip a coi ad set X =

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Hashing and Amortization

Hashing and Amortization Lecture Hashig ad Amortizatio Supplemetal readig i CLRS: Chapter ; Chapter 7 itro; Sectio 7.. Arrays ad Hashig Arrays are very useful. The items i a array are statically addressed, so that isertig, deletig,

More information

Bertrand s Postulate

Bertrand s Postulate Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a

More information

Analysis of Algorithms. Introduction. Contents

Analysis of Algorithms. Introduction. Contents Itroductio The focus of this module is mathematical aspects of algorithms. Our mai focus is aalysis of algorithms, which meas evaluatig efficiecy of algorithms by aalytical ad mathematical methods. We

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

Lecture 2 February 8, 2016

Lecture 2 February 8, 2016 MIT 6.854/8.45: Advaced Algorithms Sprig 206 Prof. Akur Moitra Lecture 2 February 8, 206 Scribe: Calvi Huag, Lih V. Nguye I this lecture, we aalyze the problem of schedulig equal size tasks arrivig olie

More information

CS / MCS 401 Homework 3 grader solutions

CS / MCS 401 Homework 3 grader solutions CS / MCS 401 Homework 3 grader solutios assigmet due July 6, 016 writte by Jāis Lazovskis maximum poits: 33 Some questios from CLRS. Questios marked with a asterisk were ot graded. 1 Use the defiitio of

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Math 216A Notes, Week 5

Math 216A Notes, Week 5 Math 6A Notes, Week 5 Scribe: Ayastassia Sebolt Disclaimer: These otes are ot early as polished (ad quite possibly ot early as correct) as a published paper. Please use them at your ow risk.. Thresholds

More information

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES Peter M. Maurer Why Hashig is θ(). As i biary search, hashig assumes that keys are stored i a array which is idexed by a iteger. However, hashig attempts to bypass

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

Lecture 4: Unique-SAT, Parity-SAT, and Approximate Counting

Lecture 4: Unique-SAT, Parity-SAT, and Approximate Counting Advaced Complexity Theory Sprig 206 Lecture 4: Uique-SAT, Parity-SAT, ad Approximate Coutig Prof. Daa Moshkovitz Scribe: Aoymous Studet Scribe Date: Fall 202 Overview I this lecture we begi talkig about

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Hashing. Algorithm : Design & Analysis [09]

Hashing. Algorithm : Design & Analysis [09] Hashig Algorithm : Desig & Aalysis [09] I the last class Implemetig Dictioary ADT Defiitio of red-black tree Black height Isertio ito a red-black tree Deletio from a red-black tree Hashig Hashig Collisio

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 12

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 12 CS 70 Discrete Mathematics ad Probability Theory Fall 2009 Satish Rao,David Tse Note 12 Two Killer Applicatios I this lecture, we will see two killer apps of elemetary probability i Computer Sciece. 1.

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

11. Hash Tables. m is not too large. Many applications require a dynamic set that supports only the directory operations INSERT, SEARCH and DELETE.

11. Hash Tables. m is not too large. Many applications require a dynamic set that supports only the directory operations INSERT, SEARCH and DELETE. 11. Hash Tables May applicatios require a dyamic set that supports oly the directory operatios INSERT, SEARCH ad DELETE. A hash table is a geeralizatio of the simpler otio of a ordiary array. Directly

More information

Definitions: Universe U of keys, e.g., U N 0. U very large. Set S U of keys, S = m U.

Definitions: Universe U of keys, e.g., U N 0. U very large. Set S U of keys, S = m U. 7 7 Dictioary: S.isertx): Isert a elemet x. S.deletex): Delete the elemet poited to by x. S.searchk): Retur a poiter to a elemet e with key[e] = k i S if it exists; otherwise retur ull. So far we have

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

6.046 Recitation 5: Binary Search Trees Bill Thies, Fall 2004 Outline

6.046 Recitation 5: Binary Search Trees Bill Thies, Fall 2004 Outline 6.046 Recitatio 5: Biary Search Trees Bill Thies, Fall 2004 Outlie My cotact iformatio: Bill Thies thies@mit.edu Office hours: Sat 1-3pm, 36-153 Recitatio website: http://cag.lcs.mit.edu/~thies/6.046/

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

19.1 The dictionary problem

19.1 The dictionary problem CS125 Lecture 19 Fall 2016 19.1 The dictioary proble Cosider the followig data structural proble, usually called the dictioary proble. We have a set of ites. Each ite is a (key, value pair. Keys are i

More information

Lecture 5: April 17, 2013

Lecture 5: April 17, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 5: April 7, 203 Scribe: Somaye Hashemifar Cheroff bouds recap We recall the Cheroff/Hoeffdig bouds we derived i the last lecture idepedet

More information

Lecture 6: Coupon Collector s problem

Lecture 6: Coupon Collector s problem Radomized Algorithms Lecture 6: Coupo Collector s problem Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Radomized Algorithms - Lecture 6 1 / 16 Variace: key features

More information

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Variace of Discrete Radom Variables Class 5, 18.05 Jeremy Orloff ad Joatha Bloom 1 Learig Goals 1. Be able to compute the variace ad stadard deviatio of a radom variable.. Uderstad that stadard deviatio

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Lecture 4 The Simple Random Walk

Lecture 4 The Simple Random Walk Lecture 4: The Simple Radom Walk 1 of 9 Course: M36K Itro to Stochastic Processes Term: Fall 014 Istructor: Gorda Zitkovic Lecture 4 The Simple Radom Walk We have defied ad costructed a radom walk {X }

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Sequences I. Chapter Introduction

Sequences I. Chapter Introduction Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Shannon s noiseless coding theorem

Shannon s noiseless coding theorem 18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1 PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured

More information

5.1 Review of Singular Value Decomposition (SVD)

5.1 Review of Singular Value Decomposition (SVD) MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of

More information

Skip Lists. Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 S 3 S S 1

Skip Lists. Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 S 3 S S 1 Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Skip Lists S 3 15 15 23 10 15 23 36 Skip Lists 1 What is a Skip List A skip list for

More information

CS166 Handout 02 Spring 2018 April 3, 2018 Mathematical Terms and Identities

CS166 Handout 02 Spring 2018 April 3, 2018 Mathematical Terms and Identities CS166 Hadout 02 Sprig 2018 April 3, 2018 Mathematical Terms ad Idetities Thaks to Ady Nguye ad Julie Tibshirai for their advice o this hadout. This hadout covers mathematical otatio ad idetities that may

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Lecture 4 February 16, 2016

Lecture 4 February 16, 2016 MIT 6.854/18.415: Advaced Algorithms Sprig 16 Prof. Akur Moitra Lecture 4 February 16, 16 Scribe: Be Eysebach, Devi Neal 1 Last Time Cosistet Hashig - hash fuctios that evolve well Radom Trees - routig

More information

Discrete probability distributions

Discrete probability distributions Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2013 Aat Sahai Lecture 18 Iferece Oe of the major uses of probability is to provide a systematic framework to perform iferece uder ucertaity. A

More information

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis Recursive Algorithms Recurreces Computer Sciece & Egieerig 35: Discrete Mathematics Christopher M Bourke cbourke@cseuledu A recursive algorithm is oe i which objects are defied i terms of other objects

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

CS161: Algorithm Design and Analysis Handout #10 Stanford University Wednesday, 10 February 2016

CS161: Algorithm Design and Analysis Handout #10 Stanford University Wednesday, 10 February 2016 CS161: Algorithm Desig ad Aalysis Hadout #10 Staford Uiversity Wedesday, 10 February 2016 Lecture #11: Wedesday, 10 February 2016 Topics: Example midterm problems ad solutios from a log time ago Sprig

More information

Recurrence Relations

Recurrence Relations Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Disjoint set (Union-Find)

Disjoint set (Union-Find) CS124 Lecture 7 Fall 2018 Disjoit set (Uio-Fid) For Kruskal s algorithm for the miimum spaig tree problem, we foud that we eeded a data structure for maitaiig a collectio of disjoit sets. That is, we eed

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Solutios to Quiz : Sprig 006 Problem : Each of the followig statemets is either True or False. There will be o partial credit give for the True False questios, thus ay explaatios will ot be graded. Please

More information

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS

THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS THE ASYMPTOTIC COMPLEXITY OF MATRIX REDUCTION OVER FINITE FIELDS DEMETRES CHRISTOFIDES Abstract. Cosider a ivertible matrix over some field. The Gauss-Jorda elimiatio reduces this matrix to the idetity

More information