MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM
|
|
- Patience Townsend
- 5 years ago
- Views:
Transcription
1 MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM XIANG LI Abstract. This paper centers on the analysis of a specific randomized algorithm, a basic random process that involves marking a binary tree, in light of concepts and techniques from probability theory. The first part of the paper is based on an interesting assumption that significantly simplifies the problem and thus presents us with a solution for figuring out the expectation of the time steps required by this algorithm. The essential part of the solution is a coupon-collector model that mainly makes use of the geometric probability distribution. The rest of the paper aims to verify the legitimacy of our previous assumption with a balls-into-bins model and the concept of Poisson distribution. Contents. Presenting the Problem: Marking a Binary Tree. Random Variables, Probability and Expectation 3. The expectation of Total Time Steps Assuming the Existence of a Bottleneck. The Balls-into-bins Model and Poisson Distribution 6 5. Verifying the Existence of a Bottleneck 7 6. Conclusion 9 Acknowledgments 9 7. bibliography 9 References 9. Presenting the Problem: Marking a Binary Tree The research of a random process often involves the understanding of its patterns or mechanisms at a higher level, with formal mathematical proofs developed to support such understanding. A basic random process of marking a binary tree is presented below, and we are particularly interested in the number of time steps required to mark the entire tree. Consider a complete binary tree with N = n nodes and depth of n. For example, Figure is a binary tree of depth 5. For a particular node, its parent is the node directly connected to it one level above. Its sibling is the node on the same level that shares the same parent. Its two children, if it has any at all, are the two nodes directly connected to it one level below. Date: AUGUST 6, 08.
2 XIANG LI Figure. A Complete Binary Tree of Depth 5 and Relationships between Nodes Initially, all nodes are unmarked, and our ultimate goal is to mark the entire tree with the process we shall describe. First we number each node with a unique identifying number within the range of {,,...,N}. Every time step, a random number, chosen uniformly at random from {,,...,N} is generated and sent as a signal to mark the corresponding node with the same identifying number. After you mark the sent node, a infecting process is also immediately invoked: If a node and its sibling are marked, its parent is marked. If a node and its parent are marked, the other sibling is marked. The marking rule is always applied recursively as much as possible before the next node is sent. For example, in Figure., the marked nodes are filled in. The arrival of the marked node labeled by an X will allow you to mark the remainder of the nodes as you apply the marking rule in the sequence of node,, 3. Figure. The Infection Caused by the Arrival of the Marked X Throughout the analysis of this process, the leaf nodes, which are nodes at the very bottom level, are of particular interest to us. Moreover, we ll frequently treat two leaf nodes that are siblings as a pair. The number of pairs of leaf node siblings will be denoted by N, which is equal to n =. Before diving into the task, we shall review some basic concepts of probability theory.. Random Variables, Probability and Expectation Definition.. A random variable on a sample space Ω is a real-valued function on Ω, denoted by X : Ω R.
3 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM3 Definition.. A probability function is any function Pr: F R that satisfies the following conditions:. for any event E, O Pr(E) ;. for the sample space Ω,Pr(Ω) = and 3. for any finite or countably infinite sequence of pairwise mutually disjoint events E,E,E 3,..., Pr( E i ) = i ). i i Pr(E Definition.3. The expectation of a discrete random variable X, denoted by E[X], is given by E(x) = i ipr(x = i), where the summation is over all values in the range of X. Theorem. (Linearity of expectation). For any finite collection of discrete random variables X,X,...,X n with finite expectations, n n E( X i ) = E(X i ). Proof. We can first prove the case for two random variables and derive from it the general case by induction. LetX andy betworandomvariables. E(X+Y) = (x+y)pr(x = x,y = y) x y = xpr(x = x,y = y)+ ypr(x = x,y = y) x y x) y = x x y Pr(X = x,y = y)+ y y Pr(X = x,y = y) = x xpr(x = x)+ y ypr(y = y) = E(X)+E(Y), where the last equality directly follows from the definition of expectations and all the summations are over the range of the corresponding random variables. Definition.5. A geometric random variable X with parameter p is given by the probability distribution Pr(X = n) = ( p) n p, where n takes on nonnegative integer values. Remark.6. It can be easily verified that the geometric probability distribution satisfies the three properties in Definition.. Now we turn to computing the expectation of a geometric random variable.
4 XIANG LI Lemma.7. Let X be a discrete random variable that takes on only nonnegative integer values. Then E(X) = Pr(X i) j=i Proof. Pr(X i) = Pr(X = j) = = jpr(x = j) = E(X) j= j= j Pr(X = j) The second equality is justified because all the terms being summed up are nonnegative. Theorem.8. The expectation of a geometric random variable X with parameter p is given by E(X) = p Proof. By Definition.5, Pr(X i) = Hence, E(X) = = = p ( p) i Pr(X i) ( p) j p = ( p) i. j=i We shall proceed by making use of the definitions and results presented so far in order to model the random process of marking trees from chapter in the light of probability theory. 3. The expectation of Total Time Steps Assuming the Existence of a Bottleneck For those with certain background in computer programming, it is easy to write a simulation program and have the program print out the sequence of nodes sent as the identifiers. After a few trials, one will find that the last identifier is almost always a leaf node, one of the n nodes at the bottom level. Such behavior of the random process is no surprise if we think about what is going on at the bottom level during the marking process. In order for a leaf node to be marked, since it doesn t have any children to infect it, it s either the node itself or its sibling (the adjacent node sharing the same parent) that is marked directly as an identifier. Therefore, a necessary condition of marking the entire tree is that at least one of each pair of siblings at the leaves is sent as an identifier. In other words, we must reach pair of leaf nodes. In the rest of the paper, we will first assume that with high probability the number of steps required for marking the entire tree is equivalent to that for marking, directly or indirectly, all the leaf nodes, and figure out the expected number of nodes sent based on our assumption. Then we will verify the legitimacy of such assumption.
5 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM5 Theorem 3. (A Coupon-collector Problem). The expected number of nodes sent in order to mark at least one of each pair of siblings at the leaves is Nln + Θ(N). Proof. A close analogy of this process is the classic coupon-collector model. Our goal is to collect distinct pairs of siblings, coupons. At each unit of time, you send an identifier that may or may not be a leaf node you haven t reached yet, which is analogous to opening a box that may or may not contain a coupon you haven t collected. Let X be the number of identifiers sent until at least one of every pair of leaf siblings is marked. Our goal is to determine E(X). Gaining insight from the coupon-collector model, we let X i be the number of identifiers sent when you have reached exactly i pairs of leaf nodes. In other words, you send X i nodes to go from reaching i pairs to reaching i pairs. It follows that X = When exactly i pairs of leaf nodes have been reached, each time an identifier is sent, the probability of reaching a new pair is p i = [ (i )] N = N i+5 N. Therefore, Pr(X i = r) = ( p i ) r p i, where r is an arbitrary nonnegative integer. By Definition.5, X i is a geometric random variable with parameter p i = N i+5 N. Applying Theorem.8, we have E(X i ) = p i. Then using Theorem., the linearity of expectation, we have E(X) = E( X i ) = E(X i ) = X i. N N i+5 = N i = N i. As we will show in Lemma 3.3, the summation is equal to ln i + Θ(). Therefore, the expected time steps needed to mark at least one of each pair of siblings at the leaves is Nln +Θ(N). Definition 3. (The Big Θ Notation). Let f and g both be real-valued functions defined on some unbounded subset of real positive numbers. f(x) = Θ(g(x)) as x if and only if there exists a positive real number M and a real number x0 such that f(x) Mg(x) for all x x 0. f(x) Mg(x) for all x x 0. Lemma 3.3. The following equation holds true for all n N : n i = lnn+θ(). Proof. According to the definition of natural log, we have both lnn = n x= x dx n x= n x dx = i
6 6 XIANG LI and n n lnn+ = x= x dx+ n x= x dx+ = i + = n i. i= n n Since lnn i lnn+, by Definition 3., we have i = lnn+θ(). Before we verify our assumption that the entire process can be reduced to marking the leaf nodes, a new model, the balls-into-bins model, is to be introduced, along with several relevant techniques of analysis.. The Balls-into-bins Model and Poisson Distribution The Balls-into-bins model refers to the process of randomly and uniformly placing m balls into n bins. This is similar to our task of marking the tree: m signals are distributed randomly and uniformly among N nodes. We are interested in the probability that a particular bin has r balls. We start by counting the number of different ( configurations of the r balls selected from a total of m balls. There are m ) r ways to select the r balls. For each configuration, the probability that all of the given r balls fall into our bin and no other balls do is given by ( n )r ( n )m r. Hence, the probability that our bin contains exactly r balls is equal to ( ) m ( r n )r ( n )m r = m(m ) (m r +) n r ( n )m r (m n )r ( n )m (m n )r e m n. Definition.. A Poisson random variable X with parameter µ is given by the probability distribution Pr(X = r) = µr e µ. Remark.. The probability distribution of Poisson variables satisfy the three properties in Definition.. In particular, we can verify that µ r e µ Pr(X = r) = = e µ r=0 r=0 r=0 µ r using the Taylor expansion of e x. = e µ e µ =, Definition.3. Let X,Y be two random variables. X and Y are independent if and only if for all possible i,j we have Pr((X = i) (Y = j)) = Pr(X = i) Pr(Y = j). Theorem.. Let X,X,...,X n be independent Poisson random bariables with parameter µ,µ,...,µ n respectively. Y = X +X + +X n is a Poisson random variable with parameter µ = µ +µ + +µ n.
7 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM7 Proof. Let X,Y be two Poisson random variables with parameter µ,µ respectively. Then Pr(X +Y = r) = Pr((X = j) (Y = r j)) = Pr(X = j) Pr(Y = r j)) = = e (µ+µ) µ j µr j j!(r j)! = e (µ+µ) µ j e µ j! ( r j µ r j e µ (r j)! ) µ j µr j = e (µ+µ) (µ +µ ) r, where the last equality uses the binomial theorem. The more general case regarding the sum of n variables can be proven by induction. 5. Verifying the Existence of a Bottleneck Here a discussion about the Poisson approximation is left out. The main conclusion of the Poisson approximation is that when we are considered with events with sufficiently extreme probability, throwing n balls into m bins yields roughly the same distribution as assigning each bin a number of balls that is Poisson distributed with µ = m n ; therefore, the loads in all bins are independent under the Poisson approximation. Let s first pay attention to the level with depth n- of the tree. Given that at least one of each pair of leaf node siblings is marked, if all the nodes on the n- level are also all marked, consequently we know that the entire tree is marked because nodes on the n- level can infect other nodes upwards. Hence, when at least one of each pair of leaf node siblings is marked, Pr(the tree is completely marked) = Pr(the n- level is completely marked) = ( Pr(node i with depth n- is not marked for an arbitrary i)), Therefore, we just need to compute the probability that a particular node i on the n- level of the tree is not marked. Let m be the total number of signals sent and X i be number of signals sent to node i. According to our previous analysis of the balls-into-bins model, X i is a Poisson variable with a probability distribution Pr(X i = r) = (m N )r e m N For the node i, let X p,x s,x c,x c be the number of signals sent to its parent, sibling, first child and second child respectively. Three pieces of information can be gained from an unmarked node i:. X i = 0.. X p = 0 or X s = X c = 0 or X c = 0. Note that these three combined are a necessary but not sufficient condition for node i to be unmarked because a zero value on X p doesn t necessarily make i s parent unmarked. Such simplification is valid here because we are only interested in an upper bound of the probability that i is unmarked. Under the Poisson approximation, X i,x p,x s,x c,x c are all independent Poisson random variables and we can therefore invoke Theorem.3. Without loss of.
8 8 XIANG LI generality, let s consider the case where Y = X i +X p +X c = 0. By Theorem.3, Y is again a Poisson random variable with parameter µ = 3m N. By Definition., Applying the union bound, we have Pr(Y = 0) = e 3m N. Pr(X i is not marked) e 3m N, because there are possible combinations among X p,x s,x c and X c, andpr(the tree is completely marked every pair of leaf node siblings is reached) ( e 3m N ). When m takes on the value N ln( ) + Θ(N), which is our previous result for the expectation of signals needed, we have Pr(the tree is completely marked every pair of leaf node siblings is reached) = ( N 3 )N where N =. For a sufficiently large N,( N 3 )N is close to, and it has a limit of when N approaches infinity. Therefore, we can conclude that when the total number of time steps taken is around nlnn, with high probability marking the nodes above leaves is completed before marking each pair of leaf nodes; that is, marking the leaves is with high probability the bottleneck of the entire process. It s worth pointing out that here we are only concerned with the situation where the total number of time steps is equivalent to N lnn +Θ(N) because, as we will show next, using the balls-into-bins model for N sufficiently large, with high probability the number of signals required to reach all pairs of leaf nodes stays very close to N lnn. Theorem 5.. If Y is the number of signals sent before reaching every pair of leaf nodes, then for any constant c we have N lnn lim Pr(X > +cn) = e e c. N Empirically, this means the random variable Y concentrates around its expectation N lnn + Θ(N). After plugging in some value of c, we can see that, for instance, the probability that X > N lnn +n is less than %. Proof. Again we assume Poisson approximation is appropriate in this case. Under the Poisson approximation, we can let the number of signals sent to each pair of leaf nodesbeapoissonrandomvariablewithparameterµ = lnn +c,andconsequently the expected number of total signals sent is N lnn +cn. By Definition., for a particular pair of leaf nodes and the corresponding X, Pr(X = 0) = e µ = e c N. Since all nodes are independent under the Poisson approximation, the probability that every pair of leaf nodes receives at least one signal is ( e c N ) N e e c ; that is, the probability that not all pairs of leaf nodes are marked when N lnn +cn signals are sent is e e c.
9 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM9 6. Conclusion At this point, we can conclude that the expected time steps needed in total is N ln equal to + ΘN and with high probability the number of total signals required stays very close to N ln. Acknowledgments I would like to thank Professor Greg Lawler for first introducing me into probability theory and stochastic process during the REU program. It is a pleasure to thank my mentor, Kevin Casto, for recommending reading materials and giving me insight into the topic of my paper. I would also like to thank Daniil Rudenko and Peter May for organizing the REU program. Your efforts are much appreciated. 7. bibliography References [] Michael Mitzenmacher, Eli Upfal. Probability and Computing- Randomized Algorithms and Probabilistic Analysis. Cambridge University Press. 005.
Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )
Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the
More informationLecture 4: Two-point Sampling, Coupon Collector s problem
Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms
More informationRandomized Load Balancing:The Power of 2 Choices
Randomized Load Balancing: The Power of 2 Choices June 3, 2010 Balls and Bins Problem We have m balls that are thrown into n bins, the location of each ball chosen independently and uniformly at random
More informationRandom Variable. Pr(X = a) = Pr(s)
Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably
More informationExercises for Unit VI (Infinite constructions in set theory)
Exercises for Unit VI (Infinite constructions in set theory) VI.1 : Indexed families and set theoretic operations (Halmos, 4, 8 9; Lipschutz, 5.3 5.4) Lipschutz : 5.3 5.6, 5.29 5.32, 9.14 1. Generalize
More informationSample Spaces, Random Variables
Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted
More informationErrata: First Printing of Mitzenmacher/Upfal Probability and Computing
Errata: First Printing of Mitzenmacher/Upfal Probability and Computing Michael Mitzenmacher and Eli Upfal October 10, 2006 We would like to thank the many of you who have bought our book, and we would
More informationLecture 20 : Markov Chains
CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called
More information8.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.
8. Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = Examples: 6. Find a formula for the general term a n of the sequence, assuming
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationPROBABILITY AND STATISTICS IN COMPUTING. III. Discrete Random Variables Expectation and Deviations From: [5][7][6] German Hernandez
Conditional PROBABILITY AND STATISTICS IN COMPUTING III. Discrete Random Variables and Deviations From: [5][7][6] Page of 46 German Hernandez Conditional. Random variables.. Measurable function Let (Ω,
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationRANDOM WALKS AND THE PROBABILITY OF RETURNING HOME
RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME ELIZABETH G. OMBRELLARO Abstract. This paper is expository in nature. It intuitively explains, using a geometrical and measure theory perspective, why
More informationLecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox
Lecture 04: Balls and Bins: Overview In today s lecture we will start our study of balls-and-bins problems We shall consider a fundamental problem known as the Recall: Inequalities I Lemma Before we begin,
More informationMath 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =
Math 5. Rumbos Fall 07 Solutions to Review Problems for Exam. A bowl contains 5 chips of the same size and shape. Two chips are red and the other three are blue. Draw three chips from the bowl at random,
More informationRandomized Algorithms
Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours
More informationChapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.
Chapter 11 Min Cut By Sariel Har-Peled, December 10, 013 1 Version: 1.0 I built on the sand And it tumbled down, I built on a rock And it tumbled down. Now when I build, I shall begin With the smoke from
More informationMassachusetts Institute of Technology Handout J/18.062J: Mathematics for Computer Science May 3, 2000 Professors David Karger and Nancy Lynch
Massachusetts Institute of Technology Handout 48 6.042J/18.062J: Mathematics for Computer Science May 3, 2000 Professors David Karger and Nancy Lynch Quiz 2 Solutions Problem 1 [10 points] Consider n >
More informationk-protected VERTICES IN BINARY SEARCH TREES
k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from
More informationIn a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers.
Suppose you are a content delivery network. In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers. How to distribute requests to balance the
More informationTail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).
Tail Inequalities William Hunt Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV William.Hunt@mail.wvu.edu Introduction In this chapter, we are interested
More informationRandomized Algorithms Week 2: Tail Inequalities
Randomized Algorithms Week 2: Tail Inequalities Rao Kosaraju In this section, we study three ways to estimate the tail probabilities of random variables. Please note that the more information we know about
More informationRandomized algorithm
Tutorial 4 Joyce 2009-11-24 Outline Solution to Midterm Question 1 Question 2 Question 1 Question 2 Question 3 Question 4 Question 5 Solution to Midterm Solution to Midterm Solution to Midterm Question
More informationLecture 4: Probability and Discrete Random Variables
Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1
More informationSolution Set for Homework #1
CS 683 Spring 07 Learning, Games, and Electronic Markets Solution Set for Homework #1 1. Suppose x and y are real numbers and x > y. Prove that e x > ex e y x y > e y. Solution: Let f(s = e s. By the mean
More informationNotes 6 : First and second moment methods
Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative
More informationn=1 ( 2 3 )n (a n ) converges by direct comparison to
. (a) n = a n converges, so we know that a n =. Therefore, for n large enough we know that a n
More informationEssentials on the Analysis of Randomized Algorithms
Essentials on the Analysis of Randomized Algorithms Dimitris Diochnos Feb 0, 2009 Abstract These notes were written with Monte Carlo algorithms primarily in mind. Topics covered are basic (discrete) random
More informationAlex Psomas: Lecture 17.
Alex Psomas: Lecture 17. Random Variables: Expectation, Variance 1. Random Variables, Expectation: Brief Review 2. Independent Random Variables. 3. Variance Random Variables: Definitions Definition A random
More informationNotes. Combinatorics. Combinatorics II. Notes. Notes. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006
Combinatorics Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Spring 2006 Computer Science & Engineering 235 Introduction to Discrete Mathematics Sections 4.1-4.6 & 6.5-6.6 of Rosen cse235@cse.unl.edu
More information10.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.
10.1 Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1 Examples: EX1: Find a formula for the general term a n of the sequence,
More informationA Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.
A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,
More informationDiscrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations
EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of
More informationSolutions to Problem Set 4
UC Berkeley, CS 174: Combinatorics and Discrete Probability (Fall 010 Solutions to Problem Set 4 1. (MU 5.4 In a lecture hall containing 100 people, you consider whether or not there are three people in
More informationLecture 5: January 30
CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 5: January 30 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationBalls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random
Balls & Bins Put m balls into n bins uniformly at random Seoul National University 1 2 3 n Balls into Bins Name: Chong kwon Kim Same (or similar) problems Birthday paradox Hash table Coupon collection
More informationRANDOM WALKS IN Z d AND THE DIRICHLET PROBLEM
RNDOM WLKS IN Z d ND THE DIRICHLET PROBLEM ERIC GUN bstract. Random walks can be used to solve the Dirichlet problem the boundary value problem for harmonic functions. We begin by constructing the random
More informationRandomized Algorithms III Min Cut
Chapter 11 Randomized Algorithms III Min Cut CS 57: Algorithms, Fall 01 October 1, 01 11.1 Min Cut 11.1.1 Problem Definition 11. Min cut 11..0.1 Min cut G = V, E): undirected graph, n vertices, m edges.
More informationRandom variables. DS GA 1002 Probability and Statistics for Data Science.
Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities
More informationCALCULUS JIA-MING (FRANK) LIOU
CALCULUS JIA-MING (FRANK) LIOU Abstract. Contents. Power Series.. Polynomials and Formal Power Series.2. Radius of Convergence 2.3. Derivative and Antiderivative of Power Series 4.4. Power Series Expansion
More informationHandout 1: Mathematical Background
Handout 1: Mathematical Background Boaz Barak February 2, 2010 This is a brief review of some mathematical tools, especially probability theory that we will use. This material is mostly from discrete math
More informationOn the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces
On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces Dona-Maria Ivanova under the direction of Mr. Zhenkun Li Department of Mathematics Massachusetts Institute of Technology
More informationWhy study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables
ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section
More informationACO Comprehensive Exam October 14 and 15, 2013
1. Computability, Complexity and Algorithms (a) Let G be the complete graph on n vertices, and let c : V (G) V (G) [0, ) be a symmetric cost function. Consider the following closest point heuristic for
More informationWith high probability
With high probability So far we have been mainly concerned with expected behaviour: expected running times, expected competitive ratio s. But it would often be much more interesting if we would be able
More informationModels of Computation. by Costas Busch, LSU
Models of Computation by Costas Busch, LSU 1 Computation CPU memory 2 temporary memory input memory CPU output memory Program memory 3 Example: f ( x) x 3 temporary memory input memory Program memory compute
More informationNotes on Gaussian processes and majorizing measures
Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,
More informationProbability & Computing
Probability & Computing Stochastic Process time t {X t t 2 T } state space Ω X t 2 state x 2 discrete time: T is countable T = {0,, 2,...} discrete space: Ω is finite or countably infinite X 0,X,X 2,...
More informationDisjointness and Additivity
Midterm 2: Format Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten
More informationMidterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley
Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley Midterm 2: Format 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten
More informationOn the errors introduced by the naive Bayes independence assumption
On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of
More informationDiscrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation
CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence
More informationCS70: Jean Walrand: Lecture 19.
CS70: Jean Walrand: Lecture 19. Random Variables: Expectation 1. Random Variables: Brief Review 2. Expectation 3. Important Distributions Random Variables: Definitions Definition A random variable, X,
More informationEquivalence of the random intersection graph and G(n, p)
Equivalence of the random intersection graph and Gn, p arxiv:0910.5311v1 [math.co] 28 Oct 2009 Katarzyna Rybarczyk Faculty of Mathematics and Computer Science, Adam Mickiewicz University, 60 769 Poznań,
More informationPOISSON PROCESSES 1. THE LAW OF SMALL NUMBERS
POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted
More informationDiscrete Random Variables
CPSC 53 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is
More informationDiscrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions
CS 70 Discrete Mathematics and Probability Theory Fall 20 Rao Midterm 2 Solutions True/False. [24 pts] Circle one of the provided answers please! No negative points will be assigned for incorrect answers.
More informationDefining the Integral
Defining the Integral In these notes we provide a careful definition of the Lebesgue integral and we prove each of the three main convergence theorems. For the duration of these notes, let (, M, µ) be
More informationNotes on Continuous Random Variables
Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes
More informationThe Markov Chain Monte Carlo Method
The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov
More informationMAS113 Introduction to Probability and Statistics. Proofs of theorems
MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a
More information9.1 Branching Process
9.1 Branching Process Imagine the following stochastic process called branching process. A unisexual universe Initially there is one live organism and no dead ones. At each time unit, we select one of
More information14.1 Finding frequent elements in stream
Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours
More informationDiscrete Mathematics for CS Spring 2006 Vazirani Lecture 22
CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 Random Variables and Expectation Question: The homeworks of 20 students are collected in, randomly shuffled and returned to the students.
More informationChapter Generating Functions
Chapter 8.1.1-8.1.2. Generating Functions Prof. Tesler Math 184A Fall 2017 Prof. Tesler Ch. 8. Generating Functions Math 184A / Fall 2017 1 / 63 Ordinary Generating Functions (OGF) Let a n (n = 0, 1,...)
More informationDiscrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability
CS 7 Discrete Mathematics and Probability Theory Fall 215 Note 2 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the number
More informationChapter 2 Metric Spaces
Chapter 2 Metric Spaces The purpose of this chapter is to present a summary of some basic properties of metric and topological spaces that play an important role in the main body of the book. 2.1 Metrics
More informationMA22S3 Summary Sheet: Ordinary Differential Equations
MA22S3 Summary Sheet: Ordinary Differential Equations December 14, 2017 Kreyszig s textbook is a suitable guide for this part of the module. Contents 1 Terminology 1 2 First order separable 2 2.1 Separable
More informationQuick review on Discrete Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 2017 Néhémy Lim Quick review on Discrete Random Variables Notations. Z = {..., 2, 1, 0, 1, 2,...}, set of all integers; N = {0, 1, 2,...}, set of natural
More informationFIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS
FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS MOUMANTI PODDER 1. First order theory on G(n, p) We start with a very simple property of G(n,
More informationINTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING
INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing
More informationProblem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)
Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.
More informationMATH 556: PROBABILITY PRIMER
MATH 6: PROBABILITY PRIMER 1 DEFINITIONS, TERMINOLOGY, NOTATION 1.1 EVENTS AND THE SAMPLE SPACE Definition 1.1 An experiment is a one-off or repeatable process or procedure for which (a there is a well-defined
More informationMAS113 Introduction to Probability and Statistics. Proofs of theorems
MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a
More informationHandout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.
Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound
More informationTaylor and Maclaurin Series
Taylor and Maclaurin Series MATH 211, Calculus II J. Robert Buchanan Department of Mathematics Spring 2018 Background We have seen that some power series converge. When they do, we can think of them as
More informationChase Joyner. 901 Homework 1. September 15, 2017
Chase Joyner 901 Homework 1 September 15, 2017 Problem 7 Suppose there are different types of coupons available when buying cereal; each box contains one coupon and the collector is seeking to collect
More informationStatistics for Economists. Lectures 3 & 4
Statistics for Economists Lectures 3 & 4 Asrat Temesgen Stockholm University 1 CHAPTER 2- Discrete Distributions 2.1. Random variables of the Discrete Type Definition 2.1.1: Given a random experiment with
More informationTail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002
Tail Inequalities 497 - Randomized Algorithms Sariel Har-Peled December 0, 00 Wir mssen wissen, wir werden wissen (We must know, we shall know) David Hilbert 1 Tail Inequalities 1.1 The Chernoff Bound
More informationLecture 5: Expectation
Lecture 5: Expectation 1. Expectations for random variables 1.1 Expectations for simple random variables 1.2 Expectations for bounded random variables 1.3 Expectations for general random variables 1.4
More informationNotes on Discrete Probability
Columbia University Handout 3 W4231: Analysis of Algorithms September 21, 1999 Professor Luca Trevisan Notes on Discrete Probability The following notes cover, mostly without proofs, the basic notions
More informationDiscrete Random Variables
Chapter 5 Discrete Random Variables Suppose that an experiment and a sample space are given. A random variable is a real-valued function of the outcome of the experiment. In other words, the random variable
More informationLEBESGUE INTEGRATION. Introduction
LEBESGUE INTEGATION EYE SJAMAA Supplementary notes Math 414, Spring 25 Introduction The following heuristic argument is at the basis of the denition of the Lebesgue integral. This argument will be imprecise,
More informationCS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis
CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis Eli Upfal Eli Upfal@brown.edu Office: 319 TA s: Lorenzo De Stefani and Sorin Vatasoiu cs155tas@cs.brown.edu It is remarkable
More informationLecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018
CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of
More information6.1 Occupancy Problem
15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.
More informationA Generalization of Wigner s Law
A Generalization of Wigner s Law Inna Zakharevich June 2, 2005 Abstract We present a generalization of Wigner s semicircle law: we consider a sequence of probability distributions (p, p 2,... ), with mean
More informationCS Data Structures and Algorithm Analysis
CS 483 - Data Structures and Algorithm Analysis Lecture VII: Chapter 6, part 2 R. Paul Wiegand George Mason University, Department of Computer Science March 22, 2006 Outline 1 Balanced Trees 2 Heaps &
More information1 Stat 605. Homework I. Due Feb. 1, 2011
The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationSTAT 7032 Probability Spring Wlodek Bryc
STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,
More informationLecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]
U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 4 Professor Satish Rao September 7, 2010 Lecturer: Satish Rao Last revised September 13, 2010 Lecture 4 1 Deviation bounds. Deviation bounds
More information1 INFO Sep 05
Events A 1,...A n are said to be mutually independent if for all subsets S {1,..., n}, p( i S A i ) = p(a i ). (For example, flip a coin N times, then the events {A i = i th flip is heads} are mutually
More informationCSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds
CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds Lecturer: Toniann Pitassi Scribe: Robert Robere Winter 2014 1 Switching
More informationAsymptotic Analysis and Recurrences
Appendix A Asymptotic Analysis and Recurrences A.1 Overview We discuss the notion of asymptotic analysis and introduce O, Ω, Θ, and o notation. We then turn to the topic of recurrences, discussing several
More informationX = X X n, + X 2
CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk
More informationCHAPTER 8: EXPLORING R
CHAPTER 8: EXPLORING R LECTURE NOTES FOR MATH 378 (CSUSM, SPRING 2009). WAYNE AITKEN In the previous chapter we discussed the need for a complete ordered field. The field Q is not complete, so we constructed
More informationNotes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.
Lecture: Hélène Barcelo Analytic Combinatorics ECCO 202, Bogotá Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.. Tuesday, June 2, 202 Combinatorics is the study of finite structures that
More informationEstimates for probabilities of independent events and infinite series
Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences
More information