MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM

Size: px
Start display at page:

Download "MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM"

Transcription

1 MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM XIANG LI Abstract. This paper centers on the analysis of a specific randomized algorithm, a basic random process that involves marking a binary tree, in light of concepts and techniques from probability theory. The first part of the paper is based on an interesting assumption that significantly simplifies the problem and thus presents us with a solution for figuring out the expectation of the time steps required by this algorithm. The essential part of the solution is a coupon-collector model that mainly makes use of the geometric probability distribution. The rest of the paper aims to verify the legitimacy of our previous assumption with a balls-into-bins model and the concept of Poisson distribution. Contents. Presenting the Problem: Marking a Binary Tree. Random Variables, Probability and Expectation 3. The expectation of Total Time Steps Assuming the Existence of a Bottleneck. The Balls-into-bins Model and Poisson Distribution 6 5. Verifying the Existence of a Bottleneck 7 6. Conclusion 9 Acknowledgments 9 7. bibliography 9 References 9. Presenting the Problem: Marking a Binary Tree The research of a random process often involves the understanding of its patterns or mechanisms at a higher level, with formal mathematical proofs developed to support such understanding. A basic random process of marking a binary tree is presented below, and we are particularly interested in the number of time steps required to mark the entire tree. Consider a complete binary tree with N = n nodes and depth of n. For example, Figure is a binary tree of depth 5. For a particular node, its parent is the node directly connected to it one level above. Its sibling is the node on the same level that shares the same parent. Its two children, if it has any at all, are the two nodes directly connected to it one level below. Date: AUGUST 6, 08.

2 XIANG LI Figure. A Complete Binary Tree of Depth 5 and Relationships between Nodes Initially, all nodes are unmarked, and our ultimate goal is to mark the entire tree with the process we shall describe. First we number each node with a unique identifying number within the range of {,,...,N}. Every time step, a random number, chosen uniformly at random from {,,...,N} is generated and sent as a signal to mark the corresponding node with the same identifying number. After you mark the sent node, a infecting process is also immediately invoked: If a node and its sibling are marked, its parent is marked. If a node and its parent are marked, the other sibling is marked. The marking rule is always applied recursively as much as possible before the next node is sent. For example, in Figure., the marked nodes are filled in. The arrival of the marked node labeled by an X will allow you to mark the remainder of the nodes as you apply the marking rule in the sequence of node,, 3. Figure. The Infection Caused by the Arrival of the Marked X Throughout the analysis of this process, the leaf nodes, which are nodes at the very bottom level, are of particular interest to us. Moreover, we ll frequently treat two leaf nodes that are siblings as a pair. The number of pairs of leaf node siblings will be denoted by N, which is equal to n =. Before diving into the task, we shall review some basic concepts of probability theory.. Random Variables, Probability and Expectation Definition.. A random variable on a sample space Ω is a real-valued function on Ω, denoted by X : Ω R.

3 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM3 Definition.. A probability function is any function Pr: F R that satisfies the following conditions:. for any event E, O Pr(E) ;. for the sample space Ω,Pr(Ω) = and 3. for any finite or countably infinite sequence of pairwise mutually disjoint events E,E,E 3,..., Pr( E i ) = i ). i i Pr(E Definition.3. The expectation of a discrete random variable X, denoted by E[X], is given by E(x) = i ipr(x = i), where the summation is over all values in the range of X. Theorem. (Linearity of expectation). For any finite collection of discrete random variables X,X,...,X n with finite expectations, n n E( X i ) = E(X i ). Proof. We can first prove the case for two random variables and derive from it the general case by induction. LetX andy betworandomvariables. E(X+Y) = (x+y)pr(x = x,y = y) x y = xpr(x = x,y = y)+ ypr(x = x,y = y) x y x) y = x x y Pr(X = x,y = y)+ y y Pr(X = x,y = y) = x xpr(x = x)+ y ypr(y = y) = E(X)+E(Y), where the last equality directly follows from the definition of expectations and all the summations are over the range of the corresponding random variables. Definition.5. A geometric random variable X with parameter p is given by the probability distribution Pr(X = n) = ( p) n p, where n takes on nonnegative integer values. Remark.6. It can be easily verified that the geometric probability distribution satisfies the three properties in Definition.. Now we turn to computing the expectation of a geometric random variable.

4 XIANG LI Lemma.7. Let X be a discrete random variable that takes on only nonnegative integer values. Then E(X) = Pr(X i) j=i Proof. Pr(X i) = Pr(X = j) = = jpr(x = j) = E(X) j= j= j Pr(X = j) The second equality is justified because all the terms being summed up are nonnegative. Theorem.8. The expectation of a geometric random variable X with parameter p is given by E(X) = p Proof. By Definition.5, Pr(X i) = Hence, E(X) = = = p ( p) i Pr(X i) ( p) j p = ( p) i. j=i We shall proceed by making use of the definitions and results presented so far in order to model the random process of marking trees from chapter in the light of probability theory. 3. The expectation of Total Time Steps Assuming the Existence of a Bottleneck For those with certain background in computer programming, it is easy to write a simulation program and have the program print out the sequence of nodes sent as the identifiers. After a few trials, one will find that the last identifier is almost always a leaf node, one of the n nodes at the bottom level. Such behavior of the random process is no surprise if we think about what is going on at the bottom level during the marking process. In order for a leaf node to be marked, since it doesn t have any children to infect it, it s either the node itself or its sibling (the adjacent node sharing the same parent) that is marked directly as an identifier. Therefore, a necessary condition of marking the entire tree is that at least one of each pair of siblings at the leaves is sent as an identifier. In other words, we must reach pair of leaf nodes. In the rest of the paper, we will first assume that with high probability the number of steps required for marking the entire tree is equivalent to that for marking, directly or indirectly, all the leaf nodes, and figure out the expected number of nodes sent based on our assumption. Then we will verify the legitimacy of such assumption.

5 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM5 Theorem 3. (A Coupon-collector Problem). The expected number of nodes sent in order to mark at least one of each pair of siblings at the leaves is Nln + Θ(N). Proof. A close analogy of this process is the classic coupon-collector model. Our goal is to collect distinct pairs of siblings, coupons. At each unit of time, you send an identifier that may or may not be a leaf node you haven t reached yet, which is analogous to opening a box that may or may not contain a coupon you haven t collected. Let X be the number of identifiers sent until at least one of every pair of leaf siblings is marked. Our goal is to determine E(X). Gaining insight from the coupon-collector model, we let X i be the number of identifiers sent when you have reached exactly i pairs of leaf nodes. In other words, you send X i nodes to go from reaching i pairs to reaching i pairs. It follows that X = When exactly i pairs of leaf nodes have been reached, each time an identifier is sent, the probability of reaching a new pair is p i = [ (i )] N = N i+5 N. Therefore, Pr(X i = r) = ( p i ) r p i, where r is an arbitrary nonnegative integer. By Definition.5, X i is a geometric random variable with parameter p i = N i+5 N. Applying Theorem.8, we have E(X i ) = p i. Then using Theorem., the linearity of expectation, we have E(X) = E( X i ) = E(X i ) = X i. N N i+5 = N i = N i. As we will show in Lemma 3.3, the summation is equal to ln i + Θ(). Therefore, the expected time steps needed to mark at least one of each pair of siblings at the leaves is Nln +Θ(N). Definition 3. (The Big Θ Notation). Let f and g both be real-valued functions defined on some unbounded subset of real positive numbers. f(x) = Θ(g(x)) as x if and only if there exists a positive real number M and a real number x0 such that f(x) Mg(x) for all x x 0. f(x) Mg(x) for all x x 0. Lemma 3.3. The following equation holds true for all n N : n i = lnn+θ(). Proof. According to the definition of natural log, we have both lnn = n x= x dx n x= n x dx = i

6 6 XIANG LI and n n lnn+ = x= x dx+ n x= x dx+ = i + = n i. i= n n Since lnn i lnn+, by Definition 3., we have i = lnn+θ(). Before we verify our assumption that the entire process can be reduced to marking the leaf nodes, a new model, the balls-into-bins model, is to be introduced, along with several relevant techniques of analysis.. The Balls-into-bins Model and Poisson Distribution The Balls-into-bins model refers to the process of randomly and uniformly placing m balls into n bins. This is similar to our task of marking the tree: m signals are distributed randomly and uniformly among N nodes. We are interested in the probability that a particular bin has r balls. We start by counting the number of different ( configurations of the r balls selected from a total of m balls. There are m ) r ways to select the r balls. For each configuration, the probability that all of the given r balls fall into our bin and no other balls do is given by ( n )r ( n )m r. Hence, the probability that our bin contains exactly r balls is equal to ( ) m ( r n )r ( n )m r = m(m ) (m r +) n r ( n )m r (m n )r ( n )m (m n )r e m n. Definition.. A Poisson random variable X with parameter µ is given by the probability distribution Pr(X = r) = µr e µ. Remark.. The probability distribution of Poisson variables satisfy the three properties in Definition.. In particular, we can verify that µ r e µ Pr(X = r) = = e µ r=0 r=0 r=0 µ r using the Taylor expansion of e x. = e µ e µ =, Definition.3. Let X,Y be two random variables. X and Y are independent if and only if for all possible i,j we have Pr((X = i) (Y = j)) = Pr(X = i) Pr(Y = j). Theorem.. Let X,X,...,X n be independent Poisson random bariables with parameter µ,µ,...,µ n respectively. Y = X +X + +X n is a Poisson random variable with parameter µ = µ +µ + +µ n.

7 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM7 Proof. Let X,Y be two Poisson random variables with parameter µ,µ respectively. Then Pr(X +Y = r) = Pr((X = j) (Y = r j)) = Pr(X = j) Pr(Y = r j)) = = e (µ+µ) µ j µr j j!(r j)! = e (µ+µ) µ j e µ j! ( r j µ r j e µ (r j)! ) µ j µr j = e (µ+µ) (µ +µ ) r, where the last equality uses the binomial theorem. The more general case regarding the sum of n variables can be proven by induction. 5. Verifying the Existence of a Bottleneck Here a discussion about the Poisson approximation is left out. The main conclusion of the Poisson approximation is that when we are considered with events with sufficiently extreme probability, throwing n balls into m bins yields roughly the same distribution as assigning each bin a number of balls that is Poisson distributed with µ = m n ; therefore, the loads in all bins are independent under the Poisson approximation. Let s first pay attention to the level with depth n- of the tree. Given that at least one of each pair of leaf node siblings is marked, if all the nodes on the n- level are also all marked, consequently we know that the entire tree is marked because nodes on the n- level can infect other nodes upwards. Hence, when at least one of each pair of leaf node siblings is marked, Pr(the tree is completely marked) = Pr(the n- level is completely marked) = ( Pr(node i with depth n- is not marked for an arbitrary i)), Therefore, we just need to compute the probability that a particular node i on the n- level of the tree is not marked. Let m be the total number of signals sent and X i be number of signals sent to node i. According to our previous analysis of the balls-into-bins model, X i is a Poisson variable with a probability distribution Pr(X i = r) = (m N )r e m N For the node i, let X p,x s,x c,x c be the number of signals sent to its parent, sibling, first child and second child respectively. Three pieces of information can be gained from an unmarked node i:. X i = 0.. X p = 0 or X s = X c = 0 or X c = 0. Note that these three combined are a necessary but not sufficient condition for node i to be unmarked because a zero value on X p doesn t necessarily make i s parent unmarked. Such simplification is valid here because we are only interested in an upper bound of the probability that i is unmarked. Under the Poisson approximation, X i,x p,x s,x c,x c are all independent Poisson random variables and we can therefore invoke Theorem.3. Without loss of.

8 8 XIANG LI generality, let s consider the case where Y = X i +X p +X c = 0. By Theorem.3, Y is again a Poisson random variable with parameter µ = 3m N. By Definition., Applying the union bound, we have Pr(Y = 0) = e 3m N. Pr(X i is not marked) e 3m N, because there are possible combinations among X p,x s,x c and X c, andpr(the tree is completely marked every pair of leaf node siblings is reached) ( e 3m N ). When m takes on the value N ln( ) + Θ(N), which is our previous result for the expectation of signals needed, we have Pr(the tree is completely marked every pair of leaf node siblings is reached) = ( N 3 )N where N =. For a sufficiently large N,( N 3 )N is close to, and it has a limit of when N approaches infinity. Therefore, we can conclude that when the total number of time steps taken is around nlnn, with high probability marking the nodes above leaves is completed before marking each pair of leaf nodes; that is, marking the leaves is with high probability the bottleneck of the entire process. It s worth pointing out that here we are only concerned with the situation where the total number of time steps is equivalent to N lnn +Θ(N) because, as we will show next, using the balls-into-bins model for N sufficiently large, with high probability the number of signals required to reach all pairs of leaf nodes stays very close to N lnn. Theorem 5.. If Y is the number of signals sent before reaching every pair of leaf nodes, then for any constant c we have N lnn lim Pr(X > +cn) = e e c. N Empirically, this means the random variable Y concentrates around its expectation N lnn + Θ(N). After plugging in some value of c, we can see that, for instance, the probability that X > N lnn +n is less than %. Proof. Again we assume Poisson approximation is appropriate in this case. Under the Poisson approximation, we can let the number of signals sent to each pair of leaf nodesbeapoissonrandomvariablewithparameterµ = lnn +c,andconsequently the expected number of total signals sent is N lnn +cn. By Definition., for a particular pair of leaf nodes and the corresponding X, Pr(X = 0) = e µ = e c N. Since all nodes are independent under the Poisson approximation, the probability that every pair of leaf nodes receives at least one signal is ( e c N ) N e e c ; that is, the probability that not all pairs of leaf nodes are marked when N lnn +cn signals are sent is e e c.

9 MARKING A BINARY TREEPROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM9 6. Conclusion At this point, we can conclude that the expected time steps needed in total is N ln equal to + ΘN and with high probability the number of total signals required stays very close to N ln. Acknowledgments I would like to thank Professor Greg Lawler for first introducing me into probability theory and stochastic process during the REU program. It is a pleasure to thank my mentor, Kevin Casto, for recommending reading materials and giving me insight into the topic of my paper. I would also like to thank Daniil Rudenko and Peter May for organizing the REU program. Your efforts are much appreciated. 7. bibliography References [] Michael Mitzenmacher, Eli Upfal. Probability and Computing- Randomized Algorithms and Probabilistic Analysis. Cambridge University Press. 005.

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( ) Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr Pr = Pr Pr Pr() Pr Pr. We are given three coins and are told that two of the coins are fair and the

More information

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 4: Two-point Sampling, Coupon Collector s problem Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms

More information

Randomized Load Balancing:The Power of 2 Choices

Randomized Load Balancing:The Power of 2 Choices Randomized Load Balancing: The Power of 2 Choices June 3, 2010 Balls and Bins Problem We have m balls that are thrown into n bins, the location of each ball chosen independently and uniformly at random

More information

Random Variable. Pr(X = a) = Pr(s)

Random Variable. Pr(X = a) = Pr(s) Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably

More information

Exercises for Unit VI (Infinite constructions in set theory)

Exercises for Unit VI (Infinite constructions in set theory) Exercises for Unit VI (Infinite constructions in set theory) VI.1 : Indexed families and set theoretic operations (Halmos, 4, 8 9; Lipschutz, 5.3 5.4) Lipschutz : 5.3 5.6, 5.29 5.32, 9.14 1. Generalize

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Errata: First Printing of Mitzenmacher/Upfal Probability and Computing

Errata: First Printing of Mitzenmacher/Upfal Probability and Computing Errata: First Printing of Mitzenmacher/Upfal Probability and Computing Michael Mitzenmacher and Eli Upfal October 10, 2006 We would like to thank the many of you who have bought our book, and we would

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

8.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.

8.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1. 8. Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = Examples: 6. Find a formula for the general term a n of the sequence, assuming

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

PROBABILITY AND STATISTICS IN COMPUTING. III. Discrete Random Variables Expectation and Deviations From: [5][7][6] German Hernandez

PROBABILITY AND STATISTICS IN COMPUTING. III. Discrete Random Variables Expectation and Deviations From: [5][7][6] German Hernandez Conditional PROBABILITY AND STATISTICS IN COMPUTING III. Discrete Random Variables and Deviations From: [5][7][6] Page of 46 German Hernandez Conditional. Random variables.. Measurable function Let (Ω,

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME ELIZABETH G. OMBRELLARO Abstract. This paper is expository in nature. It intuitively explains, using a geometrical and measure theory perspective, why

More information

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox Lecture 04: Balls and Bins: Overview In today s lecture we will start our study of balls-and-bins problems We shall consider a fundamental problem known as the Recall: Inequalities I Lemma Before we begin,

More information

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) = Math 5. Rumbos Fall 07 Solutions to Review Problems for Exam. A bowl contains 5 chips of the same size and shape. Two chips are red and the other three are blue. Draw three chips from the bowl at random,

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1. Chapter 11 Min Cut By Sariel Har-Peled, December 10, 013 1 Version: 1.0 I built on the sand And it tumbled down, I built on a rock And it tumbled down. Now when I build, I shall begin With the smoke from

More information

Massachusetts Institute of Technology Handout J/18.062J: Mathematics for Computer Science May 3, 2000 Professors David Karger and Nancy Lynch

Massachusetts Institute of Technology Handout J/18.062J: Mathematics for Computer Science May 3, 2000 Professors David Karger and Nancy Lynch Massachusetts Institute of Technology Handout 48 6.042J/18.062J: Mathematics for Computer Science May 3, 2000 Professors David Karger and Nancy Lynch Quiz 2 Solutions Problem 1 [10 points] Consider n >

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers.

In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers. Suppose you are a content delivery network. In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers. How to distribute requests to balance the

More information

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials). Tail Inequalities William Hunt Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV William.Hunt@mail.wvu.edu Introduction In this chapter, we are interested

More information

Randomized Algorithms Week 2: Tail Inequalities

Randomized Algorithms Week 2: Tail Inequalities Randomized Algorithms Week 2: Tail Inequalities Rao Kosaraju In this section, we study three ways to estimate the tail probabilities of random variables. Please note that the more information we know about

More information

Randomized algorithm

Randomized algorithm Tutorial 4 Joyce 2009-11-24 Outline Solution to Midterm Question 1 Question 2 Question 1 Question 2 Question 3 Question 4 Question 5 Solution to Midterm Solution to Midterm Solution to Midterm Question

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Solution Set for Homework #1

Solution Set for Homework #1 CS 683 Spring 07 Learning, Games, and Electronic Markets Solution Set for Homework #1 1. Suppose x and y are real numbers and x > y. Prove that e x > ex e y x y > e y. Solution: Let f(s = e s. By the mean

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

n=1 ( 2 3 )n (a n ) converges by direct comparison to

n=1 ( 2 3 )n (a n ) converges by direct comparison to . (a) n = a n converges, so we know that a n =. Therefore, for n large enough we know that a n

More information

Essentials on the Analysis of Randomized Algorithms

Essentials on the Analysis of Randomized Algorithms Essentials on the Analysis of Randomized Algorithms Dimitris Diochnos Feb 0, 2009 Abstract These notes were written with Monte Carlo algorithms primarily in mind. Topics covered are basic (discrete) random

More information

Alex Psomas: Lecture 17.

Alex Psomas: Lecture 17. Alex Psomas: Lecture 17. Random Variables: Expectation, Variance 1. Random Variables, Expectation: Brief Review 2. Independent Random Variables. 3. Variance Random Variables: Definitions Definition A random

More information

Notes. Combinatorics. Combinatorics II. Notes. Notes. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006

Notes. Combinatorics. Combinatorics II. Notes. Notes. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006 Combinatorics Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Spring 2006 Computer Science & Engineering 235 Introduction to Discrete Mathematics Sections 4.1-4.6 & 6.5-6.6 of Rosen cse235@cse.unl.edu

More information

10.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1.

10.1 Sequences. Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1. 10.1 Sequences Example: A sequence is a function f(n) whose domain is a subset of the integers. Notation: *Note: n = 0 vs. n = 1 Examples: EX1: Find a formula for the general term a n of the sequence,

More information

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

Solutions to Problem Set 4

Solutions to Problem Set 4 UC Berkeley, CS 174: Combinatorics and Discrete Probability (Fall 010 Solutions to Problem Set 4 1. (MU 5.4 In a lecture hall containing 100 people, you consider whether or not there are three people in

More information

Lecture 5: January 30

Lecture 5: January 30 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 5: January 30 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Balls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random

Balls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random Balls & Bins Put m balls into n bins uniformly at random Seoul National University 1 2 3 n Balls into Bins Name: Chong kwon Kim Same (or similar) problems Birthday paradox Hash table Coupon collection

More information

RANDOM WALKS IN Z d AND THE DIRICHLET PROBLEM

RANDOM WALKS IN Z d AND THE DIRICHLET PROBLEM RNDOM WLKS IN Z d ND THE DIRICHLET PROBLEM ERIC GUN bstract. Random walks can be used to solve the Dirichlet problem the boundary value problem for harmonic functions. We begin by constructing the random

More information

Randomized Algorithms III Min Cut

Randomized Algorithms III Min Cut Chapter 11 Randomized Algorithms III Min Cut CS 57: Algorithms, Fall 01 October 1, 01 11.1 Min Cut 11.1.1 Problem Definition 11. Min cut 11..0.1 Min cut G = V, E): undirected graph, n vertices, m edges.

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

CALCULUS JIA-MING (FRANK) LIOU

CALCULUS JIA-MING (FRANK) LIOU CALCULUS JIA-MING (FRANK) LIOU Abstract. Contents. Power Series.. Polynomials and Formal Power Series.2. Radius of Convergence 2.3. Derivative and Antiderivative of Power Series 4.4. Power Series Expansion

More information

Handout 1: Mathematical Background

Handout 1: Mathematical Background Handout 1: Mathematical Background Boaz Barak February 2, 2010 This is a brief review of some mathematical tools, especially probability theory that we will use. This material is mostly from discrete math

More information

On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces

On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces Dona-Maria Ivanova under the direction of Mr. Zhenkun Li Department of Mathematics Massachusetts Institute of Technology

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

ACO Comprehensive Exam October 14 and 15, 2013

ACO Comprehensive Exam October 14 and 15, 2013 1. Computability, Complexity and Algorithms (a) Let G be the complete graph on n vertices, and let c : V (G) V (G) [0, ) be a symmetric cost function. Consider the following closest point heuristic for

More information

With high probability

With high probability With high probability So far we have been mainly concerned with expected behaviour: expected running times, expected competitive ratio s. But it would often be much more interesting if we would be able

More information

Models of Computation. by Costas Busch, LSU

Models of Computation. by Costas Busch, LSU Models of Computation by Costas Busch, LSU 1 Computation CPU memory 2 temporary memory input memory CPU output memory Program memory 3 Example: f ( x) x 3 temporary memory input memory Program memory compute

More information

Notes on Gaussian processes and majorizing measures

Notes on Gaussian processes and majorizing measures Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,

More information

Probability & Computing

Probability & Computing Probability & Computing Stochastic Process time t {X t t 2 T } state space Ω X t 2 state x 2 discrete time: T is countable T = {0,, 2,...} discrete space: Ω is finite or countably infinite X 0,X,X 2,...

More information

Disjointness and Additivity

Disjointness and Additivity Midterm 2: Format Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten

More information

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley Midterm 2: Format 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

CS70: Jean Walrand: Lecture 19.

CS70: Jean Walrand: Lecture 19. CS70: Jean Walrand: Lecture 19. Random Variables: Expectation 1. Random Variables: Brief Review 2. Expectation 3. Important Distributions Random Variables: Definitions Definition A random variable, X,

More information

Equivalence of the random intersection graph and G(n, p)

Equivalence of the random intersection graph and G(n, p) Equivalence of the random intersection graph and Gn, p arxiv:0910.5311v1 [math.co] 28 Oct 2009 Katarzyna Rybarczyk Faculty of Mathematics and Computer Science, Adam Mickiewicz University, 60 769 Poznań,

More information

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted

More information

Discrete Random Variables

Discrete Random Variables CPSC 53 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is

More information

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions CS 70 Discrete Mathematics and Probability Theory Fall 20 Rao Midterm 2 Solutions True/False. [24 pts] Circle one of the provided answers please! No negative points will be assigned for incorrect answers.

More information

Defining the Integral

Defining the Integral Defining the Integral In these notes we provide a careful definition of the Lebesgue integral and we prove each of the three main convergence theorems. For the duration of these notes, let (, M, µ) be

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

9.1 Branching Process

9.1 Branching Process 9.1 Branching Process Imagine the following stochastic process called branching process. A unisexual universe Initially there is one live organism and no dead ones. At each time unit, we select one of

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 Random Variables and Expectation Question: The homeworks of 20 students are collected in, randomly shuffled and returned to the students.

More information

Chapter Generating Functions

Chapter Generating Functions Chapter 8.1.1-8.1.2. Generating Functions Prof. Tesler Math 184A Fall 2017 Prof. Tesler Ch. 8. Generating Functions Math 184A / Fall 2017 1 / 63 Ordinary Generating Functions (OGF) Let a n (n = 0, 1,...)

More information

Discrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2015 Note 20. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 215 Note 2 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces Ω, where the number

More information

Chapter 2 Metric Spaces

Chapter 2 Metric Spaces Chapter 2 Metric Spaces The purpose of this chapter is to present a summary of some basic properties of metric and topological spaces that play an important role in the main body of the book. 2.1 Metrics

More information

MA22S3 Summary Sheet: Ordinary Differential Equations

MA22S3 Summary Sheet: Ordinary Differential Equations MA22S3 Summary Sheet: Ordinary Differential Equations December 14, 2017 Kreyszig s textbook is a suitable guide for this part of the module. Contents 1 Terminology 1 2 First order separable 2 2.1 Separable

More information

Quick review on Discrete Random Variables

Quick review on Discrete Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 2017 Néhémy Lim Quick review on Discrete Random Variables Notations. Z = {..., 2, 1, 0, 1, 2,...}, set of all integers; N = {0, 1, 2,...}, set of natural

More information

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS MOUMANTI PODDER 1. First order theory on G(n, p) We start with a very simple property of G(n,

More information

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing

More information

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15) Problem 1: Chernoff Bounds via Negative Dependence - from MU Ex 5.15) While deriving lower bounds on the load of the maximum loaded bin when n balls are thrown in n bins, we saw the use of negative dependence.

More information

MATH 556: PROBABILITY PRIMER

MATH 556: PROBABILITY PRIMER MATH 6: PROBABILITY PRIMER 1 DEFINITIONS, TERMINOLOGY, NOTATION 1.1 EVENTS AND THE SAMPLE SPACE Definition 1.1 An experiment is a one-off or repeatable process or procedure for which (a there is a well-defined

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.

Handout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0. Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound

More information

Taylor and Maclaurin Series

Taylor and Maclaurin Series Taylor and Maclaurin Series MATH 211, Calculus II J. Robert Buchanan Department of Mathematics Spring 2018 Background We have seen that some power series converge. When they do, we can think of them as

More information

Chase Joyner. 901 Homework 1. September 15, 2017

Chase Joyner. 901 Homework 1. September 15, 2017 Chase Joyner 901 Homework 1 September 15, 2017 Problem 7 Suppose there are different types of coupons available when buying cereal; each box contains one coupon and the collector is seeking to collect

More information

Statistics for Economists. Lectures 3 & 4

Statistics for Economists. Lectures 3 & 4 Statistics for Economists Lectures 3 & 4 Asrat Temesgen Stockholm University 1 CHAPTER 2- Discrete Distributions 2.1. Random variables of the Discrete Type Definition 2.1.1: Given a random experiment with

More information

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002 Tail Inequalities 497 - Randomized Algorithms Sariel Har-Peled December 0, 00 Wir mssen wissen, wir werden wissen (We must know, we shall know) David Hilbert 1 Tail Inequalities 1.1 The Chernoff Bound

More information

Lecture 5: Expectation

Lecture 5: Expectation Lecture 5: Expectation 1. Expectations for random variables 1.1 Expectations for simple random variables 1.2 Expectations for bounded random variables 1.3 Expectations for general random variables 1.4

More information

Notes on Discrete Probability

Notes on Discrete Probability Columbia University Handout 3 W4231: Analysis of Algorithms September 21, 1999 Professor Luca Trevisan Notes on Discrete Probability The following notes cover, mostly without proofs, the basic notions

More information

Discrete Random Variables

Discrete Random Variables Chapter 5 Discrete Random Variables Suppose that an experiment and a sample space are given. A random variable is a real-valued function of the outcome of the experiment. In other words, the random variable

More information

LEBESGUE INTEGRATION. Introduction

LEBESGUE INTEGRATION. Introduction LEBESGUE INTEGATION EYE SJAMAA Supplementary notes Math 414, Spring 25 Introduction The following heuristic argument is at the basis of the denition of the Lebesgue integral. This argument will be imprecise,

More information

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis Eli Upfal Eli Upfal@brown.edu Office: 319 TA s: Lorenzo De Stefani and Sorin Vatasoiu cs155tas@cs.brown.edu It is remarkable

More information

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of

More information

6.1 Occupancy Problem

6.1 Occupancy Problem 15-859(M): Randomized Algorithms Lecturer: Anupam Gupta Topic: Occupancy Problems and Hashing Date: Sep 9 Scribe: Runting Shi 6.1 Occupancy Problem Bins and Balls Throw n balls into n bins at random. 1.

More information

A Generalization of Wigner s Law

A Generalization of Wigner s Law A Generalization of Wigner s Law Inna Zakharevich June 2, 2005 Abstract We present a generalization of Wigner s semicircle law: we consider a sequence of probability distributions (p, p 2,... ), with mean

More information

CS Data Structures and Algorithm Analysis

CS Data Structures and Algorithm Analysis CS 483 - Data Structures and Algorithm Analysis Lecture VII: Chapter 6, part 2 R. Paul Wiegand George Mason University, Department of Computer Science March 22, 2006 Outline 1 Balanced Trees 2 Heaps &

More information

1 Stat 605. Homework I. Due Feb. 1, 2011

1 Stat 605. Homework I. Due Feb. 1, 2011 The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a] U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 4 Professor Satish Rao September 7, 2010 Lecturer: Satish Rao Last revised September 13, 2010 Lecture 4 1 Deviation bounds. Deviation bounds

More information

1 INFO Sep 05

1 INFO Sep 05 Events A 1,...A n are said to be mutually independent if for all subsets S {1,..., n}, p( i S A i ) = p(a i ). (For example, flip a coin N times, then the events {A i = i th flip is heads} are mutually

More information

CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds

CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds CSC 2429 Approaches to the P vs. NP Question and Related Complexity Questions Lecture 2: Switching Lemma, AC 0 Circuit Lower Bounds Lecturer: Toniann Pitassi Scribe: Robert Robere Winter 2014 1 Switching

More information

Asymptotic Analysis and Recurrences

Asymptotic Analysis and Recurrences Appendix A Asymptotic Analysis and Recurrences A.1 Overview We discuss the notion of asymptotic analysis and introduce O, Ω, Θ, and o notation. We then turn to the topic of recurrences, discussing several

More information

X = X X n, + X 2

X = X X n, + X 2 CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk

More information

CHAPTER 8: EXPLORING R

CHAPTER 8: EXPLORING R CHAPTER 8: EXPLORING R LECTURE NOTES FOR MATH 378 (CSUSM, SPRING 2009). WAYNE AITKEN In the previous chapter we discussed the need for a complete ordered field. The field Q is not complete, so we constructed

More information

Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.

Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements. Lecture: Hélène Barcelo Analytic Combinatorics ECCO 202, Bogotá Notes by Zvi Rosen. Thanks to Alyssa Palfreyman for supplements.. Tuesday, June 2, 202 Combinatorics is the study of finite structures that

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information