Course Notes. Part II. Probabilistic Combinatorics. Algorithms

Size: px
Start display at page:

Download "Course Notes. Part II. Probabilistic Combinatorics. Algorithms"

Transcription

1 Course Notes Part II Probabilistic Combinatorics and Algorithms J. A. Verstraete Department of Mathematics University of California San Diego 9500 Gilman Drive La Jolla California

2 2 Basic probabilistic inequalities The probability spaces we deal with will generally be discrete. The reader is referred to Feller for a complete and formal background of probability theory, and to Williams for a shorter but still complete text. 2.1 Probability and Expectation We will recall that a probability space is a triple (Ω, F, P ) where Ω is a set, F is a family of subsets of Ω containing, closed under complementation, and closed under countable unions, and P : F [0, 1] is a countably additive function on F with P ( ) = 0. The elements of Ω are called sample points and the sets in F are called events. If Ω is finite, then P is determined completely by its value on each ω Ω. A random variable is a real-valued function X : F R such that the inverse image X 1 of X maps Borel subsets of R that is sets which consist of unions and intersubsections of countably many half closed intervals (a, b] to events in F. If (Ω, F, P ) is a probability space and A F is any event, then we write A a.s. (almost surely) instead of P (A) = 1. For a sequence of events A 1, A 2, F, we write A n a.a.s. to mean lim P (A n) = 1. n This says A n occurs asymptotically almost surely Expectation In the following definitions, integrals are taken to denote Lebesgue integrals. For the purposes we have in mind, most of the integrals will be Riemann integrals, or even finite sums. Let (Ω, F, P ) be a probability space. The expectation (or first moment) of a random variable X, when it exists, is defined by E(X) = X(ω) dp (ω). Ω Let F (x) denote P (X x) for each x R. This is the cumulative distribution function of X, and its derivative is called the probability distribution function of X. In practice, if f is the distribution function of X and X has range R R, then E(X) = xf(x) dx. R We will assume henceforth that when we write an expression involving E(X), then the mean of X exists. In the last subsection, we used the fact that the expectation is a linear operator, together with the fact that for any random variable X, there is a point ω Ω such that X(ω) E(X) and a point ω Ω such that X(ω ) E(X). This simple notion is very useful in general when applying the probabilistic method. In general, the rth moment of X is the expectation of X r, and the variance of X is var(x) = E(X 2 ) E(X) 2. The standard deviation of X is var(x), and we often write var(x) = σ 2 and the standard deviation as σ. Amongst other things, we will use these as parameters to measure the concentration of a random variable X. One of the crucial properties of expectation we shall use is that it is a linear operator: if X 1, X 2,..., X n are random variables, then E(X 1 + X X n ) = E(X 1 ) + E(X 2 ) + + E(X n ). It is not true, however, that E(XY ) = E(X)E(Y ). 1

3 2.1.2 Independence Events E 1, E 2,..., E n F are said to be independent if P (E 1 E 2 E n ) = P (E i ). Random variables X 1, X 2,..., X n are independent if the events E i = {X i x i } are independent for all choices of x i. There are many important theorems regarding independent events and random variables. For example if X 1, X 2,..., X n are independent random variables, then there are many things one can say about their sum X n = n X i, especially as n tends to infinity. A special case of the central limit theorem is that if P (X i = 1) = P (X i = 1) = 1 2, then n 1/2 X n tends in distribution to a standard Gaussian. One of the basic facts concerning independent random variables X 1, X 2,..., X n is that E(X 1 X 2... X n ) = n E(X i ). The reader should consult Williams for a succinct exposition of the notions and theorems concerning independence Conditional Expectation For the probability space (Ω, F, P ) and a set B F of non-zero measure, the probability of A F given B is P (A B)/P (B), which is written P (A B). We read this as the conditional probability of A given B or, simply, the probability of A given B. This defines a probability measure P B on (Ω, F) which allows us to define a random variable from a given random variable X by considering X as a function on (Ω, F, P B ). This random variable is denoted X B. While it is true that for disjoint events A and B, P (A B) = P (A) + P (B), the reader should easily come up with examples where P (C A B) P (C A) + P (C B). The expectation of this random variable is called the conditional expectation of X given B, written E(X B). Furthermore, if Y is a random variable, then E(X Y = y) is just x R xp (X = x Y = y). If we do not specify the value of Y, then one obtains a random variable called the conditional expectation of X given Y, which is denoted E(X Y ). One of the main properties of this random variable is that its expectation can be computed from the formula E(X) = E(E(X Y )). This is sometimes called the tower property of conditional expectation, and is fundamental to the definition of martingales. We will return to this important notion in greater depth at a later stage Basic Inequalities We have already discussed some basic combinatorial inequalities involving binomial coefficients in Part I. There are a number of useful inequalities concerning probability and expectation of random variables. Perhaps the most commonly useful inequality is the Cauchy-Schwartz Inequality, which, in its simplest form, states that E(XY ) E(X)E(Y ) for any random variable X. This is a special case of a much more general inequality known as Jensen s inequality, namely Let f be a convex function on an open interval I in R and let X be a random variable such that P (X I) = 1, and E(f(X)) and E( X ) are both finite. Then E(f(X)) f(e(x)). A second important inequality is Hölder s inequality, which can be deduced from Jensen s Inequality, namely 2

4 Let p, q 1 be real numbers with 1/p + 1/q = 1. Then E(XY ) E(X p ) 1/p E(Y q ) 1/q when X and Y are non-negative random variables such that E(X), E(Y ) and E(XY ) are all finite. This inequality is fundamental not only in probability theory, but also in functional analysis (e.g. duality of normed spaces). We will also require some inequalities for real numbers. The Taylor series of a single-variable function f about zero, when it exists, is defined by k=0 f (k) (0) xk k!. One can truncate the Taylor series at some term to obtain an approximation to the function f. This idea allows us to obtain several useful inequalities concerning real numbers. Two very familar Taylor series include e x = k=0 x k k! ( 1) k x k+1 and ln(1 + x) = k + 1 k=0 where the second is valid only for 1 < x 1. Using these series, it is fairly straightforward to prove some of the following inequalities for positive real numbers x i (the proofs are left as exercises): 1. n ( n ) (1 x i ) exp x i 2. ( n x i ) 1/n 1 n n x i It is always useful to remember (1 1 n )n < 1 e < (1 1 n )n 1. More inequalities may be deduced by converting sums to integrals. We know that a convergent Riemann integral is the limiting value of a Riemann sum. Thus, for example, f(t) dt = lim n n n f(k/n) for continuous bounded functions f on [0, 1]. Another trick is to convert the sum directly to an integral. For instance, if f + k denotes the maximum value of f on the interval [k, k + 1] and f k the minimum value, then clearly k=1 n 1 f k k=1 n 1 f(t) dt n f + k. k=1 Another thing to note is that sums over subsets frequently arise: for example one might recognise the identity n (1 + x i ) 1 = S [n] i S where the sum is over all non-empty subsets S of [n]. The expression on the left is much more manageable to estimate using preceding inequalities. But suppose the sum on the right is only x i 3

5 over subsets of [n] of size k. Then the product on the left is still an upper bound for that sum, but often not a good one. To fix this we introduce a weight α > 0 as follows: x i = α k (αx i ) α k S =k i S S =k i S n (1 + αx i ). Having an estimate for the product then allows us to minimize the result over α. With a bit of luck, an appropriate choice of α will give a good estimate. There are many more analytic techniques for evaluating sums, but we will not give them here. The reader may wish to consult some texts on generating functions Classical Distributions In most of our work, the same probability distributions will recur. These are, principally, the binomial distribution, the Poisson distribution, and the normal distribution (or Gaussian distribution. These are respectively defined by x ( ) n P (X x) = p t (1 p) n t t P (X x) = P (X x) = t=0 x t=0 x e λ λ t t! 1 2πσ e 1 2σ 2 (t µ)2 One should recall the meaning of the parameters in each of these distributions, for example the rate of the Poisson process is the parameter λ featuring in the second distribution function above, and the mean and variance of a Gaussian random variable with the distribution in the third line are µ and σ respectively. Other useful distributions in our work are the geometric distribution, the negative binomial distribution, the exponential distribution and the hypergeometric distribution. A short account of the basic facts concerning these distributions, presented in a way that is tailored for the material to follow, may be found in Bollobás book on random graphs. The moment generating function of a random variable X is denoted by M X (t) and defined by M X (t) = E(e tx ). The probability generating function of a random variable X is denoted by G X (t) and defined by G X (t) = E(t X ). The names assigned to these functions are natural, in the sense, for example, that the rth derivative of M X (t) evaluated at zero is precisely the rth moment of X. Finally, the characteristic function ϕ X of X is the complex-valued function defined by ϕ X (t) = E(e itx ) = e itx df X (x) where F X is the cumulative distribution function of X. A fundamental theorem here is Lévy s convergence theorem: Theorem 1 Let ϕ n be the characteristic function of a distribution function F n, for n N, and suppose that ϕ(t) = lim n ϕ n (t) exists and is continuous for any real number t. Then there is a distribution function F of which ϕ is the characteristic function. One of the consequences of this theorem is the famous central limit theorem, which we state in the next chapter. 4 R dt.

6 2.2 Markov and Chebyshev s Inequalities In the present chapter, we start to develop some tools from probability theory, which, although remaining simple, allow us to increase the breadth of applicability of the probabilistic method. The main theme is that of concentration: in many situations, one is required to know not only the expectation of a random variable, but also how far the random variable deviates from its expectation. Therefore most of the inequalities we develop will be collected under the title concentration inequalities. Two inequalities which are applicable regardless of the distribution of our random variable are Markov s and Chebyshev s inequalities. Based only on the variance and the mean of a random variable X, Markov s and Chebyshev s inequalities tell us something about the concentration of X around its mean. Throughout what follows, σ 2 denotes the variance of a random variable X and σ denotes the standard deviation of X. Markov s inequality follows from the simple fact that if X is a non-negative random variable and λ > 0, then X λi X λ for any real λ: λp (X λ) = E(λI X λ ) E(X). Here I X λ denotes the indicator function of the event X λ: I X λ (ω) = 0 if X(ω) < λ and I X λ (ω) = 1 if X(ω) λ. Therefore we obtain Markov s inequality: P (X λe(x)) 1 λ. To obtain Chebyshev s inequality, replace X with the non-negative random variable (X E(X)) 2. Then we obtain If E(X) = µ 0, then this reduces to P ( X E(X) λσ) 1 λ 2. P (X = 0) σ2 µ 2. Markov s and Chebyshev s Inequalities have many applications in combinatorics and elsewhere, due to their generality. Later we will see that for the distributions we are interested in, much stronger concentration inequalities can be found Subset sums We give a simple application of Chebyshev s inequality in combinatorial number theory. Although it does not give much better than a straightforward counting argument, it is how the inequality is used that should be retained: at first impression there is no probability in sight. The basic problem is this: what is the largest size of a subset A of [n] such that no two non-empty subsets of A have the same sum? Since there are 2 k subsets sums in a set of size k, we must have 2 k < nk giving upper bound log 2 n + log 2 log 2 n + 1. A modest improvement of the second order term is given by Chebyshev s Inequality: Theorem 2 The maximum number of elements of [n] that can be chosen so that all subsets have distinct sums is between log 2 n and log 2 n log 2 log 2 n

7 Proof Since the sequence of powers of two has distinct subset sums, the lower bound is proved. We can do better using second moments. Let A = {x 1, x 2,..., x k } be a subset of [n] in which all subset sums are distinct, and assign a random weight ε i {0, 1} to x i, for all i [k]. If X = k ε i x i then E(X) = 1 2 k x i and var(x) = 1 4 k x2 i. This last sum is clearly at most kn2 /4, since each x i is an element of [n]. Take a real number λ > 1. By Chebyshev s inequality, P ( X E(X) < λn k/2) 1 1 λ 2. Now this is a key point in the proof the probability P (X = x) is either zero or 2 k, since no pair of distinct subsets of A have the same sum, by assumption. Therefore P ( X E(X) < λn k/2) 2 k λn k. So if λ = 2, then 2 k 1 n 2k, which gives the required bound on k. The problem of determining whether there is a constant c so that any subset of [n] whose subset sums are distinct has size at most log 2 n + c is one of Erdős oldest problems, dating back to the 1930s Large Chromatic Number and Girth The chromatic number of a graph G is the minimum number of colours which can be assigned to the vertices of G so that no two adjacent vertices are assigned the same colour. For example, the complete graph on n vertices has chromatic number n, and bipartite graphs have chromatic number two. The famous Four Colour Theorem asserts that planar graphs have chromatic number at most four. The chromatic number of a graph is a notoriously hard parameter to deal with. It seems to depend globally on the structure of the graph, and the following result of Erdős shows that even if a graph is locally very sparse, it could still have high chromatic number. The proof of this result is a slightly more subtle alteration than in the construction of dense graphs of large girth in Part I. Theorem 3 For every pair of numbers g, k, there is a graph of chromatic number at least k in which every cycle has length greater than g. Proof We select each edge of the complete graph on n vertices with probability p = n γ 1 where 0 < γ < 1/g, to obtain a random graph G. We let n throughout the proof. The probability that (v 1, v 2,..., v l, v 1 ) is a cycle in G is clearly p l, so the expected number of cycles X of length at most g in G is exactly E(X) = l g n(n 1)(n 2)... (n l + 1) pl 2l 0 6

8 So by Markov s Inequality, P (X > n/2) 0. Now let y = 3 p ln n and let Y denote the number of sets of a vertices of G with no edges between them (this is called an independent set or stable set of G). Then ( ) n E(Y ) = (1 p) (a 2) 0 a so Markov s Inequality shows P (Y 1) 0. Therefore there exists a specific graph G for which X n/2 and Y = 0. Now remove from G one vertex of every cycle of length at most g, to get a graph H with at least 1 2n vertices and with no stable set of size a. Now we make the following observation: if a graph on n vertices has no stable set of size a, then its chromatic number is more than n/a. This follows from the fact that vertices of the same colour form a stable set, so the number of colours must be more than n/a. Applying this observation in H, we see that the chromatic number of H is at least n 2a >. If n is large enough, this is as large as we wish. nγ 6 ln n Explicit constructions of graphs of large chromatic number and girth (the length of the shortest cycle in a graph containing a cycle) were first given by Lovász. Since then, many other constructions exist, in particular, certain Ramanujan graphs of Lubotsky, Phillips and Sarnak give, for arbitrary k, n-vertex graphs of girth g log n and chromatic number large than k. All this indicates the difficulty in dealing with colouring: even if the graph is locally a tree (the case for graphs of large girth), the chromatic number may still be large. Many other results show that the chromatic number of a graph appears to be a global propery. 2.3 The Chernoff Bound One of the most fundamental theorems in probability is the central limit theorem. It is a statement about the convergence of the distribution of sums of many independent random variables to a Gaussian or normal distribution based on mild assumptions on the moments of the random variables. There are many versions of this theorem, of which we will state only one. The theorem is as follows; here Y denotes a standard Gaussian random variable and we write X i i.i.d as shorthand for the random variables X i are independently and identically distributed. Theorem 4 If X 1, X 2,..., X n are i.i.d random variables with means µ i and bounded variances σi 2, then with S n = n X i, µ n = n µ i and σn 2 = n σ2 i, S n µ n σ n d Y See Feller s probability theory book or Kallenberg s introduction to probability for more general versions and proofs of this theorem. The standard proof of the central limit involve s an application of Lévy s Convergence Theorem. We shall use the central limit theorem (actually a minor modification of it) to prove the Erdős-Kac Theorem on prime divisors in a later subsection. Chebyshev s inequality gives a polynomial bound on the probability that a random variable is a certain number of standard deviations away from its expectation. The central limit theorem sometimes provides an exponentially small bound for these so-called tail events or large deviations. For example, when λ is fixed, we have P ( S n µ n λσ n ) e 1 2 λ2. 7

9 This is fine when λ is constant, but when λ depends on n, then the quality of convergence to normality becomes important. It is certainly possible to find random variables where the convergence in distribution is very slow. However, when the X i s are independent with Bernouilli distribution, then X = X i has a binomial distribution, and a result can be obtained for all n, which is called the Chernoff Bound (1952): Theorem 5 Let X i, i = 1, 2,..., n be independent random variables with Bernouilli distributions with means p i, i [n], and let p = n 1 p i and h 0. Then P (S n > (p + h)n) e h2 n/2a P (S n < (p h)n) e h2 n/2b, where S n = X 1 + X X n and a is the maximum of α(1 α) for p α p + h and b is the maximum of β(1 β) for p h β p. In particular, if X has binomial distribution with probability p and mean pn, then for any ε < 1, P ( X pn > εpn) < 2e ε2 pn/2. Proof Let t, γ > 0 be real numbers, let S = S n, and write P (S γn) = P (e ts e tγn ) E(e ts )e tγn M S (t)e tγn. In the second step we used Markov s inequality. Recall the moment generating function of a random variable X is M X (t) := E(e tx ). Now the moment generating function M S (t) is exactly the product of moment generating functions M Xi (t), since the X i are independent random variables. Since M Xi (t) = (1 p i + p i e t ), we have P (S γn) e tγn n (1 p i + p i e t ) e tγn [ (1 p) + pe t) n. In the last step we used the arithmetic mean - geometric mean inequality. expression over t [0, ), we obtain a minimum of ( ) γ γn ( ) 1 p (1 γ)n = e I(γ) when e t γ(1 p) = p 1 γ p(1 γ). Minimizing this Now we put γ = p + h and use some simple estimates from first year calculus to get the result. The second statement of concentration is proved similarly (or considering complementary events). To get the last statement of the theorem, take the bounds from the first part with h = εp and h = εp, and add them together. There are many other forms of the Chernoff Bound. In Assignment 2, you are asked to prove the following inequality: if X is a sum of independent random variables X i where X i 1 and E(X i ) = 0, then for 0 λ 2σ, P ( X > λσ) 2e 1 4 λ2 where σ is the standard deviation of X. In the next few subsections we give some applications of the Chernoff Bound. 8

10 2.3.1 Triangles in Random Graphs A very natural instance of the binomial distribution comes from random graphs. We consider the sample space Ω n of all graphs on n vertices, with probability measure P p (G) = p e(g) (1 p) (n 2) e(g). Since we can view Ω n as Ω n = ( n 2) {0, 1} - one probability space for each edge - the probability measure P is a product probability measure. In words, edges of a random graph ω Ω n appear independently with probability p. For simplicity, we refer to a random graph as G n,p if it is taken from the sample space Ω n with probability measure P p. So the number of edges in a random graph has a binomial distribution, and we may apply the Chernoff bound. When we get to the subsection on random graphs, we ll look in detail at their structure. For now, as an example of the Chernoff Bound, we give another structural property of random graphs: every pair of vertices has roughly the same number of common neighbours. Theorem 6 Let (u, v) be the number of triangles in G n,p containing an edge {u, v} E(G n,p ). If p 2 n/ log n, then a.a.s (u, v) p 2 n Proof It is sufficient to prove that a.a.s every pair {u, v} of vertices of G n,p has codegree d(u, v) p 2 n, where the codegree of u and v is the number of common neighbours of u and v (vertices adjacent to both u and v). Let µ = p 2 (n 2) and let X = d(u, v) for a fixed pair of vertices {u, v}. Note that µ = E(X). Let X w = 1 if uw, vw G n,p and X w = 0 otherwise. Then X w has Bernouilli distribution with probability p 2. More importantly, the random variables X w are independent. Now X = w u,v X w, so we can apply the Chernoff Bound. If ε > 0, and n is large enough, then P ( X µ > εµ) < 2e ε2 µ 2 < 1 n 3. Here we used the assumption p 2 n/ log n. It follows that the expected number of pairs u, v with X µ > εµ is less than 1 n, which proves the theorem. We wrote the statement of the theorem in fairly succinct form. Another way to write it is as follows. It should be clear why we chose the succinct form! For all real numbers δ, ε > 0, there exist positive integers M = M(ε, δ) and N = N(ε, δ) such that for every integer n > N, if p : N [0, 1] is a function satisfying satisfying p(n) 2 > M(log n)/n, then P ( u, v G n,p : d(u, v) p 2 n < δp 2 n) > 1 ε Upsets in Tournaments We consider another striking example of an application of the Chernoff Bound concerning tournaments. Let σ be a permutation of n letters {1, 2,..., n} and let T be a tournament with players {1, 2,..., n}. The permutation tells us which players should beat which i.e. it is a ranking of the players. Accordingly, the game involving players i and j is called an upset if i beats j and σ(i) < σ(j). The question is, in a given tournament, is there a ranking of the players that assures few upsets? For example, can we make sure that at most 1 3 of the games are upsets? It turns out that there are tournaments which fail in this regard, as proved by Erdős and Moon (1963). 9

11 Theorem 7 There exist a tournament T such that the difference D between the upsets and nonupsets is at most 2n 3/2 log n. Proof We may assume n 3. Let σ be a fixed ranking of {1, 2,..., n}. In the game between players i and j, we let an upset occur with probability 1 2. Let D(σ) be the difference between upsets and non-upsets; then D(σ) = X e where X e = 1 if e represents an upset and X e = 1 otherwise. If we rescale the random variables X e using Y e = 1 2 (X e + 1), then the sum of Y e s satisfies the requirements of Theorem 5, so for any constant c > 0, P (D(σ) > 2n 3/2 (log n) 1/2 ) < e 4 log n n 1 2( n 2) < n (n 1) < 1 n!. Since there were n! possible rankings of our tournament, the expected number of rankings for which D(σ) > 2n 3/2 log n is less than one. Therefore there is a tournament for which every ranking σ produces D(σ) 2n 3/2 log n. De la Vega showed, using more advanced techniques, that we can find tournaments of n players such that D(σ) n 3/2 for every ranking σ, and this is best possible apart from the implicit constant Max Cut A cut in a graph G is the set of edges of G with one end in each part of a partition (X, Y ) of the vertex set of G. It is not hard to show that in every graph G, there is a cut with at least 1 2 e(g) edges. Indeed, if e(x, Y ) is the number of edges in a cut (X, Y ) where X = Y = n, then e(x, Y ) = ( ) 2n 2 n 1 (X,Y ) e G so there exists (X, Y ) for which n 2n 1 e(x, Y ) e(g) ( 2n 2 n 1 1 2n 2( n which implies e(x, Y ) E(G) for some (X, Y ). A slightly weaker result is obtained by taking a set X of vertices with probability 1 2, independently for each vertex of G. Then the expected number of edges between X and Y = X is 1 2e(G), so there must be an X for which e(x, Y ) 1 2e(G). A natural question is whether we can find larger cuts in a graph. We give a very natural application of Chernoff s theorem to show that this is not the case. Theorem 8 For all ε > 0, there exist graphs on 2n vertices of average degree d (8 log 2)/ε 2 such that for every equipartition of the vertices into two classes X and Y, the number of edges between X and Y is at least (1 ε) dn 2 and at most (1 + ε) dn 2. Proof We take a random graph: let p = d 2n and let G be a graph in which each edge of K 2n is present with probability p. For a given equipartition (X, Y ) of the vertices of G, the expected number of edges between X and Y is d 2n (2n) = dn 2. ) )

12 Chernoff s bound provides us with the necessary concentration: the probability that the number of edges between X and Y is less than (1 ε) dn 2 and at most (1 + ε) dn 2 is at most 2e ε2 dn 4. So the expected number of equipartitions (X, Y ) with fewer than (1 ε) dn 2 (1 + ε) dn 2 edges between X and Y is at most ( ) 1 2n 2e ε2 dn 4 < e n log 4 ε 2 dn 4. 2 n edges or more than Here we used that ( 2n n ) < 2 2n. Since d (8 log 2)/ε 2, this is less than one. Therefore there exists a graph on n vertices for which all equipartitions have between (1 + ε) dn 2 and (1 ε) dn 2 edges. The max cut problem is one of the central problems in combinatorial optimization. While it is NPhard to determine the exact size of a largest cut in a graph, one may ask for efficient algorithms which give a good approximation to the maximum cut. One of the most important results in approximation algorithms, due to Goemans and Williamson, states that the maximum cut can be approximated to a factor of about eighty-eight percent. A recent result of Khot, Kindler, Mossel and O Donnel shows that this is best possible, based on a controversial conjecture called the unique games conjecture. 2.4 Classic Bases and the Borel Cantelli Lemma A set A of non-negative integers is called a basis of order k for a set S Z if every element of S may be written as a sum of k elements of A. There are some very famous open problems concerning bases: for example, the Goldbach conjecture is that the set of all primes forms a basis of order two for the even integers greater than two. Fermat s Theorem (proved by Wiles) states that {x k : x Z} does not have a basis of order two consisting of kth powers, when k > 2. On the positive side, Vinogradov showed that every odd integer is a sum of three primes. A very general result on bases uses the following definition of density: the Schnirelmann density of a set A Z is given by A [n] σ(a) = inf n 1 n A fundamental theorem in number theory states that if σ(a) > 0 and 0 A, then A is a basis of order k for Z, for some positive integer k. Schnirelmann density is discussed at length in the book of Halberstam and Roth on sequences. It is of interest in number theory to find bases which do not contain many elements. Some early examples include the set of triangular numbers, (Gauss s Theorem), the set of squares (Lagrange s Theorem) and the set of kth powers, proved by Hilbert (this is known generally as Waring s Problem). It is easy to see that a basis of order k cannot contain too few elements of [n]: at most A k integers are a sum of k elements of A. Since every integer in [n] is a sum of k elements of A, A n 1/k. Erdős and Turán conjectured (1941) that there exists no basis A of order k > 1 such that every integer may be written as a sum of k elements of A in at most a constant number of ways. This is particular to the integers; indeed, in the cyclic group Z q 2 +q+1, 11

13 when q is a prime power, one can construct perfect difference sets a set A such that every x Z q 2 +q+1 can be written in exactly one way as a sum of two elements of A. In this subsection, we are interested in finding a basis A of order two which has the property that every integer in [n] can be written in only a few ways as a sum of two elements of A (sometimes called a thin basis). We need the following lemma, known in probability theory as the First Borel-Cantelli Lemma Borel-Cantelli Lemma For events A 1, A 2,..., the event that A n occurs infinitely often, written sometimes as {A n i.o.} or lim sup A n, is defined by: {A n i.o.} := A j. i 1 j i Lemma 9 Let A 1, A 2,... be a sequence of events with i 1 P (A i) finite. Then the probability that A n occurs infinitely often is zero. Proof For any positive integer i, P (A n i.o.) P A j P (A j ). j i j i Since the sum of P (A i ) is finite, this tends to zero as i Back to bases The following technical lemma is needed: Lemma 10 Let f(x) = ( log x x )1/2 on [n] and let S n = f(x)f(y), where the sum is over all pairs (x, y) with x + y = n, where x, y [n]. Then, as n tends to infinity, S n π log n. Proof Let N = n/2 and t = x/n. Then we may write S n = 2 x<n f(n + x)f(n x) 2 log n N x<n Since t = x/n, this is a Riemann sum, which converges to an integral: This completes the proof. S n log n (1 t 2 dt = π. ) 1/2 1 (1 t 2 ) 1/2. 12

14 Theorem 11 There exists a set A Z + such that every n Z + can be represented in c n log n ways as a sum of two elements of A, where 8 c n 24. Proof Let f(x) be the function defined in the last lemma. Construct a set A Z + by taking x A with probability af(x), where a is to be determined. For all positive integers x and y such that x + y = n, let I xy be the indicator that x, y A and let A(n) be the number of ways of writing n as a sum of two elements of A. Then the preceding lemma applies to give a(n) = E(A(n)) = E(I xy ) πa 2 log n. x+y=n Now the indicator variables I xy are independent, since we restricted to pairs (x, y) with x+y = n. So by the Chernoff bound, P ( A(n) a(n) > 1 2 a(n)) < 2e 1 8 a(n) 1 n 2 when a = 16/π and n is large enough. By the First Borel-Cantelli Lemma, the probability that A(n) a(n) > 1 2 a(n) occurs infinitely often is zero. Therefore there exists a set A Z+ such that A(n) a(n) < 1 2 a(n) for all n sufficiently large. Since a(n) = πa 2 log n 16 log n, we are done. Erdős and Tetali extended this result for bases of order k: there exists a basis A of order k for the integers such that every integer n can be represented as a sum of k elements of A in Θ(log n) ways. The key difference between this and the proof above is that, unfortunately, the random variables corresponding to I xy are no longer independent. 13

Algorithms. and. We will assume henceforth that when we write an expression involving E(X), then the mean of. is defined by

Algorithms. and. We will assume henceforth that when we write an expression involving E(X), then the mean of. is defined by Course Notes Part II Probabilistic Combinatorics and Algorithms J. A. Verstraete Department of Mathematics University of California San Diego 9500 Gilman Drive La Jolla California 92037-02 jacques@ucsd.edu

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1 Lecture I jacques@ucsd.edu Notation: Throughout, P denotes probability and E denotes expectation. Denote (X) (r) = X(X 1)... (X r + 1) and let G n,p denote the Erdős-Rényi model of random graphs. 10 Random

More information

Course Notes. Part IV. Probabilistic Combinatorics. Algorithms

Course Notes. Part IV. Probabilistic Combinatorics. Algorithms Course Notes Part IV Probabilistic Combinatorics and Algorithms J. A. Verstraete Department of Mathematics University of California San Diego 9500 Gilman Drive La Jolla California 92037-0112 jacques@ucsd.edu

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES CLASSICAL PROBABILITY 2008 2. MODES OF CONVERGENCE AND INEQUALITIES JOHN MORIARTY In many interesting and important situations, the object of interest is influenced by many random factors. If we can construct

More information

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1 List of Problems jacques@ucsd.edu Those question with a star next to them are considered slightly more challenging. Problems 9, 11, and 19 from the book The probabilistic method, by Alon and Spencer. Question

More information

Week 12-13: Discrete Probability

Week 12-13: Discrete Probability Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

Cycle lengths in sparse graphs

Cycle lengths in sparse graphs Cycle lengths in sparse graphs Benny Sudakov Jacques Verstraëte Abstract Let C(G) denote the set of lengths of cycles in a graph G. In the first part of this paper, we study the minimum possible value

More information

List coloring hypergraphs

List coloring hypergraphs List coloring hypergraphs Penny Haxell Jacques Verstraete Department of Combinatorics and Optimization University of Waterloo Waterloo, Ontario, Canada pehaxell@uwaterloo.ca Department of Mathematics University

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

Brownian Motion and Conditional Probability

Brownian Motion and Conditional Probability Math 561: Theory of Probability (Spring 2018) Week 10 Brownian Motion and Conditional Probability 10.1 Standard Brownian Motion (SBM) Brownian motion is a stochastic process with both practical and theoretical

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

8 Laws of large numbers

8 Laws of large numbers 8 Laws of large numbers 8.1 Introduction We first start with the idea of standardizing a random variable. Let X be a random variable with mean µ and variance σ 2. Then Z = (X µ)/σ will be a random variable

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

4 Expectation & the Lebesgue Theorems

4 Expectation & the Lebesgue Theorems STA 205: Probability & Measure Theory Robert L. Wolpert 4 Expectation & the Lebesgue Theorems Let X and {X n : n N} be random variables on a probability space (Ω,F,P). If X n (ω) X(ω) for each ω Ω, does

More information

Module 3. Function of a Random Variable and its distribution

Module 3. Function of a Random Variable and its distribution Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given

More information

18.175: Lecture 3 Integration

18.175: Lecture 3 Integration 18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability

More information

17. Convergence of Random Variables

17. Convergence of Random Variables 7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This

More information

Notes on uniform convergence

Notes on uniform convergence Notes on uniform convergence Erik Wahlén erik.wahlen@math.lu.se January 17, 2012 1 Numerical sequences We begin by recalling some properties of numerical sequences. By a numerical sequence we simply mean

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

Lecture 2: Random Variables and Expectation

Lecture 2: Random Variables and Expectation Econ 514: Probability and Statistics Lecture 2: Random Variables and Expectation Definition of function: Given sets X and Y, a function f with domain X and image Y is a rule that assigns to every x X one

More information

(A n + B n + 1) A n + B n

(A n + B n + 1) A n + B n 344 Problem Hints and Solutions Solution for Problem 2.10. To calculate E(M n+1 F n ), first note that M n+1 is equal to (A n +1)/(A n +B n +1) with probability M n = A n /(A n +B n ) and M n+1 equals

More information

On the convergence of sequences of random variables: A primer

On the convergence of sequences of random variables: A primer BCAM May 2012 1 On the convergence of sequences of random variables: A primer Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM May 2012 2 A sequence a :

More information

arxiv: v1 [math.co] 24 Apr 2014

arxiv: v1 [math.co] 24 Apr 2014 On sets of integers with restrictions on their products Michael Tait, Jacques Verstraëte Department of Mathematics University of California at San Diego 9500 Gilman Drive, La Jolla, California 9093-011,

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

1 Stat 605. Homework I. Due Feb. 1, 2011

1 Stat 605. Homework I. Due Feb. 1, 2011 The first part is homework which you need to turn in. The second part is exercises that will not be graded, but you need to turn it in together with the take-home final exam. 1 Stat 605. Homework I. Due

More information

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School Basic counting techniques Periklis A. Papakonstantinou Rutgers Business School i LECTURE NOTES IN Elementary counting methods Periklis A. Papakonstantinou MSIS, Rutgers Business School ALL RIGHTS RESERVED

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2 Math 736-1 Homework Fall 27 1* (1 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2 and let Y be a standard normal random variable. Assume that X and Y are independent. Find the distribution

More information

9 - The Combinatorial Nullstellensatz

9 - The Combinatorial Nullstellensatz 9 - The Combinatorial Nullstellensatz Jacques Verstraëte jacques@ucsd.edu Hilbert s nullstellensatz says that if F is an algebraically closed field and f and g 1, g 2,..., g m are polynomials in F[x 1,

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Entropy and Ergodic Theory Lecture 15: A first look at concentration Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Expectation, inequalities and laws of large numbers

Expectation, inequalities and laws of large numbers Chapter 3 Expectation, inequalities and laws of large numbers 3. Expectation and Variance Indicator random variable Let us suppose that the event A partitions the sample space S, i.e. A A S. The indicator

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

are the q-versions of n, n! and . The falling factorial is (x) k = x(x 1)(x 2)... (x k + 1).

are the q-versions of n, n! and . The falling factorial is (x) k = x(x 1)(x 2)... (x k + 1). Lecture A jacques@ucsd.edu Notation: N, R, Z, F, C naturals, reals, integers, a field, complex numbers. p(n), S n,, b(n), s n, partition numbers, Stirling of the second ind, Bell numbers, Stirling of the

More information

Szemerédi s Lemma for the Analyst

Szemerédi s Lemma for the Analyst Szemerédi s Lemma for the Analyst László Lovász and Balázs Szegedy Microsoft Research April 25 Microsoft Research Technical Report # MSR-TR-25-9 Abstract Szemerédi s Regularity Lemma is a fundamental tool

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

The expansion of random regular graphs

The expansion of random regular graphs The expansion of random regular graphs David Ellis Introduction Our aim is now to show that for any d 3, almost all d-regular graphs on {1, 2,..., n} have edge-expansion ratio at least c d d (if nd is

More information

Stochastic Models (Lecture #4)

Stochastic Models (Lecture #4) Stochastic Models (Lecture #4) Thomas Verdebout Université libre de Bruxelles (ULB) Today Today, our goal will be to discuss limits of sequences of rv, and to study famous limiting results. Convergence

More information

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get 18:2 1/24/2 TOPIC. Inequalities; measures of spread. This lecture explores the implications of Jensen s inequality for g-means in general, and for harmonic, geometric, arithmetic, and related means in

More information

Topics in Algorithms. 1 Generation of Basic Combinatorial Objects. Exercises. 1.1 Generation of Subsets. 1. Consider the sum:

Topics in Algorithms. 1 Generation of Basic Combinatorial Objects. Exercises. 1.1 Generation of Subsets. 1. Consider the sum: Topics in Algorithms Exercises 1 Generation of Basic Combinatorial Objects 1.1 Generation of Subsets 1. Consider the sum: (ε 1,...,ε n) {0,1} n f(ε 1,..., ε n ) (a) Show that it may be calculated in O(n)

More information

Exercises in Extreme value theory

Exercises in Extreme value theory Exercises in Extreme value theory 2016 spring semester 1. Show that L(t) = logt is a slowly varying function but t ǫ is not if ǫ 0. 2. If the random variable X has distribution F with finite variance,

More information

The Canonical Gaussian Measure on R

The Canonical Gaussian Measure on R The Canonical Gaussian Measure on R 1. Introduction The main goal of this course is to study Gaussian measures. The simplest example of a Gaussian measure is the canonical Gaussian measure P on R where

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Lecture 4: September Reminder: convergence of sequences

Lecture 4: September Reminder: convergence of sequences 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused

More information

Midterm Exam 1 Solution

Midterm Exam 1 Solution EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:

More information

Notes on Equidistribution

Notes on Equidistribution otes on Equidistribution Jacques Verstraëte Department of Mathematics University of California, San Diego La Jolla, CA, 92093. E-mail: jacques@ucsd.edu. Introduction For a real number a we write {a} for

More information

Measure-theoretic probability

Measure-theoretic probability Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is

More information

Random Lifts of Graphs

Random Lifts of Graphs 27th Brazilian Math Colloquium, July 09 Plan of this talk A brief introduction to the probabilistic method. A quick review of expander graphs and their spectrum. Lifts, random lifts and their properties.

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

Graph Theory. Thomas Bloom. February 6, 2015

Graph Theory. Thomas Bloom. February 6, 2015 Graph Theory Thomas Bloom February 6, 2015 1 Lecture 1 Introduction A graph (for the purposes of these lectures) is a finite set of vertices, some of which are connected by a single edge. Most importantly,

More information

Product measure and Fubini s theorem

Product measure and Fubini s theorem Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω

More information

Elementary Probability. Exam Number 38119

Elementary Probability. Exam Number 38119 Elementary Probability Exam Number 38119 2 1. Introduction Consider any experiment whose result is unknown, for example throwing a coin, the daily number of customers in a supermarket or the duration of

More information

High Dimensional Probability

High Dimensional Probability High Dimensional Probability for Mathematicians and Data Scientists Roman Vershynin 1 1 University of Michigan. Webpage: www.umich.edu/~romanv ii Preface Who is this book for? This is a textbook in probability

More information

1. Probability Measure and Integration Theory in a Nutshell

1. Probability Measure and Integration Theory in a Nutshell 1. Probability Measure and Integration Theory in a Nutshell 1.1. Measurable Space and Measurable Functions Definition 1.1. A measurable space is a tuple (Ω, F) where Ω is a set and F a σ-algebra on Ω,

More information

1 Introduction. 2 Measure theoretic definitions

1 Introduction. 2 Measure theoretic definitions 1 Introduction These notes aim to recall some basic definitions needed for dealing with random variables. Sections to 5 follow mostly the presentation given in chapter two of [1]. Measure theoretic definitions

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Sequences of Real Numbers

Sequences of Real Numbers Chapter 8 Sequences of Real Numbers In this chapter, we assume the existence of the ordered field of real numbers, though we do not yet discuss or use the completeness of the real numbers. In the next

More information

Erdős-Renyi random graphs basics

Erdős-Renyi random graphs basics Erdős-Renyi random graphs basics Nathanaël Berestycki U.B.C. - class on percolation We take n vertices and a number p = p(n) with < p < 1. Let G(n, p(n)) be the graph such that there is an edge between

More information

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time: Chapter 1 Random Variables 1.1 Elementary Examples We will start with elementary and intuitive examples of probability. The most well-known example is that of a fair coin: if flipped, the probability of

More information

Bounds on the generalised acyclic chromatic numbers of bounded degree graphs

Bounds on the generalised acyclic chromatic numbers of bounded degree graphs Bounds on the generalised acyclic chromatic numbers of bounded degree graphs Catherine Greenhill 1, Oleg Pikhurko 2 1 School of Mathematics, The University of New South Wales, Sydney NSW Australia 2052,

More information

Prime Number Theory and the Riemann Zeta-Function

Prime Number Theory and the Riemann Zeta-Function 5262589 - Recent Perspectives in Random Matrix Theory and Number Theory Prime Number Theory and the Riemann Zeta-Function D.R. Heath-Brown Primes An integer p N is said to be prime if p and there is no

More information

THE INVERSE PROBLEM FOR REPRESENTATION FUNCTIONS FOR GENERAL LINEAR FORMS

THE INVERSE PROBLEM FOR REPRESENTATION FUNCTIONS FOR GENERAL LINEAR FORMS INTEGERS: ELECTRONIC JOURNAL OF COMBINATORIAL NUMBER THEORY 8 (2008), #A16 THE INVERSE PROBLEM FOR REPRESENTATION FUNCTIONS FOR GENERAL LINEAR FORMS Peter Hegarty Department of Mathematics, Chalmers University

More information

1 Basic Combinatorics

1 Basic Combinatorics 1 Basic Combinatorics 1.1 Sets and sequences Sets. A set is an unordered collection of distinct objects. The objects are called elements of the set. We use braces to denote a set, for example, the set

More information

CHAPTER 3: LARGE SAMPLE THEORY

CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 1 CHAPTER 3: LARGE SAMPLE THEORY CHAPTER 3 LARGE SAMPLE THEORY 2 Introduction CHAPTER 3 LARGE SAMPLE THEORY 3 Why large sample theory studying small sample property is usually

More information

Lectures 6: Degree Distributions and Concentration Inequalities

Lectures 6: Degree Distributions and Concentration Inequalities University of Washington Lecturer: Abraham Flaxman/Vahab S Mirrokni CSE599m: Algorithms and Economics of Networks April 13 th, 007 Scribe: Ben Birnbaum Lectures 6: Degree Distributions and Concentration

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

3. Review of Probability and Statistics

3. Review of Probability and Statistics 3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture

More information

Decomposing oriented graphs into transitive tournaments

Decomposing oriented graphs into transitive tournaments Decomposing oriented graphs into transitive tournaments Raphael Yuster Department of Mathematics University of Haifa Haifa 39105, Israel Abstract For an oriented graph G with n vertices, let f(g) denote

More information

Math Bootcamp 2012 Miscellaneous

Math Bootcamp 2012 Miscellaneous Math Bootcamp 202 Miscellaneous Factorial, combination and permutation The factorial of a positive integer n denoted by n!, is the product of all positive integers less than or equal to n. Define 0! =.

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

the convolution of f and g) given by

the convolution of f and g) given by 09:53 /5/2000 TOPIC Characteristic functions, cont d This lecture develops an inversion formula for recovering the density of a smooth random variable X from its characteristic function, and uses that

More information

Lecture 18: March 15

Lecture 18: March 15 CS71 Randomness & Computation Spring 018 Instructor: Alistair Sinclair Lecture 18: March 15 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Large incidence-free sets in geometries

Large incidence-free sets in geometries Large incidence-free sets in geometries Stefaan De Winter Department of Mathematical Sciences Michigan Technological University Michigan, U.S.A. Jeroen Schillewaert sgdewint@mtu.edu Jacques Verstraete

More information

Sums and products. Carl Pomerance, Dartmouth College Hanover, New Hampshire, USA. Providence College Math/CS Colloquium April 2, 2014

Sums and products. Carl Pomerance, Dartmouth College Hanover, New Hampshire, USA. Providence College Math/CS Colloquium April 2, 2014 Sums and products Carl Pomerance, Dartmouth College Hanover, New Hampshire, USA Providence College Math/CS Colloquium April 2, 2014 Let s begin with products. Take the N N multiplication table. It has

More information

Appendix B: Inequalities Involving Random Variables and Their Expectations

Appendix B: Inequalities Involving Random Variables and Their Expectations Chapter Fourteen Appendix B: Inequalities Involving Random Variables and Their Expectations In this appendix we present specific properties of the expectation (additional to just the integral of measurable

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

1 Independent increments

1 Independent increments Tel Aviv University, 2008 Brownian motion 1 1 Independent increments 1a Three convolution semigroups........... 1 1b Independent increments.............. 2 1c Continuous time................... 3 1d Bad

More information

µ X (A) = P ( X 1 (A) )

µ X (A) = P ( X 1 (A) ) 1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration

More information

1.1 Szemerédi s Regularity Lemma

1.1 Szemerédi s Regularity Lemma 8 - Szemerédi s Regularity Lemma Jacques Verstraëte jacques@ucsd.edu 1 Introduction Szemerédi s Regularity Lemma [18] tells us that every graph can be partitioned into a constant number of sets of vertices

More information

Lecture 7: February 6

Lecture 7: February 6 CS271 Randomness & Computation Spring 2018 Instructor: Alistair Sinclair Lecture 7: February 6 Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They

More information

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events.

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. 1 Probability 1.1 Probability spaces We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. Definition 1.1.

More information

Preliminaries. Probability space

Preliminaries. Probability space Preliminaries This section revises some parts of Core A Probability, which are essential for this course, and lists some other mathematical facts to be used (without proof) in the following. Probability

More information

Disjoint Subgraphs in Sparse Graphs 1

Disjoint Subgraphs in Sparse Graphs 1 Disjoint Subgraphs in Sparse Graphs 1 Jacques Verstraëte Department of Pure Mathematics and Mathematical Statistics Centre for Mathematical Sciences Wilberforce Road Cambridge CB3 OWB, UK jbav2@dpmms.cam.ac.uk

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x!

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x! Lectures 3-4 jacques@ucsd.edu 7. Classical discrete distributions D. The Poisson Distribution. If a coin with heads probability p is flipped independently n times, then the number of heads is Bin(n, p)

More information