APM 421 Probability Theory Discrete Random Variables. Jay Taylor Fall Jay Taylor (ASU) Fall / 86

Size: px
Start display at page:

Download "APM 421 Probability Theory Discrete Random Variables. Jay Taylor Fall Jay Taylor (ASU) Fall / 86"

Transcription

1 APM 421 Probability Theory Discrete Random Variables Jay Taylor Fall 2013 Jay Taylor (ASU) Fall / 86

2 Outline 1 Motivation 2 Infinite Sets and Cardinality 3 Countable Additivity 4 Discrete Random Variables 5 Probability Generating Functions 6 Geometric and related distributions 7 Poisson distribution 8 Fluctuation Tests 9 Poisson Processes Jay Taylor (ASU) Fall / 86

3 Motivation Distributions on Infinite Spaces Example: Suppose that a coin with probability p = 1/2 of landing on heads is repeatedly flipped and let N be the number of flips that land on tails before we get the first heads. Assuming that the flips are independent of one another, we expect that P(N = k) = «k for any integer k 0, since the event {N = k} occurs if and only if the first k flips all land on tails and the k + 1 st toss lands on heads. This suggests that we can define N to be a random variable with values in the set of natural numbers N = {0, 1, } and distribution given by the above formula. In particular, notice that if p > 0, then «k+1 1 P(N = k) = = 1, 2 k=0 k=0 which suggests that P(N < ) = 1. Jay Taylor (ASU) Fall / 86

4 Motivation Unfortunately, our current formulation of probability does not allow us to define such a random variable N. The problem is that at the moment we only require probability distributions to be finitely additive, which means that while we can calculate the probabilities of finite sets such as P(N 4) = 4X P(N = k) = k=0 k=0 4X k=0 «k+1 1 = , we cannot similarly conclude that «2k+1 1 P(N is even) = P(N = 2k) = 2 k=0 since [ {N is even} = {N = 2k} k=0 expresses the event that N is even as a disjoint union of infinitely many sets and finite additivity tells us nothing in this case. Jay Taylor (ASU) Fall / 86

5 Motivation One way to address this problem is to define a distribution ν on N by requiring ν(a) = X k A «k for every subset A N. Clearly, ν is coherent: ν(a) 0 for every A N; ν(n) = P k 0 ν({k}) = P ` 1 k+1 k=0 = 1 2 ν is finitely additive, since given any pair of disjoint subsets A, B N, ν(a B) = X k A B 1 2 «k+1 = X k A 1 2 «k+1 + X k B «k+1 1 = ν(a) + ν(b). 2 However, ν is not the only coherent distribution on N which assigns probability (1/2) k+1 to each singleton set {k}. In fact, there are infinitely many such distributions and we don t yet know which one (if any) of these should be chosen as the distribution of N. Jay Taylor (ASU) Fall / 86

6 Motivation The previous example demonstrated that finite additivity on its own may not be strong enough to uniquely define the probabilities of infinite sets. Our next example will reveal an even more serious defect: although sure loss cannot occur if we use a coherent probability distribution to place bets on a finite sample space, this is not true for infinite sample spaces. Example: Let N = {0, 1, 2, } denote the set of natural numbers and for each m 1 and k = 0,, m 1, let R m,k be the set R m,k = {k + nm : n 0} = {k, k + m, k + 2m, k + 3m, } The sets R m,k are called residue classes mod m. For example, R 2,0 is the set of non-negative even integers, while R 2,1 is the set of non-negative odd integers. It can be shown that there exists a coherent distribution µ on N with the following properties: µ(r m,k ) = 1/m for all m 1 and k = 0,, m 1. µ(b) = 0 for any finite subset B N. In particular, µ({n}) = 0 for every n 0. Jay Taylor (ASU) Fall / 86

7 Motivation Since R 1,0 = N, it follows that = µ(n) = n 0{n} [ A X µ({n}) = 0, n 0 but this is no contradiction since we only require µ to be finitely additive. We will say that µ is a uniform distribution on the natural numbers. Although µ is coherent, it has some unsavory properties. Let A be an event to which you assign probability P(A) = 1/2 and let X be the indicator variable for A, i.e., X = 1 if A occurs and X = 0 if A occurs. We will define a second random variable Y with values in N as follows. Given any subset B N define 8 < µ(b) if X = 0 P(Y B X ) = : ν(b) if X = 1, where ν is the distribution defined in the first example and µ is the uniform distribution defined in this example. Jay Taylor (ASU) Fall / 86

8 Motivation Since Y has been defined in terms of X, we can use the law of total probability to calculate the probabilities of events of the form Y B. For example, observe that for every n 0, P(Y = n) = P(Y = n X = 0)P(X = 0) + P(Y = n X = 1)P(X = 1) = µ({n}) ν({n}) 1 2 = 0 1 «n «n+2 1 =, 2 since µ assigns probability 0 to every finite subset of N. Jay Taylor (ASU) Fall / 86

9 Motivation Likewise, we can use Bayes formula to calculate the conditional distribution of X given Y, e.g., P(X = 1 Y = n) = = P(X = 1, Y = n) P(Y = n) P(Y = n X = 1)P(X = 1) P(Y = n) = (1/2)n+1 (1/2) (1/2) n+2 = 1, which also holds for every n 0. In other words, although X is equally likely to be equal to 1 or 0, as soon as we learn the value of Y we can immediately deduce that X = 1, no matter what value Y assumes. Jay Taylor (ASU) Fall / 86

10 Motivation These observations also lead to following consequences for wagers on the event A. In the absence of any information about Y, we are willing to pay $0.50 for a $1 bet that A will occur. However, if we subsequently learn the value of Y, then the value of our $1 wager on A immediately becomes $0 since we are then certain that A will occur. This example illustrates a phenomenon known as dynamic sure loss: by gaining information we guarantee that we will lose money. Notice that we cannot escape this quandry by re-assigning the unconditional probability of A to be 1, since in the absence of information about Y, A is as likely to occur as it is not to occur. Rather the problem is that coherence is not strong enough a condition on distributions on infinite spaces to avoid certain forms of sure loss. Fortunately, we can avoid these dilemmas by requiring that probability distributions on infinite spaces satisfy a stronger set of conditions. Jay Taylor (ASU) Fall / 86

11 Infinite Sets and Cardinality Interlude: Infinite Sets and Cardinality Before we can begin to extend our theory to sets with infinitely many elements, we need to take a closer look at some of the properties of infinite sets. We begin by addressing the following question: what do we mean when we say that two sets, A and B, have the same number of elements? This is easy when A and B are finite. For example, if A = {apples, oranges, pears} B = {87, J, c} then since A and B each contain three elements, it is clear that they both have the same number of elements. In other words, we count the number of elements in each set and check whether these numbers are equal. Jay Taylor (ASU) Fall / 86

12 Infinite Sets and Cardinality To extend this concept further, we need to take a closer look at counting. When we count the number of elements in a set X and decide that this number if n, what we are doing is creating a function Φ from the set {1, 2,, n} into the set X that is both one-to-one and onto: Φ is one-to-one if no two distinct elements are assigned the same value by Φ, i.e., if i j, then Φ(i) Φ(j); Φ is onto if every element in the range is the image of an element in the domain, i.e., for every x X, there is an element i such that Φ(i) = x. A function Φ that is both one-to-one and onto is said to be bijective. In general, there may be many bijections Φ between {1, 2,, n} and X, but one way to select such a function is to label the elements of X = {x 1,, x n} and then define for i = 1,, n. Φ(i) = x i Jay Taylor (ASU) Fall / 86

13 Infinite Sets and Cardinality This way of thinking about counting can also be applied to pairs of sets whose sizes are being compared. Specifically, if X and Y both have n elements, then there are bijective functions Φ (X ) and Φ (Y ) from {1,, n} onto X and Y, respectively. However, this 1 means that there is a bijective function Ψ = Φ (Y ) Φ (X ) from X onto Y. In fact, the converse is also true: if X and Y are finite and there is a bijection between X and Y, then they have the same numbers of elements. For example, a bijection can be constructed between the sets A and B as follows apples 1 87 oranges 2 J pears 3 c which gives us the mapping Ψ : A B with Ψ(apples) = 87, Ψ(oranges) = J and Ψ(pears) = c. Jay Taylor (ASU) Fall / 86

14 Infinite Sets and Cardinality These observations lead us to the following definition. Definition We say that two sets, X and Y, have the same cardinality, written X = Y, if there exists a bijective function Φ between X and Y. In contrast, we say that the cardinality of X is less than the cardinality of Y, written X < Y, if X and Y do not have the same cardinality and there is a subset D Y such that X and D have the same cardinality. Remarks: Interpretation: Cardinality provides us with a way to compare the sizes of different sets. Sets that have the same cardinality have, in some sense, the same number of elements, even if that number is infinite. Cardinality is not the only way to measure the size of an infinite set, but it is one of the most basic notions insofar as it does not require additional structure on the set. Other more specialized notions of size include Lebesgue measure, Hausdorff dimension, and capacity. Jay Taylor (ASU) Fall / 86

15 Infinite Sets and Cardinality Given any two finite sets A and B, we can show that either both sets have the same number of elements or one of the two sets has fewer elements than the other. This is a consequence of the fact that the counting numbers are well-ordered: given positive integers n and m, either n = m or n < m or n > m. However, things are not so straightforward when we turn to infinite sets. In fact, the following two statements are equivalent in the sense that each one implies the other: Law of Trichotomy: Given any two sets X and Y, either X = Y or X < Y or X > Y. Axiom of Choice: Given any collection of distinct, non-empty sets S α, α A, there exists a set C which contains exactly one element from each of the sets S α. Although the axiom of choice is accepted by many mathematicians as one of the fundamental axioms of set theory, it leads to a number of odd results such as the Banach-Tarski paradox, which asserts that it is possible to dissect a three-dimensional ball into a finite number of pieces which can then be reassembled into two disjoint unit balls, each having the same volume as the original. Jay Taylor (ASU) Fall / 86

16 Infinite Sets and Cardinality One of the stranger properties of cardinality is that two sets X and Y can have the same cardinality even when X is a proper subset of Y. Example: The positive integers Z + are a proper subset of the natural numbers N, but the mapping Φ(n) = n + 1 is a bijection from N onto Z + and so both sets have the same cardinality. Example: The even natural numbers 2N are a proper subset of the natural numbers N, but the mapping Φ(n) = 2n is a bijection from N onto 2N and so these sets also have the same cardinality. Hilbert s Paradox of the Grand Hotel: Suppose that a hotel contains an infinite number of rooms, numbered 1, 2, 3,, and that all of the rooms are occupied. A new guest arrives and seeks accommodation. To make a place for them, the hotel moves each guest from their current room to the room with the next higher number, e.g., the person in room 1 moves to room 2, the person in room 2 moves to room 3, and so forth. The new guest is then moved into room 1. In this way, the hotel is able to accommodate new arrivals even when there are no vacancies. Jay Taylor (ASU) Fall / 86

17 Infinite Sets and Cardinality Sets that have the same cardinality as the natural numbers or one of its subsets play an especially important role in probability theory. Definition A set X is said to be countable if X is either finite or X has the same cardinality as the natural numbers N. In the latter case, we say that X is countably infinite. If X is neither finite nor countably infinite, then X is said to be uncountable. The following are examples of countable sets: The natural numbers N = {0, 1, 2, }; The positive integers Z + = {1, 2, }; The integers Z = {0, ±1, ±2, }; The rational numbers Q = {p/q : q 0, p, q Z}; Any countable union of countable sets, i.e., if A i is countable for every i 1, then the union A = i A i is also countable. Jay Taylor (ASU) Fall / 86

18 Infinite Sets and Cardinality As the following theorem shows, not all infinite sets are countably infinite. Theorem The set [0, 1] is uncountable. Proof: We prove this by contradiction, using a clever method invented by Georg Cantor that has come to be known as the Cantor diagonalization argument. If [0, 1] is countable, then there is a bijection Φ between Z + and [0, 1]. Each of the numbers Φ(n) [0, 1] has a decimal expansion which can be written as Φ(n) = 0.c n1c n2c n3. Let x [0, 1] be the number with decimal expansion x = 0.x 1x 2x 3, where x n = 2 whenever c nn 2 and x n = 1 whenever c nn = 2. I claim that there is no integer n > 0 such that Φ(n) = x. Jay Taylor (ASU) Fall / 86

19 Infinite Sets and Cardinality Indeed, if there is an n > 0 such that Φ(n) = x, then we can write the decimal expansion of x in two ways: x = 0.x 1x 2x 3 = 0.c n1c n2c n3. However, since decimal expansions that do not end in a repeating series of all 0 s or all 9 s are unique, it must be the case that x i = c ni for all i 1. In particular, x n = c nn, which is a contradiction since we chose each x i so that x i c ii. This shows that no bijection can exist between Z + and [0, 1], which in turn implies that [0, 1] is uncountably infinite. Remarks: Since Z + has the same cardinality as the set D = {n 1 : n 1} [0, 1], it follows that the cardinality of [0, 1] is strictly larger than that of Z +. In other words, some infinite sets are bigger than others. It can be shown that any interval [a, b] or (a, b) with a < b is uncountably infinite. In particular, the real numbers R are uncountable, as are all of the Euclidean spaces R n. Jay Taylor (ASU) Fall / 86

20 Countable Additivity Countable Additivity To avoid the kinds of difficulties exposed by the examples given at the beginning of these slides, we will require that probability distributions on infinite sets satisfy the following additional condition. Definition Let S be a set and let P(S) be the collection of all subsets of S, i.e., the power set of S. A function µ : P(S) R is said to be countably additive if for any countably infinite collection of disjoint sets A 1, A 2, in F we have µ! [ A i = i=1 µ(a i ). i=1 Notice that if µ is countably additive and µ( ) is finite, then µ( ) = 0. Indeed, if we take A i = for all i 1, then µ( ) = µ( ). i=1 Jay Taylor (ASU) Fall / 86

21 Countable Additivity Theorem Let S be a countably infinite set and suppose that P : P(S) [0, 1] is a countably additive function with P(S) = 1. Then P is coherent. Proof: To show that P is coherent, we need only show that it is finitely additive. Suppose that A 1,, A n is a finite collection of disjoint subsets of S and define A n+k = for every k 1. Then A 1, A 2, extended in this fashion is a countably infinite sequence of disjoint subsets of S and by countable additivity we know that P! n[ A i i=1! [ = P A i = = i=1 P(A i ) i=1 nx P(A i ), i=1 which shows that P is finitely additive. Jay Taylor (ASU) Fall / 86

22 Countable Additivity In view of the previous theorem, we will adopt the following definition for probability distributions on countably infinite sets. Definition Let S = {s 1, s 2, } be a countably infinite set. A probability distribution on S is a function P : P(S) [0, 1] which satisfies the following conditions: 1 P(A) 0 for every subset A S; 2 P(S) = 1; 3 P is countably additive, i.e., if A 1, A 2, is a countable sequence of disjoint subsets of S, then! [ P A i = P(A i ). i=1 i=1 Jay Taylor (ASU) Fall / 86

23 Countable Additivity Because every subset of a countably infinite set is either finite or countably infinite, every probability distribution on a countably infinite set is uniquely determined by the probabilities that it assigns to the individual members of the set. Theorem Suppose that P is a probability distribution on a countably infinite set S = {s 1, s 2, }. Then, for any subset A S, we have P(A) = X s A P({s}). Proof: The result follows from the countable additivity of P and the fact that A can be expressed as a countable disjoint union of singleton sets containing the elements contained in A: A = [ s A{s}. Jay Taylor (ASU) Fall / 86

24 Countable Additivity In particular, this identity leads to an easy method for constructing probability distributions on countably infinite sets. Theorem Let S = {s 1, s 2, } be a countably infinite set and suppose that p 1, p 2, is a sequence of non-negative numbers that sums to 1 If P : P(S) [0, 1] is defined by p i = 1. i=1 then P is a probability distribution on S. P(A) = X s i A p i, Jay Taylor (ASU) Fall / 86

25 Countable Additivity Proof: It is clear from the definition that P(A) 0 for every subset A S and also that P(S) = X s i S p i = p i = 1. i=1 Furthermore, if A 1, A 2, is a countably infinite sequence of disjoint subsets of S, then! [ P A k k=1 which shows that P is countably additive. = = = X s i S k A k X p i p i k=1 s i A k P(A k ), k=1 Jay Taylor (ASU) Fall / 86

26 Discrete Random Variables Having defined probability distributions on countably infinite spaces, we can also extend our definition of a random variable to include variables which can take on countably infinitely many possible values. The following definition is key. Definition A discrete random variable is a random quantity X which takes values in a countable set S = {x 1, x 2, }. (Here S can be finite or countably infinite.) In this case, the distribution of X is the probability distribution on S defined by P(A) = P(X A) for any subset A S. Furthermore, the probability mass function of X is the function p : S [0, 1] defined by for any x S. p(x) = P(X = x) Jay Taylor (ASU) Fall / 86

27 Discrete Random Variables Example: For each n 0, let p n = 2 (n+1). Since p n = 2 (n+1) = 1, n=0 n=0 we can define a probability distribution on the natural numbers N = {0, 1, 2, } by setting P(A) = X n A p n. With this machinery in place, we can also formally define a random variable N which is equal to the number of tails obtained before the first heads when a fair coin is tossed repeatedly. In this case, the probability mass function of N is the function p : N [0, 1] defined by p(n) = 2 (n+1). Jay Taylor (ASU) Fall / 86

28 Discrete Random Variables Example: There is no uniform distribution on the natural numbers. Indeed, a distribution on a set S is said to be uniform if every element in S has the same probability. Thus, if P was uniform on N, then there would exists a non-negative number c 0 such that for every n 0, c = p n = P({n}). However, since N is equal to the countably infinite disjoint union of the singleton sets {n} and we know that P(N) = 1 for any probability distribution on N, the countable additivity of P implies that 1 = P(N) = p n = c and the right-hand side is either 0 if c = 0 or if c > 0. In either case, we have a contradiction and so P cannot be uniform on N. n=0 n=0 Jay Taylor (ASU) Fall / 86

29 Discrete Random Variables Theorem Suppose that X is a discrete random variable with values in the countable set S and let p : S [0, 1] be the probability mass function of S. Then and X p(x) = 1 x S P(X A) = X x A p(x) for any subset A S. Exercise: Prove this theorem. Jay Taylor (ASU) Fall / 86

30 Discrete Random Variables Previously we defined the expected value of a random variable X with finitely many possible values S = {x 1,, x n} to be the weighted sum of these values: E[X ] = nx P(X = x i ) x i. i=1 Although we would like to be able to extend this definition to random variables with countably infinitely many possible values, the following example shows that this is not entirely straightforward. Example: Let X be a random variable with values in the integers Z = {0, ±1, ±2, } and the following probability mass function p : Z [0, 1]: 8 < p(n) = P(X = n) = : 0 if n = 0 1 Cn 2 if n 0. Jay Taylor (ASU) Fall / 86

31 Discrete Random Variables The constant C included in the definition of the probability mass function of X is said to be a normalizing constant and must be chosen so that the probabilities sum to 1: 1 = X n Z p(n) = 2 C n=1 1 n = 2 π 2 2 C 6 = π2 3C. Thus C = 3/π 2 and so X is a properly defined discrete random variable. Now suppose that we define the expectation of X to be E[X ] X n Z P(X = n) n = 1 C = 1 C X n 0 X n 0 n n 2 1 n. Jay Taylor (ASU) Fall / 86

32 Discrete Random Variables Unfortunately, the last expression is ambiguous since its value depends on the order in which the terms are included in the sum. For example, if we first add the positive terms and then the negative terms, then we obtain the difference: X n 0 1 n? = X 1 n + X 1 n n 1 n 1 =, which is undefined. Alternatively, if we group the terms by absolute value and sum in order of increasing magnitude, then we obtain X 1 n + 1 «= X 0 = 0. n n 1 n 1 In fact, given any real number x R, it is possible to order the terms in this series such that the sum is equal to x. This shows that our previous definition of the expectation cannot be automatically extended to variables that take on infinitely many values since the infinite series might not even exist or, if it does, might depend on the order in which we list the possible values. Jay Taylor (ASU) Fall / 86

33 Discrete Random Variables Interlude: Infinite Series We begin by recalling what it means for a sequence of real numbers to converge to a limit. Definition A sequence of real numbers (x n : n 1) is said to converge to the limit x R, written x = lim n xn, if for every ɛ > 0 there exists a positive integer N ɛ such that for every n N ɛ we have x x n < ɛ. Example: The sequence (1/n; n 1) converges to the limit x = 0 since for any n N ɛ = 1 + ɛ 1 we have 0 x n = 0 1/n = 1 n ɛ 1 < 1 ɛ < ɛ. Jay Taylor (ASU) Fall / 86

34 Discrete Random Variables Although any sequence of real numbers (x n; n 1) can be assembled into a formal series, x n, the previous example shows that this sum is not always uniquely defined. For this reason, we need to pick out a special class of series that can be summed. n=1 Definition An infinite series consisting of the terms (x n; n 1) is said to be convergent if the sequence of partial sums s n = x x n is convergent, i.e., if the limit exists. i=1 x i lim n nx i=1 x i Jay Taylor (ASU) Fall / 86

35 Discrete Random Variables Our example also revealed that the value or even the existence of the limit of an infinite series may depend on the order of appearance of the terms in that sequence. This is unacceptable if we wish to use infinite series to define expectations of random variables with countably infinitely many values since the order in which we list these values is completely arbitrary. Fortunately, there is a large class of infinite series that do not suffer from this ambiguity. Definition An infinite series consisting of the terms (x n; n 1) is said to be absolutely convergent if the series ( x n : n ) is convergent, i.e., if the limit n=1 x n = lim n nx x i exists. If the series formed from (x n : n 1) is convergent, but not absolutely convergent, then we say that it is conditionally convergent. i=1 Jay Taylor (ASU) Fall / 86

36 Discrete Random Variables Example: Consider the alternating sequence with terms x n = ( 1) n+1 /n. This sequence is convergent with limit ln(2) = lim n k=1 nx x k, but it is only conditionally convergent since the limit is infinite. lim n k=1 nx x k = k=1 1 k = Jay Taylor (ASU) Fall / 86

37 Discrete Random Variables There is a profound difference between absolutely and conditionally convergent sequences that has a direct impact on our ability to define expectations. This is highlighted by the next two theorems. Theorem Suppose that (x n : n 1) are the terms in an absolute convergent series and let (y n : n 1) be a rearrangement of these terms. Then (y n : n 1) is also absolutely convergent and the limit of the series does not depend on the order in which we sum the terms: x n = y n. n=1 In particular, if (a n : n 1) is the sequence of non-negative values in (x n; n 1) and ( b n : n 1) is the sequence of negative values, both listed in order of appearance, then the series P a n and P b n are both absolutely convergent and n=1 x n = X a n X b n. n 1 n 1 n=1 Jay Taylor (ASU) Fall / 86

38 Discrete Random Variables The previous theorem states that absolutely convergent series are well-behaved in the sense that their limits do not depend on the order in which we sum their terms. The next theorem shows that absolute convergence is also a necessary condition for this to be true. Theorem Suppose that the series P x n is conditionally convergent and let y R be a real number. Then there is a rearrangement of the sequence (x n : n 1), say (y n : n 1), such that the series P y n converges to y, i.e., y = lim n k=1 nx y k. In other words, a conditionally convergent series can be rearranged so that it converges to any limit that we like. Jay Taylor (ASU) Fall / 86

39 Discrete Random Variables These last two theorems lead us to the following definition of the expectation of a random variable that takes on countably infinitely many values. Definition Suppose that X is a random variable with values in a countably infinite set S = {x 1, x 2, } R and let p(x i ) = p i be the probability mass function of X. Then the expectation of X will be defined to be equal to the quantity E[X ] = p k x k = lim k=1 n k=1 nx p k x k provided that the series P p k x k is absolutely convergent, i.e., provided that E[ X ] = p k x k = lim k=1 n k=1 nx p k x k < If this condition is not satisfied, then we say that the expectation of X does not exist. Jay Taylor (ASU) Fall / 86

40 Discrete Random Variables Example: Let N be the random variable with values in the natural numbers N = {0, 1, 2, } and probability mass function p(n) = 2 (n+1). Since N only takes on non-negative values, we need only check that the series P n 0 pnn is convergent. However, this is a consequence of the following calculation: nx 2 (n+1) n = 2 (n+1) 1 n=0 = = = 1 n=0 k=1 n=k k=1 2 n k=1 2 (n+1) Thus the expectation of N exists and is equal to E[N] = 1. Jay Taylor (ASU) Fall / 86

41 Discrete Random Variables Most of the results that we proved about expectations of random variables taking at most finitely many values extend to expectations of random variables taking countably infinitely many values. Here I will prove one such result and state several others (see the text for proofs). Theorem Suppose that X is a random variable having an expectation. Let k and b be constants and define Y = kx + b. Then Y has an expectation and E[Y ] = ke[x ] + b. Proof: Suppose that X takes values in the set S = {x 1, x 2, } and let p i = P(X = x i ). Then E Y = p i kx i + b p i` kxi + b i=1 = k p i x i + p i b i=1 i=1 = k E X + b <. i=1 Jay Taylor (ASU) Fall / 86

42 Discrete Random Variables This calculation shows that E[Y ] exists. Furthermore, its value is E[Y ] = p i`kxi + b i=1 = lim n nx p i`kxi + b i=1 nx = lim k p i x i + b n = k lim n i=1 nx i=1! nx p i i=1 p i x i + b lim n = k p i x i + b i=1 = ke[x ] + b. i=1 p i nx i=1 p i Jay Taylor (ASU) Fall / 86

43 Discrete Random Variables The remaining properties are stated as theorems. Theorem 1 Suppose that X and Y are random variables and that the expectations E[X ] and E[Y ] both exist. Then the expectation of X + Y exists and is equal to E[X + Y ] = E[X ] + E[Y ]. 2 In general, suppose that the expectations of the random variables X 1,, X n exists and let c 1,, c n be constants. Then the expectation of the variable c 1X c nx n exists and is equal to " nx # nx E c i X i = c i E[X i ]. i=1 i=1 In other words, expectations remain linear even when extended to random variables taking countably infinitely many values. Jay Taylor (ASU) Fall / 86

44 Discrete Random Variables Theorem Suppose that X is a random variable with countably infinitely many variables in the set S = {x 1, x 2, } and let g : S R. Then Y = g(x ) has expectation provided that E Y <. E[Y ] = P(X = x i ) g(x i ) i=1 Remark: The existence of E[X ] is not enough to guarantee the existence of E[g(X )]. For example, if P(N = n) = 2 (n+1), then we know that E[N] = 1 exists. However, if g(n) = ( 2) n, then and so E[g(N)] does not exist. E[ g(n) ] = 2 (n+1) ( 2) n 1 = 2 =, n=0 n=0 Jay Taylor (ASU) Fall / 86

45 Discrete Random Variables Theorem Let X and Y be random variables taking at most countably many values and suppose that E[X ] and E[X Y = y i ] exist for all possible values y i of Y. Then the random variable E[X Y ] has an expectation and E[X ] = EˆE[X Y ]. This result is sometimes known as the law of iterated expectations. Theorem Let X and Y be independent random variables and suppose that the expectations of g(x ) and h(y ) exist, where g and h are functions. Then g(x ) and h(y ) are independent random variables and the expectation of g(x )h(y ) exists and is equal to E[g(X )h(y )] = E[g(X )]E[h(Y )]. Jay Taylor (ASU) Fall / 86

46 Probability Generating Functions Probability Generating Functions Definition Let X be a random variable with values in the natural numbers N = {0, 1, }. The probability generating function of X is the function ψ() defined by h i ψ X (t) = E t X = P(X = n) t n for those values of t R such that the series on the right-hand side converges. The set of all such t is called the domain of ψ X. n=0 Remarks: The probability generating function is an alternative way of encoding information about the distribution of a random variable. Our aim is to learn about the distribution by studying the properties of the probability generating function. Jay Taylor (ASU) Fall / 86

47 Probability Generating Functions Example: If X is a binomial random variable with parameters n and p, then the probability generating function of X is ψ X (t) = E ht i X = = = nx P(X = k) t k k=0 nx k=0 nx k=0! n p k (1 p) n k t k k! n (pt) k (1 p) n k k = (pt + 1 p) n, and the domain of ψ X is the entire real line. Jay Taylor (ASU) Fall / 86

48 Probability Generating Functions Because the probability generating function is defined by a power series expansion, its domain is at least as large as the radius of convergence of that series. Recall that the radius of convergence of a power series φ(x) = P c nx n is the largest number of ρ such that the series converges for all x with x < ρ: ( ) ρ = sup r > 0 : c nx n < if x r n=0 There are many methods that can be used to determine the radius of convergence of a power, but one of these is the so-called ratio test. Theorem Let φ(x) = P n cnx n be a power series and suppose that the limit c n ρ lim n c n+1 exists. Then ρ is the radius of convergence of this power series. Jay Taylor (ASU) Fall / 86

49 Probability Generating Functions Example: Let X be a natural number-valued random variable with distribution 8 < 0 if n = 0 P(X = n) = : 6 1 if n 1. π 2 n 2 and let ψ X be the probability generating function of X : ψ X (t) = 6 π 2 X n=0 t n n 2. To apply the ratio test, we calculate the limit c n ρ = lim n c = lim (n + 1) 2 = 1, n+1 n n 2 which shows that the radius of convergence is 1. Since ψ X (1) < and ψ X ( 1) <, the domain of ψ X is [ 1, 1]. Jay Taylor (ASU) Fall / 86

50 Probability Generating Functions An important property of probability generating functions is that they uniquely determine the distribution of a random variable. Theorem Suppose that X and Y are natural number-valued random variables with identical probability generating functions, i.e., h i h i ψ X (t) = E t X = E t Y = ψ Y (t) and both functions have the same domain. Then X and Y have the same distribution, i.e., for every n 0. P(X = n) = P(Y = n), Remark: Notice that the theorem only asserts that X and Y have the same distribution, not that X = Y. In such cases we say that X and Y are identical in distribution and we write X d = Y. Jay Taylor (ASU) Fall / 86

51 Probability Generating Functions Proof: Because the radius of convergence of ψ X and ψ Y is greater than or equal to 1, we can perform the following differentiations: d n dt ψ n X (t) = d n t=0 dt n = =! P(X = k) t t=0 k k=0 P(X = k) d n k=0 k=n dt n tk t=0 k! P(X = k) (k n)! tk n t=0 = n! P(X = n). Similarly, d n dt ψ n Y (t) = n! P(Y = n), t=0 but since ψ X = ψ Y, the two functions have identical derivatives of all orders and so P(X = n) = P(Y = n). Jay Taylor (ASU) Fall / 86

52 Probability Generating Functions Probability generating functions of sums of independent random variables are particularly well behaved. Theorem Suppose that X 1,, X n are independent natural number-valued random variables with probability generating functions ψ X1,, ψ Xn. Then the probability generating function of X = X X n is ny ψ X (t) = ψ Xi (t) and the domain of ψ X is the intersection of the domains of the functions ψ X1, ψ Xn. i=1 Proof: Since X 1,, X n are independent, so are the random variables t X 1,, t Xn every value of t. Consequently, " i ny # ny h i ny ψ X (t) = E ht X 1+ +X n = E t X i = E t X i = ψ Xi (t) provided that t is contained in the domain of each of the functions ψ Xi. i=1 i=1 i=1 for Jay Taylor (ASU) Fall / 86

53 Probability Generating Functions Example: Let X 1,, X n be independent Bernoulli random variables, each with parameter p, and let X = X X n. Since X 1,, X n all have the same probability generating function ψ Xi (t) = 1 p + pt, it follows that the probability generating function of X is ψ X (t) = ny ψ Xi (t) = (1 p + pt) n. i=1 Since this is the same probability generating function that we found for a binomial random variable with parameters n and p, it follows that X is itself a binomial random variable with these parameters. Jay Taylor (ASU) Fall / 86

54 Probability Generating Functions Probability generating functions can also be used to calculate the mean and the variance of a random variable. Theorem Suppose that X is a random variable with probability generating function ψ X and assume that the radius of convergence of ψ X is greater than 1. Then the mean and the variance of X are equal to E[X ] = ψ X (1) Var(X ) = ψ X (1) + ψ X (1) `ψ X (1) 2. Jay Taylor (ASU) Fall / 86

55 Probability Generating Functions Proof: Provided that the radius of convergence is greater than 1, we can differentiate inside the series: ψ X (1) = = = k=0 P(X = k) d dt tk t=1 P(X = k) kt k 1 t=1 k=1 P(X = k) k k=0 = E[X ]. Jay Taylor (ASU) Fall / 86

56 Probability Generating Functions Similarly, ψ X (1) = = P(X = k) d 2 k=0 dt 2 tk t=1 P(X = k) k(k 1) k=2 = E[X 2 ] E[X ], which shows that Var(X ) = E[X 2 ] E[X ] 2 = ψ X (1) + ψ X (1) `ψ X (1) 2. Jay Taylor (ASU) Fall / 86

57 Probability Generating Functions Example: If X is a binomial random variable with parameters n and p, then the probability generating function of X is ψ X (t) = (1 p + pt) n which has derivatives ψ (1) = np and ψ (1) = n(n 1)p 2. Consequently, E[X ] = np and Var[X ] = ψ (1) + ψ (1) (ψ (1)) 2 = n(n 1)p 2 + np n 2 p 2 = np(1 p). These results agree with those that we previously calculated through more direct means. Jay Taylor (ASU) Fall / 86

58 Geometric and related distributions The Geometric Distribution Suppose that a series of independent trials is performed and that each trial has probability p of failing. If X is the number of successes that occur before the first failure, then P(X = n) = (1 p) n p for n 0. This distribution is important enough to have its own name. Definition A random variable X with values in the natural numbers is said to have the geometric distribution with parameter p, written X Geometric(p), if the probability mass function of X is P(X = n) = (1 p) n p. Jay Taylor (ASU) Fall / 86

59 Geometric and related distributions Exercise: Suppose that X Geometric(p). Find the probability generating function ψ X (t) = E[t X ] and use this to calculate the mean and the variance of X. Jay Taylor (ASU) Fall / 86

60 Geometric and related distributions Exercise: Suppose that X Geometric(p). Find the probability generating function ψ X (t) = E[t X ] and use this to calculate the mean and the variance of X. Solution: The probability generating function of X is Since ψ (t) = ψ X (t) = ψ (1) = 2(1 p)2. Therefore p 2 p(1 p) n t n p = 1 t(1 p). n=0 p(1 p) and ψ (t) = 2p(1 p)2, it follows that ψ (1) = 1 p (1 t(1 p)) 2 (1 t(1 p)) 3 p E[X ] = 1 p p Var(X ) = = 2(1 p)2 p 2 (1 p) p p p 1 p p «2 and Jay Taylor (ASU) Fall / 86

61 Geometric and related distributions The geometric distribution can be used to model random lifespans when the probability of death or failure per unit time is constant. In fact, the geometric distribution is said to be memoryless because the probability of survival over a period [t, t + s] depends only on the duration of the period and not on its starting time: P(X > t + s X > t) = P(X > t + s) P(X > t) (1 p)t+s = (1 p) t = (1 p) s = P(X > s). In other words, knowing that an individual has survived until time t makes it no more nor no less probable that they will survive for an additional s units of time. Jay Taylor (ASU) Fall / 86

62 Geometric and related distributions Example: Suppose that we believe that the lifespan of an electronic component can be modeled by a geometric distribution with an unknown parameter p and that we measure the lifespans of m copies of the component in order to estimate p. Let X 1 = x 1,, X m = x m be the observed lifespans and assume that these are independent. The likelihood function for p given the data D = (x 1,, x m) is L(p D) P p(x 1 = x 1,, X m = x m) my = P p(x i = x i ) = i=1 my p(1 p) x i i=1 = p m (1 p) x, where x = x 1 + x m. The notation P p was used above to indicate that we are calculating the probability of the data under the assumption that the parameter of the geometric distribution is p. Jay Taylor (ASU) Fall / 86

63 Geometric and related distributions The likelihood function tells us how the probability of the data varies with our choice of the parameter p. One way to select a point estimate of p is to choose the value of p that maximizes the probability of the data. This estimate is called the maximum likelihood estimate of p and can be found by maximizing the function L(p D). To this end, we differentiate L(p D) with respect to p and set the result equal to 0: 0 = d dp L(p D) = d dp (pm (1 p) x ) = mp m 1 (1 p) x xp m (1 p) x 1 = p m (1 p) x m p x «. 1 p Solving for p shows that the maximum likelihood estimate of p is ˆp ML = m m + x. Jay Taylor (ASU) Fall / 86

64 Geometric and related distributions If we use the geometric distribution to model random lifespans, then we are implicitly assuming that a single failure is sufficient to cause death or system collapse. However, many systems are robust in the sense that multiple independent components must fail for death to result. To model the lifespan of such a system we will introduce a more general class of distributions. Definition A random variable X with values in the natural numbers is said to have the negative binomial distribution with parameters r 1 and p [0, 1], written X NB(r, p), if the probability mass function of X is! P(X = n) = n + r 1 p r (1 p) n. n Remark: The negative binomial distribution with parameters r = 1 and p [0, 1] is just the geometric distribution with parameter p. Jay Taylor (ASU) Fall / 86

65 Geometric and related distributions The negative binomial distribution arises in the following way. Suppose that a sequence of independent trials is performed and that each trial has probability p of failure. If X is the number of successes that occur before the r th failure, then X NB(r, p). To verify this claim, observe that the event X = n occurs if the first n + r 1 trials result in r 1 failures and n successes, and the n + r th trial results in a failure. However, the probability of n successes in the first n r + 1 trials is given by the binomial probability! P(n successes in the first n + r 1 trials) = n + r 1 p r 1 (1 p) n. n Furthermore, since the outcome of the n + r th trial is independent of the first n + r 1 trials, it follows that!! P(X = n) = n + r 1 p r 1 (1 p) n p = n + r 1 p r (1 p) n, n n which is the negative binomial distribution. Jay Taylor (ASU) Fall / 86

66 Geometric and related distributions Suppose that X 1,, X r are independent geometric random variables with parameter p and let X = X X r. Then X NB(r, p). Indeed, if we interpret X 1 as the number of success that occur until the first failure, X 2 as the number of successes that occur until the second failure, etc., then it is clear that X is the number of successes that occur until the cumulative number of failures is equal to r. This observation makes it easy to calculate the probability generating function of X : h i ψ X (t) = E t X = ry «r p ψ Xi (t) =. 1 t(1 p) i=1 Furthermore, by differentiating ψ X (t), we can calculate the mean and the variance of X, which are E[X ] = Var(X ) = r(1 p) p r(1 p). p 2 Jay Taylor (ASU) Fall / 86

67 Poisson distribution The Poisson Distribution Suppose that we perform a large number of independent trials, say n 100, and that the probability of a success on any one trial is small, say p n = λ/n 1. If we let X (n) denote the total number of successes that occur in all n trials, then X (n) Binomial(n, p) and therefore P(X (n) = k) = = = =! n pn k (1 p n) n k k λ «k n! 1 λ «n k (n k)!k! n n ««n! λ k 1 λ «n 1 λ «k n k (n k)! k! n n «n(n 1)(n 2) (n k + 1) λ k n k k! «1 λ «n 1 λ «k. n n Jay Taylor (ASU) Fall / 86

68 Poisson distribution Notice that three of the terms in the last line depend on n and that these converge to finite limits as n : lim n «n(n 1)(n 2) (n k + 1) n k lim n lim n = 1 1 λ n «k = 1 1 λ n «n = e λ. Consequently, for every integer k 0, the probabilities P(X (n) = k) converge to a limit as n, which is lim P(X (n) = k) = e λ n λ k k! «. Jay Taylor (ASU) Fall / 86

69 Poisson distribution To verify the third limit on the preceding page, recall that the Taylor series for log(1 + x) is log(1 + x) = ( 1) n+1 x n n, n=1 which converges as long as x ( 1, 1]. In particular, when x 1, we can write log(1 + x) = x + O(x 2 ), where O(x 2 ) stands for a remainder term that is bounded by a constant times x 2. Therefore, lim log 1 λ «n = lim n n n log n = lim n n = λ. «1 λ n λn + O`n 2 «However, since e x is continuous on (, ), we can exponentiate both sides of this identity to obtain lim 1 λ «n = e λ. n n Jay Taylor (ASU) Fall / 86

70 Poisson distribution Furthermore, the limiting values of the probabilities sum to 1 when k is allowed to range over all of the natural numbers: «e λ λ k X «= e λ λ k = e λ e λ = 1. k! k! k=0 k=0 These observations motivate the following definition: Definition A random variable X is said to have the Poisson distribution with parameter λ 0 if X takes values in the non-negative integers with probability mass function In this case we write X Poisson(λ). p X (k) = P(X = k) = e λ λ k k!. Remark: The Poisson distribution takes its name from that of the 19 th century French mathematician, Siméon Dennis Poisson ( ). Jay Taylor (ASU) Fall / 86

71 Poisson distribution The probability generating function of the Poisson distribution is: h i ψ X (t) = E t X = X p X (k)t k = e λ k=0 Differentiating twice with respect to t gives = e λ e λt = e λ(t 1). k=0 (λt) k k! and we then find ψ X (t) = λe λ(t 1) ψ X (t) = λ 2 e λ(t 1) E[X ] = ψ X (1) = λ Var(X ) = ψ X (1) + ψ X (1) (ψ X (1)) 2 = λ. Thus, λ is equal to both the mean and the variance of the Poisson distribution. Jay Taylor (ASU) Fall / 86

72 Poisson distribution It has long been recognized that Poisson distributions provide a surprisingly accurate model for the statistics of a large number of seemingly unrelated phenomena. Some examples include: the number of misprints per page of a book; the number of wrong telephone numbers dialed in a day; the number of customers entering a post office per day; the number of mutations that occur when a genome is replicated; the number of α-particles discharged per day from a 14 C source; the number of major earthquakes per year; the number of Prussian soldiers killed per year by being kicked by a horse. Jay Taylor (ASU) Fall / 86

73 Poisson distribution For example, even the number of vacancies per year on the US Supreme Court is reasonably well modeled by a Poisson distribution: Number of vacancies (x) Probability Years with x vacancies Observed Expected Observed Expected Data from Cole (2010) compared with a Poisson distribution with λ = 0.5. Jay Taylor (ASU) Fall / 86

74 Poisson distribution Since these phenomena are generated by very different physical and biological processes, the fact that they share similar statistical properties cannot be explained by the specific mechanisms that operate in each instance. Instead, the widespread emergence of the Poisson distribution appears to be a consequence of the following more general mathematical result, which is commonly known as the Law of Rare Events. Theorem For each n 1, let X (n) 1,, X n (n) be a collection of independent Bernoulli random variables, each with success probability p n = λ/n, and let X (n) = X (n) X n (n) be the total number of successes in these n trials. Then «lim P(X (n) = k) = e λ λ k. n k! Interpretation: When n is large, the probability of success p n = λ/n is small and so each success is a rare event. However, since there are many trials, there is a non-negligible probability of having at least one success and the distribution of the total number of successes is approximately Poisson with parameter λ. Jay Taylor (ASU) Fall / 86

75 Poisson distribution We previously proved the law of rare events by directly calculating the limits of the probabilities P(X (n) = k) and showing that these coincide with the probabilities given by a Poisson distribution. However, we can also prove this result with the help of probability generating functions. The following theorem provides the essential tool. Theorem For each n 1, let X (n) be a non-negative integer-valued random variable with probability generating function ψ n(t) and suppose that these functions converge pointwise on the interval ( 1, 1), i.e., the limit ψ(t) lim n ψn(t) exists for all t ( 1, 1). Then ψ(t) is the probability generating function of a random variable X with values in the natural numbers and lim P X (n) = k = P(X = k) n for every integer k 0. Jay Taylor (ASU) Fall / 86

76 Poisson distribution Proof of the Law of Rare Events: Since X (n) Binomial(n, p n), we know that the probability generating function of X n is h ψ n(t) = E t X (n)i = 1 λ n + λt «n. n However, the pointwise limit of these functions as n tends to infinity is the function ψ(t) = lim 1 λ n n + λt «n = e λ(t 1), n and convergence occurs over the entire real line, i.e., for all values of t. Since ψ(t) is the probability generating function of the Poisson distribution with parameter λ, it follows that the probabilities P(X (n) = k) converge to those of this Poisson distribution. Jay Taylor (ASU) Fall / 86

77 Poisson distribution Probability generating functions can also be used to prove the following theorem. Theorem Suppose that X 1,, X n is a collection of independent Poisson distributed random variables with parameters λ 1,, λ n, respectively, and let X = X X n. Then X is Poisson-distributed with parameter λ λ n. Jay Taylor (ASU) Fall / 86

78 Poisson distribution Probability generating functions can also be used to prove the following theorem. Theorem Suppose that X 1,, X n is a collection of independent Poisson distributed random variables with parameters λ 1,, λ n, respectively, and let X = X X n. Then X is Poisson-distributed with parameter λ λ n. Proof: If ψ i (t) = e λ i (t 1) is the probability generating function of X i, then because the X i are independent, we know that the probability generating function of X is ψ X (t) = ny ψ i (t) = i=1 ny e λ i (t 1) = e (t 1) P n i=1 λ i. i=1 Since this is also the probability generating function of a Poisson-distributed random variable with parameter λ λ n, it follows that this is the distribution of X. Jay Taylor (ASU) Fall / 86

79 Fluctuation Tests Fluctuation tests and the origin of adaptive mutations One of the classic experiments of molecular genetics is the fluctuation test, which was developed by Salvador Luria and Max Delbrück in 1943 to investigate the origins of adaptive mutations. An adaptive mutation is one that increases the fitness (e.g., survival, fecundity) of an individual that carries that mutation. At the time, the molecular processes underpinning heredity and mutation were unknown (e.g., the structure of DNA was only described in 1951). There were two prevailing hypotheses explaining the origins of adaptive mutations: the spontaneous mutation hypothesis and the induced mutation hypothesis. According to the spontaneous mutation hypothesis, adaptive mutations occur by chance irrespective of the environmental conditions. According to the induced mutation hypothesis, adaptive mutations are directly induced by the environmental conditions in which they will be favored. Jay Taylor (ASU) Fall / 86

80 Fluctuation Tests Luria and Delbrück developed an experimental system based on a bacterium Escherischia coli along with a virus (T1 bacteriophage) that infects it. When T1 phage is added to a culture of E. coli, most of the bacteria are killed, but a few resistant cells may survive and give rise to resistant colonies that can be seen on the surface of a petri dish. This shows that resistance to T1 phage is a trait that varies across E. coli bacteria and which is heritable, i.e., the descendants of resistant bacteria are usually themselves resistant. Source: Wikipedia Source: Madeleine Price Ball Jay Taylor (ASU) Fall / 86

81 Fluctuation Tests The experiment carried out by Luria and Delbrück consisted of the following steps: 1 An E. coli culture was initiated from a single T1-susceptible cell and allowed to grow to a population containing millions of bacteria. 2 Several small samples were taken from this colony and spread on agar plates that had also been inoculated with the T1 phage. These plates were left for a period after which the number of resistant colonies on each plate was counted. 3 The procedures described in steps 1-2 were repeated several times, using independently established E. coli cultures and the resulting data were used to estimate both the mean and the variance of the number of resistant colonies arising in each culture. 4 If R ij is the number of resistant colonies observed on the j th plate inoculated with bacteria from the i th colony, then the mean and the variance of the number of resistant colonies can be estimated by R i = 1 5 5X R ij and V i = 1 4 j=1 5X `Rij R i 2. j=1 Jay Taylor (ASU) Fall / 86

82 Fluctuation Tests Luria and Delbrück argued that the spontaneous and induced mutation hypotheses could be distinguished in the following manner. If mutations are induced, then these will only appear after the bacteria are exposed to the phage. In this case, the number of resistant colonies will be approximately Poisson distributed and the variance will be approximately equal to the mean. If mutations are spontaneous, then the number of resistant colonies depends on timing of the mutation relative to the expansion of the culture. In this case, the variance will be much greater than the mean. Source: Wikipedia The law of rare events explains why the number of resistant colonies is expected to be Poisson distributed under the induced mutation hypothesis: although there are a large number of cells that can independently mutate to the resistance phenotype, the probability of mutation was known to be low. Jay Taylor (ASU) Fall / 86

Continuum Probability and Sets of Measure Zero

Continuum Probability and Sets of Measure Zero Chapter 3 Continuum Probability and Sets of Measure Zero In this chapter, we provide a motivation for using measure theory as a foundation for probability. It uses the example of random coin tossing to

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero Chapter Limits of Sequences Calculus Student: lim s n = 0 means the s n are getting closer and closer to zero but never gets there. Instructor: ARGHHHHH! Exercise. Think of a better response for the instructor.

More information

Discrete Distributions

Discrete Distributions A simplest example of random experiment is a coin-tossing, formally called Bernoulli trial. It happens to be the case that many useful distributions are built upon this simplest form of experiment, whose

More information

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory Chapter 1 Statistical Reasoning Why statistics? Uncertainty of nature (weather, earth movement, etc. ) Uncertainty in observation/sampling/measurement Variability of human operation/error imperfection

More information

Lecture 16. Lectures 1-15 Review

Lecture 16. Lectures 1-15 Review 18.440: Lecture 16 Lectures 1-15 Review Scott Sheffield MIT 1 Outline Counting tricks and basic principles of probability Discrete random variables 2 Outline Counting tricks and basic principles of probability

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

Math 564 Homework 1. Solutions.

Math 564 Homework 1. Solutions. Math 564 Homework 1. Solutions. Problem 1. Prove Proposition 0.2.2. A guide to this problem: start with the open set S = (a, b), for example. First assume that a >, and show that the number a has the properties

More information

Mathematical Statistics 1 Math A 6330

Mathematical Statistics 1 Math A 6330 Mathematical Statistics 1 Math A 6330 Chapter 3 Common Families of Distributions Mohamed I. Riffi Department of Mathematics Islamic University of Gaza September 28, 2015 Outline 1 Subjects of Lecture 04

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

The Different Sizes of Infinity

The Different Sizes of Infinity The Different Sizes of Infinity New York City College of Technology Cesar J. Rodriguez November 11, 2010 A Thought to Ponder At... Does Infinity Come in Varying Sizes? 2 Points of Marked Interest(s) General

More information

Chapter 3 Discrete Random Variables

Chapter 3 Discrete Random Variables MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 3 Discrete Random Variables Nao Mimoto Contents 1 Random Variables 2 2 Probability Distributions for Discrete Variables 3 3 Expected

More information

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events... Probability COMP 245 STATISTICS Dr N A Heard Contents Sample Spaces and Events. Sample Spaces........................................2 Events........................................... 2.3 Combinations

More information

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65 APM 504: Probability Notes Jay Taylor Spring 2015 Jay Taylor (ASU) APM 504 Fall 2013 1 / 65 Outline Outline 1 Probability and Uncertainty 2 Random Variables Discrete Distributions Continuous Distributions

More information

18.175: Lecture 2 Extension theorems, random variables, distributions

18.175: Lecture 2 Extension theorems, random variables, distributions 18.175: Lecture 2 Extension theorems, random variables, distributions Scott Sheffield MIT Outline Extension theorems Characterizing measures on R d Random variables Outline Extension theorems Characterizing

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

1 of 8 7/15/2009 3:43 PM Virtual Laboratories > 1. Foundations > 1 2 3 4 5 6 7 8 9 6. Cardinality Definitions and Preliminary Examples Suppose that S is a non-empty collection of sets. We define a relation

More information

Poisson processes Overview. Chapter 10

Poisson processes Overview. Chapter 10 Chapter 1 Poisson processes 1.1 Overview The Binomial distribution and the geometric distribution describe the behavior of two random variables derived from the random mechanism that I have called coin

More information

The Two Faces of Infinity Dr. Bob Gardner Great Ideas in Science (BIOL 3018)

The Two Faces of Infinity Dr. Bob Gardner Great Ideas in Science (BIOL 3018) The Two Faces of Infinity Dr. Bob Gardner Great Ideas in Science (BIOL 3018) From the webpage of Timithy Kohl, Boston University INTRODUCTION Note. We will consider infinity from two different perspectives:

More information

Discrete Random Variables

Discrete Random Variables Chapter 5 Discrete Random Variables Suppose that an experiment and a sample space are given. A random variable is a real-valued function of the outcome of the experiment. In other words, the random variable

More information

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes The Poisson Distribution 37.3 Introduction In this Section we introduce a probability model which can be used when the outcome of an experiment is a random variable taking on positive integer values and

More information

the time it takes until a radioactive substance undergoes a decay

the time it takes until a radioactive substance undergoes a decay 1 Probabilities 1.1 Experiments with randomness Wewillusethetermexperimentinaverygeneralwaytorefertosomeprocess that produces a random outcome. Examples: (Ask class for some first) Here are some discrete

More information

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R Random Variables Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R As such, a random variable summarizes the outcome of an experiment

More information

APM 421 Probability Theory Probability Notes. Jay Taylor Fall Jay Taylor (ASU) Fall / 62

APM 421 Probability Theory Probability Notes. Jay Taylor Fall Jay Taylor (ASU) Fall / 62 APM 421 Probability Theory Probability Notes Jay Taylor Fall 2013 Jay Taylor (ASU) Fall 2013 1 / 62 Motivation Scientific Determinism Scientific determinism holds that we can exactly predict how a system

More information

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

Cardinality of Sets. P. Danziger

Cardinality of Sets. P. Danziger MTH 34-76 Cardinality of Sets P Danziger Cardinal vs Ordinal Numbers If we look closely at our notions of number we will see that in fact we have two different ways of conceiving of numbers The first is

More information

To Infinity and Beyond

To Infinity and Beyond University of Waterloo How do we count things? Suppose we have two bags filled with candy. In one bag we have blue candy and in the other bag we have red candy. How can we determine which bag has more

More information

1.1. MEASURES AND INTEGRALS

1.1. MEASURES AND INTEGRALS CHAPTER 1: MEASURE THEORY In this chapter we define the notion of measure µ on a space, construct integrals on this space, and establish their basic properties under limits. The measure µ(e) will be defined

More information

Topic 3: The Expectation of a Random Variable

Topic 3: The Expectation of a Random Variable Topic 3: The Expectation of a Random Variable Course 003, 2017 Page 0 Expectation of a discrete random variable Definition (Expectation of a discrete r.v.): The expected value (also called the expectation

More information

Measures and Measure Spaces

Measures and Measure Spaces Chapter 2 Measures and Measure Spaces In summarizing the flaws of the Riemann integral we can focus on two main points: 1) Many nice functions are not Riemann integrable. 2) The Riemann integral does not

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

Construction of a general measure structure

Construction of a general measure structure Chapter 4 Construction of a general measure structure We turn to the development of general measure theory. The ingredients are a set describing the universe of points, a class of measurable subsets along

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Solution. 1 Solutions of Homework 1. 2 Homework 2. Sangchul Lee. February 19, Problem 1.2

Solution. 1 Solutions of Homework 1. 2 Homework 2. Sangchul Lee. February 19, Problem 1.2 Solution Sangchul Lee February 19, 2018 1 Solutions of Homework 1 Problem 1.2 Let A and B be nonempty subsets of R + :: {x R : x > 0} which are bounded above. Let us define C = {xy : x A and y B} Show

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Fundamental Tools - Probability Theory II

Fundamental Tools - Probability Theory II Fundamental Tools - Probability Theory II MSc Financial Mathematics The University of Warwick September 29, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory II 1 / 22 Measurable random

More information

MAT1000 ASSIGNMENT 1. a k 3 k. x =

MAT1000 ASSIGNMENT 1. a k 3 k. x = MAT1000 ASSIGNMENT 1 VITALY KUZNETSOV Question 1 (Exercise 2 on page 37). Tne Cantor set C can also be described in terms of ternary expansions. (a) Every number in [0, 1] has a ternary expansion x = a

More information

p. 4-1 Random Variables

p. 4-1 Random Variables Random Variables A Motivating Example Experiment: Sample k students without replacement from the population of all n students (labeled as 1, 2,, n, respectively) in our class. = {all combinations} = {{i

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

SETS AND FUNCTIONS JOSHUA BALLEW

SETS AND FUNCTIONS JOSHUA BALLEW SETS AND FUNCTIONS JOSHUA BALLEW 1. Sets As a review, we begin by considering a naive look at set theory. For our purposes, we define a set as a collection of objects. Except for certain sets like N, Z,

More information

Mathematics 220 Workshop Cardinality. Some harder problems on cardinality.

Mathematics 220 Workshop Cardinality. Some harder problems on cardinality. Some harder problems on cardinality. These are two series of problems with specific goals: the first goal is to prove that the cardinality of the set of irrational numbers is continuum, and the second

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Introduction to Proofs in Analysis updated December 5, 2016 By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION Purpose. These notes intend to introduce four main notions from

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Lecture 3. Discrete Random Variables

Lecture 3. Discrete Random Variables Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition

More information

1 INFO Sep 05

1 INFO Sep 05 Events A 1,...A n are said to be mutually independent if for all subsets S {1,..., n}, p( i S A i ) = p(a i ). (For example, flip a coin N times, then the events {A i = i th flip is heads} are mutually

More information

Measure and integration

Measure and integration Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

More information

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS 1. Cardinal number of a set The cardinal number (or simply cardinal) of a set is a generalization of the concept of the number of elements

More information

In N we can do addition, but in order to do subtraction we need to extend N to the integers

In N we can do addition, but in order to do subtraction we need to extend N to the integers Chapter The Real Numbers.. Some Preliminaries Discussion: The Irrationality of 2. We begin with the natural numbers N = {, 2, 3, }. In N we can do addition, but in order to do subtraction we need to extend

More information

Tom Salisbury

Tom Salisbury MATH 2030 3.00MW Elementary Probability Course Notes Part V: Independence of Random Variables, Law of Large Numbers, Central Limit Theorem, Poisson distribution Geometric & Exponential distributions Tom

More information

Axiomatic set theory. Chapter Why axiomatic set theory?

Axiomatic set theory. Chapter Why axiomatic set theory? Chapter 1 Axiomatic set theory 1.1 Why axiomatic set theory? Essentially all mathematical theories deal with sets in one way or another. In most cases, however, the use of set theory is limited to its

More information

Problem Set 2: Solutions Math 201A: Fall 2016

Problem Set 2: Solutions Math 201A: Fall 2016 Problem Set 2: s Math 201A: Fall 2016 Problem 1. (a) Prove that a closed subset of a complete metric space is complete. (b) Prove that a closed subset of a compact metric space is compact. (c) Prove that

More information

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8. Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8. Coin A is flipped until a head appears, then coin B is flipped until

More information

Module 1. Probability

Module 1. Probability Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive

More information

Continuous Probability Spaces

Continuous Probability Spaces Continuous Probability Spaces Ω is not countable. Outcomes can be any real number or part of an interval of R, e.g. heights, weights and lifetimes. Can not assign probabilities to each outcome and add

More information

CITS2211 Discrete Structures (2017) Cardinality and Countability

CITS2211 Discrete Structures (2017) Cardinality and Countability CITS2211 Discrete Structures (2017) Cardinality and Countability Highlights What is cardinality? Is it the same as size? Types of cardinality and infinite sets Reading Sections 45 and 81 84 of Mathematics

More information

Introduction to Probability

Introduction to Probability Introduction to Probability Salvatore Pace September 2, 208 Introduction In a frequentist interpretation of probability, a probability measure P (A) says that if I do something N times, I should see event

More information

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

CS 125 Section #10 (Un)decidability and Probability November 1, 2016 CS 125 Section #10 (Un)decidability and Probability November 1, 2016 1 Countability Recall that a set S is countable (either finite or countably infinite) if and only if there exists a surjective mapping

More information

Lecture 6: The Pigeonhole Principle and Probability Spaces

Lecture 6: The Pigeonhole Principle and Probability Spaces Lecture 6: The Pigeonhole Principle and Probability Spaces Anup Rao January 17, 2018 We discuss the pigeonhole principle and probability spaces. Pigeonhole Principle The pigeonhole principle is an extremely

More information

Probability Review. Gonzalo Mateos

Probability Review. Gonzalo Mateos Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 11, 2018 Introduction

More information

n if n is even. f (n)=

n if n is even. f (n)= 6 2. PROBABILITY 4. Countable and uncountable Definition 32. An set Ω is said to be finite if there is an n N and a bijection from Ω onto [n]. An infinite set Ω is said to be countable if there is a bijection

More information

Notes on ordinals and cardinals

Notes on ordinals and cardinals Notes on ordinals and cardinals Reed Solomon 1 Background Terminology We will use the following notation for the common number systems: N = {0, 1, 2,...} = the natural numbers Z = {..., 2, 1, 0, 1, 2,...}

More information

Math Bootcamp 2012 Miscellaneous

Math Bootcamp 2012 Miscellaneous Math Bootcamp 202 Miscellaneous Factorial, combination and permutation The factorial of a positive integer n denoted by n!, is the product of all positive integers less than or equal to n. Define 0! =.

More information

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events.

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. 1 Probability 1.1 Probability spaces We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. Definition 1.1.

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

POL502: Foundations. Kosuke Imai Department of Politics, Princeton University. October 10, 2005

POL502: Foundations. Kosuke Imai Department of Politics, Princeton University. October 10, 2005 POL502: Foundations Kosuke Imai Department of Politics, Princeton University October 10, 2005 Our first task is to develop the foundations that are necessary for the materials covered in this course. 1

More information

BINOMIAL DISTRIBUTION

BINOMIAL DISTRIBUTION BINOMIAL DISTRIBUTION The binomial distribution is a particular type of discrete pmf. It describes random variables which satisfy the following conditions: 1 You perform n identical experiments (called

More information

Countability. 1 Motivation. 2 Counting

Countability. 1 Motivation. 2 Counting Countability 1 Motivation In topology as well as other areas of mathematics, we deal with a lot of infinite sets. However, as we will gradually discover, some infinite sets are bigger than others. Countably

More information

Poisson approximations

Poisson approximations Chapter 9 Poisson approximations 9.1 Overview The Binn, p) can be thought of as the distribution of a sum of independent indicator random variables X 1 + + X n, with {X i = 1} denoting a head on the ith

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 27

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 27 CS 70 Discrete Mathematics for CS Spring 007 Luca Trevisan Lecture 7 Infinity and Countability Consider a function f that maps elements of a set A (called the domain of f ) to elements of set B (called

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Relationship between probability set function and random variable - 2 -

Relationship between probability set function and random variable - 2 - 2.0 Random Variables A rat is selected at random from a cage and its sex is determined. The set of possible outcomes is female and male. Thus outcome space is S = {female, male} = {F, M}. If we let X be

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

CHAPTER 8: EXPLORING R

CHAPTER 8: EXPLORING R CHAPTER 8: EXPLORING R LECTURE NOTES FOR MATH 378 (CSUSM, SPRING 2009). WAYNE AITKEN In the previous chapter we discussed the need for a complete ordered field. The field Q is not complete, so we constructed

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

In N we can do addition, but in order to do subtraction we need to extend N to the integers

In N we can do addition, but in order to do subtraction we need to extend N to the integers Chapter 1 The Real Numbers 1.1. Some Preliminaries Discussion: The Irrationality of 2. We begin with the natural numbers N = {1, 2, 3, }. In N we can do addition, but in order to do subtraction we need

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

MAT 271E Probability and Statistics

MAT 271E Probability and Statistics MAT 71E Probability and Statistics Spring 013 Instructor : Class Meets : Office Hours : Textbook : Supp. Text : İlker Bayram EEB 1103 ibayram@itu.edu.tr 13.30 1.30, Wednesday EEB 5303 10.00 1.00, Wednesday

More information

ECS 120 Lesson 18 Decidable Problems, the Halting Problem

ECS 120 Lesson 18 Decidable Problems, the Halting Problem ECS 120 Lesson 18 Decidable Problems, the Halting Problem Oliver Kreylos Friday, May 11th, 2001 In the last lecture, we had a look at a problem that we claimed was not solvable by an algorithm the problem

More information

MORE ON CONTINUOUS FUNCTIONS AND SETS

MORE ON CONTINUOUS FUNCTIONS AND SETS Chapter 6 MORE ON CONTINUOUS FUNCTIONS AND SETS This chapter can be considered enrichment material containing also several more advanced topics and may be skipped in its entirety. You can proceed directly

More information

A Readable Introduction to Real Mathematics

A Readable Introduction to Real Mathematics Solutions to selected problems in the book A Readable Introduction to Real Mathematics D. Rosenthal, D. Rosenthal, P. Rosenthal Chapter 10: Sizes of Infinite Sets 1. Show that the set of all polynomials

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

Algorithms: Lecture 2

Algorithms: Lecture 2 1 Algorithms: Lecture 2 Basic Structures: Sets, Functions, Sequences, and Sums Jinwoo Kim jwkim@jjay.cuny.edu 2.1 Sets 2 1 2.1 Sets 3 2.1 Sets 4 2 2.1 Sets 5 2.1 Sets 6 3 2.1 Sets 7 2.2 Set Operations

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME ELIZABETH G. OMBRELLARO Abstract. This paper is expository in nature. It intuitively explains, using a geometrical and measure theory perspective, why

More information

Slides 8: Statistical Models in Simulation

Slides 8: Statistical Models in Simulation Slides 8: Statistical Models in Simulation Purpose and Overview The world the model-builder sees is probabilistic rather than deterministic: Some statistical model might well describe the variations. An

More information

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

Notes Week 2 Chapter 3 Probability WEEK 2 page 1 Notes Week 2 Chapter 3 Probability WEEK 2 page 1 The sample space of an experiment, sometimes denoted S or in probability theory, is the set that consists of all possible elementary outcomes of that experiment

More information

APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77

APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77 APM 541: Stochastic Modelling in Biology Probability Notes Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall 2013 1 / 77 Outline Outline 1 Motivation 2 Probability and Uncertainty 3 Conditional Probability

More information

Lebesgue Measure on R n

Lebesgue Measure on R n CHAPTER 2 Lebesgue Measure on R n Our goal is to construct a notion of the volume, or Lebesgue measure, of rather general subsets of R n that reduces to the usual volume of elementary geometrical sets

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information