MATH/STATS 425 : Introduction to Probability. Boaz Slomka

Size: px
Start display at page:

Download "MATH/STATS 425 : Introduction to Probability. Boaz Slomka"

Transcription

1 MATH/STATS 425 : Introduction to Probability Boaz Slomka These notes are not proofread, and may contain typos and errors. Last update: April 10, 2018.

2

3 LECTURE 1 Counting (1.2 Basic multiplication principle: if there are m possible outcomes in experiment 1 and n possible outcomes in experiment 2, then there is a total of m n possible outcomes (works for more than two. Example 1.1. How many words with 5 letters are there? million. Actually probably only about 10, , 000 real words. In general: In a k-letter alphabet, there are k n n-letter words. Example 1.2. We roll three different dice (6 sided, how many possible outcomes are there? 6 3 = 216 Example 1.3. One die, roll three times (order of the results is not recorded, so e.g., and are considered the same outcome, how many possible outcomes now? a bit more difficult... (Answer: 56.ddf Permutations (1.3 Example 1.4. In how many ways can one arrange 8 people in a line? = 8! = Each arrangements is also called a permutation. In general: number of permutations (ways to order n different objects is n! Example 1.5. How many ways to arrange 4 couples in a line (each couple standing next to each other? There are 4! = 24 possible ways to order the different couples (as if each couple is one object, and = 16 to order each pair between themselves. Total: = 384. General permutations Next, we would like to count arrangements of objects, when some of the objects are indistinguishable: Example 1.6. How many arrangements of 5 identical red balloons, 3 identical blue balloons, and 2 identical green balloons are there? 3

4 Suppose that all the balloons are distinguishable. For example, by imagining that the red balloons are labeled R 1,... R 5, the blue balloons are labeled B 1,..., B 3, and the green balloons are labeled G 1, G 2. Then, there are 10! arrangements of the 10 balloons. Now, consider any specific arrangement, e.g. RRRRRGGBBB. With our imaginary labels, among the 10! permutations we have actually counted many arrangements that correspond to the same initial arrangement of RRRRRGGBBB, including R 1 R 2 R 3 R 4 R 5 G 1 G 2 B 1 B 2 B 3 or R 2 R 1 R 4 R 5 R 3 G 2 G 1 B 3 B 2 B 1. The exact number of arrangements corresponding to RRRRRGGBBB (or to any other initial arrangement of the balloons is 5! 3! 2! because there are 5! permutations of the red balloons among themselves, 3! permutations of the blue balloons among themselves, and 2! permutations of the green balloons among themselves. Therefore, we divide by that number to get a total of possible arrangements. 10! 5!3!2! = 2520 In general: the number of possible arrangements of n objects in a line - n 1 of type 1, n 2 of type 2,..., n r of type r (where objects of each type are identical, and n 1 + n n r = n is n! n 1!n 2!... n r!. 4

5 LECTURE 2 Binomial Coefficients (1.4 Example 2.1. How many combinations of 6 different courses out of 10 can one choose (order is not important? Solution. There are possibilities to choose 6 courses with order. Since each selection is counted 6! times (the number of permutations of 6 elements we divide by 6! to get: ! = 10! 6!4! = 210. Note: this is the same as choosing what 4 courses not to take out of 10. In general: the number of k-element subsets of an n-set (or ways to pick k different objects out of n, where order is not important is ( n k = n! - n choose k. Note that: ( ( n k!(n k! k = n n k. We also define: ( n = 0 ( n = 1 (0! = 1, n ( n = 0 if r < 0 or r > n r Example 2.2. The Detroit Pistons has 19 players. How many teams (5 players can they possibly form? ( Exercise 2.3. In how many ways can one arrange 10 balloons and 5 ribbons in a line? Solution. There are = 15 decorations, and we choose 5 spots for the ribbons. In total, there are ( = 3003 ways. Exercise 2.4. We have 10 balloons and 5 ribbons. In how many ways can one arrange them in a line so that no two ribbons are next to each other? Solution. We put spaces between the balloons, so that each ribbon can take one space: 5

6 There are 11 spaces and 5 ribbons, hence total of ( 11 5 = 462 ways. Example 2.5. (see 1.6, Proposition 6.2 How many nonnegative integer solutions a 1,..., a r 0 are there for the equation a a r = n? Solution. Same idea as the previous example: count orderings of n units of 1, and r 1 imaginary separators (separating r sets of 1 s. For example, n = 6, r = 5 : corresponds to Therefore, there are ( ( n+r 1 r 1 arrangements corresponding to n+r 1 r 1 solutions. Theorem 2.6. (The Binomial theorem (x + y n = n k=0 ( n x k y n k k Proof. A purely algebraic proof goes by induction. We will present a combinatorial proof: Write (x + y n = (x + y (x + y and expand. We now have a long sum (actually, 2 n terms, where each term in the sum is a product of n factors (consisting of x s and y s. For example: (x + y 2 = (x + y (x + y = xx + xy + yx + yy = x 2 + 2xy + y 2 (x + y 3 = (x + y (x + y (x + y = xxx + xxy + + yyy = x 3 + 3x 2 y + 3xy 2 + y 3. To obtain the final shorter sum, we combine like terms, i.e., with the same number of x factors and the same number of y factors (such as xy and yx to 2xy - for n = 2, or xxy, xyx, and yxx to 3x 2 y - for n = 3. How many like terms equivalent to x k y n k? One visual way to count it is by assigning n different colors to the different pairs of parentheses (x + y: (x + y(x + y(x + y. Then, when expanded, the like terms equivalent to x k y n k are formed by multiplying k x s of different colors, and n k y s of the remaining colors. The number of such terms is exactly the number of subsets of k colors (to choose for the x factors out of n possible colors. This number is ( n k, and therefore, the term x k y n k has the coefficient ( n k in front of it. Corollary 2.7. One has Proof. Two proofs: 2 n = n k=0 ( n k 6

7 1. Follows from the binomial theorem, by setting x = y = Another interpretation: the RHS and the LHS both count the total number of subsets of a set with n elements, only in two different ways. Indeed, LHS = Each element of a subset has 2 possibilities - to be or not to be (in the subset, hence = 2 n possibilities. RHS = This time we sum (over k the number of k-subsets an n-set, which is ( n k. Multinomial coefficients (1.5 Example 2.8. You own 12 cars, of which 5 are identical Ferraris, 3 are identical Bentleys, 2 are identical Lamborghinis and 2 are identical Ford focus. In how many ways can you arrange your cars for an exhibition (in one line? We already saw that the number of ways to order n items of r types, where n 1, n 2,..., n r are ( n! n the number of items of each type is n 1!n 2!...n r!. We use the notation n 1,n 2,...,n r for this expression (also known as the multinomial coefficient. In this example, there are ( 12 5,3,2,2 = 12! possible 5!3!2!2! arrangements. Another solution for the above problem: There are ( 12 5 possibilities to place the 5 Ferraris. There are: ( 7 3 possibilities to then place the 3 Bentleys. There are ( 4 2 possibilities to then place the 2 Lamborghinis. There are ( 2 2 possibilities to finally place the 2 Fords. By the multiplication principle, there are ( ( ( 2( 2 possible arrangements. It is not hard to check that, indeed, ( 12 5 ( 7 3 Theorem 2.9. (The multinomial theorem (x 1 + x x r n = ( ( k 1,..., k r k k r = n = 12! 5!3!2!2!. ( n k 1,..., k r x k 1 1 x k x kr r The sum is over all r-tuples of non-negative integers k 1,..., k r, such that k k r = n. Proof. Similar to the binomial theorem. 7

8

9 LECTURE 3 Sample space and events (2.2 How to define probability formally? not so simple, only done in the 1930s. We consider experiments with unpredictable outcomes: Example 3.1. (1 Flipping two coins (2 Rolling a die (3 The temperature tomorrow at noon (4 The result of the Super Bowl The set of all possible outcomes of an experiment forms its sample space: Definition. Sample space S = {all possible outcomes}. Example 3.2. (1 Flipping two coins (order is important: S = {HH, HT, T H, T T } (2 Rolling a die: S = {1, 2,..., 6} (3 The temperature tomorrow at noon: S = [60, 90] F. We are interested to understand the likelihood that certain events will occur. Question: what is an event? Definition. Any subset E of the sample space S is called an event. Example 3.3. (1 Flipping T exactly one time (E = {HT, T H} (2 Rolling an odd number (E = {1, 3, 5} (3 Temperature at noon tomorrow is at least 75 F (E = [75, 90] F Remark 3.4. Note that each experiment has exactly one outcome, but many different events can occur simultaneously. For example, if the outcome of one die roll is 5, then events {at least 4} and {an odd number} both occur. 9

10 Operations on events (and relations between events Let E, F be events in a sample space S. Then: Operation\Relation Meaning Example E F or E F E = F (same: E F & F E E F = EF E F If E occurs, then F occurs E = {1} E, F both occur, or both don t occur all outcomes that are both in E and in F, E F occurs if both E and F occur at the same time outcomes that either in E or in F (or both, E F occurs if E or F occur (or both Complement E c all outcomes not in E, E c occurs iff E doesn t Difference E\F = E F c all outcomes in E and not in F. E\F occurs ff E occurs and F doesn t F = {an odd number} E = {1, 3, 5} F = {an odd number} E = {an even number} F = {an odd number} E F = never happens E = {an even number} F = {an odd number} E F = {1, 2,..., 6} = S always happens E = {an even number} F = {an odd number} E = F c, E c = F E = {1, 3, 4, 6} F = {an even number} E\F = {1, 3} Venn diagram Basic Properties. Let E, F, G be events. Then (1 Commutative laws: E F = F E, and E F = F E (2 Associative laws: (E F G = E (F G and E F G = E (F G In particular, this means that E F G (respectively E F G is well defined since the order in which we take the intersection operations does not matter. (3 Distributive laws: (E F G = (E G (F G and (E F G = (E G (F G 10

11 Definition. (Operations on multiple events E 1, E 2,..., E n events (possibly infinitely many Intersection: n i=1 E i = E 1 E 2 E n. Meaning: n i=1 E i occurs if and only if all the events E i occur. Union: n i=1 E i = E 1 E 2 E n. Meaning: n i=1 E i occurs if and only if at least one of the events E i occurs. Theorem 3.5 (De Morgan s laws. E, F and E 1, E 2,... are events (a (E F c = E c F c and (b (E F c = E c F c (a ( n i=1 E i c = n i=1 Ec i and (b ( n i=1 E i c = n i=1 Ec i Proof. We prove only (a (the other cases are similar by drawing the Venn diagrams of the RHS and the LHS and show that we end up with the same subset. LHS: E F E F (E F c RHS: E c F c E c F c Formal proof (optional: x (E F c if and only if x E F. Moreover, x E F if and only if x E or x F. Equivalently, x E c or x F c (or both, which means that x E c F c. Example 3.6. S = {1, 2,..., 6}, E = {an even number}, F = {at least 4} then E c = {an odd number} F c = {at most 3} 11

12 and E F = {even and at least 4} = {4, 6} E F = {even or at least 4} = {2, 4, 5, 6} {1, 2, 3, 5} = (E F c = E c F c = {an odd number or at most 3} {1, 3} = (E F c = E c F c = {an odd number and at most 3} 12

13 LECTURE 4 Probability Spaces (2.3 Recall: S - sample space (set of all possible outcomes. Definition. Events E 1, E 2,..., E n in S are called mutually exclusive or disjoint if E j E k = for j k. The same definition works for infinitely many events. Definition. A probability function on S is an assignment of a number P (E to each event E, satisfying the following axioms: (A1 0 P (E 1 (A2 P (S = 1 (A3 If E 1, E 2,... are mutually exclusive events, then ( P E j = P (E j. j=1 j=1 Example 4.1. Flip a fair coin twice (order important.in this case, the sample space is S = {HH, HT, T H, T T }. What is the probability of getting tails once, that is P ({HT, T H}? Common mistake: we have three options; 2 tails, 1 tails or 0 tails, so probability is 1. 3 Fair coin means that all outcomes are equally likely, that is P ({HH} = P ({T H} = P ({HT } = P ({T T }. Since the events {HH}, {HT }, {T H}, {T T } are disjoint, axioms (A2 and (A3 imply that: 1 = P (S = P ({HH} + P ({HT } + P ({T H} + P ({T T } = 4 P ({HH}, and therefore P ({HH} = P ({HT } = P ({T H} = P ({T T } = 1 4. By axiom (A3: P ({HT, T H} = P ({HT } + P ({T H} = 2 =

14 Example 4.2. We roll two fair dice. What is P (sum of two dice is at least 11? S = {(1, 1, (1, 2, (2, 1,..., (5, 6, (6, 5, (6, 6} or S = {1, 2,..., 6} 2 = {1,... 6} {1,..., 6} Since the dice are fair, for any i, j {1,..., 6}, P ((i, j = We want to find P (sum of two dice is at least 11 = P ({(5, 6, (6, 5, (6, 6}. We have P (E = P ((5, 6 + P ((6, 5 + P ((6, 6 = = 1 12 Proposition 4.3. P (E c = 1 P (E Properties of probability (2.4 Proof. We have S = E E c. Since E and E c are disjoint, it follows by Axioms (A2 and (A3 that 1 = P (E + P (E c. Example 4.4. Roll two dice. What is P (sum of two dice is at most 10? Solution. P (roll at most 10 = 1 P (roll at least 11 = 1 1 = Proposition 4.5. E F = P (E P (F Proof. F = E (F \E (mutually exclusive, so P (F = P (E + P (F \E P (E (Axioms A1 and A3. Proposition 4.6. (Inclusion-exclusion principle - basic case P (E F = P (E + P (F P (E F Proof. First, note that the events E\F, F \E, and E F are mutually exclusive, and hence (convince yourselves, using a Venn diagram: (1 P (E = P (E \ F + P (EF, (2 P (F = P (F \ E + P (EF, and (3 P (E F = P (E\F + P (F \E + P (EF. 14

15 Therefore, we have that P (E + P (F P (EF = (P (E \ F + P (EF }{{} (1 = P (E\F + P (F \E + P (EF }{{} (3 = P (E F, + (P (F \ E + P (EF P (EF }{{} (2 as claimed. Proposition 4.7. (Inclusion-exclusion principle for 3 events For any events E 1, E 2, E 3, P (E 1 E 2 E 3 = P (E 1 + P (E 2 + P (E 3 P (E 1 E 2 P (E 1 E 3 P (E 2 E 3 + P (E 1 E 2 E 3 Proof. Similar to the basic case. The general inclusion-exclusion principle: n P (E 1 E n = P (E i P (E i1 E i2 + P (E i1 E i2 E i3 i=1 i 1 <i 2 i 1 <i 2 <i ( 1 n+1 P (E 1 E n, where means that we sum over all possible k different, and ordered, indices. i 1 <i 2 < <i k Example 4.8. Suppose Alice s probabilities of being accepted at UM and MSU are 0.35, and 0.9, respectively, and the probability of being accepted at both is What is the probability that she is accepted at neither? Solution. Let A, B be the events that Alice is accepted at UM, and MSU, respectively. Then P ((A B c = 1 P (A B = 1 (P (A + P (B P (AB = 1 ( = Exercise 4.9. There are three people with three different hats. If each person takes off a random hat from the coatroom, what is the probability that at least one person get their own hat? Solution. Let E i denote the event that person i gets his own hat. The event where someone gets his hat back is E 1 E 2 E 3. Notice that P (E 1 = P (E 2 = P (E 3 = 1 3 (Each person is equally likely to take 3 possible hats, out of which only one is his, hence 1 3 Also note that if two people take their own hat, then the third person must also take his own hat (only his hat is left. In other words, E 1 E 2 = E 1 E 3 = E 2 E 3 = E 1 E 2 E 3. 15

16 Therefore, P (E 1 E 2 = P (E 1 E 3 = P (E 2 E 3 = P (E 1 E 2 E 3 = 1 6 (6 equally likely combinations, out of which exactly one in which each takes his own hat. By the Inc.-Exc. principle: P (E 1 E 2 E 3 = = 2 3. Challenge : Same question, n people (what happens when n is very large, i.e., n?. Answer: 1 e 1. 16

17 LECTURE 5 Sample spaces with equally likely outcomes (2.5 So far, in all our examples, the outcomes of an experiment were all equally likely. Such sample spaces satisfy the following assumptions: Finite sample space (so the outcomes can be labeled by numbers S = {1, 2,..., N} The elementary outcomes are equally likely: P ({1} = = P ({N} = 1 N Therefore, the probability of any event E = {i 1, i 2,..., i k } is P (E = # of outcomes in E # of outcomes in S = E S. Example 5.1. We roll a die 3 times. What is the probability that it lands on 4 exactly once? Solution. We choose the sample space S of ordered results of three rolls, that is S = {(1, 1, 1, (1, 1, 2, (1, 2, 1, (2, 1, 1, (1, 1, 3,... }. We have that S = 6 3, where each outcome is equally likely. If E is the event that the die lands on 4 exactly once, then E = (3 possibilities to choose in which roll the die lands on 4, and then 5 2 possibilities for the remaining rolls. Thus P (E = Remark 5.2. If we choose another sample space, namely, the set of unordered results, that is, each outcome indicates only the number of times the die has landed on each one of its sides; S = {{1, 1, 1}, {1, 1, 2}, {1, 1, 3},... } In this case, we have S = ( = 56 (see lecture 1, and the solution of homework 1, and E = ( ( = 6 4 = 15 (one roll lands on 4, and either the remaining rolls are the same - which is 5 possibilities, or they are different - which is ( 5 2 = 10, and combined we get 15. However P (E 15, because, in this sample space, the outcomes are not equally likely! For example, 56 P (lands three times on 1 P (lands once on 1, once on 2, and once on 3. Therefore, in this case one cannot use the formula P (E = E S 17

18 Exercise 5.3. (Lottery Suppose that to win the lottery we have to pick 6 correct (and different numbers out of 49 possible choices. What is the probability that we pick 5 correct numbers and one incorrect one? Solution. The sample space S consists of all possible subsets of 6 numbers, from the possible 49 choices. Therefore, S = ( We want to find P (E, where E = {5 correct numbers, 1 incorrect}, and all choices of 6 numbers are equally likely. To find E, we need to choose 5 correct out of 6 possible correct numbers, for which there are ( 6 5 combinations, and then choose 1 incorrect number out of possible 43 = 49 6 incorrect numbers, for which there are ( ( 43 1 combinations. Thus, E = ( 1. Finally, P (E = (6 5( 43 1 ( ( million (small but much more likely than picking the correct 6 which is Example 5.4. If n people are in a room, what is the probability that none of them was born on July 21st? (Assume birthdays are equally likely, and there are 365 days in a year. S = {1,..., 365} {1,..., 365} {1,..., 365} = {1,..., 365} n, S = 365 n E = {not July 21st} n, E = 364 n P (E = ( n Exercise 5.5. In a group of n people, what is the probability that no two of them share a birthday? (Assume 365 days, equally likely. Solution. Clearly, if n 365, then the probability is 0 (more people than birthdays. If n < 365, the problem is similar to the elevator problem: let E denote the event that no two share a birthday. Then, the number of outcomes in E is and E = (365 (n 1 P (E = E S = (366 n 365 n. Remark 5.6. One can check that already for n = 23, P (E becomes smaller than 0.5, which means that in a group of 23 people, it is likely that 2 people will share a birthday. With 50 people, the probability that two share a birthday already exceeds 0.95! Exercise 5.7. A market study produces the following results: 80% of respondents drink coffee or tea or both 60% of respondents drink coffee (the may also drink tea 30% of respondents drink both coffee and tea What is the percentage of respondents drinking tea? 18

19 Solution. Pick a random respondent from {all respondents}, that is, all respondents are equally likely to be picked. Define C = {drinks coffee}, T = {drinks tea} Then C T = {drinks coffee or tea}, C T = {drinks coffee and tea}. According to the given information, we have P (C = C S = 0.6, P (C T = C T S By the inclusion-exclusion principle, we have = 0.8, P (C T = C T S = = P (C T = P (C + P (T P (C T = P (T 0.3 = P (T Therefore P (T = 0.5, or 50%. 19

20

21 LECTURE 6 More nontrivial examples (2.5 Exercise 6.1. (Quality control A shipment has 20 items. 5 random items are tested for quality; if one or more defective items are detected, the shipment is rejected. If the shipment is known to have k defective items, what is the probability that it is rejected? Solution. The sample space is S = {all possible samples of 5 items chosen from 20 options}. So, S = ( The event in question is E = {all samples of 5 items with at least one defective item}. It is easier to consider the complement event: E c = {all samples consisting of only good items}. To compute E c we count the number of subsets of 5 items, out of possible 20 k good items. That is, E c = ( 20 k 5. Therefore, P (E = 1 P (E c = 1 ( 20 k 5 ( 20 5 For example, if k = 5, then P (E 0.81 while if k = 2, then P (E Exercise 6.2. (Quality control - continued Now consider testing pixels of LCD screens with a million pixels. If a screen with 10 or more defective pixels is called a bad screen, find a sample size n of pixels to be tested, so that we can guarantee at least 90% chance to detect a bad screen (detect = finding one or more defective pixels? Solution. Suppose we have a bad screen with exactly 10 defective pixels (if we guarantee 90% to detect this bad screen, then we will clearly have at least 90% chance to detect bad screens with more than 10 defective pixels. From the solution of the previous problem, the probability of not finding any defective pixel in a sample of n pixels from the screen is P (E c = ( 1,000, n ( 1,000,000 n = 999,990! n! (999,990 n! 1,000,000! n! (1,000,000 n! = 999, , 989 (999, 990 (n 1 1, 000, , 999 (1, 000, 000 (n 1. We need to find n such that P (E c < 0.1 (so that P (E of finding defective pixels in the sample is at least 0.9. Note that ( n P (E c 999, , 989 = 1, 000, , , 990 (n 1 999, 990 1, 000, 000 (n 1 1, 000,

22 and hence it suffices to find n such that ( 999, 990 n 0.1 1, 000, 000 ( 999, 990 n ln ln (0.1 1, 000, 000 n ln (10 ( ln 1,000, , , Therefore, sampling n = 230, 258 guarantees at least 90% chance to find bad screens. Exercise 6.3. (GMAT practice problem A fair coin is tossed 10 times. What is the probability that at least two consecutive heads appear? Solution. We have S = {all results of 10 coin flips, order important} = {H, T } 10, and hence S = Denote the event that two consecutive heads appear by E. Consider We ll see two methods to find E c : Method 1: E c = {no heads appear next to each other}. Put the tails in a row with spaces between them, and count the ways to place heads in the possible spaces (similar to the balloons and ribbons examples # heads # tails # spaces # ways ( 11 0 = ( 10 1 = ( 9 2 = ( 8 3 = ( 7 4 = ( 6 5 = 6 Note that it is not possible to have more than 5 heads, since there are more heads than spaces. 22

23 In total, there are ( ( 6 5 = 144 combinations. Therefore P (E = 1 P (E c = 1 Ec S = Method 2: Denote by A n the number of ways to order n coins such that no two consecutive heads appear. Observe that: If the first coin is T : then there are A n 1 ways to order the remaining tosses. If the first coin is H: then the second one must be T, and there are A n 2 ways to order the remaining tosses. In total: A n = A n 1 + A n 2. Since A 1 = 2 and A 2 = 3 (easy to check, one can iteratively find A 3, A 4,..., A 10 or A n for any n. Remark 6.4. The numbers A 1, A 2,... are also known as Fibonacci numbers. 23

24

25 LECTURE 7 Conditional probability (3.1, 3.2 Conditional probability is used in situations where partial information concerning the result of an experiment is available. We will later see that even with no partial information, the concept of conditional probability can be very helpful in computing probabilities. Definition. Let E, F be events. Suppose P (F > 0. The conditional probability that E occurs given that F has occurred is P (E F := P (E F P (F Note that, equivalently, P (EF = P (E F P (F. Idea: now F becomes the new sample space, so we normalize by P (F. Example 7.1. The ELISA test (mid 1980s to screen donated blood for the presence of HIV. Among the people who are given this test, we can expect the following distribution Find P (A 1 B 2 and P (B 1 A 1. B 1 : HIV+ B 2 : HIV- Totals A 1 : Test positive A 2 : Test negative Totals Solution. We have P (A 1 B 2 = Meaning: given that a person is HIV-, the probability that he will be tested positive is about Similarly, P (B 1 A 1 = / , which means that only 6.5% of the people who are tested positive, are actually HIV+. Example 7.2. A family has two children. (a Given that at least one child is a boy, what is the probability that both children are boys? (b Given that the older child is a boy, what is the probability that both children are boys? Solution. The sample space is S = {BB, BG, GB, GG}. We assume all outcomes are equally likely. Let E be the event that both children are boys. 25

26 (a Let F 1 = {BB, BG, GB} be the event that at least one child is a boy. Thus, Hence, P (F = {BB, BG, GB} S P (E F 1 = P (E F = P (BB = 1/4 P (F P (F 3/4 = 1/3. (b Let F 2 be the event that the older child is a boy. Then F 2 = {BB, BG}, and hence P (F 2 = 2 4 = 1 2. Therefore P (E F 2 = P (E F 2 P (F 2 = 3 4. = P (BB P (F 2 = 1/4 1/2 = 1 2. Exercise 7.3. Three cards are randomly selected without replacement from an ordinary deck of 52 cards. Compute the conditional probability that the third card selected is a spade, given that the 1st and 2nd cards were spades. Solution. One can easily convince oneself, without the definition of conditional probability, that the probability is (why?. Using the definition of conditional probability: Define F = {1st and 2nd are spades}, and E = {third is spade}. We have that P (E F = P (all three are spades = P (F is the probability to get 2 spades by selecting two cards, which is simply (13 2. The conditional ( 52 2 probability is thus P (E F = ( 13 3 ( 52 3 ( 13 2 ( 52 2 = Exercise 7.4. The numbers 1, 2, 3, 4, 5, 6 were randomly written on the sides of a blank six-sided die. What is the probability that the sum of numbers on each pair of opposite sides of the die is equal to 7? ( 13 3 ( Solution. Define E 16 = {1 and 6 on opposite sides}. Similarly define E 25, E 34. Note that P (E 16 E 25 E 34 = P (E 16 E 25 as once 1 & 6, and 2 & 5 are on opposite sides then also 3 & 4 must be on opposite sides. P (E 16 E 25 E 34 = P (E 16 E 25 = P (E 16 P (E 25 E

27 It is not hard to see that P (E 16 = 1. Indeed, put 1 on any side of the die. The opposite side is 5 equally likely to have any number among {2, 3, 4, 5, 6}. Thus, the probability is 1. Moreover, to 5 compute P (E 25 E 16, note that 2 can be anywhere, and then the opposite side is equally likely to have any number among {3, 4, 5}, and hence P (E 25 E 16 = 1 3. Therefore, P (E 16 E 25 E 34 = P (E 16 E 25 = P (E 16 P (E 25 E 16 = = A generalization of P (EF = P (E P (F E is the following Multiplication rule: Proposition 7.5. We have: P (E 1 E 2 E n = P (E 1 P (E 2 E 1 P (E n E 1... E n 1 Proof. We simply write the definition of conditional probability to get P (E 1 P (E 2 E 1 P (E n E 1... E n 1 = P (E 1 P (E 1 E 2 P (E 1 P (E 1 E 2 E 3 P (E 1 E 2 = P (E 1... E n P (E 1E 2 E n P (E 1 E n 1 27

28

29 LECTURE 8 LTP (3.3 Law of total probability - basic case. We have: P (E = P (E F P (F + P (E F c P (F c. The proof is straightforward; by the definition of conditional probability, we know that P (E F P (F = P (E F, and P (E F c P (F c = P (E F c, hence the right hand side is equal to P (E F + P (E F c. Since S = F F c, the disjoint union of EF and EF c is equal to E, which means that P (E = P (E F + P (E F c (draw a Venn diagram to convince yourselves. Theorem 8.1. (general LTP Let F 1, F 2,..., F N be disjoint events such that Then for any event E, P (E = N S = F k. k=1 N P (E F k P (F k k=1 Remark 8.2. LTP also works for infinitely many events F 1, F 2,... Example 8.3. What % of Delta flights arrive on time if: 70% of flights depart on time (same as 30% depart late. 80% of flights that depart on time arrive on time 90% of flights that depart late arrive late (same as 10% arrive on time. We Select a random flight, and define the events: D = {departs on time} and A = {arrives on time}. In terms of A and D, the given information states that P (D = 0.7 P (D c = 0.3 P (A D = 0.8 P (A c D c = 0.9 P (A D c =

30 By LTP, it follows that P (A = P (A D P (D + P (A D c P (D c = = 0.59 Therefore, the final answer is: 59% Bayes formula (3.3 Idea: to relate P (E F with P (F E How? We know that P (F E P (E = P (E F = P (E F P (F, which implies Theorem 8.4. (Bayes formula If P (E, P (F > 0, then P (F E = P (E F P (F P (E LT P = P (E F P (F P (E F P (F + P (E F c P (F c Example 8.5. A certain disease affects 1% of the population. 5% of non-ill population will be tested positive, while 10% of the ill will be tested negative. What is the probability that a person tested positive is actually sick with this disease? Solution. Define D = {has disease}, T = {test positive}. We want to find P (D T. The probability that a non-ill person has is tested positive: P (T D c = The probability that an ill person is tested negative: P (T c D = 0.1. Using Bayes formula, P (D T = P (T D P (D P (T D P (D + P (T D c P (D c = which is very small.. Intuition: the disease is rare so almost all positive tests are of non-ill people. For example, among 2000 people, about 20 are ill, and 1980 are not. Among the 20, 18 will be tested positive, and among the healthy, 99 will be tested positive. so the probability is In some cases, conditional probability is useful, even if partial information is not specified in the problem: Exercise 8.6. There are 15 tennis balls in a box, only nine of which are new balls. In the first round, three of the balls are randomly selected, played with, and then returned to the box. Later, in the second round, another three balls are randomly selected from the box. Find the probability that all three balls in the second selection are new. Solution. Let E denote the event that all three balls in the second round are new. For i = 0, 1, 2, 3, let F i denote the event that i new balls have been selected in the first round. Note that 30

31 F i, F j are disjoint, and satisfy S = 3 i=0 F i. We now use LTP to write P (E = 3 P (E F i PF i. i=0 Thus, the solution is divided into a few simpler parts, namely, the computations of P (E F i, P (F i : P (F i = (9 i( 3 i 6 ( old balls. (choosing i new balls from the 9 new balls, and 3 i old balls from the The probability P (E F i is the probability that all three balls in the second selection are new, where given F i means that there are 9 i new balls (and 6 + i used balls. Therefore P (E F i = ( 9 i 3 Plugging back into LTP formula, we get ( 3 9 i ( P (E = ( 15 i( 3 i = 1 3 ( 9 i ( i=0 ( 15 3 i=0 ( ( 9 i ( 6 3 i =

32

33 LECTURE 9 Independent events (3.4 Intuitive definition: knowing that E has occurred doesn t affect the probability that F occurs, and vice versa. P (E F = P (E, or P (F E = P (F Example 9.1. Standard deck with 52 cards and P (king heart = P (king = 1 13 P (heart king = P (heart = 1 4 Note that both P (E F = P (E, or P (F E = P (F are equivalent to P (E P (F = P (E F since, for example, P (E = P (E F P (E = P (E F P (F P (E P (F = P (E F. For this reason we define: Definition 9.2. Events E, F are called independent if P (E P (F = P (E F. Exercise 9.3. Bob and Alice do not know each other, but both need to take an advanced math course next semester. The probability that Alice will take Math 425 is 0.8. On the other hand, Bob has decided that a coin flip will determine whether he takes MATH 425 or not. What is the probability that both of them will be taking MATH 425? Solution. By independence (they do not know each other, P ({Alice takes 425} {Bob takes 425} = P ({Alice takes 425} P ({Bob takes 425} = We may also define independence for any sequence of events: We say that E 1, E 2, E 3 are independent if P (E 1 E 2 E 3 = P (E 1 P (E 2 P (E 3 and every pair of events is independent; P (E i E j = P (E i P (E j for all i j. 33

34 We say that E 1,..., E 4 are independent if P (E 1 E 2 E 3 E 4 = P (E 1 P (E 2 P (E 3 P (E 4 and every collection of 3 events among {E 1,..., E 4 } are independent.. Iteratively, we say that E 1,..., E n are independent if P (E 1 E 2 E n = P (E 1... P (E n and every collection of n 1 events among {E 1,..., E n } are independent.. Remark 9.4. Sometimes we know that only pairs are independent, this is called pairwise independence and is a weaker property that does not imply independence. For example: Example 9.5. We roll 2 fair dice. E 1 = {first die shows 4} E 2 = {second die shows 4} E 3 = {sum of the dice is 7} It is easy to verify that E 1, E 2 are independent, E 1, E 3 are independent, and E 2, E 3 are independent. However, E 1, E 2, E 3 are not independent. Exercise 9.6. (Ross, example 4g A system composed of n separate components is said to be a parallel system if it functions when at least one of the components functions (See Figure. For such a system, if component i, which is independent of the other components, functions with probability p i, i = 1,..., n, what is the probability that the system functions? Solution. Let A i denote the event that component i functions. Then ( n P (system functions = 1 P (system does not function = 1 P By independence, we have that P (system functions = 1 Π n i=1 (1 p i i=1 A c i. 34

35 Exercise 9.7. Show that if A, B are independent then A c, B independent (which also implies that A, B c are independent and A c, B c are independent. Solution. We have that which implies The P (B = P (A c B + P (A B indep. = P (A c B + P (A P (B P (A c B = (1 P (A P (B = P (A c P (B. Exercise 9.8. Let A, B, C, D be independent. Show that A B and C D are also independent. Solution. From the discussion above, we know A c, B c, C c, D c are independent. Therefore, A c B c and C c D c are also independent (why?. This implies: A B = (A c B c c and C D = (C c D c c are also independent. Applying the ideas from the previous two exercises, leads to the following general properties of independence: Let E 1, E 2,... be independent events. Then (1 Replacing some of the E i by Ei c preserves independence (2 Intersections of different events among {E 1,..., E n } are independent. For example, E 1 E 2, E 3 E 4 E 5 are independent. (3 Unions of different events among {E 1,..., E n } are independent. For example, E 1 E 2, E 3 E 4 are independent. (4 Any combinations of the above (where each event E i appears at most once are independent. For example, ((E 1 E 2 E3 c c, E 4 E 5, E 6 are independent. Exercise 9.9. We want to drive from A to C through B. There are two roads connecting A to B and two other roads connecting B to C. Because of road construction, each of the four roads may be closed with probability p independently of other roads. What is the probability that we can reach C? 35

36 Solution. Define the events: R 1, R 2 roads from A to B (open, R 3, R 4 roads from B to C (open. P (can reach C = P (can go from A to B and then from B to C = P ((R 1 R 2 (R 3 R 4 By the previous proposition, we have P ((R 1 R 2 (R 3 R 4 = P (R 1 R 2 P (R 3 R 4 = (1 P ((R 1 R 2 c (1 P ((R 3 R 4 c = (1 P (R1 c R2 c (1 P (R3 c R4 c = (1 P (R1 c P (R2 c (1 P (R3 c P (R4 c = ( 1 p

37 LECTURE 10 Series of independent events (3.4 Exercise Independent trials consisting of rolling a pair of fair dice are performed (indefinitely. What is the probability that an outcome of 5 appears before an outcome of 7 when the outcome of a roll is the sum of the dice? Solution. Since, on any roll, 5 occurs with probability 4/36, and 7 with probability 6/36, it seems intuitive that the odds that a 5 appears before a 7 should be 6 to 4 against. The probability should then be 4/10. Indeed, one can solve the problem directly, by defining the event E n that neither 5 nor 7 occur in the n 1 first trials, and 5 occurs in the nth trial. Then P ( i=1 E n = P (E n is what we are looking for, which can be computed using independence (try!. We use conditional probability. Let E denote the event that 5 (that is, the sum 5 appears before 7. Let F 1 denote the event that the outcome of the first roll is 5, let F 2 denote the event that the outcome of the first roll is 7, and let F 3 denote the event that the outcome of the first roll is neither 5 nor 7. By LTP we have that P (E = P (E F 1 P (F 1 + P (E F 2 P (F 2 + P (E F 3 P (F 3. Notice that P (E F 1 = 1 and P (E F 2 = 0, and that, by independence, P (E F 3 = P (E. Moreover, an easy computation shows that P (F 1 = 4 36, P (F 2 = 6 36, and P (F 3 = Thus, P (E = P (E = P (E = 4 36 = P (E = 4 10 = 2 5. At home: similar arguments show that if E and F are mutually exclusive events of an experiment, then, when independent trials of the experiment are performed, the event E will occur before the event F with probability P (E P (E + P (F 37

38 Conditional probability as a probability (3.5 Conditional probabilities satisfy all the axioms of a probability function. Namely, let F be an event, Then P ( F is a probability function, satisfying 0 P ( F 1, P (S F = 1, and ( n n P E i F = P (E i F i=1 i=1 for any disjoint events E 1, E 2..., E n. This means that all the results/formulas we have seen so far, also hold for conditional probabilities. For example, the inclusion/exclusion relation: P (E 1 E 2 F = P (E 1 F + P (E 2 F P (E 1 E 2 F. Another useful relation is LTP for conditional probabilities. Denote P F (E := P (E F. Then, for any event G we have (4 P F (E = P F (E G P F (G + P F (E G c P F (G c. To understand what this formula means in terms of the original probability P, note that P F (E G = P F (E G P F (G = P (EG F P (G F = P(EGF P(F P(F G P(F = P (EF G P (F G = P (E F G which means that (4 is equivalent to P (E F = P (E F G P (G F + P (E F G c P (G c F Exercise One at a time, we turn over cards from a shuffled standard deck of 52 cards. Given that the first card is a Queen, what is the probability that the second card is the queen of spades? Solution. Let E denote the event that the second card is the queen of spades, let F denote the event that the first card is a queen, and let G denote the event that the first card is the queen of spades. Then P (E F = P (E F G P (G F + P (E F G c P (G c F. Clearly, we have that P (E F G = 0 (given that the first is the queen of spades, the second card cannot be the queen of spades. Moreover, P (E F G c = 1 (there are 51 cards left, including the 51 queen of spades, and P (G c F = 3 (3 possible queens out of 4. Thus, 4 Conditional independence. P (E F = =

39 Definition We say that E 1 and E 2 are conditionally independent given F if P F (E 1 E 2 = P F (E 1 or, equivalently, P F (E 1 E 2 = P F (E 1 P F (E 2. In terms of P, E 1 and E 2 are conditionally independent given F if P (E 1 E 2 F = P (E 1 F or, equivalently, if P (E 1 E 2 F = P (E 1 F P (E 2 F. Exercise (Ross, example 5a The probability that a policyholder is an accident-prone person is 0.3. During any given year, an accident-prone person will have an accident with probability 0.4 (independently of other years whereas a person who is not prone to accidents will have an accident with probability 0.2 (again, independently of other years. What is the conditional probability that a new policyholder will have an accident in his/her second year of policy ownership, given that the he/she has had an accident in the first year? Solution. Let A be the event that the policyholder is accident prone, and let A 1 (resp. A 2 be the event that the policyholder has had an accident in the 1st (resp. 2nd year. We want to find P (A 2 A 1. We condition on whether the policyholder is accident prone or not: P (A 2 A 1 = P (A 2 AA 1 P (A A 1 + P (A 2 A c A 1 P (A c A 1. By Bayes formula, we have P (A A 1 = P (A 1 A P (A P (A 1 A P (A + P (A 1 A c P (A c = = = 6 13, and hence P (A c A 1 = By conditional independence, we have P (A 2 AA 1 = P (A 2 A = 0.4 and P (A 2 A c A 1 = P (A 2 A c = 0.2. Thus, P (A 2 A 1 =

40

41 LECTURE 11 Random Variables (4.1 Definition A r.v. X is a real-valued function on the sample space S. Example. (a We toss a fair coin 3 times. Consider the following assignment of numbers to each one of the outcomes in the sample space: { S = HHH 3, HHT 2, HT H 2, HT T 1, T HH 2, T HT 1 This assignment is the r.v. describing the number of heads tossed., T T H, T T T 1 0 Example. (b Roll a fair die once. Then S = {1, 2, 3, 4, 5, 6}. Let E = {We roll an even number}. Consider the function X : S R defined by 1, s E X (s = 0, s E. This function the r.v. indicating whether E has occurred or not. More examples: (c age of a randomly chosen person from a group. (d delay time of a flight. (e molecule s velocity at a given moment. Notations: P (X = x = P ({s S : X (s = x}. P (X A = P ({s S : X (s A} P (X < a P (a < X b, etc. Examples: Consider example (a again. (1 P (X = 0 = P (T T T = 1 8 (2 P (X [2, 3] = P ({tossed at least two heads} = 1 2. (3 P (X 1 = 1 P (X < 1 = 1 P (X = 0 = 7 8. }. 41

42 Discrete R.V. S (4.2 Definition A r.v. X is called discrete if X takes on a finite or countable number of values x 1, x 2,.... Examples (a, (b and (c above describe discrete r.v s, but the other examples do not. The probability mass function Definition (pmf Let X be a discrete r.v. The probability mass function (pmf of X is defined as p X (x = P (X = x, x R. Properties: Suppose the range of X is I = {x 1, x 2,... }. Then the following holds. If A I then P (x A = P ( x i A {X = x i} = x i A p X (x i. In particular: p X (x x i I i = 1. Example Back to example (a. X = # heads tossed. The pmf of X is: p X (0 = 1 8, p X (1 = 3 8, p X (2 3 8, p X (3 = 1 8. Exercise n tosses of a fair coin. X = # heads in n coin flips. Find the pmf of X. Solution. We have p (k = ( n k ( 1 2 n for k=0, 1, 2,..., n. (here ( n k is the number of ways to choose k heads out of n, and 1/2 n is the probability of each such outcome. 42

43 Cumulative distribution function (CDF Definition The cdf (cumulative distribution function of a r.v. X is the function F : R [0, 1], defined by F (x = P (X x = P ({s : X (s x}. Properties: The following holds. (1 F is non-decreasing. (2 lim x F (x = 0, lim x F (x = 1. Example Again 3 coin flips (example (a above. The CDF is: 0, x < 0 1/8, 0 x < 1 F = 1/2, 1 x < 2 7/8, 2 x < 3 1, 3 x 43

44 Connection between cdf and pmf Given one of the functions (pmf or cdf, one can calculate the other function: pmf cdf: If one knows the pmf of a r.v., then the cdf can be recovered by: F (x = P (X x = P (X = x i = p X (x i. x i x x i x cdf pmf: Similarly, given the cdf of X, one can recover the pmf of X by observing that at each point x i where F jumps (discontinuity point is a point for which p X (x i 0. The value p X (x i is exactly the hight of the jump. Exercise Let X be a r.v. with cdf 0, x < 1 F = 1 3, 1 x < 1. 1, x 1 Find the pmf of X. Can you come up with an underlying story that explains the sample space? Solution. There are two jumps, one at 1 and one at 1. The jump at 1 is F ( 1 F ( 1 = 1 3 and the jump at 1 is F (1 F (1 = 2 3. We thus have p X ( 1 = 1/3, p X (1 = 2/3. For x 1, 1 we have p X (x = 0. A suitable story for this r.v. can be: tossing a biased coin. Remark The pmf and the cdf of X are also referred to as the distribution of X. Exercise (if time permits The probability mass function of a random variable X is given by p(i = c λi, i = 0, 1, 2,..., where λ is some positive value. Find (a P (X = 0 and (b P (X > 2. i! Solution. We have 1 = p (i = c λ i i=0 = ce λ and hence c = e λ. Therefore, i! and P (X = 0 = p (0 = e λ P (X > 2 = 1 P (X 2 = 1 P (X = 0 P (X = 1 P (X = 2 ( = 1 e λ 1 λ λ

45 LECTURE 12 Review of practice problems on Ch. 1-3 before Exam 1. 45

46

47 LECTURE 13 Expected Value (4.3 We start with an example, followed by the definition. Example The number of apartments in Ann Arbor is given in the following table: Bedroom The average number of bedrooms in AA is thus: Ans : total # bedrooms total # apts Apts 1,200 2,572 3,307 3,198 1, = 12, = = 2.15 Probabilistic view: The experiment: select an apartment at random. Define the r.v. X = The number of bedrooms in the apartment. The pmf of X is p X (0 = , 083, p X (1 = , 083, p X (2 = , 083, p X (3 = , 083, p X (4 = , 083. The average above can be written in terms of p X as: 2.15 = 0 p X (0 + 1 p X (1 + 2 p X (2 + 3 p X (3 + 4 p X (4. This leads to the following definition: Definition Let X be a discrete r.v. with range I = {x 1, x 2... }. The expected value of X is then: E (X = x I xp (x = k x k p (x k. Remark If the range of X is {x 1, x 2,..., x N },and all the values of X are equally likely then p (x i = 1 and N N i=1 E (X = x i = Usual Average. N In general, E (X is the weighted average which puts more weight to likelier values. 47

48 Example Toss a fair coin 3 times. Let X = # heads. Then E (X = = 1.5. Exercise (Lottery To win the lottery one need to guess 6 out of 49 number (order irrelevant. Here s a table describing the prizes: # correct Prize 6 $1,200,000 5 $800 4 $35 3 or less $0 Is it advantageous to buy a 14 lottery ticket? Solution. We compute the prize money expected value. First, we find the pmf: Now, we have p (1, 200, 000 = 1 ( 49 6 p (800 = p (35 = ( 6 ( ( 49 6 ( 6 ( ( 49 6 p (0 = 1 p (35 p (800 p (1, 200, 000. E (X = 1, 200, 000 p (1, 200, p ( p ( p ( Therefore, on average, we lose a cent! (note that this does not mean that people are not willing to take the risk of losing a cent or two, in the hopes of winning $1 million... Properties of Expectation Expectation has the following basic and very useful properties: (1 E (X 1 + X X n = E (X E (X n for any r.v.s X 1,..., X n (we will prove it in the future (2 E (ax = a E (X for a r.v. X, and a R. (3 E (b = b for a constant r.v. X b, b R. (4 E (X + c = E (X + c for any r.v. X and constant c R. 48

49 Example (Group testing We need to test blood of n people for a rare disease (syphilis in men drafted during WWII, p = P (positive (assume that a person has the disease independently of other people. Method 1: test all people individually. This means we always run exactly n tests. Method 2: divide the n people into groups of k people, mix the blood samples of each group, and test each mixed sample (group testing. If a test is negative then no more tests are needed for that group (means: only 1 test was needed. If test is positive = test all k people in the group individually (means: k + 1 tests were needed. Question: what is the expected number of tests in method 2? Solution. Divide n people into n/k groups of k (assume that k divides n. Define X = X 1 + X X n/k wherex i = # of tests needed for group i, with possible values: 1, k + 1. We have: P (X i = 1 = P (all k people in group i do not have the disease = (1 p k and Thus, and therefore P (X i = k + 1 = 1 (1 p k. ( E (X i = 1 (1 p k + (k (1 p k, n/k E (X = E (X i = n k i=1 For example, if n = 100, 000, p = 10 4 E (X = 2026 (compare to method 1 where # = 100, 000! ( ( 1 (1 p k + (k (1 p k. (on average,10 are sick, then for k = 120, we get Expected value of a function of a r.v. (4.4 Example A box contains 11 disks with radii 1, 2,..., 11 inches. random. What is the expected area A of the chosen disk? One disk is chosen at We know that the area of a disk is A = πr 2, where R = radius of the disk is a r.v. E (R = = 6. with 49

50 Note : E (A π in 2. Since A takes the values π 1 2,..., π 11 2 with probabilities 1 11, it follows that E (A = π π 10 2 In general, the following formula holds: in 2. Proposition Let X be a discrete r.v. with pmf p (x. Then for any function g (, E [g (X] = x g (x p (x. (again, note that in general, we don t have E (g (X g (E (X Proof. The function g (X is a r.v. with values g (x k (where x 1, x 2,... are the possible values of X. Hence E [g (X] = k g (x k P (g (X = g (x k = [if g is 1 1 : g(x k p(x k ] = k = k g (x k P j: g(y j =g(x k j: g(y j =g(x k g (y j p (y j = x {X = y j } = k g (x p (x. g (x k P (X = y j j: g(y j =g(x k 50

51 LECTURE 14 Review of Exam #1 51

52

53 LECTURE 15 Variance (4.5 Example Compare the r.v.s for different values of a 0 (draw a picture +a Prob. 1 2 X a =. a Prob. 1 2 For very small a (even a = 0, the values are always very close to E (X = 0. However, for large values, they are always very far from E (X. The following notion of variance (and standard deviation of a r.v. X, gives an indication of how the values of X are spread around E (X, on average: Definition Let X be a r.v. with mean (= expected value µ. The variance of X is Var (X = E [ (X µ 2]. The standard deviation of X is σ X = Var (X. Remark Note that σ X has the same units as X while Var (X does not. Typically, one expects to get a result in the range E (X ± σ X (optional Another view of Var (X: suppose one want to replace X by the best constant (not random number y. One way to measure what best means is to minimize the expected difference (squared - to avoid cancellations with negative signs: d (y = E ( (X y 2. By differentiating with respect to y, one can show that the minimizer is y = E (X, and Var (X = d (E (X is the minimum expected difference possible. Often (but not always - see the above example!, it will be more convenient to use the following formula: Proposition. We have Var (X = E ( X 2 2µX + µ 2 = E ( X 2 = 2µE (X + µ 2 = E ( X 2 E ( X 2. 53

54 Example Compare the following two investments options: Stock A: 3% return with prob. 0.8; 2% loss with prob Stock B: 5% return with prob. 0.6; 2% loss with prob Expected profit in %: E (X A = = 2%. and E (X B = = 2.2%. However, we compare the variances (or risk in this case: E ( X 2 A = = 8, and E (X B = = 16.6, Therefore, Var (X A = E (XA 2 E (X A 2 = = 4, σ XA = Var (X A = 2%, and Var (X B = = 11.76, σ XB = Var (X B = 3.4%. So, typically, the profit from A is 2 ± 2% and from B, 2.2 ± 3.4%. Therefore, stock A seems safer. Properties of variance Proposition Let X be a r.v. Let a, b R. Then (1 Var (ax = a 2 Var (X (a Var (X + b = Var (X. (b Combined: Var (ax + b = a 2 Var (X. Proof. (1 We have a Var (ax = E ( (ax E (ax 2 = E ( (ax ae (X 2 = E ( a 2 (X E (X 2 = a 2 E ( (X E (X 2 = a 2 Var (X. (2 A shift by a constant does not change how that values are spread around the mean. Indeed, Var (X + b = E ( ((X + b E (X + b 2 = E ( (X + b E (X b 2 = E ( (X E (X 2 = Var (X. 54

55 Exercise (The matching problem N assignments are returned to N students at random. (1 How many student, on average, will get their own assignment? (2 What is the standard deviation of the number of students who will get their own hw back? Solution. (1 Method of indicators: Let X = # of students that got their own HW back. We write where X = N i=1 X i 1, student i gets his/hers own hw X i =. 0,, o/w Now, we use the additivity of expected value to get E (X = N E (X i. i=1 For each i we have E (X i = 1 P (X i = P (X i = 0 = P (X i = 1 = 1 N. Therefore, E (X = N 1 N = 1. (2 We need to calculate the standard deviation of X. We have E ( ( N 2 ( X 2 N N = E X i = E X i X j = E ( Xi 2 + E (X i X j, i=1 i=1 X 2 i + i j i=1 i j Note that E (X 2 i = E (X i = 1 N (the square of an indicator is itself, and E (X i X j = 1 P (X i X j = P (X i X j = 0 = P (X i = 1 and X j = 1 Therefore, = P (both students i and j get their own hw back = 1 N 1 N 1 and hence E ( X 2 = N 1 # of (ordered pairs i j N + 1 N (N 1 N (N 1 = 2, Var (X = E ( X 2 E (X 2 = 2 1 = 1 σ X = Var (X = 1. 55

56 Exercise (Optional - if time permits A fair die is rolled, and you win (or lose money according to the following law: If X is the number on which the die lands, then you win/lose 2X 3 dollars. Find the expected value and standard deviation of your winnings. Solution. Since E (X = 1 ( = 7, the expected winnings are: 6 2 E (2X 3 = 2E (X 3 = = 4. For the standard deviation, we first calculate E ( X 2 = 1 ( = Hence, Var (X = E ( X 2 (E (X 2 = 91 6 The variance of our winnings is: Var (2X 3 = σ 2X 3 = = 35 3 ( 2 7 = , and the standard deviation is 56

57 LECTURE 16 Bernoulli and Binomial distributions (4.6 Bernoulli R.V.S Consider an experiment with two outcome; e.g., success, and failure, where P (success = p, P (failure = 1 p. Let X be the indicator of success: 1, with prob. p X = 0, with prob. 1 p. Then X is called a Bernoulli r.v. with parameter p. Formally: Definition A r.v. X is said to be a Bernoulli r.v. with parameter p if its pmf is given by p X (1 = p, p X (0 = 1 p. Notation: X Bern (p. The expectation of a Bernoulli r.v.: E (X = 1 p + 0 (1 p = p. The variance of a Bernoulli r.v. : Var (X = E (X 2 E (X 2 = p p 2 = p (1 p. Binomial R.V.S Now consider an experiment consisting of n independent Bernoulli trials, each having success with probability p and failure with probability 1 p, independently of other trials. Let the r.v. X be the number of successes. The pmf of X is then: p X (k = P (X = k = P (k successes in n trials = ( n p k (1 p n k k where ( n k stands for the # of possible arrangement of successes and failures, and p k (1 p n k stands for the probability of each arrangement. Formally: Definition A r.v. X is said to be a binomial r.v. with parameters n, p if ( n p X (k = p k (1 p n k, k = 0, 1,..., n. k Notation: X Bin (n, p. 57

58 We have: E (X = np,var (X = np (1 p (see exercise below. Example Jack hits his target 70% of the time. What is the probability that he hits his target in at least 8 of 10 shots? Solution. X = # of hits. Assuming that Jack s hits are independent, X Bin (10, 0.7. Then ( ( ( P (X 8 = p X (8+p X (9+p X (10 = Exercise Compute the expectation and variance of a binomial r.v. Solution. We present two solutions: 1. Direct computation: by definition, we have n n E (X = k p X (k = k p X (k = k=0 k=1 We use the identity k ( ( n k = n n 1 k 1 to obtain n ( n 1 E (X = n p k (1 p n k = np k 1 k=1 n k=1 n k we next change variables by setting j = k 1. Hence, n 1 ( n 1 E (X = np p j (1 p n 1 j. j ( n 1 j=0 k=1 ( n p k (1 p n k. k ( n 1 p k 1 (1 p (n 1 (k 1. k 1 We claim that n 1 j=0 j p j (1 p n 1 j = 1. Indeed, note that if Y Bin (n 1, p then the pmf of Y is p Y (j = ( n 1 j p j (1 p n 1 j, for j = 0,... n 1. Therefore, n 1 n 1 ( n 1 1 = p Y (j = p j (1 p n 1 j, j which implies that E (X = np. j=0 j=0 Remark The same method can be used to find E (X 2 and Var (X - See Ross p Method of indicators: We write X = n i=1 X i, where 1, success at trial i X i = 0, failure at trial i Note that X i Bernoulli (p. Therefore, E (X = n i=1 E (X i = np. Moreover, 58

59 E ( ( n 2 ( n X 2 = E X i = E i=1 i=1 X 2 i + i j X i X j = = np + n (n 1 (1 P (X i X j = P (X i X j = 0 = np + n (n 1 P (X i = 1 and X j = 1 = np + n (n 1 p 2 which implies that Var (X = np + n (n 1 p 2 (np 2 = np (1 p. n E ( Xi 2 + E (X i X j Exercise. (overbooking Flight A: 10 tickets sold, the plane has 9 seats. Flight B: 20 tickets sold, the plane has 18 seats. Passengers show up with prob. 0.9 each. Which flight is more likely to get overbooked? Solution. We have: i=1 X A = # passengers who show up for flight A X B = # passengers who show up for flight B. Note that X A Bin (10, 0.9, and X B Bin (20, 0.9, and hence ( 10 P (A overbooked = P (X A = 10 = = P (B overbooked = P (X B = 19 + P (X B = 20 = Therefore, B is more likely to get overbooked. ( i j ( = Exercise Let X Bin (n, p where p is a parameter. maximum variance? For what value of p X has the Solution. We know that Var (X = np (1 p. We look for the critical points of the variance: (np (1 p = np 2np = 0 and find that the maximum of np (1 p is attained at p =

60

61 LECTURE 17 Poisson distribution (4.7 Recall: X Bin (n, p where X is the number of successes in n independent trials and p is the probability of success at each trial. The pmf of X is : p (k = ( n k p k (1 p n k, k = 0, 1,..., n. Also: E (X = p, Var (X = np (1 p. Example In a class of 40 students, on average 2 students are sick. What is the prob. that 4 students are sick? Solution. X Bin (40, p, and E (X = 40p = 2 = p = 1/20. Therefore ( ( 4 ( 40 1 p (4 = = An approximation to the Binomial distribution The pmf p (k of Bin (n, p is sometimes difficult to compute, especially for large n. approximate the pmf of Bin (n, p under the following assumptions: n is large (n p is small, or successes are rare (p 0 np is of moderate size ( p = λ n Let us rewrite the pmf of X: p (k = n! k! (n k! = λk k! for a constant λ, or np λ. ( k ( λ 1 λ n k n k n (n 1 (n 2 (n k + 1 n k ( 1 n λ n ( 1 n λ k. Note that under the above assumptions, as n we have that for any fixed k: n(n 1(n 2 (n k+1 n k 1, ( 1 λ n n e λ, and ( 1 λ n k 1 (n, and k fixed We will 61

62 Combining the above, we get p (k λk k! e λ. This function defines the Poisson distribution: Definition. (Poisson distribution. A r.v. X has the Poisson distribution with parameter λ > 0 if the pmf of X is given by: Notation: X Pois (λ. p (k = λk k! e λ, k = 0, 1, 2, 3,... Remark The Poisson distribution was introduced by Poisson in 1837, along with applications in criminal lawsuits (jury decisions. However, it had not attracted much attention until a little book by Bortkiewicz was published in 1898, which included a strange example: deaths by horse kicks in the Prussian army. Using the Poisson distribution, such rare events were shown to be quite regularized and predictable, which helped to popularize this distribution. Example Let us use Poisson approximation for our previous example (sick students: here X Bin (40, 1/20 Pois (2, and hence P (X = ! e (compare to the exact answer Remark One may use Poisson approximation to the binomial distribution, that is Bin (n, p Pois (np, whenever the number of independent trials is large, p is small, and λ = np is of moderate size. As a consequence of the fact that a Poisson r.v. is the limit of Binomial r.v.s, we get: Corollary For X Pois (λ, we have that E (X = λ and Var (X = λ. Indeed, the idea is that the pmf of X is the limit of the pmfs of Binomial r.v. X Bin ( n, λ n, and hence one expects to have E (X np = λ and Var (X np (1 p = λ (1 p λ (since p 0. Indeed, one can also verify these facts directly (try!. Remark We know we must have 1 = λ k k=0 k! e λ (probability axiom. This is equivalent to e λ λ k = k! e λ k=0 which is a known identity (Taylor series of the exponential functions. This is a probabilistic proof of this identity. 62

63 Other Poisson distribution models Besides approximating the binomial distribution, the Poisson distribution is an appropriate model in various situations, and has numerous applications. For example, (1 The number of accidents on a section of a highway on a given day (2 The number of floods at a river during over a century. (3 The number of α particles emitted from a radioactive source during a fixed period of time. Poisson Process. Often, we consider the number N (t of occurrences of an event in an interval, say [0, t], where the average number of occurrences is proportional to the length of the interval, that is E (N (t = λt. This means that, for each t, N (t Pois (λt. The family of r.v.s {N (t} is called a Poisson process with rate λ. Exercise Births of twins in a certain city are described by a Poisson process with the constant rate of 1.2 births per year. (a What is the probability that more than two twin births will occur during the year 2017? (b What is the probability that no twin births will occur during the next five years? (c If we learn that there was at least one birth of twins during the year 2015, what is the conditional probability that there were no twin births during the first half of that year? Solution. (a Let X be the number of twin births during Then X Pois (1.2, and hence ( P(X > 2 = 1 P(X 2 = 1 e (b Let Y be the number of twin births in the next 5 years. Then Y Pois (6, and hence P (Y = 0 = e 6. (c Let N 1 [0, 2], N [ 1,1], N [0,1] respectively be the number of twin births during the 1 st half, of 2015, 2 2 nd half of 2015, and the entire year of Then ( P N 1 [0, 2] = 0 N [0,1] 1 = P 2] = 0 and N [0,1] 1 P ( N [0,1] 1 ( 2] = 0 P N [ 1,1] 1 2 P ( N [0,1] 1 ( N [0, 1 ( P N 1 [0, ( P N = 1 [0, 2] = 0 and N [ 1,1] 1 2 P ( N [0,1] 1 ( e = 1 e e

64 Example Poisson approximation to the binomial r.v. is very good. In fact, it remains pretty good even when the trials are not independent, provided that their dependence is weak. For example, recall the problem in which n people randomly select n hats (each hat belongs to exactly one person, and let X be the number of people who select their own hat. Then X = X 1 + +X n where X i is the indicator of the event that person i selects his own hat. Clearly, we have that P (X i = 1 = 1 n and P (X i = 1 X j = 1 = 1 n 1 for i j. Thus, we see that X 1,..., X n are not independent, but their dependence for large n appears to be weak. Let us estimate the probability that at least one selects his own hat: we approximately have that X Pois (1 and hence P (X > 0 1 e 1. This is exactly the same result we previously got for n! (see the Challenge problem at the end of Exercise 4.9 from Lecture 4. 64

65 LECTURE The Geometric r.v. (4.8 Consider an infinite sequence of independent trials, with probability p of success at each trial. Let X = the trial number of the first success The pmf of X is: p X (k = P (X = k = (1 p k 1 p, k = 1, 2,..., where (1 p k is the probability that the first k 1 trials are failures, and p is the probability that the k th trial is a success. Definition A r.v. X is called a geometric r.v. with parameter p if the pmf of X is p X (k = (1 p k 1 p, where k = 1, 2,.... Notation: X G (p. Moments: E (X = 1 1 p (see below, Var (X = (see Ross p In the future, we will p p 2 compute these moments using the notion of conditional expectation. Remark We have k=1 p X (k = k=1 (1 pk 1 p = 1, which is equivalent to (1 p k 1 = (1 p k = 1 (the well-known geometric series. p k=1 k=0 Exercise Show that the expectation of X G (p is E (X = 1. p Solution. By definition, E (X = k (1 p k 1 p = (k (1 p k 1 p = k=1 (k 1 (1 p k 1 p + k=1 k=1 (1 p k 1 p = [change index: l = k 1] k=1 l (1 p l p + 1 = (1 p l (1 p l 1 p + 1 =(1 p E (X + 1. l=1 l=1 We thus have that E (X = (1 p E (X + 1, which yields E (X = 1 p. 65

66 Exercise A miner is trapped in a mine with three doors. One door leads outside through a tunnel, in 2 hours. The other two doors are connected by another tunnel, which one can walk through in 3 hours. The miner is equally likely to choose any of the three doors. If he finds himself back in the mine, he again chooses one of the 3 doors at random (the miner is tired and disoriented, which makes him forget which doors he had chosen before. What is the expected time it takes the miner to escape the mine? Solution. The number of times it takes until the door leading outside is chosen is X G ( 1 3. Therefore E (X = 3. This means that the average time it would take to reach safety is = 8 hours (three hours for choosing the first two connected doors, and another two hours in the tunnel leading outside. More formally, the r.v. representing the time passing until reaching safety is: T = 3 (X 1 + 2, where X 1 is # of times choosing one of the connected doors. Thus, E (T = 3 E (X = 3 X 1 = = 8. Exercise Let X G (p. Find the probability that X is even. Solution. One may use a direct computation of i=1 P (X = 2i (try!. We can also use conditional probability as follows. Let E denote the event that X is even. Then P (E = P (E X = 1 p + P (E X > 1 (1 p = P (E X > 1 (1 p. To understand P (E X > 1, let Y be the trial of the first success, starting counting from the second trial. Note that Y G (p. Also note that given X > 1, X is even if and only if Y is odd (if we don t know that X > 1 then it may happen, for example, that X = 1 and Y = 1 -i.e. first and second trials are a success. Thus, we have P (E X > 1 = P (Y is odd = 1 P (Y is even. Since Y G (p, it follows that P (Y is even = P (X is even = P (E. Therefore, P (E = (1 P (E (1 p, which yields P (E = 1 p 2 p. Other discrete distributions (4.8 The negative Binomial r.v. Similar to geometric. This time, we are interested in the trial number of the r th success. The pmf: ( k + r 1 p X (r + k = p r (1 p k, k = 0, 1,... k Explanation: ( k+r 1 k is the number of possibilities for k failures among the k + r 1 first trials (the (r + k th trail is the r th success and p r (1 p k is the probability of each such possibility. 66

67 Notation: X NB (r, p. Moments: E (X = r 1 p, Var (X = r p p 2 (see Ross for a direct computation. One can write X = X X r where X i G (p are independent. This provides another way to compute E (X and Var (X. Hypergeometric r.v. (Definition by example An urn contains N balls of which D are green and N D are red. We take a sample of n balls. Let X be the number of green balls in the sample. The pmf: p X (k = ( D ( N D k n k ( N n where ( ( D k - k greens out of D, N D ( n k - (n k reds out of N D, and N n - samples of n balls out of N. Notation: X HG (N, D, n. Range of X: We have: X 0 and X n sample size (N D, and so the smallest possible value of X is # of red balls max (0, n N + D We have: X D (total number of green balls and X n (sample size, and so the greatest possible value of X is min (D, n Moments of X: E (X = n D D, Var (X = n (1 D N N N N n. N 1 One can calculate these using the indicator method (try!: 1, j th ball is green option 1: define X j =, j = 1,..., n, X = X j, 0, o/w 1, i th green ball is selected option 2: define Y i =, i = 1,..., D (X = Y i. 0, o/w Remark Read Ross 4.8 for more details and examples. 67

68

69 LECTURE 19 Continuous distributions (5.1 Example Discrete r.v.s would not be able to model certain situations, such as Lifetime of equipment Travel time between points (in a city, or on highway Payoff of an enterprise etc. In a continuous model, P (X = x = 0 so a pmf would not make sense. Instead we have the pdf (probability distribution function: Definition A r.v. X is said to be continuous if there exists a function f (x 0 defined on R such that P (X B = f (x dx for each B R. The function f is called pdf (probability density function, or just density of X. B Figure 1. Example of pdf. Here the area is P (X [a, b] Remark We have For B = [a, b], we have that P (a X B = b f (x dx. a For any a R, we have that P (X = a = a f (x = 0. a By the probability axiom: f (x dx = P ( < x < = 1. 69

70 Example Suppose that X is a continuous random variable whose probability density function is given by (1 What is the value of C? (2 Find P (X > 1. C (4x 2x 2, 0 < x < 2 f (x = 0, otherwise. Solution. (1 We set the equation 1 = f dx = C 2 (4x 0 2x2 dx = ( Hence C = 3 8. C 4 x x = C ( (2 We have P (X > 1 = f (x dx = 3 2 (4x x2 dx = 1. 2 As in the discrete case, we define Furthermore, we have that = C The cdf of a continuous distribution F (a = P (X X = a f (x dx. P (a X b = P (X b P (X < a = F (b F (a. Connection to the pdf of X. By the fundamental theorem of calculus, F (a = f (a. Summary: pdf cdf f (x 0, f (x dx = 1 P (a X b = b f (x dx a f (a = F (a F ( = 0, F ( = 1,F P (a X B = F (b = F (a F(a = a f (x dx 70

71 Expectation & Variance of continuous r.v. (5.2 The expected value of a discrete r.v. was defined as E (X = x x p X (x. In the continuous case, we the sum is replaced by an integral, and the pmf is replaced by the pdf: Definition For a r.v. X with pdf f (x, the expected value is E (X = x f (x dx. Example Find E (X, where X is a continuous random variable whose probability density function is given by Solution. We have that E (X = 0 e x 0 < x f (x = 0, x < 0. xe x = [u = x, v = e x ] = xe x 0 + e x = 1. Exercise The concentration of alcohol in your blood t hours after drinking is e t. The concentration is measured at a random time X, whose pdf is given by 1/2, 3 < x < 5 f (x = 0, otherwise. Find the pdf of Y = e X (concentration level of alcohol, and the expected concentration of alcohol in the blood. Solution. Note that F Y (y = P (Y y = P ( e X y = P ( X ln y = P (X ln y 0, ln y 5 ln y 0, ln y 5 = f (x dx = 5 1 ln y 2 ln y, 3 < ln y < 5 = 1 (5 + ln y, 3 < ln y < , ln y 3 1, ln y , y e 5 = 1 2 (ln y + 5, e 5 < y < e 3. 1, y e 3 0 Therefore, 1 f Y (y = F Y (y =, 2y e 5 < y < e 3. 0, otherwise 71

72 The expected concentration in blood is thus E (Y = e 3 e 5 y 1 2y dy = 1 y e 3 e = 1 ( e 3 e In fact, there is a way to compute the expected value of Y as a function of X, without finding f Y. Namely, we can use the following proposition: Proposition (Expectation of a function of a r.v. For any function g (x E [g (X] = g (x f (x dx. Compare to E [g (X] = x g (x p (x for a discrete r.v. X. Exercise Compute the expected concentration of alcohol in blood from the previous exercise, using the above proposition. Solution. We have that Y = e X, and hence E ( e X = e x f (x dx = 1 R e x dx = 1 2 ( e 3 e 5 Definition As in the discrete case, the variance of a continuous r.v. is E ( (X E (X 2 = E (X 2 E (X 2. Exercise Find the variance and standard deviation of Y from the previous example. Solution. We have Y 2 = e 2X and hence E ( Y 2 = e 2x f (x dx = 1 2 Therefore, The standard deviation is σ Y e 2x = 1 4 Var (Y = ( e 6 e Remark The properties we established for E (X and Var (X also hold in the continuous case. That is E (ax + b = ae (X + b, Var (ax + b = a 2 Var (X. 72

73 LECTURE Uniform distributions (5.3 Definition We say that a r.v. X has the uniform distribution on (α, β if X has pdf 1, x (α, β β α f (x = 0, otherwise Notation: X U (α, β. Loosely speaking: X is equally likely to take any value in [α, β]. Why 1? Because one must have 1 β f (x dx = dx = 1. β α β α α Exercise Suppose X has a uniform distribution on (3, 8. (1 Find P (X > 5. (2 Find P (2.5 X 7.5. Solution. The pdf of X is and hence 1 f (x =, 3 < x < 8 5 0, otherwise (1 P (X > 5 = dx = (2 P (2.5 X 7.5 = dx = The cdf of a uniform r.v. Let X U (α, β. By definition, the cdf of X is 0, x < α F (x = P (X x = x α β α, α < x < β. 1, x > β 73

74 In particular, if X U (0, 1 then 0, x 0 F (x = x, 0 < x < 1. 1, x 1 Example Let X U (0, 1. Then E (X = 1 0 and hence E ( X 2 = R x 2 f (x dx = x 1dx = x2 2 1 Var (X = E ( X 2 (E (X 2 = E ( X 2 Exercise Let X U (α, β. Find E (X and Var (X. Solution. Two ways: Method 1: Let Y = X α β α and 1 2 = E (Y = E 1 12 = Var (Y = Var. Then Y U (0, 1, and hence ( X α = E (X α β α β α ( X α = β α Method 2: Direct computation. We have E (X = β α 0 1 = 1, 2 0 x 2 dx = 1 3, ( 2 1 = = E (X = β α 2 + α = α + β, 2 1 (β α2 2 Var (X = Var (X =. (β α 12 x β α dx = 1 (β α β2 α 2 2 = α + β, 2 74

75 and hence E ( X 2 β = α x 2 β α dx = β3 α 3 3 (β α = β2 + αβ + α 2, 3 ( Var (X = β2 + αβ + α 2 2 α + β =... = 3 2 (β α2. 12 Exercise (Bus stop problem Busses arrive at the bus stop every 30 minutes, at 1 : 00, 1 : 30, 2:00, etc. Matt arrives at the bus stop at a random time, uniformly distributed between 2 and 3. What is the distribution of his waiting time? Expected wait? standard deviation? Solution. Denote Matt s arrival by X U (2, 3, and his waiting time by W. Since X is uniform, we have that 1, 2 < x < 3 f X (x = 0, otherwise. In particular, note that for any 2 < a < b < 3. P (a < X < b = b 1 dx = b a. a Next, we find the cdf of Matt s waiting time W : if t 1 we clearly have that F 2 W (t = P (W t = 1 (Matt will wait at most a 1 hour for a bus. If t < 1 then 2 2 ( F W (t = P (W t = P t X 21 + P (3 t X 3 = 2 = [2.5 (2.5 t] + [3 (3 t] = 2t. Therefore, 1 1, < t 2 F W (t = 2t, 0 < t 1 2 0, t 0. which means that W U ( 0, 1 2. Thus E (W = 1 (1/22, Var (W = 4 12 So, the typical waiting time is 15 ± 7 minutes. = σ W = 1/ min. Geometric Probability. The following is a multidimensional generalization of U (α, β. Given a set S R 2 (or R 3, or R n, the probability that a uniformly distributed random point in S belongs to a subset A S is P (X A = Area (A Area (S ( Vol (A Vol (S. 75

76 Exercise A certain city has the shape of a disk with radius R. A family builds their home at a random point in the city, uniformly distributed. (1 What is the probability that the distance of their home to the city center is less than r miles? (2 Find the expected distance from their home to the city center, and its variance. Solution. Let X be the distance from their home to the city center. Then This means that the pdf of X is ( P (X r = πr2 r 2 πr =. 2 R 2r 0 < r < R R f (r = 2 0 otherwise. Therefore and Var (X = E (X = R 0 R 0 r 2r R 2 dr = 2 3 R, r 2 2r ( 2 2 R dr 2 3 R = R

77 LECTURE 21 The normal distribution (5.4 The standard normal distribution. Definition A r.v. has the standard normal distribution if the pdf of X os Notation: X N (0, 1. f (x = 1 2π e x2 /2, < x <. Remark Gauss used normal distributions to model his observations in astronomy. Therefore, normal distributions are often referred to as Gaussian distributions. Why 1 2π? As usual, since: Proposition f (x above is indeed a pdf, i.e., Proof. Need to show that Trick: I 2 = I := f (x dx = 1. e x2 /2 dx = 2π. ( ( /2 e x2 dx /2 e x2 dx = e x2 +y 2 R 2 2 dx. We now pass to polar coordinates: x = r cos θ,y = r sin θ dxdy = rdrdθ, so, [ I 2 = 0 ] u = r 2 /2 = 2π du = rdr 2π e r2 /2 rdrdθ = 2π e u du = 2π e u 0 = 2π. re r2 /2 dr 77

78 Exercise Let X N (0, 1. Show that E (X = 0, and Var (X = 1. Proof. By definition Moreover, Var (X = E ( X 2 = 1 2π [ E (X = 1 2π u = x, v = xe x2 /2 u = 1, v = xe x2 /2 = e x2 /2 1 (0 + 2π = 1. 2π x 2 e x2 /2 dx = xe x2 /2 odd function dx = 0. ] = 1 ( xe x2 /2 + 2π e x2 /2 dx = The cdf of N (0, 1. The cdf of a standard r.v. is denoted by φ (a. We have φ (a = P (X a = 1 2π a e x2 /2 dx, a R. The function φ (x cannot be expressed in terms of elementary functions, such as sin x, cos x, e x, ln x, x n, etc. It is a special function, tabulated on p.201 of Ross. Remark Bt the symmetry of the density of X N (0, 1, one has φ ( x = 1 φ (x. Figure 1. By symmetry of the density of N (0, 1,φ ( z = 1 φ (z Example Let X N (0, 1.Find P ( X 2. Solution. We have P( 2 X 2 = φ (2 φ ( 2 = 2φ (2 1 = In the this example, you see how high the probability of a small interval around the origin is. This is a demonstration of the fast tail decay of f (x. 78

79 General normal distributions. Let Z N (01, and consider the r.v. X = µ + σz, where µ R, and σ > 0. Then, the cdf of X is : F (x = P (X x = P (µ + σz x = P By the chain rule, the pdf of X is: f (x = F (x = φ ( x µ σ The expected value and variance are: This leads us to the following definition: ( Z x µ = φ σ 1 σ = 1 σ E (X = E (µ + σz = µ + σe (Z =0 (x µ 2π 2 e = µ Var (X = Var (µ + σz = σ 2 Var (Z = σ 2. 2σ 2. ( x µ Definition A r.v. X has normal distribution with parameters µ, σ if the pdf X has pdf: Notation: X N (µ, σ 2. We have X µ σ f (x = 1 2π e σ (t µ 2 2σ 2. N (0, 1, and F X (x = P (X x = P ( X µ σ x µ σ σ ( = φ x µ σ. Meaning of the parameters: µ = E (X is the mean value of X, and σ = σ X is the standard deviation of X. Different values of µ correspond to horizontal shifts of the density function, and different values of σ make the density s graph narrower/broader (see figure below.. 79

Probabilistic models

Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became the definitive formulation

More information

Probabilistic models

Probabilistic models Probabilistic models Kolmogorov (Andrei Nikolaevich, 1903 1987) put forward an axiomatic system for probability theory. Foundations of the Calculus of Probabilities, published in 1933, immediately became

More information

1 The Basic Counting Principles

1 The Basic Counting Principles 1 The Basic Counting Principles The Multiplication Rule If an operation consists of k steps and the first step can be performed in n 1 ways, the second step can be performed in n ways [regardless of how

More information

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch Monty Hall Puzzle Example: You are asked to select one of the three doors to open. There is a large prize behind one of the doors and if you select that door, you win the prize. After you select a door,

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Discrete Structures for Computer Science

Discrete Structures for Computer Science Discrete Structures for Computer Science William Garrison bill@cs.pitt.edu 6311 Sennott Square Lecture #24: Probability Theory Based on materials developed by Dr. Adam Lee Not all events are equally likely

More information

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability Lecture Notes 1 Basic Probability Set Theory Elements of Probability Conditional probability Sequential Calculation of Probability Total Probability and Bayes Rule Independence Counting EE 178/278A: Basic

More information

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 1 Probability Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Applications Elementary Set Theory Random

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter One Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Chapter One Notes 1 / 71 Outline 1 A Sketch of Probability and

More information

Probability 1 (MATH 11300) lecture slides

Probability 1 (MATH 11300) lecture slides Probability 1 (MATH 11300) lecture slides Márton Balázs School of Mathematics University of Bristol Autumn, 2015 December 16, 2015 To know... http://www.maths.bris.ac.uk/ mb13434/prob1/ m.balazs@bristol.ac.uk

More information

CSC Discrete Math I, Spring Discrete Probability

CSC Discrete Math I, Spring Discrete Probability CSC 125 - Discrete Math I, Spring 2017 Discrete Probability Probability of an Event Pierre-Simon Laplace s classical theory of probability: Definition of terms: An experiment is a procedure that yields

More information

Probability. Lecture Notes. Adolfo J. Rumbos

Probability. Lecture Notes. Adolfo J. Rumbos Probability Lecture Notes Adolfo J. Rumbos October 20, 204 2 Contents Introduction 5. An example from statistical inference................ 5 2 Probability Spaces 9 2. Sample Spaces and σ fields.....................

More information

Chapter 2 Class Notes

Chapter 2 Class Notes Chapter 2 Class Notes Probability can be thought of in many ways, for example as a relative frequency of a long series of trials (e.g. flips of a coin or die) Another approach is to let an expert (such

More information

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th HW2 Solutions, for MATH44, STAT46, STAT56, due September 9th. You flip a coin until you get tails. Describe the sample space. How many points are in the sample space? The sample space consists of sequences

More information

With Question/Answer Animations. Chapter 7

With Question/Answer Animations. Chapter 7 With Question/Answer Animations Chapter 7 Chapter Summary Introduction to Discrete Probability Probability Theory Bayes Theorem Section 7.1 Section Summary Finite Probability Probabilities of Complements

More information

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1 Topic -2 Probability Larson & Farber, Elementary Statistics: Picturing the World, 3e 1 Probability Experiments Experiment : An experiment is an act that can be repeated under given condition. Rolling a

More information

Chance, too, which seems to rush along with slack reins, is bridled and governed by law (Boethius, ).

Chance, too, which seems to rush along with slack reins, is bridled and governed by law (Boethius, ). Chapter 2 Probability Chance, too, which seems to rush along with slack reins, is bridled and governed by law (Boethius, 480-524). Blaise Pascal (1623-1662) Pierre de Fermat (1601-1665) Abraham de Moivre

More information

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

The probability of an event is viewed as a numerical measure of the chance that the event will occur. Chapter 5 This chapter introduces probability to quantify randomness. Section 5.1: How Can Probability Quantify Randomness? The probability of an event is viewed as a numerical measure of the chance that

More information

Formalizing Probability. Choosing the Sample Space. Probability Measures

Formalizing Probability. Choosing the Sample Space. Probability Measures Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take

More information

Statistical Theory 1

Statistical Theory 1 Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

Lecture Lecture 5

Lecture Lecture 5 Lecture 4 --- Lecture 5 A. Basic Concepts (4.1-4.2) 1. Experiment: A process of observing a phenomenon that has variation in its outcome. Examples: (E1). Rolling a die, (E2). Drawing a card form a shuffled

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Conditional Probability & Independence. Conditional Probabilities

Conditional Probability & Independence. Conditional Probabilities Conditional Probability & Independence Conditional Probabilities Question: How should we modify P(E) if we learn that event F has occurred? Definition: the conditional probability of E given F is P(E F

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Probability Notes (A) , Fall 2010

Probability Notes (A) , Fall 2010 Probability Notes (A) 18.310, Fall 2010 We are going to be spending around four lectures on probability theory this year. These notes cover approximately the first three lectures on it. Probability theory

More information

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G.

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G. CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Copyright Arthur G. Werschulz, 2017.

More information

Lectures Conditional Probability and Independence

Lectures Conditional Probability and Independence Lectures 5 11 Conditional Probability and Independence Purpose: Calculate probabilities under restrictions, conditions or partial information on the random experiment. Break down complex probabilistic

More information

Introduction and basic definitions

Introduction and basic definitions Chapter 1 Introduction and basic definitions 1.1 Sample space, events, elementary probability Exercise 1.1 Prove that P( ) = 0. Solution of Exercise 1.1 : Events S (where S is the sample space) and are

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Axioms of Probability. Set Theory. M. Bremer. Math Spring 2018

Axioms of Probability. Set Theory. M. Bremer. Math Spring 2018 Math 163 - pring 2018 Axioms of Probability Definition: The set of all possible outcomes of an experiment is called the sample space. The possible outcomes themselves are called elementary events. Any

More information

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2 Probability Probability is the study of uncertain events or outcomes. Games of chance that involve rolling dice or dealing cards are one obvious area of application. However, probability models underlie

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MATH MW Elementary Probability Course Notes Part I: Models and Counting MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010 Introduction [Jan 5] Probability: the mathematics used for Statistics

More information

Lecture 8: Probability

Lecture 8: Probability Lecture 8: Probability The idea of probability is well-known The flipping of a balanced coin can produce one of two outcomes: T (tail) and H (head) and the symmetry between the two outcomes means, of course,

More information

Chapter 3 : Conditional Probability and Independence

Chapter 3 : Conditional Probability and Independence STAT/MATH 394 A - PROBABILITY I UW Autumn Quarter 2016 Néhémy Lim Chapter 3 : Conditional Probability and Independence 1 Conditional Probabilities How should we modify the probability of an event when

More information

Properties of Probability

Properties of Probability Econ 325 Notes on Probability 1 By Hiro Kasahara Properties of Probability In statistics, we consider random experiments, experiments for which the outcome is random, i.e., cannot be predicted with certainty.

More information

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Math 224 Fall 2017 Homework 1 Drew Armstrong Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman. Section 1.1, Exercises 4,5,6,7,9,12. Solutions to Book Problems.

More information

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability Chapter 2 Conditional Probability and Independence 2.1 Conditional Probability Example: Two dice are tossed. What is the probability that the sum is 8? This is an easy exercise: we have a sample space

More information

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then 1.1 Probabilities Def n: A random experiment is a process that, when performed, results in one and only one of many observations (or outcomes). The sample space S is the set of all elementary outcomes

More information

Introduction to Decision Sciences Lecture 11

Introduction to Decision Sciences Lecture 11 Introduction to Decision Sciences Lecture 11 Andrew Nobel October 24, 2017 Basics of Counting Product Rule Product Rule: Suppose that the elements of a collection S can be specified by a sequence of k

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B). Lectures 7-8 jacques@ucsdedu 41 Conditional Probability Let (Ω, F, P ) be a probability space Suppose that we have prior information which leads us to conclude that an event A F occurs Based on this information,

More information

Introduction to Probability, Fall 2009

Introduction to Probability, Fall 2009 Introduction to Probability, Fall 2009 Math 30530 Review questions for exam 1 solutions 1. Let A, B and C be events. Some of the following statements are always true, and some are not. For those that are

More information

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces. Probability Theory To start out the course, we need to know something about statistics and probability Introduction to Probability Theory L645 Advanced NLP Autumn 2009 This is only an introduction; for

More information

Discrete Probability

Discrete Probability Discrete Probability Counting Permutations Combinations r- Combinations r- Combinations with repetition Allowed Pascal s Formula Binomial Theorem Conditional Probability Baye s Formula Independent Events

More information

STAT Chapter 3: Probability

STAT Chapter 3: Probability Basic Definitions STAT 515 --- Chapter 3: Probability Experiment: A process which leads to a single outcome (called a sample point) that cannot be predicted with certainty. Sample Space (of an experiment):

More information

Introduction to Probability 2017/18 Supplementary Problems

Introduction to Probability 2017/18 Supplementary Problems Introduction to Probability 2017/18 Supplementary Problems Problem 1: Let A and B denote two events with P(A B) 0. Show that P(A) 0 and P(B) 0. A A B implies P(A) P(A B) 0, hence P(A) 0. Similarly B A

More information

Chapter Summary. 7.1 Discrete Probability 7.2 Probability Theory 7.3 Bayes Theorem 7.4 Expected value and Variance

Chapter Summary. 7.1 Discrete Probability 7.2 Probability Theory 7.3 Bayes Theorem 7.4 Expected value and Variance Chapter 7 Chapter Summary 7.1 Discrete Probability 7.2 Probability Theory 7.3 Bayes Theorem 7.4 Expected value and Variance Section 7.1 Introduction Probability theory dates back to 1526 when the Italian

More information

Axioms of Probability

Axioms of Probability Sample Space (denoted by S) The set of all possible outcomes of a random experiment is called the Sample Space of the experiment, and is denoted by S. Example 1.10 If the experiment consists of tossing

More information

Conditional Probability & Independence. Conditional Probabilities

Conditional Probability & Independence. Conditional Probabilities Conditional Probability & Independence Conditional Probabilities Question: How should we modify P(E) if we learn that event F has occurred? Definition: the conditional probability of E given F is P(E F

More information

Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3)

Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3) 1 Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3) On this exam, questions may come from any of the following topic areas: - Union and intersection of sets - Complement of

More information

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability Chapter 2 Conditional Probability and Independence 2.1 Conditional Probability Probability assigns a likelihood to results of experiments that have not yet been conducted. Suppose that the experiment has

More information

Module 1. Probability

Module 1. Probability Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive

More information

STAT 516 Answers Homework 2 January 23, 2008 Solutions by Mark Daniel Ward PROBLEMS. = {(a 1, a 2,...) : a i < 6 for all i}

STAT 516 Answers Homework 2 January 23, 2008 Solutions by Mark Daniel Ward PROBLEMS. = {(a 1, a 2,...) : a i < 6 for all i} STAT 56 Answers Homework 2 January 23, 2008 Solutions by Mark Daniel Ward PROBLEMS 2. We note that E n consists of rolls that end in 6, namely, experiments of the form (a, a 2,...,a n, 6 for n and a i

More information

lecture notes October 22, Probability Theory

lecture notes October 22, Probability Theory 8.30 lecture notes October 22, 203 Probability Theory Lecturer: Michel Goemans These notes cover the basic definitions of discrete probability theory, and then present some results including Bayes rule,

More information

Part (A): Review of Probability [Statistics I revision]

Part (A): Review of Probability [Statistics I revision] Part (A): Review of Probability [Statistics I revision] 1 Definition of Probability 1.1 Experiment An experiment is any procedure whose outcome is uncertain ffl toss a coin ffl throw a die ffl buy a lottery

More information

Discrete Random Variable

Discrete Random Variable Discrete Random Variable Outcome of a random experiment need not to be a number. We are generally interested in some measurement or numerical attribute of the outcome, rather than the outcome itself. n

More information

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability What is Probability? the chance of an event occuring eg 1classical probability 2empirical probability 3subjective probability Section 2 - Probability (1) Probability - Terminology random (probability)

More information

Dept. of Linguistics, Indiana University Fall 2015

Dept. of Linguistics, Indiana University Fall 2015 L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 34 To start out the course, we need to know something about statistics and This is only an introduction; for a fuller understanding, you would

More information

STAT 414: Introduction to Probability Theory

STAT 414: Introduction to Probability Theory STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises

More information

CMPSCI 240: Reasoning Under Uncertainty

CMPSCI 240: Reasoning Under Uncertainty CMPSCI 240: Reasoning Under Uncertainty Lecture 5 Prof. Hanna Wallach wallach@cs.umass.edu February 7, 2012 Reminders Pick up a copy of B&T Check the course website: http://www.cs.umass.edu/ ~wallach/courses/s12/cmpsci240/

More information

2.6 Tools for Counting sample points

2.6 Tools for Counting sample points 2.6 Tools for Counting sample points When the number of simple events in S is too large, manual enumeration of every sample point in S is tedious or even impossible. (Example) If S contains N equiprobable

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

Lecture 1 : The Mathematical Theory of Probability

Lecture 1 : The Mathematical Theory of Probability Lecture 1 : The Mathematical Theory of Probability 0/ 30 1. Introduction Today we will do 2.1 and 2.2. We will skip Chapter 1. We all have an intuitive notion of probability. Let s see. What is the probability

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

STT When trying to evaluate the likelihood of random events we are using following wording.

STT When trying to evaluate the likelihood of random events we are using following wording. Introduction to Chapter 2. Probability. When trying to evaluate the likelihood of random events we are using following wording. Provide your own corresponding examples. Subjective probability. An individual

More information

Chapter 8 Sequences, Series, and Probability

Chapter 8 Sequences, Series, and Probability Chapter 8 Sequences, Series, and Probability Overview 8.1 Sequences and Series 8.2 Arithmetic Sequences and Partial Sums 8.3 Geometric Sequences and Partial Sums 8.5 The Binomial Theorem 8.6 Counting Principles

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

4. Conditional Probability

4. Conditional Probability 1 of 13 7/15/2009 9:25 PM Virtual Laboratories > 2. Probability Spaces > 1 2 3 4 5 6 7 4. Conditional Probability Definitions and Interpretations The Basic Definition As usual, we start with a random experiment

More information

Probability (Devore Chapter Two)

Probability (Devore Chapter Two) Probability (Devore Chapter Two) 1016-345-01: Probability and Statistics for Engineers Fall 2012 Contents 0 Administrata 2 0.1 Outline....................................... 3 1 Axiomatic Probability 3

More information

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space I. Vocabulary: A. Outcomes: the things that can happen in a probability experiment B. Sample Space (S): all possible outcomes C. Event (E): one outcome D. Probability of an Event (P(E)): the likelihood

More information

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES COMMON TERMS RELATED TO PROBABILITY Probability is the measure of the likelihood that an event will occur Probability values are

More information

MATH 556: PROBABILITY PRIMER

MATH 556: PROBABILITY PRIMER MATH 6: PROBABILITY PRIMER 1 DEFINITIONS, TERMINOLOGY, NOTATION 1.1 EVENTS AND THE SAMPLE SPACE Definition 1.1 An experiment is a one-off or repeatable process or procedure for which (a there is a well-defined

More information

1. Sample spaces, events and conditional probabilities. A sample space is a finite or countable set S together with a function. P (x) = 1.

1. Sample spaces, events and conditional probabilities. A sample space is a finite or countable set S together with a function. P (x) = 1. DEPARTMENT OF MATHEMATICS UNIVERSITY OF CALIFORNIA, BERKELEY Probability theory H.W. Lenstra, Jr. These notes contain material on probability for Math 55, Discrete mathematics. They were written to supplement

More information

Name: Exam 2 Solutions. March 13, 2017

Name: Exam 2 Solutions. March 13, 2017 Department of Mathematics University of Notre Dame Math 00 Finite Math Spring 07 Name: Instructors: Conant/Galvin Exam Solutions March, 07 This exam is in two parts on pages and contains problems worth

More information

Lecture 2: Probability. Readings: Sections Statistical Inference: drawing conclusions about the population based on a sample

Lecture 2: Probability. Readings: Sections Statistical Inference: drawing conclusions about the population based on a sample Lecture 2: Probability Readings: Sections 5.1-5.3 1 Introduction Statistical Inference: drawing conclusions about the population based on a sample Parameter: a number that describes the population a fixed

More information

4. Probability of an event A for equally likely outcomes:

4. Probability of an event A for equally likely outcomes: University of California, Los Angeles Department of Statistics Statistics 110A Instructor: Nicolas Christou Probability Probability: A measure of the chance that something will occur. 1. Random experiment:

More information

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability Probability Chapter 1 Probability 1.1 asic Concepts researcher claims that 10% of a large population have disease H. random sample of 100 people is taken from this population and examined. If 20 people

More information

Term Definition Example Random Phenomena

Term Definition Example Random Phenomena UNIT VI STUDY GUIDE Probabilities Course Learning Outcomes for Unit VI Upon completion of this unit, students should be able to: 1. Apply mathematical principles used in real-world situations. 1.1 Demonstrate

More information

Lectures for STP 421: Probability Theory. Jay Taylor

Lectures for STP 421: Probability Theory. Jay Taylor Lectures for STP 421: Probability Theory Jay Taylor February 27, 2012 Contents 1 Overview and Conceptual Foundations of Probability 5 1.1 Deterministic and Statistical Regularity.......................

More information

PROBABILITY VITTORIA SILVESTRI

PROBABILITY VITTORIA SILVESTRI PROBABILITY VITTORIA SILVESTRI Contents Preface. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8 4. Properties of Probability measures Preface These lecture notes are for the course

More information

Conditional probability

Conditional probability CHAPTER 4 Conditional probability 4.1. Introduction Suppose there are 200 men, of which 100 are smokers, and 100 women, of which 20 are smokers. What is the probability that a person chosen at random will

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Lecture 2 Probability Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 3-1 3.1 Definition Random Experiment a process leading to an uncertain

More information

What is a random variable

What is a random variable OKAN UNIVERSITY FACULTY OF ENGINEERING AND ARCHITECTURE MATH 256 Probability and Random Processes 04 Random Variables Fall 20 Yrd. Doç. Dr. Didem Kivanc Tureli didemk@ieee.org didem.kivanc@okan.edu.tr

More information

STAT 418: Probability and Stochastic Processes

STAT 418: Probability and Stochastic Processes STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical

More information

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

MATH2206 Prob Stat/20.Jan Weekly Review 1-2 MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion

More information

Review Basic Probability Concept

Review Basic Probability Concept Economic Risk and Decision Analysis for Oil and Gas Industry CE81.9008 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178 EE 178 Lecture Notes 0 Course Introduction About EE178 About Probability Course Goals Course Topics Lecture Notes EE 178: Course Introduction Page 0 1 EE 178 EE 178 provides an introduction to probabilistic

More information

Elementary Discrete Probability

Elementary Discrete Probability Elementary Discrete Probability MATH 472 Financial Mathematics J Robert Buchanan 2018 Objectives In this lesson we will learn: the terminology of elementary probability, elementary rules of probability,

More information

Week 2: Probability: Counting, Sets, and Bayes

Week 2: Probability: Counting, Sets, and Bayes Statistical Methods APPM 4570/5570, STAT 4000/5000 21 Probability Introduction to EDA Week 2: Probability: Counting, Sets, and Bayes Random variable Random variable is a measurable quantity whose outcome

More information

n N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.)

n N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.) CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.) S.T. is the key to understanding driving forces. e.g., determines if a process proceeds spontaneously. Let s start with entropy

More information

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary) Chapter 14 From Randomness to Probability How to measure a likelihood of an event? How likely is it to answer correctly one out of two true-false questions on a quiz? Is it more, less, or equally likely

More information

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events... Probability COMP 245 STATISTICS Dr N A Heard Contents Sample Spaces and Events. Sample Spaces........................................2 Events........................................... 2.3 Combinations

More information

Fundamentals of Probability CE 311S

Fundamentals of Probability CE 311S Fundamentals of Probability CE 311S OUTLINE Review Elementary set theory Probability fundamentals: outcomes, sample spaces, events Outline ELEMENTARY SET THEORY Basic probability concepts can be cast in

More information

Discrete Probability. Chemistry & Physics. Medicine

Discrete Probability. Chemistry & Physics. Medicine Discrete Probability The existence of gambling for many centuries is evidence of long-running interest in probability. But a good understanding of probability transcends mere gambling. The mathematics

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information