Basic Statistics for SGPE Students Part II: Probability theory 1

Similar documents
Events A and B are said to be independent if the occurrence of A does not affect the probability of B.

What is Probability? Probability. Sample Spaces and Events. Simple Event

Probabilistic models

CSC Discrete Math I, Spring Discrete Probability

Conditional Probability

Introduction Probability. Math 141. Introduction to Probability and Statistics. Albyn Jones

Lecture 3 Probability Basics

Deep Learning for Computer Vision

STAT 430/510 Probability

STAT509: Probability

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

STAT:5100 (22S:193) Statistical Inference I

Mathematics. ( : Focus on free Education) (Chapter 16) (Probability) (Class XI) Exercise 16.2

CS 361: Probability & Statistics

With Question/Answer Animations. Chapter 7

LECTURE NOTES by DR. J.S.V.R. KRISHNA PRASAD

Probability Theory and Applications

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Probability Pearson Education, Inc. Slide

Chapter 2 PROBABILITY SAMPLE SPACE

Probabilistic models

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Lecture Lecture 5

CS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G.

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Dept. of Linguistics, Indiana University Fall 2015

Probability the chance that an uncertain event will occur (always between 0 and 1)

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

Probability Year 9. Terminology

Uncertainty. Russell & Norvig Chapter 13.

Probability Pr(A) 0, for any event A. 2. Pr(S) = 1, for the sample space S. 3. If A and B are mutually exclusive, Pr(A or B) = Pr(A) + Pr(B).

First Digit Tally Marks Final Count

CS 361: Probability & Statistics

Business Statistics MBA Pokhara University

What is a random variable

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 15 Notes. Class URL:


Probability Year 10. Terminology

the time it takes until a radioactive substance undergoes a decay

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers.

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

TA Qinru Shi: Based on poll result, the first Programming Boot Camp will be: this Sunday 5 Feb, 7-8pm Gates 114.

Probability. VCE Maths Methods - Unit 2 - Probability

2. Conditional Probability

Week 2: Probability: Counting, Sets, and Bayes

Introduction to Probability 2017/18 Supplementary Problems

Probability and Probability Distributions. Dr. Mohammed Alahmed

Lecture 3 - Axioms of Probability

324 Stat Lecture Notes (1) Probability

Lecture 16. Lectures 1-15 Review

Topic 3: Introduction to Probability

Probability (10A) Young Won Lim 6/12/17

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

MATH1231 Algebra, 2017 Chapter 9: Probability and Statistics

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Discrete Random Variable

tossing a coin selecting a card from a deck measuring the commuting time on a particular morning

n How to represent uncertainty in knowledge? n Which action to choose under uncertainty? q Assume the car does not have a flat tire

Bayes Rule for probability

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Lecture 1: Probability Fundamentals

CS206 Review Sheet 3 October 24, 2018

Lower bound for sorting/probability Distributions

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

Econ 325: Introduction to Empirical Economics

Part (A): Review of Probability [Statistics I revision]

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

Chapter 2. Probability

Chapter 7: Section 7-1 Probability Theory and Counting Principles

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

Probability Distributions. Conditional Probability.

Conditional Probability and Bayes Theorem (2.4) Independence (2.5)

PROBABILITY CHAPTER LEARNING OBJECTIVES UNIT OVERVIEW

ELEG 3143 Probability & Stochastic Process Ch. 1 Experiments, Models, and Probabilities

SDS 321: Introduction to Probability and Statistics

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Origins of Probability Theory

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 16 Notes. Class URL:

2.6 Tools for Counting sample points

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

Conditional Probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Lecture 3. January 7, () Lecture 3 January 7, / 35

Probability. Introduction to Biostatistics

Discrete Probability

ORF 245 Fundamentals of Statistics Chapter 5 Probability

Conditional probability

STAT Chapter 3: Probability

UNIT 5 ~ Probability: What Are the Chances? 1

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

STA Module 4 Probability Concepts. Rev.F08 1

2.4 Conditional Probability

Chance, too, which seems to rush along with slack reins, is bridled and governed by law (Boethius, ).

Transcription:

Basic Statistics for SGPE Students Part II: Probability theory 1 Mark Mitchell mark.mitchell@ed.ac.uk Nicolai Vitt n.vitt@ed.ac.uk University of Edinburgh September 2016 1 Thanks to Achim Ahrens, Anna Babloyan and Erkal Ersoy for creating these slides and allowing us to use them.

Hypothesis testing and p-values 1 / 35 Outline 1. Descriptive statistics Sample statistics (mean, variance, percentiles) Graphs (box plot, histogram) Data transformations (log transformation, unit of measure) Correlation vs. Causation 2. Probability theory Conditional probabilities and independence Bayes theorem 3. Probability distributions Discrete and continuous probability functions Probability density function & cumulative distribution function Binomial, Poisson and Normal distribution E[X] and V[X] 4. Statistical inference Population vs. sample Law of large numbers Central limit theorem Confidence intervals

Probability Example II.1 A fair coin is tossed three times. Sample space and event The (mutually exclusive and exhaustive) list of possible outcomes of an experiment is known as the sample space and is denoted as S. An event E is a single outcome or group of outcomes in the sample space. That is, E is a subset of S. In this example, S = {HHH ; THH ; HTH ; HHT; HTT; THT; TTH ; TTT} where H and T denote head and tail. Suppose we are interested in the event at least two heads. The corresponding subspace is E = {HHH ; THH ; HTH ; HHT}. What is the probability of the event E? 2 / 35

Probability Let s take a step back: What is probability? Classical Interpretation (Jacob Bernoulli, Pierre-Simon Laplace) If outcomes are equally likely, they must have the same probability. For example, when a coin is tossed, there are two possible outcomes: head and tail. More general, if there are n equally likely outcomes, then the probability of each outcome is 1/n. Frequency Interpretation The probability that a specific outcome of a process will be obtained is the relative frequency with which that outcome would be obtained if the process were repeated a large number of times under the same conditions. As we make more and more tosses, the proportion of tosses that produce head approaches 0.5. We say that 0.5 is the probability of head. Rel. frequency of heads 0.2.4.6.8 1 Trial 1 Trial 2 0 20 40 60 80 100 Number of tosses 3 / 35

Probability Let s take a step back: What is probability? Subjective Interpretation (Bayesian approach) The probability that a person assigns to a possible outcome represents his own judgement (based on the person s beliefs and information). Another person, who may have different beliefs or different information, may assign a different probability to the same outcome. Distinction between prior and posterior beliefs. Thinking about randomness [Carl Friedrich] Gauss s conversation turned to chance, the enemy of all knowledge, and the thing he had always wished to overcome. Viewed from up close, one could detect the infinite fineness of the web of causality behind every event. Step back and larger patterns appeared: Freedom and Chance were a question of distance, a point of view. Did he understand? Sort of, said Eugen wearily, looking at his pocket watch. from Measuring the World by Daniel Kehlmann 4 / 35

Probability Properties of probability S Rule 1 For any event A, 0 P(A) 1. Furthermore, P(S) = 1. A S Rule 2: Complement rule A c denotes the complement of event A. P(A c ) = 1 P(A) A c A S Rule 3: Multiplication rule Two events A and B are independent of each other if and only if P(AB) = P(A and B) = P(A B) = P(A)P(B). AB c AB A c B 5 / 35

Probability Properties of probability Rule 4: Addition rule If two events A and B are mutually exclusive, then P(A or B) = P(A B) = P(A) + P(B). A B S S Rule 5 If event B is a subset of event A, then P(B) < P(A). AB c B 6 / 35

Probability What is the probability of E? Example II.1 A fair coin is tossed three times. S = {HHH, THH, HTH, HHT, HTT, THT, TTH, TTT} E = {HHH, THH, HTH, HHT} What is P(E)? First, note that because the coin is fair P(H ) = P(T) = 1 2. Second, since each toss is independent of the previous, we can use Rule 3 (Multiplication Rule), P(HHH ) = P(H )P(H )P(H ) = 1 2 1 2 1 2 = 1 8. and following the same reasoning, P(HHT) = P(HHT) =... = 1 /8. Third, using Rule 4 (Addition Rule) P(E) = P(HHH ) + P(THH ) + P(HTH ) + P(HHT) = 4 8 = 1 2. 7 / 35

Probability Generalised addition rule Example II.2 A fair six-sided die is rolled. The sample space is given by S = {1, 2, 3, 4, 5, 6}. Let E 1 be the event obtain 3 or 4 and let E 2 denote the event smaller than 4. Thus, E 1 = {3, 4} and E 2 = {1, 2, 3} It is immediately clear that P(E 1 ) = 2 /6 and P(E 2 ) = 3 /6. But what is the probability that either E 1 or E 2? That is, what is P(E 1 E 2 )? Since E 1 and E 2 are not mutually exclusive, we cannot apply Rule 4 (Addition Rule). But we can generalise Rule 4. 8 / 35

Probability Generalised addition rule Rule 4 : (General) Addition rule For any two events A and B P(A or B) = P(A B) = P(A) + P(B) P(AB). Note that if A and B are mutually exclusive, P(AB) = 0. Therefore, Rule 4 is a special case of Rule 4. Applying Rule 4, we get AB c AB A c B S P(E 1 E 2 ) = P(E 1 ) + P(E 2 ) P(E 1 E 2 ) = P(3) + P(4) + P(1) + P(2) + P(3) P(3) }{{}}{{}}{{} P(E 1) P(E 2) P(E 1E 2) = 1 6 + 1 6 + 1 6 + 1 6 + 1 6 1 6 = 4 6. 9 / 35

Conditional probability Example II.3 Suppose that, on any particular day, Anna is either in a good mood (A) or in a bad mood (A c ). Also, on any particular day, the sun is either shining (B) or not (B c ). Anna s mood depends on the weather, such that she is more likely to be in a good mood when the sun is shining. S A The blue area A which represents the probability that Anna is in a good mood is rather small compared to the full rectangle ( 35%). In general, it is more likely that Anna is in a bad mood. 10 / 35

Conditional probability Example II.3 Suppose that, on any particular day, Anna is either in a good mood (A) or in a bad mood (A c ). Also, on any particular day, the sun is either shining (B) or not (B c ). Anna s mood depends on the weather, such that she is more likely to be in a good mood when the sun is shining. S AB A c B AB c This graph shows both events, A and B, and their overlap. 11 / 35

Conditional probability Example II.3 Suppose that, on any particular day, Anna is either in a good mood (A) or in a bad mood (A c ). Also, on any particular day, the sun is either shining (B) or not (B c ). Anna s mood depends on the weather, such that she is more likely to be in a good mood when the sun is shining. AB A c B Now, suppose the sun is shining. We can discard the remaining sample space and focus on B. The area AB takes up most of the area in the circle. That is, given that B occured, it is more likely that Anna is in a good mood, although in general she is more often in a bad mood. 12 / 35

Conditional probability Rule 3 : General Multiplication rule If A and B are any two events and P(B) > 0, then P(AB) = P(A)P(B A) = P(B)P(A B). AB c AB A c B S P(A B) is the conditional probability of the event A given that the event B has occurred. Conditional probability From Rule 3 follows the definition for conditional probability P(A B) = P(AB) P(B). Note that, if A and B are independent, then P(A B) = P(A)P(B) P(B) Thus, Rule 3 is a special case of Rule 3. = P(A). 13 / 35

Conditional probability Example II.4 The following table contains counts (in thousands) of persons aged 25 and older, classified by educational attainment and employment status: Education Employed Unemployed Not in labor force Total Did not finish high school 11,521 886 14,226 26,633 High school degree 36,857 1,682 22,834 61,373 Some college 34,612 1,275 13,944 49,831 Bachelor s degree or higher 43,182 892 12,546 56,620 Total 126,172 4,735 63,550 194,457 Is employment status independent of educational attainment? Suppose we randomly draw a person from the population. What is the probability that the person is employed? P(employed) = 126,172 194,457 = 0.6488. Now, suppose we randomly draw another person and are given the information that the person did not finish high school. What is the probability that the person is employed given that the person did not finish high school? P(employed did not finish high school) = 11,521 26,633 = 0.4326. 14 / 35

Conditional probability We can display the relationship between education and employment in a probability table. Education Employed Unemployed Not in labor force Total Did not finish high school 0.05925 0.00456 0.07316 0.13696 High school degree 0.18954 0.00865 0.11742 0.31561 Some college 0.17800 0.00656 0.07171 0.25626 Bachelor s degree or higher 0.22206 0.00459 0.06452 0.29117 Total 0.64884 0.02435 0.32681 1.00000 The probabilities in the central enclosed rectangle are joint probabilities. For example, P(no high school unemployed) = P(unemp.)P(no high school unemp.) ( ) ( ) 4,735 886 = 194,457 4,735 = P(no high school)p(unemp. no high school) ( ) ( ) 26,633 886 = 194,457 26,633 ( ) 886 = = 0.00456. 194,457 15 / 35

Conditional probability We can display the relationship between education and employment in a probability table. Education Employed Unemployed Not in labor force Total Did not finish high school 0.05925 0.00456 0.07316 0.13696 High school degree 0.18954 0.00865 0.11742 0.31561 Some college 0.17800 0.00656 0.07171 0.25626 Bachelor s degree or higher 0.22206 0.00459 0.06452 0.29117 Total 0.64884 0.02435 0.32681 1.00000 The probabilities in the right-most and bottom row are called marginal probabilities. For example, ( ) 61,373 P(High school degree) = = 0.31561. 194,457 16 / 35

Conditional probability We can display the relationship between education and employment in a probability table. Education Employed Unemployed Not in labor force Total Did not finish high school 0.05925 0.00456 0.07316 0.13696 High school degree 0.18954 0.00865 0.11742 0.31561 Some college 0.17800 0.00656 0.07171 0.25626 Bachelor s degree or higher 0.22206 0.00459 0.06452 0.29117 Total 0.64884 0.02435 0.32681 1.00000 Note that, under independence, P(no high school employed) = P(employed)P(no high school) = 0.64884 0.13696 = 0.08887 0.05925. which indicates that educational attainment and employment are not independent. 17 / 35

Conditional probability Example II.5 Ms Smith, Ms Brown and Ms Thomson want to spend a day in Edinburgh, but cannot agree on what to do. They decide to vote. Each person can choose between theatre (T) and cinema (C). Ms Smith and Ms Thomson decide independently but Ms Brown is affected by Ms Thomson. The probabilities can be summarised as follows: P(Thomson = T) = 0.2; P(Brown = T Thomson = T) = 0.8; P(Brown = T Thomson = C) = 0.05; P(Smith = T) = 0.8. What is the probability that the majority (i.e. at least two) will vote in favour of theatre? 18 / 35

Independence versus disjointness Recall that P(A and B) = P(A)P(B) holds if and only if A and B are independent. Furthermore, P(A and B) = 0 holds if and only if the events A and B are disjoint or mutually exclusive. Therefore, if A and B are nontrivial events (i.e. P(A) and P(B) are nonzero), then they cannot be both independent and mutually exclusive. Remark Independent and disjoint does not mean the same! Disjointness means that A and B cannot occur at the same time. Independence means that the occurrence of A has no influence on the probability that B happens, and vice versa. 19 / 35

Bayes theorem Derivation From Rule 3 P(A B) = P(AB) P(B) P(B A) = P(AB) P(A) (1) (2) We can rewrite (2) as P(B A)P(A) = P(AB) and substitute the expression into (1) to get P(A B) = P(B A)P(A). (3) P(B) Furthermore and from (2) Therefore, we can write (3) as P(B) = P(BA) + P(BA c ) P(B) = P(B A)P(A) + P(B A c )P(A c ). P(A B) = P(B A)P(A) P(B A)P(A) + P(B A c )P(A c ). 20 / 35

Bayes theorem Bayes theorem For any two events A and B with 0 < P(A) < 1 and 0 < P(B) < 1 P(A B) = P(B A)P(A) P(B A)P(A) + P(B A c )P(A c ) Bayes theorem provides a simple rule for computing the conditional probability of the event A given B from the conditional probability of B given A (and the unconditional probability of A). 21 / 35

Bayes theorem Example II.6 Suppose you have three coins in a box. Two of them are fair and the other one is counterfeit and always lands heads. Thus, if you randomly pick one coin, there is a 1/3 chance that the coin is counterfeit; i.e. P(counterfeit) = 1/3. P(counterfeit) is the prior (or unconditional) probability. Now, you toss the randomly picked coin three times and get three heads. We are interested in the (posterior) probability that the coin is counterfeit conditional on observing three heads. That is P(counterfeit HHH) = We know from above that P(HHH counterfeit)p(counterfeit) P(HHH counterfeit)p(counterfeit) + P(HHH fair)p(fair) P(counterfeit) = 1 3 P(fair) = 2 3 P(HHH counterfeit) = 1 P(HHH fair) = 1 8 Thus, 1 1 3 P(counterfeit HHH) = 1 1 + 1 2 3 8 3 = 4 5. 22 / 35

Bayes theorem Example II.7 Suppose a test for an illegal drug correctly identifies drug users 90% of the time and will give a positive reading for non-drug users only 1% percent of the time. 1 person in thousand of the population are drug users. Timmy is tested positive, indicating that he is a drug user. How likely is it that Timmy is actually a drug user? We are looking for P(user pos.) = P(pos. user)p(user) P(pos. user)p(user) + P(pos. non-user)p(non-user). From the text above, we know that P(user) = 0.001, P(non-user) = 0.999, P(pos. user) = 0.9 and P(pos. non-user) = 0.01. Therefore, P(user pos.) = 0.9 0.001 0.9 0.001 + 0.01 0.999 0.083. 23 / 35

Bayes theorem Example II.7 Suppose a test for an illegal drug correctly identifies drug users 90% of the time and will give a positive reading for non-drug users only 1% percent of the time. 1 person in thousand of the population are drug users. Timmy is tested positive. indicating that he is a drug user. How likely is it that Timmy is actually a drug user? The prior (unconditional) probability that Timmy is a drug user is P(user) = 0.001. Based on the information from the test, we update the prior probability of 0.001 upwards to a posterior probability of 0.0833. This probability is surprisingly low. Despite the positive test result and despite the test being quite reliable, it is more likely that Timmy is not a drug user than that he is a drug user! 24 / 35

Bayes theorem Example II.7 Suppose a test for an illegal drug correctly identifies drug users 90% of the time and will give a positive reading for non-drug users only 1% percent of the time. 1 person in thousand of the population are drug users. Timmy is tested positive. indicating that he is a drug user. How likely is it that Timmy is actually a drug user? We can display the relationship between test results and drug consumption in a probability table: Test results Drug user? Positive Negative Non-user 0.0099 0.9891 0.999 User 0.0009 0.0001 0.001 0.0108 0.9892 1.000 P(user positive) = P(user)P(pos. user) = 0.001 0.9 = 0.0009 P(non-user positive) = P(non-user)P(pos. non-user) = 0.999 0.01 = 0.00999 25 / 35

Monty Hall problem You are in a game show. There are three doors. Behind one door is a car; behind the other two doors are goats. You pick one door (here door 1).????? The game host opens another door which has a goat (here door 2).?? The game host gives you the chance to switch to the other closed door (here door 3). Should you stick to the door or switch? Does it matter?? 26 / 35

Monty Hall problem The answer seems obvious: It should not make a difference. There are two doors left. So the probability of winning should be 0.5, independent of how you decide. However, this reasoning is wrong! To see why, let s list all nine different cases and look which strategy is more successful. 27 / 35

Monty Hall problem Door 1 Door 2 Door 3 Stick Switch WIN LOSE LOSE WIN LOSE WIN If we switch, we have a 2 /3 chance of winning! Watch video 28 / 35

Monty Hall problem Door 1 Door 2 Door 3 Stick Switch LOSE WIN WIN LOSE LOSE WIN If we switch, we have a 2 /3 chance of winning! Watch video 28 / 35

Monty Hall problem Door 1 Door 2 Door 3 Stick Switch LOSE WIN LOSE WIN WIN LOSE If we switch, we have a 2 /3 chance of winning! Watch video 28 / 35

The Birthday Problem Example II.8 Suppose there is group of k (2 k 365) people. What is the probability that at least two people in a group share the same birthday (i.e. year of birth does not matter)? Ignore February 29 and assume that each of the 365 days of a year is equally likely to be the birthday of any person and that birthdays of the group members are unrelated (no twins). It turns out, it is easier to start with the question what is the probability that no one in the group shares a birthday? Note that P(at least two share a birthday) = 1 P(no one shares a birthday). Let s start with k = 2. Given that the first person has her birthday on any arbitrary day of the year, the probability that the second person does not have the same birthday is 364 365. 29 / 35

The Birthday Problem Example II.8 Suppose there is group of k (2 k 365) people. What is the probability that at least two people in a group share the same birthday (i.e. year of birth does not matter)? Ignore February 29 and assume that each of the 365 days of a year is equally likely to be the birthday of any person and that birthdays of the group members are unrelated (no twins). k = 3. The probability that three persons do not share the same birthday is 364 363 365 365. And, in general, 364 363 362... (365 k + 1) 365 k. 30 / 35

The Birthday Problem Example II.8 Suppose there is group of k (2 k 365) people. What is the probability that at least two people in a group share the same birthday (i.e. year of birth does not matter)? Ignore February 29 and assume that each of the 365 days of a year is equally likely to be the birthday of any person and that birthdays of the group members are unrelated (no twins). Note that (n k)(n k 1)... 1 n(n 1)... (n k + 1) = n(n 1)... (n k + 1) (n k)(n k 1)... 1 n! = (n k)! where n! = n(n 1)... 1 and 0! = 1. Thus, we can write the above as P(no one shares a birthday) = 365! (365 k)!365 k and the solution is 365! P(at least two share a birthday) = 1 (365 k)!365. k 31 / 35

The Birthday Problem Example II.8 Suppose there is group of k (2 k 365) people. What is the probability that at least two people in a group share the same birthday (i.e. year of birth does not matter)? Ignore February 29 and assume that each of the 365 days of a year is equally likely to be the birthday of any person and that birthdays of the group members are unrelated (no twins). The table shows the probability p that at least two people in a group of k people will have the same birthday. k p 5 0.027 10 0.117 15 0.253 20 0.411 22 0.476 23 0.507 25 0.569 30 0.706 40 0.891 50 0.970 60 0.994 Watch video 32 / 35

Sampling with replacement The birthday problem is an example of sampling with replacement. Sampling with replacement (stylised example) A box contains n balls numbered 1,..., n. First, one ball is selected at random from the box and its number noted. This ball is then put back in the box and another ball is selected. Thus, it is possible that the same ball is selected again. This process is called sampling with replacement. It is assumed that each of the n balls is equally likely to be selected at each stage and that the selections are independent of each other. Suppose we pick k balls. There are in total n k different outcomes. The probability assigned to each outcome is 1 /n k. 33 / 35

Sampling without replacement Example II.9 Suppose we have a box of 6 books and we randomly arrange the books on a shelf. What is the probability that, by chance, the books are ordered alphabetically? There are 6 5 4 3 2 1 = 6! = 720 distinct ways of arranging 6 books, but only one order is alphapetically correct. Thus, p = 1 /720. More general: Permutations Suppose that k cards are to be selected and removed from a deck of n cards without replacement. Each possible distinct outcome is called a permutation. The total number of permutations is n! P n,k = n(n 1)... (n k + 1) = (n k)! where a! = a(a 1)(a 2)... 1 and 0! = 1. 34 / 35

Summary Frequentist approach: The probability of an outcome is the relative frequency with which that outcome would be obtained if the experiment were repeated a large number of times. Independence and disjointness are not the same! If two events A and B are mutually exclusive (or disjoint), then P(AB) = 0. If two events are independent, then the occurrence of A has no influence on the probability that B occurs, and vice versa. Bayes theorem provides a rule for computing the conditional probability of the event A given B from the conditional probability of B given A. It is the building block of Bayesian econometrics. 35 / 35