Probability and Discrete Distributions

Similar documents
Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Binomial and Poisson Probability Distributions

Lecture 2: Probability and Distributions

3.2 Probability Rules

Math 140 Introductory Statistics

Math 140 Introductory Statistics

STT 315 Problem Set #3

Probability and Probability Distributions. Dr. Mohammed Alahmed

Chapter 7 Wednesday, May 26th

Conditional Probability

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

1 Probability Theory. 1.1 Introduction

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Topic 2 Probability. Basic probability Conditional probability and independence Bayes rule Basic reliability

( ) P A B : Probability of A given B. Probability that A happens

Discrete probability distributions

STAT 201 Chapter 5. Probability

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Basic Concepts of Probability

3 PROBABILITY TOPICS

7.1 What is it and why should we care?

C Homework Set 3 Spring The file CDCLifeTable is available from the course web site. It s in the M folder; the direct link is

Chapter 4 Probability

Chapter 6. Probability

STA Module 4 Probability Concepts. Rev.F08 1

Today we ll discuss ways to learn how to think about events that are influenced by chance.

Intro to Probability Day 3 (Compound events & their probabilities)

3. DISCRETE PROBABILITY DISTRIBUTIONS

Discrete Distributions

DISCRETE RANDOM VARIABLES EXCEL LAB #3

a. The sample space consists of all pairs of outcomes:

What Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic

Math 243 Section 3.1 Introduction to Probability Lab

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

EPE / EDP 557 Homework 7

Lecture 1: Probability Fundamentals

Probability 5-4 The Multiplication Rules and Conditional Probability

Chapter 26: Comparing Counts (Chi Square)

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

Experiment 1: The Same or Not The Same?

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

Joint, Conditional, & Marginal Probabilities

Probability and Statistics Notes

UNIT 5 ~ Probability: What Are the Chances? 1

What is a random variable

P (A) = P (B) = P (C) = P (D) =

Part 3: Parametric Models

Homework (due Wed, Oct 27) Chapter 7: #17, 27, 28 Announcements: Midterm exams keys on web. (For a few hours the answer to MC#1 was incorrect on

What is Probability? Probability. Sample Spaces and Events. Simple Event

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc

Probability P{E} Example Consider. Find P{HH}. simultaneously. = # ways E occurs

SS257a Midterm Exam Monday Oct 27 th 2008, 6:30-9:30 PM Talbot College 342 and 343. You may use simple, non-programmable scientific calculators.

PROBABILITY.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Single Maths B: Introduction to Probability

Basic Concepts of Probability

5.3 Conditional Probability and Independence

Conditional Probability, Independence and Bayes Theorem Class 3, Jeremy Orloff and Jonathan Bloom

REPEATED TRIALS. p(e 1 ) p(e 2 )... p(e k )

2. AXIOMATIC PROBABILITY

MAT Mathematics in Today's World

Selected problems from past exams and a few additional problems

HYPERGEOMETRIC and NEGATIVE HYPERGEOMETIC DISTRIBUTIONS

Probability- describes the pattern of chance outcomes

Announcements. Topics: To Do:

2.4. Conditional Probability

Discrete Distributions

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

Chapter 2 Class Notes

Chapter 15. General Probability Rules /42

Basic Statistics and Probability Chapter 3: Probability

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II

Discrete Probability. Chemistry & Physics. Medicine

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

Last few slides from last time

Week 2. Section Texas A& M University. Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Using Probability to do Statistics.

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

CS 361: Probability & Statistics

Slide 1 Math 1520, Lecture 21

Superiority by a Margin Tests for One Proportion

STEP Support Programme. Statistics STEP Questions

CIVL Why are we studying probability and statistics? Learning Objectives. Basic Laws and Axioms of Probability

2.6 Tools for Counting sample points

Independence 1 2 P(H) = 1 4. On the other hand = P(F ) =

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Intermediate Math Circles November 8, 2017 Probability II

Relative Photometry with data from the Peter van de Kamp Observatory D. Cohen and E. Jensen (v.1.0 October 19, 2014)

Machine Learning

STAT:5100 (22S:193) Statistical Inference I

Probability Notes (A) , Fall 2010

Chapter 7: Section 7-1 Probability Theory and Counting Principles

Intro to Probability Day 4 (Compound events & their probabilities)

STAT 516: Basic Probability and its Applications

Transcription:

AMS 7L LAB #3 Fall, 2007 Objectives: Probability and Discrete Distributions 1. To explore relative frequency and the Law of Large Numbers 2. To practice the basic rules of probability 3. To work with the binomial and Poisson distributions in JMP Getting Started: Log onto your machine and start JMP. There are no datasets to download for today. Part I. Relative Frequency Suppose a random experiment is repeated many times, for example, a fair coin is flipped 1000 times. The outcome from a single flip of the coin is either Heads or Tails. If we repeat the experiment, the number of times each outcome is observed, i.e., the number of Heads and the number of Tails, is called the frequency of that outcome. The relative frequency of an outcome is the proportion of times that outcome is observed. For example, the relative frequency of Heads is the frequency of heads divided by the total number of flips. By the Law of Large Numbers, as the number of repetitions of an experiment increases, the relative frequency of each outcome approaches the probability of that outcome. In this lab you will use computer simulation to find the relative frequency of an outcome and to show how, as the number of repetitions of the experiment increases, the relative frequency stabilizes to a number between 0 and 1. To illustrate the law of large numbers and the relative frequency definition of probability we will use JMP to simulate the flipping of a coin a number of times. A probability model for this experiment, i.e., flipping a coin multiple times, is called a binomial model. From the JMP Starter window, choose New Data Table. We will generate 10 coin flips, one in each row. To get this started, first we need to tell JMP how many coins to flip. To do this, double-click on the header for Column 1 (at the top of the first column of the data table) to bring up the dialog box (if that doesn t work, you can also right-click and select Column Info...). Near the bottom, where it has a box with 0 after N Rows, replace the 0 with 10 and hit OK. This will give you a column of dots, which represent missing values. What we ve done is tell JMP how many random draws we are going to want. Now right-click on the header and choose Formula, which will pop up a new dialog box. On the top, just right of center is a menu (titled Functions (grouped)) with a scroll bar. Scroll down a bit and click on Random, which pops up more choices, and click on Random Binomial. It should put Random Binomial(n, p) in the big box, and the n should be surrounded by a red box, meaning that JMP is asking you to specify how many coins are flipped for each row. Since we re just putting one coin flip in each row, type the number 1 and hit Enter. Now click on the p to move the red box to be around it. We want each coin to have probability 0.5 of coming up head, so type in 0.5 and hit Enter. Now hit OK and JMP will fill in the random flips and return you to the data table. You should now see 0 s (tails) and 1 s (heads) in the second column. In this case it is easy enough to compute the relative frequency by hand. But you can also get it by going back to the JMP Starter window, going to the Basic menu, and choosing Distribution. Since all entries are 0 or 1, the mean is equal to the relative frequency of 1 s, so you can just scroll down to the mean. 1

Question #1 What is the relative frequency of heads? Question #2 What proportion of Heads did you expect? Are you surprised at the proportion of Heads you observed? We want to observe the law of large numbers taking effect by increasing the number of flips of the coin. Redo your simulation above separately (create a new data table each time, and remember to tell JMP how many rows you want before you go to the formula window) for (i) n = 100 flips, (ii) n = 1000 flips and (iii) n = 10000 flips. Please fill in the following table: Question #3 Number of Flips n = 10 n = 100 n = 1000 n = 10000 Relative Frequency of Heads Question #4 The true probability of observing a Head based on this simulation is 0.50. Describe what happens to the relative frequency of the occurrence of a Head as the number of flips increases from 10 to 10000. Question #5 If you were to look at someone else s simulation, would you expect the results for n=10 to be about the same as yours? How about the results for n=10000? Part II. Probability Questions 6-14 regard a sample of 100 college students that was asked what their favorite burger chain is. The following results were obtained: 2

Burger King McDonald s In-N-Out Other Male 10 12 24 9 Female 24 8 7 6 Question #6 If one person is randomly selected from this sample, what is the probability that they prefer In-N-Out? Question #7 If one person is randomly selected, what is the probability that they are female and prefer In-N-Out? Question #8 If one person is randomly selected, what is the probability that they are female or prefer In-N-Out? Question #9 If one person is randomly selected, what is the probability that they prefer In-N-Out given that they are female? Question #10 If a randomly selected person that prefers In-N-Out is selected, what is the probability that they are female? Question #11 Do you get the same number for #9 and #10? Why or why not? Question #12 Are choice of burger chain and gender independent or dependent? Why? Question #13 Are preference for Burger King and for McDonald s mutually exclusive? Question #14 Are preference for Burger King and for McDonald s independent? 3

A newspaper article reported Finnish men are complaining of sexual harassment in the work place. A sample of male and female Finnish workers consisted of 48% men. Among all the workers in the sample, 30% reported being sexually harassed in the workplace. Furthermore, 26% of the workers reported being male and being sexually harassed. Question #15 Note that the probability a random worker reports being harassed is equal to the probability that they are male and report being harassed plus the probability that they are female and report being harassed. What is the probability that they are female and report being harassed? Question #16 What is the probability that they are male and do not report being harassed? Question #17 What is the probability that they are male given that they report being harassed? Now we ll practice an application of Bayes Theorem. Recall that Bayes Theorem says: P(A B) = P(B A)P(A) P(B A)P(A) + P(B nota)p(nota) The ELISA test was an early test used to screen blood donations for antibodies to HIV. A study (Weiss et al. 1985) found that the conditional probability that a person would test positive given they have HIV was 0.977 and the conditional probability that a person would test negative given they did not have HIV was 0.926. The World Almanac gives an estimate of the probability of a person in North America having HIV of 0.0026. Question #18 Suppose a random person is tested and they test positive. What is the conditional probability that this person has HIV given that they test positive? 4

Question #19 Are you surprised by your answer? What implications does this have for policies of mandatory testing? Note that this phenomenon of large and unexpected changes in conditional probabilities is not unusual, particularly when dealing with rare events. What is happening is that the number of false positives is much larger than the number of true positives. Question #20 Suppose 10,000 random people are tested. How many of them do you expect to actually have HIV? Question #21 Of those with HIV, how many do you expect to test positive? Question #22 Of those without HIV, how many do you expect to test positive? Question #23 Do your answers to #21 and #22 make sense in light of your answer to #18? Of course in practice, people who test positive would be given a follow-up test. Also, more expensive but more accurate tests have subsequently been developed. But the implications on mandatory testing still apply. For rare diseases, it really only makes sense to test in subgroups that are at high risk. Otherwise you get more false positives than true positives. Part III. Binomial Random Variables Recall that the characteristics of a binomial random variable are: 1. The experiment consists of a fixed number (n) trials. 2. The trials are independent. 3. There are only two possible outcomes on each trial, which for convenience we call success and failure. 4. The probability of a success (p), is the same for each observation. 5. If X is a binomial random variable then its mean is E[X] = µ X = np and its standard deviation is σ X = np(1 p). 5

For the next three questions of the following examples, decide whether X is a binomial random variable. State why or why not. Question #23 The pool of potential jurors for a murder case contains 100 persons chosen at random from the adult residents of a large city. Each person in the pool is asked whether he or she opposes the death penalty; X is the number who say Yes. Question #24 Suppose a jury of 12 people is chosen from the above pool, and this jury hears a case and discusses the verdict; X is the number who think the defendant is guilty. Question #25 A drug is known to be 60% effective in curing a certain disease; that is, the probability is 0.60 that a person with the disease given the drug will be cured. Suppose 30 people with the disease are chosen at random and given the drug. Let X be the number of people among the 30 who are cured. Question #26 Using the information from the previous question, find the expected number of people cured, i.e., E(X), and the standard deviation of X. Question #27 If 25 of the 30 people were cured, would this be unusual? Question #28 Use JMP to generate 100 replications of draws from a binomial with n = 30 and p =.6 (create a new data sheet, fill in the first column with 100 rows of dots, then use the formula menuto put in a random binomial with n = 30 and p = 0.6). What are the observed mean and standard deviation of your 100 random replications? (Use Basic Distributions to get these values form JMP.) Are they close to what you computed in #26? Question #29 How many of your 100 replications were 25 or higher? Is this consistent with your answer to #27? 6

Finally, let s practice with the Poisson distribution. Recall that the Poisson distribution is used to model counts of events that happen independently over time or space. Also recall that the standard deviation of a Poisson is the square root of the mean. Suppose a pet clinic gets an average of six cats per day to be spayed, and they only have the resources (material supplies and volunteer time) to spay a maximum of eight cats on any given day. Can we find the probability that more cats will show up on a day than they can spay? We could work with the theoretical distribution, but instead, let s estimate it using relative frequencies in JMP. We ll simulate 1000 days of arrivals of cats. Each day will be drawn from a Poisson random variable with mean 6. Create a new data table (from File on the JMP Starter window). Doubleclick on the header for Column 1 (at the top of the first column of the data table) to bring up the dialog box. Near the bottom, where it has a box with 0 after N Rows, replace the 0 with 1000 and hit OK. Now right-click on the header and choose Formula. You should get the familiar formula dialog box. Scroll down to choose Random from the Functions (grouped) menu, and then choose Random Poisson. The red box should now be around lambda. Type in 6 and hit Enter. Now click on OK. You should now have 1000 random Poisson draws in your table. Question #30 What are the theoretical expected value and standard deviation for the number of cats showing up at the clinic on a given day? Question #31 Use JMP to find the mean and standard deviation from your simulated draws. How close are they to the theoretical values? Question #32 To get the counts by value, it is easiest to first tell JMP to pretend that the variable is ordinal instead of continuous. In the data table window, in the middle box on the left, there should be a blue triangle to the left of Column 1 (or whatever you have titled the column). Click on that triangle and change it to Ordinal, which should change the icon to grey steps. Now make a new Basic Distribution analysis, which should give you counts. What fraction of your simulation has values of 9 or higher? What do you estimate the probability is that more cats will arrive than can be handled on a given day? Quit JMP and please remember to Log Off. 7