Binomial random variable Toss a coin with prob p of Heads n times X: # Heads in n tosses X is a Binomial random variable with parameter n,p. X is Bin(n, p)
An X that counts the number of successes in many independent bernoulli trials is called a binomial random variable. The two parameters are n the number of trials p the probability of success in a trial
n trials Each trial has only two outcomes, Success or Failure trials are independent prob of success = p, in all trials X = # successes is Bin (n,p)
Sampling with and without replacement If there is a large dichotomous population and a sample is drawn from it, and we look at X the number of success in the sample. If the sample is drawn without replacement then clearly X is not binomial. However, if the sample size is small relative to population size, binomial probabilities provide a good approximation. In this case, in practice X is modeled as a binomial.
P(X = x) = ( ) n x p x (1 p) n x E(X) = np, V (X) = np(1 p) s.d (X) = (np(1 p) Calculating probabilities: from table, software
Binomial table
Problem 4.4 Use the table to find the following probabilities 1. P( x =2) for n=10, p =.4 2. P(x 5) for n = 15, p =.6 3. P(x > 1) for n = 5, p =.1 4. P(x 10) for n=15, p =.9
tophat 4.54
Problem 4.5 calculate µ, σ 2, σ for the following binomial variables 1. n=25, p =.5 2. n=80, p=.2
problem 4.48 Among guests in a hotel 66% were aware of its Green program and among those who were aware of the program 72% participated in it. Let x be the number of guests in a random sample of 15 who were aware of the Green program and participated in it.
problem 4.48 Explain why x is approximately a binomial random variable n identical trials. Although the trials are not exactly identical, they are close. Taking a sample of size n = 15 from a very large population will result in trials being essentially identical. Two possible outcomes. The hotel guests are either aware of and participate in the conservation efforts or they do not. S = hotel guest is aware of and participates in conservation
P(S) remains the same from trial to trial. If we sample without replacement, then P(S) will change slightly from trial to trial. However, the differences are extremely small and will essentially be 0. Trials are independent. Again, although the trials are not exactly independent, they are very close. The random variable x = number of hotel guests who are aware of and participate in conservation efforts in n =15 trials. Thus, x is very close to being a binomial. We will
problem 4.48 Among guests in a hotel 66% were aware of its Green program and among those who were aware of the program 72% participated in it. Let x be the number of guests in a random sample of 15 who were aware of the Green program and participated in it. determine p
problem 4.48 Define the following events: A: hotel guest is aware of conservation program B: Hotel guest participates in conservation efforts Then, P(A B) = P(A)(B A) =.72(.66) =.4752. assume p=.4 and find the probability that x is at least 10 from table 1-.966 =0.034
problem 4.58 The engineer s forecast that 10% of all Denver bridges will have ratings of 4 or below Find the probability that in a random sample of 10 bridges at least 3 will have a ratings below 4 We have a binomial with n=10, p=.1 P(x 3) = 1 P(x 2) = from tables 1.930 =.07
problem 4.58 If you actually observe that x 3, what would you infer? Since the probability of seeing at least 3 bridges out of 10 with ratings of 4 or less is so small, we can conclude that the forecast of 10% of all major Denver bridges will have ratings of 4 or less in 2020 is too small. There would probably be more than 10%.
Problem 4.58 You have purchased 5 million switches and your supplier has guaranteed that there will be no more that.1% defectives. You randomly sample 500 switches and find 4 defectives. Do you think the supplier has complied with the guarantee? Assuming the supplier s claim is true, µ = np = 500(.001) =.5 σ = 500.001.009 =.707
Problem 4.58 If the supplier s claim is true, we would only expect to find.5 defective switches in a sample of size 500. Therefore, it is not likely we would find 4. Based on the sample, the guarantee is probably inaccurate. z-value of observed result is 4.5.707 = 4.95 This is an unusually large z-score.
Poisson A Poisson random variable takes values x = 0, 1, 2,.... It has one parameter λ. X is a Poisson(λ) variable if, for λ > 0, λ λx P(X = x) = e x! for x = 0, 1, 2,... E(X) = λ V (X) = λ
Poisson distribution Why study Poisson? If X is binomial(n, p) with n large and p small then with λ = np, P(X = x) λ λx = e x! Poisson process is a very important topic in probability theory
pbinom(x, 500, p = 0.1) 0.0 0.4 0.8 0 40 80 ppois(x, 50) 0.0 0.4 0.8 0 40 80 x x
Poisson distribution How do we calculate probabilities for a Poisson random variable? first figure out the mean or λ from the problem use tables or software (there is no table in the text)
Poisson Distribution Example Customers arrive at a rate of 72 per hour. What is the probability of 4 customers arriving in 3 minutes? 1995 Corel Corp.
Poisson Distribution Solution 72 Per Hr. = 1.2 Per Min. = 3.6 Per 3 Min. Interval px ( ) x e x! 4 3.6 3.6 e p(4).1912 4!
Poisson Probability Table (Portion) x 0 3 4 9.02.980 : : : : : : : 3.4.033.558.744.997 3.6.027.515.706.996 3.8.022.473.668.994 : : : : : : : Cumulative Probabilities p(x 4) p(x 3) =.706.515 =.191
Problem 4.70 Over the last ten years the average number of bank failures per year was 45. Assume that X, the number of bank failures per year follows a Poisson distribution Find E(X) and s.d (X) E(X) = 45, s.d(x) = 45 = 6.71
In 2011, 360 banks failed. How far does this value lie above the mean? z = 360 45 671 = 47
In 2010, 65 banks failed. Find P(X 65) from table P(X 65) =.998
Hypergeometric Random Variable The experiment consists of randomly drawing n elements without replacement from a set of N elements, r of which are S s (for success) and (N - r) of which are F s (for failure). The hypergeometric random variable x is the number of S s in the draw of n elements.
( r (N r ) p(x) = x)( n x ( N n) µ = nr N σ 2 = r(n r)n(n n) N 2 (N 1)
Sampling with and without replacement If there is a large dichotomous population and a sample is drawn from it, and we look at X the number of success in the sample. If the sample is drawn without replacement then X is Hypergoemetric. However, if the sample size is small relative to population size, binomial probabilities provide a good approximation. In this case, in practice X is modeled as a binomial.
Continuous random variables probability density, mean, s.d. Normal distribution
A random variable X is continuous if P(X = x) = 0 for all x
uniform random variable Suppose X is a number picked at random from the interval [0, 1] P(X = x) = 0 for all x But Probability X falls in an interval is equal to the length of the interval,and is nonzero.
Probabilities involving X can be modeled by the following picture Identifies prob with area under a function 0.0 1.0 2.0 1.0 0.5 0.0 0.5 1.0 1.5 2.0
Continuous Probability Density Function The graphical form of the probability distribution for a continuous random variable x is a smooth curve
Density curves A density curve is a mathematical model of a distribution. The total area under the curve, by definition, is equal to 1, or 100%. The area under the curve for a range of values is the probability of all observations for that range. Histogram of a sample with the smoothed, density curve describing theoretically the population.
Density curves come in any imaginable shape. Some are well known mathematically and others aren t. Our interest is in a special type of density called Normal density or Normal Distribution