Special distributions August 22, 2017 STAT 101 Class 4 Slide 1
Outline of Topics 1 Motivation 2 Bernoulli and binomial 3 Poisson 4 Uniform 5 Exponential 6 Normal STAT 101 Class 4 Slide 2
What distributions tell us? Financial crisis data: 82 84 Mexican S&L 87 91 Black Mon. Comm. RE 97 98 AsianLTCM 00 Dotcom 07 Subprime 12 Euro? (1) How many crises (X ) in the next decade? (2) When (X ) will be the next crisis? Both are unknown a random variable; many types of random variables in nature X is a number in (1) and a time interval in (2) Distributions are used to describe the behavior of a random variable Different distributions are needed for different situations P(X ) or f (x) (PDF) tells us the likelihood of different values of X E(X ) tells us the average value of X var(x ) tells us how likely X deviates from E(X ) STAT 101 Class 4 Slide 3
Bernoulli trials [James (aka Jacob) Bernoulli, 1654-1705] Examples (1) Outcomes in tosses of a coin ( H vs. T ) (2) Outcomes in a series of similar investments ( Success vs. Failure ) (3) Outcomes in giving a new treatment to a series of patients ( Cured vs. Not cured ) If we consider each toss, each investment, each time a patient is given treatment a trial and call the outcome of interest a success and if we make the following assumptions The outcomes of the trials are independent of one another P(success) = p, 0 < p < 1 is the same for every trial then the outcome of each trial is called a Bernoulli random variable. The probability distribution of a Bernoulli random variable is P(success) = p, P(failure) = 1 P(success) = 1 p. The sequence of outcomes is called a Bernoulli sequence STAT 101 Class 4 Slide 4
The Binomial distribution We are often interested in the number of successes in a Bernoulli sequence. For example, what is the chance of X successes when n patients are treated? The number of successes, X in a Bernoulli sequence with P(success) = p has a Binomial distribution with parameters n and p. Sometimes we write X Bin(n, p) for short. PDF ( ) n P(X = k) = k }{{} (3) p k }{{} (1) (1 p) n k, k = 0, 1,..., n }{{} (2) Explanation The probability of k successes and (n k) failures equals (3) (1) (2) (1) Succeeds k times, each with probability p (2) Fails (n k) times, each with probability (1 p) (3) There are ( n k) possible outcomes with k successes and (n k) failures Mean and variance E(X ) = np, var(x ) = np(1 p) STAT 101 Class 4 Slide 5
Expectation and variance E(X ) = 0 P(X = 0) + 1 P(X = 1) +... + n P(X = n) n = kp(x = k), = k=0 n n! k k!(n k)! pk (1 p) n k k=0 = np, after some algebra var(x ) = E[(X np) 2 ] n = (k np) 2 P(X = k), k=0 = np(1 p), after some algebra STAT 101 Class 4 Slide 6
Uses of expectation and variance Suppose we have two Binomial distributions k 0 1 2... n P(X = k) Bin(n,.3) (.7) n n(.3)(.7) (n 1) n(n 1) 2 (.3) 2 (.7) (n 2)... (.3) n Bin(n,.4) (.6) n n(.4)(.6) (n 1) n(n 1) 2 (.4) 2 (.6) (n 2)... (.4) n It is difficult to compare the characteristics of the distributions based on the PDF Bin(n,.3) Bin(n,.4) E(X ) = np.3n <.4n var(x ) = np(1 p).21n <.24n So on average, there are more successes from Bin(n,.4); however the outcome from Bin(n,.4) is less predictable since its variance is higher E(X ) and var(x ) provide simple and useful characterizations of X STAT 101 Class 4 Slide 7
Example Consider the number of successes, X, in 10 patients given a treatment with success probability 0.7 and independent outcomes. Then X Bin(10,.7). (a) What is the probability that 2 out of 10 are successes? P(X = 2) = ( 10 2 ) (.7) 2 (1.7) 10 2.0014 (b) What is the probability of seeing no more than 2 successes? P(X 2) = = 2 ( 10 k k=0 ( 10 0 ) (.7) k (1.7) 10 k ) (.7) 0 (1.7) 10 + ( ) 10 (.7) 1 (1.7) 9 + 1 (c) What is the expected number of successes in 10 patients? (d) What is the variance? E(X ) = np = 10.7 = 7 var(x ) = np(1 p) = 10.7(.3) = 2.1 ( ) 10 (.7) 2 (1.7) 8.0016 2 STAT 101 Class 4 Slide 8
What are parameters? Let X Bin(n, p), then n and p are called parameters. Different parameter values allow us to use the same probability distribution (e.g., Binomial) to describe different situations that share a common thread n trials with independent outcomes but a constant success probability p. Examples Number of heads in 3 tosses of a coin, X Bin(3,.5) Number of 6 in 3 rolls of a die, X Bin(3, 1/6) Number of successes in 4 investments each with p =.4, X Bin(4,.4) k 0 1 2 3 4 n = 3, p =.5 (.5) 3 3(.5) 3 3(.5) 3 (.5) 3 P(X = k) n = 3, p = 1 6 ( 5 ) 3 6 3 ( 5 2 ( 1 ) 6) 6 3 ( 1 6) 2 ( 5 6) ( 1 6) 3 n = 4, p =.4 (.6) 4 4(.4)(.6) 3 6(.4) 2 (.6) 2 4(.4) 3 (.6) (.4) 4 STAT 101 Class 4 Slide 9
Poisson distribution (Siméon-Denis Poisson 1781-1840) Poisson distribution describes the number of events, X, occurring in a fixed unit of time or space, when events occur independently and at a constant average rate, λ. We write X Poisson(λ) for short. Example If financial crises occur independently of each other with an average of 5 per decade, then the number of financial crises, X, in the next decade may follow a Poisson(5) distribution. PDF Mean and variance P(X = k) = λk k! e λ k = 0, 1, 2,..., E(X ) = var(x ) = λ Sum of independent Poisson random variables If X and Y are independent, and X Poisson(λ), Y Poisson(µ), then X + Y Poisson(λ + µ) STAT 101 Class 4 Slide 10
Poisson approximation to the binomial distribution In a unit of time, X Poisson(λ), what is P(X = k)? Split time into a large number of n intervals, each with a small probability p = λ n of one or no events and such that the total number of events, Y Bin(n, p = λ n ): X, Average number of events = λ 0 p 1 n p 2 n p 3 n p 4 n p 5 n Y, Average number of events = np = n λ n = λ n 1 n p 1 time P(Y = k) = = ( n k ( n k ) p k (1 p) n k ) ( λ n ) k ( 1 λ ) n k n λk k! e λ = P(X = k) STAT 101 Class 4 Slide 11
Poisson approximation Example Suppose, everyday, there is a constant probability p = 0.0001 that an earthquake will occur. Then the number of days with earthquake, X, in 10 years is Bin(3650, 0.0001). P(X = k) 0 1 2 3 4 5 ( n ) k p k (1 p) n k 0.694184 0.253402 0.046238 0.005623 0.000513 0.000037 k λ k k! e λ 0.694197 0.253382 0.046242 0.005626 0.000513 0.000037 Why use Poisson approximation? (1) ( n k) for large n was very difficult to evaluate before computers (2) Suppose the average number of fish you can catch from a stream is 5 a day. How many fish will you catch tomorrow? We do NOT know n or p but we know np = 5 = λ STAT 101 Class 4 Slide 12
Uniform distribution A continuous uniform distribution is used for situations where X is equally likely to assume any value in an interval [a, b]. We write X U(a, b). There are two parameters, a and b. Density function of U(a, b) PDF and { CDF 1 f (x) = b a, if a x b 0, otherwise { x a F (x) = b a, if a x b 1, if b x Mean and variance 1 b a f (x) E(X ) = a + b (b a)2, var(x ) = 2 12 0 a b x STAT 101 Class 4 Slide 13
Example If I asked one of you to give me a number between 0 and 100, what is the chance that the number, X, will be between 1 and 15? We can assume X U(0, 100), so a = 0, b = 100. P(1 < X 15) = P(X 15) P(X 1) = F (15) F (1) = 15 0 100 1 0 100 = 14 100 STAT 101 Class 4 Slide 14
Exponential Distribution If we are interested in the time T to an event (e.g., financial crisis, pop quiz, death), then the exponential distribution may be used. The exponential distribution has one parameter, λ > 0. We write T Exp(λ). PDF and { CDF λe f (t) = λt, t > 0 0, t 0 { 1 e F (t) = λt, t > 0 0, t 0 Density function of Exp(λ) f (t) Mean and variance E(T ) = 1 λ, var(t ) = 1 λ 2 0 t 0 STAT 101 Class 4 Slide 15
Connection to Poisson process X t X 1 0 t 1 T 1 T 2 T 3 Crisis Time Number of crises in 1 unit of time: X 1 Poisson(λ) Number of crises in t units of time: X t Poisson(tλ) X t as t varies is a Poisson process Time between crises T 1, T 2,... exp(λ) F (t) = P(T 1 t) = 1 P(T 1 > t) STAT 101 Class 4 Slide 16 = 1 P(1st event takes longer than t) = 1 P(no events in time 0 to t) = 1 P(X t = 0) = 1 (λt)0 e λt 0! = 1 e λt if t 0.
Example If the number of financial crises, X, in the next decade follows a Poisson(λ = 5) distribution, then the time to the next crisis, T Exp(λ = 5). The probability the next crisis will be in this decade is P(T 1) = F (1) = 1 e 5(1) 0.993 }{{} unit is a decade and the expected time to the next crisis is E(T ) = 1 5 decade or 2 years STAT 101 Class 4 Slide 17
Memoryless property of the exponential distribution Suppose we have already waited time s for an event. What is the chance we have to wait to wait for another time t? P(T > s + t T > s) = = = P(T > s + t, T > s) P(T > s) P(T > s + t) P(T > s) 1 P(T s + t) 1 P(T s) = 1 (1 e λ(s+t) ) 1 (1 e λ(s) ) = e λ(s+t) e λs = e λt = P(T > t) if t 0 The expression e λt, does NOT depend on s. Therefore, the exponential distribution forgets that we have waited for s this is called the memoryless property. STAT 101 Class 4 Slide 18
Normal (Gaussian) distribution (Carl Friedrich Gauss, 1777-1855) The normal distribution was introduced as a distribution for measurement errors. The idea is, if we are asked to make a guess of something (e.g., the height of a building, the age of a person, etc.), then the error distribution will probably look like the histogram proportion 0.000 0.010 0.020 0.030 60 45 30 15 0 15 30 45 60 Measurement error STAT 101 Class 4 Slide 19
Who is Carl Friedrich Gauss? Ask her to wait a moment - I am almost done. Carl Friedrich Gauss (1777-1855), while working, when informed that his wife is dying [In Men of Mathematics (1937) by E. T. Bell]. STAT 101 Class 4 Slide 20
Normal (Gaussian) distribution The Normal distribution has two parameters, the mean, µ, and the variance, σ 2 where < µ <, σ 2 > 0. We write X N(µ, σ 2 ) (Sometimes as N(µ, σ)). PDF and CDF f (x) = 1 2πσ 2 e{ (x µ)2 /(2σ 2 )} < x < Density function of N(µ, σ 2 ) F (x) must be approximated by a computer. Mean and Variance E(X ) = µ, var(x ) = σ 2 Empirical rules 68% of the values are within ± 1 SD (σ) of µ 95% of the values are within ± 2 SD (σ) of µ 99.7% of the values are within ± 3 SD (σ) of µ 3σ 2σ σ µ +σ +2σ +3σ x STAT 101 Class 4 Slide 21
Empirical rules: Illustration Adult height N(µ = 1.6, σ 2 = 0.1 2 ) Investment return N(µ = 0, σ 2 = 3 2 ) 95% 95% 1.4 1.6 1.8 2σ µ + 2σ 6 0 6 2σ µ + 2σ For all normal distributions, the same percentage in the population fall within µ ± kσ STAT 101 Class 4 Slide 22
Z -score For X N(µ, σ 2 ) distribution, the Z-score is: Z = X µ σ Example (Adult height, cont d): X = 1.5, µ = 1.6, σ = 0.1, then Z = 0 X = µ Z > 0 X > µ Z < 0 X < µ Z = 1.5 1.6 0.1 Magnitude gives distance of X from µ = }{{} }{{} 1 sign magnitude Given Z, we can find X by X = Zσ + µ and vice versa Z simply re-expresses X so there is no difference in using Z or X Z N(0, 1), the standard normal distribution STAT 101 Class 4 Slide 23
Probability calculations under a normal distribution Example (Adult height, cont d): P(X > 1.75) =? P(X > 1.75) 1.6 1.75 Z = 0 Z = 1.5 = 1.75 1.6 0.1 P(X > 1.75) = ( ) 1.75 1.6 P Z > = 1.5 = z 0.1 Proportion taller than 1.75m = Proportion 1.5 SD above average P(Z > z ) for all interesting values of z are given in tables (e.g., Table A1) STAT 101 Class 4 Slide 24
Use a normal table to calculate probabilities 0 z* Table A1. Areas under the standard normal curve beyond z, i.e., shaded area z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641 0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247 0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859 0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483 0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121 0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776 0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451. STAT 101 Class 4 Slide 25
Example: P(Z > 1.5) 0 1.5 STAT 101 Class 4 Slide 26 z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641 0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247. 1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681 1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559 1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455 1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367 1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294. P(Z > 1.5) = 0.0668
Example: P(Z < 0.84).84 0.84 z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641 0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247. 0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148 0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867 0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611. P(Z < 0.84) = P(Z > 0.84) = 0.2005 STAT 101 Class 4 Slide 27
Example: P(1.2 < Z < 1.89) 0 1.2 1.89 z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09. 1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985 1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823 1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681 1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559 1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455 1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367 1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294 STAT 101 Class 4 Slide 28. P(1.2 < Z < 1.89) = P(Z > 1.2) P(Z > 1.89) = 0.1151 0.0294 = 0.0857
Example: X N(µ = 15, σ 2 = 16), 85-th percentile =? 0.85 0.15 0? z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09. 0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148 0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867 0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611 1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379 1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170 1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985. 85-th Percentile of Z 1.04 85-th Percentile of X = Zσ + µ 1.04(4) + 15 = 19.16 STAT 101 Class 4 Slide 29