Business Statistics PROBABILITY DISTRIBUTIONS
CONTENTS Probability distribution functions (discrete) Characteristics of a discrete distribution Example: uniform (discrete) distribution Example: Bernoulli distribution Example: binomial distribution Probability density functions (continuous) Characteristics of a continuous distribution Example: uniform (continuous) distribution Example: normal (or Gaussian) distribution Example: standard normal distribution Back to the normal distribution Approximations to distributions Old exam question
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) A sample space is called discrete when its elements can be counted We will code the elements of a discrete sample space S as 1,2,3,, n or 0,1,2,, n 1 Examples die x 1,2,3,4,5,6, so S = 1,2,3,4,5,6 coin x 0,1 number of broken TV sets x 0,1,2,
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) Distribution function P x = P X = x the probability that the (discrete) random variable X assumes the value x alternative notation: P X x Note our convention: capital letters (X) for random variables lowercase letters (x) for values
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) Example die: P x = 1 6 1 6 1 6 1 6 1 6 1 6 if x = 1 if x = 2 if x = 3 if x = 4 if x = 5 if x = 6 0 otherwise
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) Example: flipping a coin 3 times sample space S = HHH, HHH, HHH, TTT, define the random variable X = number of heads 1 if x = 0 distribution function P x = 8 3 8 3 8 1 8 if x = 1 if x = 2 if x = 3 0 otherwise or: P X 0 = 1 8, P X 1 = 3 8, P X 2 = 3 8, P X 3 = 1 8
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) P x is a (discrete) probability distribution function (pdf or PDF) P x = P X = x expresses the probability that X = x A random variable X that is distributed with pdf P is written as X~P Some properties of the pdf: 0 P x 1 a probability is always between 0 and 1 x S P x = 1 the probabilities of all elementary outcomes add up to 1
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) A pdf may have one or more parameters to denote a collection of different but similar pdfs Example: a regular die with m faces P X = x; m = P X x; m = P x; m = 1 m X~P m (for x = 1,, m) m = 4 m = 6 m = 8 m = 12 m = 20
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) In addition to the (discrete) probability distribution function (pdf) P X = x = P X x = P x we define the (discrete) cumulative distribution function (cdf or CDF) F x = F X x = P X x and therefore x x F x = Depending on how we count, you may also start at k = 0 or k = 1 P X = k k= = P k k=
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) Example die: P X = 2 = 1, but P X 2 = P X = 1 + 6 P X = 2 = 1 3 Some properties of the cdf: F = 0 and F = 1 monotonously increasing
PROBABILITY DISTRIBUTION FUNCTIONS (DISCRETE) pdf cdf
CHARACTERISTICS OF A DISCRETE DISTRIBUTION Expected value of X N N E X = x i P X = x i i=1 = x i P x i Example die with P 1 = P 2 = = P 6 = 1 6 E X = 1 1 6 + 2 1 6 + 3 1 6 + 4 1 6 + 5 1 6 + 6 1 6 = 7 2 = 3 1 2 Interpretation: mean (average) alternative notation: μ or μ X so E X = μ X Note difference between μ and the sample mean x e.g., rolling a specific die n = 100 times may return a mean x = 3.72 or 3.43 while μ = 7/2, always (property of die, property of population ) i=1
CHARACTERISTICS OF A DISCRETE DISTRIBUTION Variance N var X = x i E X i=1 2 P x i Interpretation: dispersion alternative notation: σ 2 2 or σ X or V X 2 so var X = σ X Note difference between σ 2 and the sample variance s 2 e.g., rolling a specific die 100 times may return a variance s 2 = 2.86 or 3.04 while σ 2 = 35, always (property of die, property of population ) 12 And of course: standard deviation σ X = var X
CHARACTERISTICS OF A DISCRETE DISTRIBUTION Transformation rules of random variable X and Y For means: E k + X = k + E X E aa = aa X E X + Y = E X + E Y For variances: var k + X = var X var ax = a 2 var X if X and Y independent: var X + Y = var X + var Y
EXAMPLE: UNIFORM DISTRIBUTION Generalization of fair die: equal probability of integer outcomes between a and b conditions: a, b Z, a < b zero probability elsewhere uniform discrete distribution pdf: P x; a, b = Examples: coin: a = 0, b = 1 die: a = 1, b = 6 Random variable: X~U a, b 1 b a+1 x Z and x a, b 0 otherwise
EXAMPLE: UNIFORM DISTRIBUTION
EXAMPLE: UNIFORM DISTRIBUTION Example: choose randomly a number between 1 and 100 with equal probability and denote in by X random variable: X~U 1,100 pdf: P x = P X = x = 1 cdf: F x = P X x = 100 x 100 (x 1,2,, 100 ) (x 1,2,, 100 ) expected value: E X = 50 1 2 variance: var X = 9999 12 833.25 Sample (n = 1000): values (e.g.): 45, 96, 33, 7, 44, 96, 20, mean: x = 50.92 (e.g.) variance: s x 2 = 823.25 (e.g.)
EXAMPLE: BERNOULLI DISTRIBUTION Bernoulli experiment random experiment with 2 discrete outcomes (coin type) head, true, success, female: X = 1 tail, false, fail, male: X = 0 Bernoulli distribution Examples: winning a price in a lottery (buying one ticket) your luggage arrives in time at a destination Probability of success is parameter π (with 0 π 1) P 1 = P X = 1 = π P 0 = P X = 0 = 1 π Random variable X~BBBBBBBBB π or X~aaa π
EXAMPLE: BERNOULLI DISTRIBUTION Expected value E X = π (obviously!) Variance var X = π 1 π variance zero when π = 0 or π = 1 (obviously!) variance maximal when π = 1 π = 1 (obviously!) 2 π if x = 1 pdf: p x; π = 1 π if x = 0 0 otherwise cdf: (not so interesting)
EXAMPLE: BINOMIAL DISTRIBUTION Repeating a Bernoulli experiment n times X is total number of successes P X = x is probality of x successes in sample X = X 1 + X 2 + + X n where X i is the outcome of Bernoulli experiment number i = 1,2,, n X has a binomial distribution
EXAMPLE: BINOMIAL DISTRIBUTION Example flip a coin 10 times:x is number of heads up roll 100 dice: X is number of sixes produce 1000 TV sets: X is number of broken sets What is important? the number of repitions (n) the probability of success (π) per item the constancy of π the independence of the experiments
EXAMPLE: BINOMIAL DISTRIBUTION Expected value E X = nn (obviously!) Variance var X = nn 1 π minimum (0) when π = 0 or π = 1 (obviously!) maximum for given n when π = 1 π = 1 2 (obviously!) pdf: p x; n, π = cdf: F x; n, π = n! x! n x! πx 1 π n x (x 0,1,2,, n ) x k=0 p x; n, π Random variable: X~bbb n, π or X~bbbbb n, π Recall the factorial function: 5! = 5 4 3 2 1
EXAMPLE: BINOMIAL DISTRIBUTION Example: roll 10 dice: what is the distribution of X = number of sixes? What is the probability model? you repeat an experiment 10 times (n = 10) with a probability π = 1 6 of success and a probability 1 π = 5 6 experiment What is the probability distribution? of failure per X~bbb 10, 1 6 where the random variable X represents the total number of sixes so X is not the outcome of a roll of the die! E X = 10 1 6 = 1 2 3 so we expect on average 1 2 3 var X = 10 1 6 5 6 = 25 18 sixes in 10 rolls
EXAMPLE: BINOMIAL DISTRIBUTION
EXAMPLE: BINOMIAL DISTRIBUTION Calculating pdf and cdf values Example: binomial distrbution with n = 8, π = 0.5 what is P 3 what is F 3 Different methods: using the formula using a table using Excel using online calculators = P X = 3 (pdf)? = P X 3 (cdf)?
EXAMPLE: BINOMIAL DISTRIBUTION pdf solution 1: using the formula P 3; 8,0.5 = 8! 3! 8 3! 0.53 1 0.5 8 3 = 0.2188 or P 3; 8,0.5 = 8 3 0.53 1 0.5 8 3 = 0.2188 using the binomial coefficient n k = C k n = n! k! n k!
EXAMPLE: BINOMIAL DISTRIBUTION pdf solution 2: using the table in Appendix A P 3; 8,0.50 = 0.2188
EXAMPLE: BINOMIAL DISTRIBUTION pdf solution 3: using Excel (or similar software) BBBBB. DDDD 3; 8; 0.5; FFFFF = 0.82175
EXAMPLE: BINOMIAL DISTRIBUTION pdf solution 4: using an online calculator
EXAMPLE: BINOMIAL DISTRIBUTION At the exam: tables (solution 2) But: how to do the cdf? with calculator or Excel, OK and with formula or table? Use the definition: F x = P X x = P X = k k=0 P X 3 = P X = 0 + P X = 1 + P X = 2 + P X = 3 use table, four times x
EXAMPLE: BINOMIAL DISTRIBUTION Example F 3; 8,0.50 = 0.0039 + 0.0313 + 0.1094 + 0.2188 Note that this table gives a pdf, not a cdf
EXAMPLE: BINOMIAL DISTRIBUTION Note that cdf is F x = P X x How to find P X < x? use P X x 1 = F x 1 How to find P X > x? use 1 P X x = 1 F x Etc.
EXAMPLE: BINOMIAL DISTRIBUTION Use such rules to efficiently use the (pdf) table (n = 8) P X 7 = F 7 = P 0 + P 1 + + P 7 Much easier: P X 7 = F 7 = 1 P 8
EXAMPLE: BINOMIAL DISTRIBUTION Example: Context: on average, 20% of the emergency room patients at Greenwood General Hospital lack health insurance In a random sample of 4 patients, what is the probability that at least 2 will be uninsured?
EXAMPLE: BINOMIAL DISTRIBUTION Binomial model (patient is uninsured or not, π uninsured = 0.20) X is number of uninsured patients in sample P X 2 = P X = 2 + P X = 3 + P X = 4 = 0.1536 + 0.0256 + 0.0016 = 0.1808 Note that this table gives a pdf, not a cdf
PROBABILITY DENSITY FUNCTION (CONTINUOUS) Discrete distributions probability distribution function (pdf): P x = P X = x probability of obtaining the value x Continuous distributions the probability of obtaining the value x is 0 define probability density function (pdf): f x b P a X b = f x dd a probability of obtaining a value between a and b Compare with the probability distribution function (pdf) P X = x for the discrete case The red curve is the pdf, f x The integral is the grey area under the pdf
PROBABILITY DENSITY FUNCTION (CONTINUOUS) So pdf refers to two distinct but related things: probability distribution function P x (discrete case) probability density function f x (continuous case) Note also that the dimensions are different P is a dimensionless probability example: if X is in kg, the discrete pdf P X is dimensionless while the continuous pdf f x is in 1/kg Because f x dd should be dimensionless, and dd is in in kg
PROBABILITY DENSITY FUNCTION (CONTINUOUS) In addition to the probability density function... P x = P X x... we define the cumulative distribution function (cdf or CDF) F x = P X x = Some properties of the cdf: F = 0 and F = 1 monotonously increasing x f y dd Compare with F x = P X x = x k= for the discrete case F x x P X = k
PROBABILITY DENSITY FUNCTION (CONTINUOUS) pdf P 70 X 75 75 = f x dd 70 cdf P 70 X 75 = F 75 F 70
CHARACTERISTICS OF A CONTINUOUS DISTRIBUTION Expected value E X = xx x dd Example: let f x = 1 for x 0,1 1 E X = xxx 0 = 1 2 x2 0 1 = 1 2 Interpretation: mean (average) alternative notation for E X : μ or μ X Compare with n E X = x i P x i=1 for the discrete case
CHARACTERISTICS OF A CONTINUOUS DISTRIBUTION Variance var X = x E X 2 f x dd Interpretation: dispersion alternative notation for V X : σ 2 or σ X 2 Compare with n var X = x i E X i=1 for the discrete case 2 P x i
EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION Analogy with uniform discrete distribution equal density for all outcomes between a and b condition: a < b zero probability elsewhere uniform continuous distribution pdf: f x; a, b = 1 b a or easier: f x; a, b = 1 b a x a, b 0 otherwise (x a, b ) Examples: standard uniform deviate: a = 0, b = 1
EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION Example: let X be exam grade of randomly selected student assume uniform distribution: X~U 1,10 what is P X 6.5? Solution use P X 6.5 = 1 P X < 6.5 = 1 P X 6.5 cdf: P X x = F x = f y dd uniform continuous with a = 1 and b = 10 pdf: f x = 1 (x 1,10 ) 9 x cdf: P X x = 1 dd = 1 1 x 1 9 9 answer: P X 6.5 = 1 1 6.5 1 9 or: area of black rectangle P X 6.5 is the black area x For a continuous distribution P X < x = P X x because P X = x = 0 1 9 1 6.5 10
EXAMPLE: UNIFORM (CONTINUOUS) DISTRIBUTION Expected value E X = a+b 2 Variance vvv X = b a 2 12 b a ( x a+b 2 2 1 b a dd pdf f x = 1 b a cdf F X = x a b a Random variable X~U a, b or X~hoo 0, θ or X~hoo θ etc. = b a 2 12 )
EXAMPLE: NORMAL (OR GAUSSIAN) DISTRIBUTION pdf f x; μ, σ = 1 σ 2π e 1 2 cdf F x = x x μ σ f y; μ, σ dd Expected value E X = μ Variance var X = σ 2 Random variable X~N μ, σ or X~N μ, σ 2 2 =??? Now, π = 3.1415... Remember notation μ for expected value and σ 2 for variance. So here μ = μ and σ 2 = σ 2. This is no coincedence! In a concrete case indicate the parameter s symbol: N 12, σ = 2 or N 12, σ 2 = 4
EXAMPLE: NORMAL (OR GAUSSIAN) DISTRIBUTION Some characteristics range: x, pdf has maximum at x = μ pdf is symmetric around x = μ not too interesting for x < μ 3σ and for x > μ + 3σ
EXAMPLE: STANDARD NORMAL DISTRIBUTION Normal distribution with μ = 0 and σ = 1 so a 0-parameter distribution: standard normal pdf cdf f x = 1 2π e 1 2 x2 x F x = f y dd =??? = Φ x with Φ = 0, Φ = 1, Φ 0 = 0.5, dd Expected value E X = 0 Variance var X = 1 Random variable X~N 0,1, we often write Z~N 0,1 dd = f x Remember the trick: if you don t know something, just give it a name
EXAMPLE: STANDARD NORMAL DISTRIBUTION Important because any normally distributed variable can be standardized to standard normal distribution Methods for determing the values of Φ x : using a table (different types!) using Excel using a graphical calculator
EXAMPLE: STANDARD NORMAL DISTRIBUTION Calculating the value of the cdf with a table P Z 1.36 = Φ 1.36 table C-2 (p.768): P Z 1.36 = 0.9131
EXAMPLE: STANDARD NORMAL DISTRIBUTION Calculating the value of the cdf with Excel Φ 1.36 = P Z 1.36 = 0.913085038
EXAMPLE: STANDARD NORMAL DISTRIBUTION Note that cdf is P Z x How to find P Z < x? use P Z x (why?) How to find P Z > x? use 1 P Z x (why?) or use P Z > x = P Z < x (why?) How to find P Z x? is easy now... How to find P x Z y? use P Z y P Z x Etc. Scale for standard normal, but this applies to any continuous distribution =
EXAMPLE: STANDARD NORMAL DISTRIBUTION Inverse lookup P X x = Φ x = 0.90 table C-2 (p.768): x 1.28
BACK TO THE NORMAL DISTRIBUTION Note: X~N μ, σ 2 X μ~n 0, σ 2 X μ Standardization x z = x μ and X Z = X μ σ σ If X~N μ, σ 2, how to determine P X x? P X x = P X μ x μ = P X μ Example suppose X~N 180, σ 2 = 25 P X 190 = P Z 190 180 5 P X x = 0.90 = P Z x 180 5 σ σ x μ σ ~N 0,1 = P Z 2 = 0.9772 x 180 5 = P Z x μ σ = 1.28 x = 186.4 This is our way of doing normalcdf and invnorm if you don t have a graphical calculator!
BACK TO THE NORMAL DISTRIBUTION Finding P X x with X~N μ, σ 2 standardizing + table of standard normal distribution Excel graphical calculator At the exam!
BACK TO THE NORMAL DISTRIBUTION What is normal about the normal distribution? it has quite a weird pdf formula and an even weirder cdf formula But it is unimodal it is symmetric very often empirical distributions look normal a quantity is approximately normal if it is influenced by many additive factors, none of which is dominating several statistics (mean, sum,...) are normally distributed You ll learn that soon when we discuss the Central Limit Theorem
PROPERTIES OF THE NORMAL DISTRIBUTION Scaling If X~N μ X, σ X 2 then ax + b~n aμ X + b, a 2 σ X 2 Additivity If X~N μ X, σ X 2 and Y~N μ Y, σ Y 2 and X, Y independent, then X + Y~N μ X + μ Y, σ X 2 + σ Y 2 pdf of X pdf of 0.825X + 11
APPROXIMATIONS TO DISTRIBUTIONS Sometimes, we can approximate a difficult distribution by a simpler one Important case: binomial normal example 1: flipping a coin (π = 0.50, X = #heads) very often
APPROXIMATIONS TO DISTRIBUTIONS But also when π 0.50 example 2: flipping a biased coin (π = 0.30, X = #heads) very often n = 10; π =.30 n = 20; π =.30 n = 40; π =.30
APPROXIMATIONS TO DISTRIBUTIONS binomial normal bbb n, π N μ, σ 2 using μ =??? and σ 2 =??? We know that when X~bbb n, π E X = nn var X = nn 1 π So, replace μ = nn σ 2 = nn 1 π So, bii n, π N nn, nn 1 π rule: allowed when nn 5 and n 1 π 5 The book says 10 instead of 5
APPROXIMATIONS TO DISTRIBUTIONS Example binomial normal roll a die n = 900 times study the occurrence of sixes (so π = 1 ) 6 what is the probability of no more then 170 sixes? Exact: P bbb n=900;π=1/6 X 170 =? Two problems: need to add 171 pdf-terms (P X = 0 until P X = 170 ) 900! gives an ERROR Approximation: P N μ=150;σ 2 =125 X 170 = P Z Z 170 150 = Φ 125 Z 1.7888 0.9631
APPROXIMATIONS TO DISTRIBUTIONS Now take X~bbb 18,0.5 In a binomial context P X 11 = P X < 12 But in a normal context P X 11 = P X < 11 So, take care about using integers Safest: go half-way: P X 11.5 = P X < 11.5 This is the continuity correction
APPROXIMATIONS TO DISTRIBUTIONS The intuitive notion of the continuity correction when approximating a discrete distribution by a continuous distribution P bbb X 7 P N X 7 1 2 P bbb X 7 P N X 6 1 2
APPROXIMATIONS TO DISTRIBUTIONS Improving previous result without continuity correction P bbb n=900;π=1/6 X 170 = P N μ=150;σ 2 =125 P Z Z 170 150 125 = Φ Z 1.788 0.9631 with continuity correction P bbb n=900;π=1/6 X 170 = P N μ=150;σ 2 =125 P Z Z 170.5 150 125 = Φ Z 1.833 0.9664 X 170 = X 170.5 =
OLD EXAM QUESTION 30 June 2014, Q1d
OLD EXAM QUESTION 30 June 2014, Q1f