ST004 Week 8 Probability Distributions Mathematical models for distributions The language of distributions Expected Values and Variances Law of Large Numbers ST004 0 Week 8
Probability Distributions Random Variable Y a name Two lists: (i) poss values y (ii) ass. probs Tabulated, or Defined by a Formula Discrete or Continuous Expected Value of Y (or a function g( ) of Y) Weighted Avg of poss values y (or g(y)) Summation or Integral ST004 0 Week 8
Prob Dists; cdf, pmf, pdf Two lists: possibilities and probabilities Probs can be via cdf cum prob dist functions Pr(Y y) OR if discrete (ie step function cdf) by pmf prob mass functions Pr(Y=y) if continuous, (smooth cdf) by pdf prob density functions ( Pr(Y y) ) ST004 0 Week 8 3
Why Math Models? Infinitely long run Thought expts Insight/Reasoning Conditional prob Approximations MC algorithms Transforms Properties Efficiency Approximations Systems neither Uniformly random Indep Methods Counting Equally likely Calculus Continuous Probability Rules Events Conditional prob ST004 0 Week 8 4
Random variable Y: Random Variables and Probability Distributions Outcome of an expt transformation of one of more calls to RAND() Often - numeric - pair (or a set ) of numbers Probability Dist: Two lists: possibilities and probabilities Lists and functions can be explicitly tabulated or defined by some mathematical formula ST004 0 Week 8 Distribution? Replicate sims, transform and summarise OR Directly from transform via probability Alt, by approximation 5
Discrete Random Variables and Probability Distributions Poss y Pr(Y =y ) Pr(Y y ) 0.667 0.667 0.667 0.3333 3 0.667 0.5 4 0.667 0.6667 5 0.667 0.8333 6 0.667 Formal math statement for cdf int( y) y 6 6 FY ( y) 0 y y 6 0. 0.5 Pr(Y=y). 0.8 Pr(Yy). 0.8 Pr(Yy) 0. 0.6 0.6 0.05 0.4 0. 0.4 0. 0 3 4 5 6 0 3 4 5 6 0 0 4 6 8 ST004 0 Week 8 6
Sum of k dice via Recursion Numbers of combinations 3 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sk 0 0 0 0 0 0 0 0 0 3 0 4 3 3 5 4 6 4 6 5 0 0 7 0 6 5 0 8 0 5 35 9 0 4 5 56 Density f ( y) 7k 35k e y Adding stuff Adding Indep Stuff Normal distribution Central Limit Theorem ST004 0 Week 8 7
Sum k Dice To simulate S with one cl alto RAND() Exact Transform Recursion Probs p, 7 ; VarS e k ; Var k Sk Y NOR 6k LOOKUP RAND() with cum probs Approximate Transform E S Not E S k 35 from sim or pref theory 7 35k M. S. INV RAND() S ROUND 7k 35k Y k k p To compute: eg Probs, Long run avg, based on S k Skip simulation stage if poss Adding stuff Similar Electoral College votes for Obama Hard-drives available for Google Class Pension Age, given 0+ Random Walk Convergence rate LLN (avgs) ST004 0 Week 8 8
Mean k Dice: Density Plot k= probx k=0 probx0 0.5 0.0000 0. 0.0000.0 0.0556 0. 0.0000.5 0. 0.3 0.0000.0 0.667 0.4 0.0000.5 0. 0.5 0.0000 3.0 0.778 0.6 0.0000 3.5 0.3333 0.7 0.0000 4.0 0.778 0.8 0.0000 4.5 0. 0.9 0.0000 5.0 0.667 0.0000 5.5 0.. 0.0000 6.0 0.0556. 0.0000 6.5 0.0000.3 0.0000 3. 0.56 3. 0.687 3.3 0.686 3.4 0.753 3.5 0.769 3.6 0.753 3.7 0.686 3.8 0.687 5. 0.0077 5. 0.0039 Scaling Area = e x 7 35 k Averaging stuff Mean of 0 less variable than mean of St Dev of Sampling Dist of Means /k Normal distribution, CLT Density plot Scaled/ Smoothed Rel Freqs Theory later 6 0.0000 ST004 0 Week 8 9
Density Estimate Freq table: bins x 0,, x i, x & corresponding relative frequencies th k x k x k Start Start bin corresp to ( ), ( ) Rel Freq Start Start End Smoothed Rel Fr estimate of eq pdf f x ST004 0 Week 8 0
Cumulative Probs Simulation Direct k calls to RAND, k LOOKUPs Via exact cdf; call to RAND, LOOKUP Via app cdf + NORM.S.INV(RAND()) Cts Normal Approx ; Pr 7k 35k S k s x, variable of integration Alt : y, v, t, z,... xs e x dx NORM.S.DIST s, TRUE Theory later Inverting x y = norm.s.dist norm.s.inv Simpler but -.5 0.006 -.5 equivalent case -.0 0.075 - x y=x^3 y^(/3) -.5 0.0668 -.5 -.5-5.65 -.50 -.0 0.5866 - -.0-8.000 -.00-0.5 0.30854-0.5 -.5-3.375 -.50 0.0 0.50000 0 -.0 -.000 -.00 0.5 0.6946 0.5-0.5-0.5-0.50.0 0.8434 0.0 0.000 0.00.5 0.9339.5 0.5 0.5 0.50.0 0.9775.0.000.00.5 0.99379.5.5 3.375.50 ST004 0 Week 8
Max k Dice Cum probs Pr(Mk i) 3 4 i 0 0 0 0 0 0.67 0.08 0.005 0.00 0.333 0. 0.037 0.0 3 0.500 0.50 0.5 0.063 4 0.667 0.444 0.96 0.98 5 0.833 0.694 0.579 0.48 6.000.000.000.000 i Pr(Mk=i) 0.67 0.08 0.005 0.00 0.67 0.083 0.03 0.0 3 0.67 0.39 0.088 0.050 4 0.67 0.94 0.7 0.35 5 0.67 0.50 0.8 0.85 6 0.67 0.306 0.4 0.58 E Mk 3.50 4.47 4.96 5.4 E (Mk)^ 5..0 5.9 8.4 Var.9.97.3 0.9 SD.7.40.4 0.95 Event Identity M i All k scores i k k Hence prob calculation Pr Pr k i M i k 6 k i i M i 6 6 k Not Normal dist > rolls before "score i" k Closely related to Counting stuff Time to ST004 0 Week 8
Max k Dice As k, E M 6, SD M 0 k Why? Dice score constrained 6 Not Law of Large Numbers k ST004 0 Week 8 3
cdf If Y is numeric and scalar Most general form cum prob dist function, cdf Poss values y eg y =,, K, eg 0 y, eg y > 0 Probs Pr(Yy)=F(y) eg Y = Dice score F(y)= y/6; poss Y are y =,..6 (equiv S = Dice score F(s)= s/6; poss S are s =,..6). 0.8 0.6 0.4 0. 0 Pr(Yy) 0 4 6 8 eg Y = RAND() F Y (y)= y; poss Y are 0 y X = 4Y + 0 F X (x)= 0.5(x-0) poss X are 0 x 4 ST004 0 Week 8 4
cdf and prob mass function, pmf Pr y Y y Pr Y y Pr Y y F y F y Y Dice Score Pr Y 5 PrY 5 PrY 5 5 3 6 6 6 Pr Y Pr Y Pr Y Pr Y 6 p ( y) Prob mass function for Y Y Lists Pr( Y y) p ( y)for each poss value y of Y or eg ( ) Prob mass function for S S X p s S. x, s index for X, S; could equally use i, j... If no ambiguity, use simpler notation p( x), p( s) X. 0.8 0.6 0.4 0. 0 Pr(Yy) 0 4 6 8 pmf: risers ST004 0 Week 8 5
Linear Functions and Expected Values Special Case S g( Y ) ay b; ( a, b) constants ( )Pr Pr E S g y Y y ay b Y y all y all y Pr a y Pr Y y b Y y all y ae Y b all y S g( Y ) ay b ( a, b) constants Pr Var S ay b E ay b Y y all y Pr all y a Var Y [ ] Pr ay E ay Y y a Y E Y Y y all y Could use, instead of Pr( Y y), Pr( Y i), p ( y), p ( i), or even p( y) ST004 0 Week 8 6 Y Y
Linear Functions and Expected Values Special Case S g( Y ) ay b g( y)pry y ay bpry y ES all y all y Pr a y Pr Y y b Y y all y ae Y b all y X score single dice E[ X ] E[X 3] 0 7 Var[ X ] Var[X 3] 35 35 3 o o C Temp C, F Temp F EF 9 5 SDF 9 E C SD C 5 3 5 5 S g( Y ) ay b ( a, b) constants Var S ay b E ay b PrY y all y ay EaY PrY y a Y E all y a Var [ Y ] SD[ as] a SD[ S] Y PrY y ST004 0 Week 8 7 all y
Variance: Useful Formula S Y EY (const); py PrY y ES Special Case; ; Var Y all y all y y y p y y p... EY Compare y y py all y Sample Var = Sample Avg y - Sample Avg y y Confirm for S 3, M 3 Exp Val & Var Sample Avg & Var ST004 0 Week 8 8
Law of Large Numbers, Sample Size Note 7k 35k k ; VarS E S k Linear Function 7 35 35 E S ; Var S ; SD S k k k k k k k k Law of Large Numbers,Convergence n n 7 35 E X ; SD X i i n n n Avg of n const at rate n ST004 0 Week 8 9
cdf F X (x) ;X=cts transf(rand()) Some simple cases t( ) non-decreasing function X t Y Y X Y X Y) eg ( ) 4 0; ; ln( A B C F Pr X Pr 4Y 0 Pr Y 0.5 0.5 A A F x Pr X x Pr Y x 0 x 0 A A 4 4 F x Pr X x Y x Y x x 0 x B B F x Pr X x ln Y x Y e x ex; x0 C C 0x 4 ; ; ST004 0 Week 8 0
Prob density function, pdf Some simple cdf/pdf Pr X Pr X Pr X A A A df x Pr x X x B d x ; B f x dx dx x B f x ; 0 x B df Pr x X x C C dx 0 0 4 4 4 x x d ex C x; 0 f x e x e x dx f C x Cts: cdf is smooth pdf is slope cdf is area under pdf ST004 0 Week 8
Continuous Probability Distributions Defined by math functions cdf F Y (y)= Pr(Y y) F(y) OK if not ambiguous continuous increasing pdf f(y) = rate of change of F(y) F(y) = indefinite integral of f(y) Prob is area under f(y) ST004 0 Week 8
0 Uniform Probability Dist F(y) Y ~ U(0,) pdf f ( y) probability density function rateof changeof cdf F( y) cdf F( y) areaunder pdf toleft of y Y equally likely to have any value in (0,) cdf eg Pr( Y 0.6) 0.6 0 y 0 general F( y) Pr( Y y) y 0 y y 0 y 0 df( y) pdf f ( y) 0 y dy 0 y ST004 0 Week 8 3
F(y) Height 0.6 f(y) Uniform(0,) Probability Dist 0 0 0 Area 0.6 0 Slope Y ~ U(0,) Y equally likely to have any value in (0,) Flat Symmetric cdf 0 y 0 F( y) Pr( Y y) y 0 y y pdf 0 y 0 df( y) f ( y) 0 y dy 0 y ST004 0 Week 8 4
Generating Y~U(a,b) U = RAND() - 0 3 To generate Y uniform in (-, +3) U RAND() Y - 4U Y - 4 U g( U ) 4 4 4 U ( Y ) g ( U ) Pr( Y y) Pr U ( y ) ( y ) 4 4 cdf F( y) ( y ) y 3 d pdf f ( y) ( y ) 4 dy y 3 ST004 0 Week 8 5
F(y) Ht = 0.86 Exponential Prob Dist Y ~ Exp( para ) Y 0 likely to be near 0 pdf f ( y) e y cdf F( y) e y f(y) 0 Area = 0.86 Skew Asymmetric Show by transform Y ln RAND() 0 ST004 0 Week 8 6
Normal Probability Dist Y ~N(0,) F(y)=Pr(Yy) Ht 0.84=Pr(Y) 0.75 0.5 0.5 0-6 -4-0 4 6 f(y) 0.5 0.5 0 Bell-shaped Symmetric -6-4 - 0 4 6 Area 0.84 Y ~ N(0,) Y likely to be "near" 0 pdf f ( y) e ST004 0 Week 8.5 0.99379.5 7 cdf F( y) Pr( Y y) f ( u) du transform Y y y NORM. S. DIST y, TRUE NORM. S. INV RAND() y = norm.s.dist norm.s.inv x -.5 0.006 -.5 -.0 0.075 - -.5 0.0668 -.5 -.0 0.5866 - -0.5 0.30854-0.5 0.0 0.50000 0 0.5 0.6946 0.5.0 0.8434.5 0.9339.5.0 0.9775
Simulating Cts Random Variables Defined System explicitly defined output var Y implicitly defined dist for output, eg F(y) RAND() t(rand()) System Output Defined Dist F(y) seek transformation Y = t(rand()) easy if know (code for) F - (y) See Tijms, 3 rd ed Sec 0.5 can be challenging esp if y is multivariate ST004 0 Week 8 8
Expected Vals for Cts Random Vars E[ Y ] y Pr Y y E[ Y ] y PrY y all y Pr( Y y) Pr Y in( y, y dy) f ( y) dy; f ( y) E[ Y ] yf ( y) dy all y df( y) dy Discrete case Continuous case As previously E[ ay b] ae[ Y ] b Var ay b a Var Y [ ] [ ] Var Y E Y E Y [ ] [ ] [ ] ST004 0 Week 8 9
Expected Vals for Cts Random Vars Var E X X ~ Uniform(0,); pdf f ( x) ; x in (0,) E X xf ( x) dx xdx 3X k E X Ee X EX ; k 0 0 0 0? x f ( x) dx Continuous case Long Run Avg 0.497 3.490 0.330 0.47.73 Var 0.083 0.75 0.089 0.080 0.4 X=RAND() 3X+ X^ X^3 exp(x) 0.4984 3.4953 0.484 0.38.6460 0.30755.966 0.09459 0.0909.36009 3 0.67007 4.000 0.44899 0.30085.95437 4 0.3.6669 0.0494 0.0099.4896 9999 0.07663.989 0.00587 0.00045.07964 0000 ST004 0 Week 8 0.845 4.53564 0.7438 0.6038.3847 30
Expected Vals for Cts Random Vars x x x x X x X ~ Exp; pdf f ( x) e ; x 0 0 0 0 0 0 0 Continuous case E X xf ( x) dx xe dx x e dx -xe e dx E X Var X Y x x e dx Y EY [ ] Var Y pdf f ( y) 3X X Long Run Average 0.993.978.006 Variance.0 9.08 0.967 rand() Y= -LN(- RAND()) 3Y Y= - LN(RAND()) 0.50 0.6976.097 0.6888 0.5438 0.7849.3546 0.609 3 0.39 0.46 0.479.080 4 0.595 0.3004 0.90.349 5 0.99.493 7.879 0.09 6 0.6695.073 3.38 0.40 7 0.555 0.80.430 0.5885 9999 0.8077.6487 4.946 0.36 ST004 0 Week 8 0000 0.403 0.58.5846 0.8908 3
Examples / Homework Simulation AND Thought Expt Discrete State prob dist cdf/pmf: avg of scores dice 3 + min(scores dice) # 6 s when roll 3 dice # rolls of single die before first 6 ST004 0 Week 8 3
Examples / Homework Simulation AND Thought Expt Continuous Y = RAND() ~ U(0,) Give cdf and pdf, and sketch both X = 3Y + 4 X = 3Y + 4 X = e Y X = 3e Y + 4 Also Simulate and form/plot ecdf by using ranks ST004 0 Week 8 33
Examples / Homework Simulation AND Thought Expt The Rayleigh density and cdf are as below. Sketch. Use the sketches to show how these functions can be used to compute probabilities. Propose a transform of Y = RAND(). Use calculus to determine its expected value, and confirm by simulation. pdf f r re F r e r ( ) r r ; ( ) ; 0 This Rayleigh model arises as distance from A to B when the North/South and East/West distances are each indep Normally distributed ie NORM.S.INV(RAND()). This suggests an alternative method for generating values from this distribution. Confirm. ST004 0 Week 8 34
Additional Material ST004 0 Week 8 35
Aside: Inverting transform X=t(RAND()) t( ) non-decreasing eg X t( Y) 4Y 0; X Y ; X ln( Y) A B C X x Y y where t( y) x X 4Y 0 Y 0.5 A X x Y x Y x B B X x ln Y x Y ex ST004 0 Week 8 36
#Rolls before score > i Infinitely Long Run E[K]..5 5.848 E[K^].68 3 60.4 Var[K] 0.4 0.75 6. 5 Prob > k Roll until score Prob k Roll until score rolls > > >5 rolls > > >5 0 k 0.67 0.333 0.833 k 0.833 0.667 0.67 0.08 0. 0.694 0.39 0. 0.39 3 0.005 0.037 0.579 3 0.03 0.074 0.6 4 0.00 0.0 0.48 4 0.004 0.05 0.096 5 0.000 0.004 0.40 5 0.00 0.008 0.080 6 0.000 0.00 0.335 6 0.000 0.003 0.067 9 0.000 0.000 0.005 9 0.000 0.000 0.00 30 0.000 0.000 0.004 30 0.000 0.000 0.00 Geometric Dist Poss values k,.3.. Probs Pr( K k) p( p) E[ K] k Pr( K k) p k k ST004 0 Week 8 37
#Rolls before score > 5; cts approx Exp approx 0.335 0.368 Prob > k 5 Prob > k 5 rolls rolls 5 before >5 before >5 exact app 0 0 0.000 0.000 0.000 k 0.833 k 0.67 0.67 0.54 0.694 0.306 0.306 0.83 3 0.579 3 0.4 0.4 0.393 4 0.48 4 0.58 0.58 0.487 5 0.40 5 0.598 0.598 0.565 6 0.335 6 0.665 0.665 0.63 7 0.79 7 0.7 0.7 0.689 8 0.33 8 0.767 0.767 0.736 9 0.94 9 0.806 0.806 0.777 6 6 6 0.335 6 6 Exponential Approx e 0.368 k k k 6 6 6 6 Pr K k e e ST004 0 Week 8 38
#Rolls before score > 5; cts approx cdf Pr K k e k 6 Generate U RAND() Solve U e K 6 3 Return K 6 LN( U ) inv cdf 4 Here Return ROUND.DOWN(K) Inverting Simulation Exact Approx Form exact cdf Call RAND(); LOOKUP via log transform of RAND() x y = - exp(-x) ``-LN(-y) 0.39347 0.5 0.63 0.77687.5 0.86466 3 0.979.5 3 0.950 3 4 0.9698 3.5 4 0.9868 4 5 0.98889 4.5 5 0.9936 5 But Stat Properties U U Simpler Stat Prop's of Return K 6 LN( U ) ST004 0 Week 8 39 U
Alt: Using ranks instead of bins Rep 0 Summarising System Life without Bins System Life prop <= this ascending To join the dots in EXCEL, data must be sorted System ascend-proing <= Life this rank value rank value 0.43 9 0.45 5.450 0.05 0.0 8 0.40 8.498 0.0 3 9.03 5 0.5 8.50 3 0.5 4 5.758 9 0.95 8.88 4 0.0 5 3.983 6 0.80 9.03 5 0.5 6 5.450 0.05 9.063 6 0.30 7.96 0.55 9.44 7 0.35 8 5.87 0.00 0.0 8 0.40 9 8.50 3 0.5 0.43 9 0.45 0 9.063 6 0.30 0.405 0 0.50 9.44 7 0.35.96 0.55 4.389 7 0.85.699 0.60 3.943 3 0.65.943 3 0.65 4 5.308 8 0.90 3.83 4 0.70 5 0.405 0 0.50 3.895 5 0.75 6 3.895 5 0.75 3.983 6 0.80 7.699 0.60 4.389 7 0.85 8 3.83 4 0.70 5.308 8 0.90 9 8.88 4 0.0 5.758 9 0.95 0 8.498 0.0 5.87 0.00.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.0 0.0 0.00 rank/n 5 0 5 0 ecdf.xlxs Continuous case ST004 0 Week 8 40
Empirical cdf Summarising System Life without Bins n=0000 System ecdf = Rep Life rank/n 0.4 0.484 0.0 0.38 3 9.0 0.87 4 5.76 0.974 5 3.98 0.9034 6 5.45 0.0087 7.93 0.69 8 5.8 0.974 9 8.50 0.65 0 9.06 0.36 9.4 0.6 4.39 0.936 3.94 0.864 4 5.3 0.96 5 0.4 0.4456 6 3.89 0.898 7.70 0.7897 8 3.8 0.839 9 8.88 0.0 0 8.50 0.65 6.0 0.9795 0.54 0.4679 3 8.60 0.765 4.97 0.6964 5. 0.789 6 8.74 0.98 7 3.5 0.8355 8 0.0 0.3803 0.8 0.6 0.4 0. 0.00 0.80 0.60 0.40 0.0 0.00 ecdf=prop( <= this value) lines and thus sorting not needed when n is large 0 5 0 5 0 5 ecdf=prop( <= this value) 5 0 5 0 Continuous case ST004 0 Week 8 4
Law of Large Numbers; counter-example running X inv X avg /X 0.9946.0054.0054 0.037 9.640 5.336 3 0.89554.665 3.906 4 0.333 3.96 3.73870 5 0.6536.5389 3.9754 6 0.7349 5.764 3.70863 7 0.8767.433 3.3488 8 0.883.0739 3.07507 9 0.84946.77.8649 0 0.8957.70.68948 0.897.994.55589 0.966.03507.495 3 0.67478.4895.3569 0 8 6 4 0 running avg /X 0 E X x dx ln( x), ill defined. 0 Alt, the dist of X E[X - ]; X~U(0,) is such that it can generate very large values with non-trivial probability. The Law of Large Numbers does not always apply ST004 0 Week 8 4