SDS 321: Introduction to Probability and Statistics

SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/ psarkar/teachig 1

Roadmap Examples of Markov ad Chebyshev Weak law of large umbers ad CLT Normal approximatio to Biomial 2

Markov s iequality Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the Markov iequality to boud P( X i 15) i=1 3

Markov s iequality Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the Markov iequality to boud P( X i 15) i=1 P( X i 15) E[ i X i ] = 20 15 15 = 4 3 i 3

Markov s iequality Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the Markov iequality to boud P( X i 15) i=1 P( X i 15) E[ i X i ] = 20 15 15 = 4 3 i How useful is this? 3

Chebyshev s iequality Example You have idepedet Poisso(1) radom variables X 1,..., X. Use the Chebyshev iequality to boud P( X 1 1)? 4

Chebyshev s iequality Example You have idepedet Poisso(1) radom variables X 1,..., X. Use the Chebyshev iequality to boud P( X 1 1)? P( X 1 1) var(x 1) = 1 = 1 10 = 1 100... Whe = 10 Whe = 100 4

Weak law of large umbers The WLLN basically states that the sample mea of a large umber of radom variables is very close to the true mea with high probability. Cosider a sequece of i.i.d radom variables X 1,... X with mea µ ad variace σ 2. Let M = X 1 + + X. E[M] = E[X 1] + + E[X] = µ 5

Illustratio Cosider the mea of idepedet Poisso(1) radom variables. For each, we plot the distributio of the average. 6

Ca we say more? Cetral Limit Theorem Turs out that ot oly ca you say that the sample mea is close to the true mea, you ca actually predict its distributio usig the famous Cetral Limit Theorem. Cosider a sequece of i.i.d radom variables X 1,... X with mea µ ad variace σ 2. Let X = X 1 + + X. Remember E[ X] = µ ad var( X) = σ 2 / Stadardize X to get X µ σ/ X µ As gets bigger, σ/ behaves more ad more like a Normal(0, 1) radom variable. ( ) X µ P σ/ < z Φ(z) 7

Ca we say more? Cetral Limit Theorem Figure: (Courtesy: Tamara Broderick) You bet! 8

Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the CLT to boud P( X i 15) i=1 9

Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the CLT to boud P( X i 15) i=1 P( X i 15) = P( X i 20 5) i i = P( X 1 1/ 20.25 20) P(Z 1.18) = 0.86 9

Example You have 20 idepedet Poisso(1) radom variables X 1,..., X 20. Use 20 the CLT to boud P( X i 15) i=1 P( X i 15) = P( X i 20 5) i i = P( X 1 1/ 20.25 20) P(Z 1.18) = 0.86 How useful is this? Better tha Markov. 9

Example A astroomer is iterested i measurig the distace, i light-years, from his observatory to a distat star. Although the astroomer has a measurig techique, he kows that, because of chagig atmospheric coditios ad ormal error, each time a measuremet is made it will ot yield the exact distace, but merely a estimate. As a result, the astroomer plas to make a series of measuremets ad the use the average value of these measuremets as his estimated value of the actual distace. If the astroomer believes that the values of the measuremets are idepedet ad idetically distributed radom variables havig a commo mea d (the actual distace) ad a commo variace of 4 (light-years), how may measuremets eed he make to 95% sure that his estimated distace is accurate to withi ±.5 lightyears? 10

Normal Approximatio to Biomial The probability of sellig a umbrella is 0.5 o a raiy day. If there are 400 umbrellas i the store, whats the probability that the ower will sell at least 180? Let X be the total umber of umbrellas sold. X Biomial(400,.5) We wat P(X > 180). Crazy calculatios. But ca we approximate the distributio of X /? X / = ( i Y i )/ where E[Y i ] = 0.5 ad var(y i ) = 0.25. Sure! CLT tells us that for large, X /400 0.5 0.25/400 N(0, 1) So P(X > 180) = P((X 200)/ 100 > 2) P(Z 2) = 1 Φ( 2) = 0.97 11

Frequetist Statistics The parameter(s) θ is fixed ad ukow Data is geerated through the likelihood fuctio p(x ; θ) (if discrete) or f (X ; θ) (if cotiuous). Now we will be dealig with multiple cadidate models, oe for each value of θ We will use E θ [h(x )] to defie the expectatio of the radom variable h(x ) as a fuctio of parameter θ 12

Problems we will look at Parameter estimatio: We wat to estimate ukow parameters from data. Maximum Likelihood estimatio (sectio 9.1): Select the parameter that makes the observed data most likely. i.e. maximize the probability of obtaiig the data at had. Hypothesis testig: A ukow parameter takes a fiite umber of values. Oe wats to fid the best hypothesis based o the data. Sigificace testig: Give a hypothesis, figure out the rejectio regio ad reject the hypothesis if the observatio falls withi this regio. 13