Probability and Statistics

Similar documents
It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Discrete probability distributions

Probability and statistics: basic terms

Axioms of Measure Theory

B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets

Binomial Distribution

Approximations and more PMFs and PDFs

4. Basic probability theory

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Singular Continuous Measures by Michael Pejic 5/14/10

Some discrete distribution

Random Variables, Sampling and Estimation

0, otherwise. EX = E(X 1 + X n ) = EX j = np and. Var(X j ) = np(1 p). Var(X) = Var(X X n ) =

Math 155 (Lecture 3)

Chapter 4. Fourier Series

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Lecture 7: Properties of Random Samples

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Lecture 19: Convergence

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

4. Partial Sums and the Central Limit Theorem

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

Simulation. Two Rule For Inverting A Distribution Function

Modeling and Performance Analysis with Discrete-Event Simulation

Final Review for MATH 3510

Chapter 6 Sampling Distributions

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Topic 9: Sampling Distributions of Estimators

The Boolean Ring of Intervals

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

AMS570 Lecture Notes #2

Statistics 511 Additional Materials

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 2 The Monte Carlo Method

Random Models. Tusheng Zhang. February 14, 2013

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Expectation and Variance of a random variable

Lecture 1 Probability and Statistics

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Chapter 0. Review of set theory. 0.1 Sets

STAT Homework 1 - Solutions

NOTES ON DISTRIBUTIONS

Topic 9: Sampling Distributions of Estimators

Lecture Notes for Analysis Class

CH5. Discrete Probability Distributions

Lecture 5. Random variable and distribution of probability

Lecture Chapter 6: Convergence of Random Sequences

What is Probability?

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Math 10A final exam, December 16, 2016

Infinite Sequences and Series

Quick Review of Probability

Ma 530 Introduction to Power Series

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

Topic 9: Sampling Distributions of Estimators

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Quick Review of Probability


EE 4TM4: Digital Communications II Probability Theory

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Lecture 2: April 3, 2013

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Review on Probability Distributions

PRACTICE FINAL/STUDY GUIDE SOLUTIONS

The Random Walk For Dummies

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Lecture 3 The Lebesgue Integral

An Introduction to Randomized Algorithms

Distribution of Random Samples & Limit theorems

( ) = p and P( i = b) = q.

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Sets and Probabilistic Models

Putnam Training Exercise Counting, Probability, Pigeonhole Principle (Answers)

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

= p x (1 p) 1 x. Var (X) =p(1 p) M X (t) =1+p(e t 1).

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

Lesson 10: Limits and Continuity

Topic 8: Expected Values

Statistical Signal Processing

Solutions to Homework 2 - Probability Review

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

As stated by Laplace, Probability is common sense reduced to calculation.

CS 330 Discussion - Probability

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

PRELIM PROBLEM SOLUTIONS

Last Lecture. Wald Test

Castiel, Supernatural, Season 6, Episode 18

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Module 1 Fundamentals in statistics

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Transcription:

robability ad Statistics rof. Zheg Zheg

Radom Variable A fiite sigle valued fuctio.) that maps the set of all eperimetal outcomes ito the set of real umbers R is a r.v., if the set ) is a evet F ) for every i R, ad the prob. of the evets {= } {=- } are 0. is a r.v, if B) F where B represets semi-defiite itervals of the form { a} ad all other sets that ca be costructed from these sets by performig the set operatios of uio, itersectio ad egatio ay umber of times. ) A B R

if is a r.v, the ) a b is also a evet { a} b are evets, c a a is a evet, Thus a b { a b} is a evet is also a evet a a { a } 3

robability Distributio Fuctio DF) Deote ) F ) 0 F ) is said to the robability Distributio Fuctio DF) associated with the r.v. The subscript is to idetify the r.v. if g) is a DF, the it is odecreasig, rightcotiuous, e.g. i) ii) if the. g ), g ) 0,, g ) g ), iii) g ) g ), for all. 4

From the earlier defiitio of F ), we have i) F ) ) ) ad F ) ) 0. ) ii) If, the the subset, ), ). ), Cosequetly the evet ) sice ) implies ). As a result F ) ) F ), ) implyig that the probability distributio fuctio is oegative ad mootoe odecreasig. iii) Let, ad cosider the evet A ). sice ) ) ), 5

usig mutually eclusive property of evets we get ) F ) F ). A ) But A A A, ad hece Thus lim A A ad hece lim A ) 0. lim A ) lim ) 0. But lim, the right limit of, ad hece F F ) F ) F ), i.e., F ) is right-cotiuous, justifyig all properties of a distributio fuctio. 6

Additioal roperties of a DF iv) If F 0 ) 0 for some, the This follows, sice F 0 ) ) 0 0 implies ) 0 is the ull set, ad for ay 0, ) will be a subset of the ull set. v) We have ) ), ad sice the two evets are mutually eclusive, the above follows. vi) ) F ). ) 0,. 0 0 The evets ) ad { ) } are mutually eclusive ad their uio represets the evet ) F ) F ) F ),.. 7

vii) ) F ) F ). Let, 0, ad. Sice or lim 0 ) F ) lim F ), 0 ) F ) F ). F 0 ), ), the limit of F as 0 from the right always eists ad equals F 0 ). However the left limit value F 0 ) eed ot equal F 0 ). Thus F ) eed ot be cotiuous from the left. At a discotiuity poit of the distributio, the left ad right limits are differet, ad from above ) F ) F ) 0. 0 0 0 8

Thus the oly discotiuities of a distributio fuctio are of the jump type, ad occur at poits 0 where it is satisfied. These poits ca always be eumerated as a sequece, ad moreover they are at most coutable i umber. Eample : is a r.v such that ) c,. Fid Solutio: For c, ), so that F ) 0, ad for c, ), so that. F ) ). F ) F F ) c Eample : Toss a coi. H,T. Suppose the r.v is such that T ) 0, H ). Fid ). F 9

Solutio: For 0, ), so that 0,, ) ) T, so that F ) T p, ) H, T, so that F ). Fig. 3) is said to be a cotiuous-type r.v if its distributio fuctio F ) is cotiuous. I that case F ) F all, ad we get 0. If F ) is costat ecept for a fiite umber of jump discotiuitiespiece-wise costat; step-type), the is said to be a discrete-type r.v. If i is such a discotiuity poit, the p F ) F ). i q F ) i i i F 0. ) for 0

From the Fig., at a poit of discotiuity we get c F c) F c ) 0. ad from the Fig., 0 F 0) F 0 ) q 0 q. Eample 3. A fair coi is tossed twice, ad let the r.v represet the umber of heads. Fid F ). Solutio: I this case HH, HT, TH, TT, ad 0 0,,,, HH ), HT ), TH ), TT ) ) ) TT F ) TT ) TT, HT, TH F ) TT, HT, TH ) F F ) ) 0,. T 0. ) T ) 4, 3 4,

From the Fig.3, robability desity fuctio p.d.f) The derivative of the distributio fuctio F ) is called the probability desity fuctio f ) of the r.v. Thus Sice df d ) ) 3 / 4 / 4 /. 3/ 4 / 4 ) F ) f F F df d ). from the mootoe-odecreasig ature of ) lim 0 F ) F ) 0, F ),

it follows that f ) 0 for all. f ) will be a cotiuous fuctio, if is a cotiuous type r.v. However, if is a discrete type r.v as i the above, the f ) its p.d.f has the geeral form Fig. 5) f ) p where i represet the jump-discotiuity poits i F As Fig. 5 shows f ) represets a collectio of positive discrete masses, ad it is ow as the probability mass fuctio p.m.f ) i the discrete case. We also obtai by itegratio Sice it yields F ) i F, i i ), ) f u ) du f ) d,. Fig. 5 p i i ). 3

which justifies its ame as the desity fuctio. Further, we also get Fig. 6b). ) F ) F ) f ) d Thus the area uder f ) i the iterval, ) represets the probability. F ) f ) a) b) Fig. 6 Ofte, r.vs are referred by their specific desity fuctios - both i the cotiuous ad discrete cases - ad i what follows we shall list a umber of them i each category. 4

Cotiuous-type radom variables. Normal Gaussia): is said to be ormal or Gaussia r.v, if f ) ) This is a bell shaped curve, symmetric aroud the parameter, ad its distributio fuctio is give by F ) ) y e y / where G ) e dy is ofte tabulated. Sice f ) depeds o two parameters ad, the otatio is ofte used. f ) e / / dy. G, N, ) Fig. 7 5

. Uiform: U a, b), a b, if Fig. 8) f ) b 0, a, a otherwise. b, 3. Epoetial: ) if Fig. 9) f ) e 0, /, 0, otherwise. b a f ) a Fig. 8 b f ) Fig. 9 6

4. Gamma: G, ) if 0, 0) Fig. 0) f ) ) 0, If a iteger e /, otherwise. 0, ) )!. f f ) Fig. 0 ) 5. Beta: a, b) if a 0, b 0) Fig. ) f ) a, b ) 0, a ) b, otherwise. 0, 0 Fig. where the Beta fuctio a, b) is defied as a b a, b ) u u ) du 0. 7

6. Chi-Square: ), if Fig. ) f ) 0, / / e / / ) otherwise. 0, Note that ) is the same as Gamma /, )., f ) Fig. 7. Rayleigh: R ), if Fig. 3) f ) e 0, /, otherwise. 0, f ) Fig. 3 8

Discrete-type radom variables. Beroulli: taes the values 0,), ad 0) q, ) p.. Biomial: if Fig. 7) B, p), ) p q, 0,,,, 3. oisso: ), if Fig. 8) ) e, 0,,,,.! ) ). Fig. 7 Fig. 8 9

4. Hypergeometric: ) m Nm N 5. Geometric: g p ) if, ma0, m N) mi m, ) ) pq, 0,,,,, q p. 6. Negative Biomial: ~ NB r, p), if r r ) p q, r, r,. r 7. Discrete-Uiform: ),,,, N. N 0

olya s distributio Icludes both biomial ad hypergeometric as special cases. A bo cotais a white balls ad b blac balls. A ball is draw at radom, ad it is replaced alog with c balls of the same color. If represets the umber of white balls draw i such draws, 0,,,,, fid the probability mass fuctio of. Solutio: Cosider the specific sequece of draws where white balls are first draw, followed by blac balls. The probability of drawig successive white balls is give by p W a a c a c a ) c a b a b c a b c a b ) c Similarly the probability of drawig white balls

followed by blac balls is give by b b c b ) c p p w a b c a b ) c a b ) c aic b jc abic ab j) c i0 j0. Iterestigly, p i above also represets the probability of drawig white balls ad ) blac balls i ay other specific order i.e., The same set of umerator ad deomiator terms i above cotribute to all other sequeces as well.) But there are such distict mutually eclusive sequeces ad summig over all of them, we obtai the olya distributio probability of gettig white balls i draws) to be a ic b jc ) p, 0,,,,. i0 j0 abic ab j) c

Both biomial distributio as well as the hypergeometric distributio are its special cases. For eample if draws are doe with replacemet, the c = 0 ad it simplifies to the biomial distributio where ) p q, 0,,,, a b p, q p. a b a b Similarly if the draws are coducted without replacemet, The c =, ad it gives ) aa ) a) a) bb ) b) ab) ab) ab ) ab) ab) 3

! a! ab)! b! ab)! )! )! a)! ab)! b)! ab)! which represets the hypergeometric distributio. Fially c = + gives replacemets are doubled) ) a )! ab)! b )! ab)! a)! ab )! b)! ab)! a b ab =. we shall refer it as olya s + distributio. the geeral olya distributio has bee used to study the spread of cotagious diseases epidemic modelig). a b ab 4

5 Let represet a Biomial r.v, the Sice the biomial coefficiet grows quite rapidly with, it is difficult to compute it for large. I this cotet, two approimatios are etremely useful. The Normal Approimatio Demoivre-Laplace Theorem) Suppose with p held fied. The for i the eighborhood of p, we ca approimate. ) q p! )!! pq Biomial Radom Variable Approimatios

p q p ) / pq pq Thus if ad are withi or aroud the eighborhood of the iterval p pq, p pq, we ca approimate the summatio by a itegratio. I that case it reduces to where e pq e p) / pq y / d p pq, p pq.. e dy, We ca epress it i terms of the ormalized itegral erf ) erf ) that has bee tabulated etesively See Table ). 0 e y / dy 6

For eample, if ad are both positive,we obtai erf ) erf ). Eample : A fair coi is tossed 5,000 times. Fid the probability that the umber of heads is betwee,475 to,55. Solutio: We eed,475,55 ). Here is large so that we ca use the ormal approimatio. I this case p so that p,500 ad 35. Sice p pq ad pq,535, the approimatio is valid for, ad,55. Thus Here pq,465, p 475 7 y / e 5 p p, pq pq 5 7. dy. 7,

8 erf) erf) erf) erf) 0.05 0.0 0.5 0.0 0.5 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.0994 0.03983 0.0596 0.0796 0.0987 0.79 0.3683 0.554 0.7364 0.946 0.0884 0.575 0.45 0.5804 0.7337 0.80 0.85 0.90 0.95.00.05.0.5.0.5.30.35.40.45.50 0.884 0.3034 0.3594 0.3894 0.3434 0.3534 0.36433 0.37493 0.38493 0.39435 0.4030 0.449 0.494 0.4647 0.4339.55.60.65.70.75.80.85.90.95.00.05.0.5.0.5 0.43943 0.4450 0.45053 0.45543 0.45994 0.46407 0.46784 0.478 0.4744 0.4776 0.4798 0.484 0.484 0.4860 0.48778.30.35.40.45.50.55.60.65.70.75.80.85.90.95 3.00 0.4898 0.4906 0.4980 0.4986 0.49379 0.4946 0.49534 0.49597 0.49653 0.4970 0.49744 0.4978 0.4983 0.4984 0.49865 ) ) erf 0 / G dy e y Table

Sice 0, from Fig. b), the above probability is give by,475,55 erf ) erf ) erf ) erf erf where we have used Table a) The oisso Approimatio e 0, 0 / 5 0.56, 7 erf0.7) 0.58. Fig. As we have metioed earlier, for large, the Gaussia approimatio of a biomial r.v is valid oly if p is fied, i.e., oly if p ad pq. what if p is small, or if it does ot icrease with? b) e 0, 0 / ) 9

Obviously that is the case if, for eample, 0 as such that is a fied umber. p p, May radom pheomea i ature i fact follow this patter. Total umber of calls o a telephoe lie, claims i a isurace compay etc. ted to follow this type of behavior. Cosider radom arrivals such as telephoe calls over a lie. Let represet the total umber of calls i the iterval 0, ). From our eperiece, as T we have so that we may assume T. Cosider a small iterval of duratio as i Fig.. If there is oly a sigle call comig i, the probability p of that sigle call occurrig i that iterval must deped o its relative size with respect to T. T 0 T Fig. 30

Hece we may assume p. Note that as T p 0 T. However i this case p T is a costat, T ad the ormal approimatio is ivalid here. Suppose the iterval i Fig. is of iterest to us. A call iside that iterval is a success H), whereas oe outside is a failure T ). This is equivalet to the coi tossig situatio, ad hece the probability ) of obtaiig calls i ay order) i a iterval of duratio is give by the biomial p.m.f. Thus )! )!! ad here as, p 0 such that p. It is easy to obtai a ecellet approimatio i that situatio. To see this, rewrite it as p p), 3

3. ) / ) /! ) /! ) ) ) ) p p,! ) lim 0,, e p p sice the fiite products as well as ted to uity as ad The right side of it represets the oisso p.m.f ad the oisso approimatio to the biomial r.v is valid i situatios where the biomial r.v parameters ad p diverge to two etremes such that their product p is a costat.,. lim e 0), p Thus

Eample : Wiig a Lottery: Suppose two millio lottery ticets are issued with 00 wiig ticets amog them. a) If a perso purchases 00 ticets, what is the probability of wiig? b) How may ticets should oe buy to be 95% cofidet of havig a wiig ticet? Solutio: The probability of buyig a wiig ticet p No. of wiig ticets 00 5 50. 6 Total o. of ticets 0 Here 00, ad the umber of wiig ticets i the purchased ticets has a approimate oisso distributio with parameter Thus p 00 ad a) robability of wiig ) e 5! 0, 5 0.005. ) 0) e 0.005. 33

b) I this case we eed ) 0.95. ) e 0.95 implies l 0 But p 5 0 5 3 or 60,000. Thus oe eeds to buy about 60,000 ticets to be 95% cofidet of havig a wiig ticet! Eample 3: A space craft has 0 5 compoets The probability of ay oe compoet beig defective is 5 0 p 0). The missio will be i dager if five or more compoets become defective. Fid the probability of such a evet. Solutio: Here is large ad p is small, ad hece oisso approimatio is valid. Thus p 00,000 0 5 ad the desired probability is give by 4 4 5) 4) e e e 0 4 3 3! 0.05. 0! 3. 34,