Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material. D. P. Bertsekas, J. N. Tsitsiklis. Itroductio to Probability.
Sample Statistics ad Populatio Parameters Populatio Sample Parameters Iferece Statistics Statistics-Berli Che
Basic Ideas Defiitio: A experimet is a process that results i a outcome that caot be predicted i advace with certaity Examples: Rollig a die Tossig a coi Weighig the cotets of a box of cereal Defiitio: The set of all possible outcomes of a experimet is called the sample space for the experimet Examples: For rollig a fair die, the sample space is {1,, 3, 4, 5, 6} For a coi toss, the sample space is {heads, tails} For weighig a cereal box, the sample space is (0, ), a more reasoable sample space is (1, 0) for a 16 oz. box (with a ifiite umber of outcomes) Statistics-Berli Che 3
More Termiology Defiitio: A subset of a sample space is called a evet The empty set Ø is a evet The etire sample space is also a evet A give evet is said to have occurred if the outcome of the experimet is oe of the outcomes i the evet. For example, if a die comes up, the evets {, 4, 6} ad {1,, 3} have both occurred, alog with every other evet that cotais the outcome Statistics-Berli Che 4
Combiig Evets The uio of two evets A ad B,, deoted A B,, is the set of outcomes that belog either to A, to B, or to both I words, A B meas A or B. So the evet A or B occurs wheever either A or B (or both) occurs Example: Let A = {1,, 3} ad B = {, 3, 4} The A B = {1,, 3, 4} Statistics-Berli Che 5
Itersectios The itersectio of two evets A ad B,, deoted by A B, is the set of outcomes that belog to A ad to B I words, A B meas A ad B. Thus the evet A ad B occurs wheever both A ad B occur Example: Let A = {1,, 3} ad B = {, 3, 4} The A B = {, 3} Statistics-Berli Che 6
Complemets The complemet of a evet A,, deoted A c, is the set of outcomes that do ot belog to A I words, A c meas ot A. Thus the evet ot A occurs wheever A does ot occur Example: Cosider rollig a fair sided die. Let A be the evet: rollig a six ={6} {6}. The A c = ot rollig a six = {1,, 3, 4, 5} Statistics-Berli Che 7
Mutually Exclusive Evets Defiitio: The evets A ad B are said to be mutually exclusive if they have o outcomes i commo A, A,..., A More geerally, a collectio of evets is said to 1 be mutually exclusive if o two of them have ay outcomes i commo Sometimes mutually exclusive evets are referred to as disjoit evets Statistics-Berli Che 8
Example Whe you flip a coi, you caot have the coi come up heads ad tails The followig Ve diagram illustrates mutually exclusive evets Statistics-Berli Che 9
Probabilities Defiitio: Each evet i the sample space has a probability of occurrig. Ituitively, the probability is a quatitative measure of how likely the evet is to occur Give ay experimet ad ay evet A: The expressio P(A) deotes the probability that the evet A occurs P(A) isthe proportio of times that the evet A would occur i the log ru, if the experimet were to be repeated over ad over agai Statistics-Berli Che 10
Axioms of Probability 1. Let S be a sample space. The P(S) ( ) = 1. For ay evet A,, 0 PA ( ) 1 3. If A ad B are mutually exclusive evets, the P( AB) P( A) P( B) A, A,... More geerally, if are mutually exclusive 1 evets, the PA ( A...) PA ( ) PA ( )... 1 1 Statistics-Berli Che 11
A Few Useful Thigs For ay evet A,, P(A( c ) = 1 P(A) ( ) Let Ø deote the empty py set. The P(Ø) ( ) = 0 E, E,..., E E E,..., E 1,, If A is a evet, ad A = { 1 } (ad are mutually exclusive), the P(A) = P(E1) + P(E) +.+ P(E ). Additio Rule (for whe A ad B are ot mutually exclusive): PA ( B ) PA ( ) PB ( ) PA ( B ) Statistics-Berli Che 1
Coditioal Probability ad Idepedece Defiitio: A probability that is based o a part of the sample space is called a coditioal probability E.g., calculate the probability of a evet give that the outcomes from a certai part of the sample space occur Let A ad B be evets ets with P(B) 0. The coditioal probability of A give B is PAB ( ) PA ( B ) PB ( ) Ve diagram Statistics-Berli Che 13
More Defiitios Defiitio: Two evets A ad B are idepedet if the probability of each evet remais the same whether or ot the other occurs If P(B) 0 ad P(B) 0, the A ad B are idepedet if P(B A) = P(B) or, equivaletly, P(A B) = P(A) If either P(A) = 0 or P(B) = 0, the A ad B are idepedet Are A ad B idepedet (?) Statistics-Berli Che 14
The Multiplicatio (Chai) Rule If A ad B are two evets ad P(B) ( ) 0, the P(A B) = P(B)P(A B) If A ad B are two evets ad P(A) 0, the P(A B) = P(A)P(B A) If P(A) 0, ad P(B) 0, the both of the above hold If A ad B are two idepedet evets, the P(A B) = P(A)P(B) This result ca be exteded to more tha two evets Statistics-Berli Che 15
Law of Total Probability If A 1 1,,,, A are mutually exclusive ad exhaustive evets, ad B is ay evet, the P(B) = P(A 1 B) + + P(A B) Exhaustive evets: The uio of the evets cover the sample space S= A 1 A A Or equivaletly, if P(A i ) 0 for each A i, P(B) = P(B A 1 )P(A 1 )+ + P(B A )P(A ) Statistics-Berli Che 16
Example Customers who purchase a certai make of car ca order a egie i ay of three sizes. Of all the cars sold, 45% have the smallest egie, 35% have a medium-sized i d egie, ad 0% have the largest. Of cars with smallest egies, 10% fail a emissios test withi two years of purchase, while 1% of those with the medium size ad 15% of those with the largest egie fail. What is the probability that a radomly chose car will fail a emissios test withi two years? Statistics-Berli Che 17
Solutio Let B deote the evet that a car fails a emissios test withi two years. Let A 1 deote the evet that a car has a small egie, A the evet that a car has a medium size egie, ad A 3 the evet that a car has a large egie. The P(A 1 ) = 0.45, P(A ) = 0.35, ad P(A 3 ) = 0.0. Also, P(B A 1 ) = 0.10, P(B A ) = 0.1, ad P(B A 3 ) = 0.15. By the law of total probability, P(B) = P(B A 1 ) P(A 1 ) + P(B A )P(A ) + P(B A 3 ) P(A 3 ) = 0.10(0.45) + 0.1(0.35) + 0.15(0.0) = 0.117 Statistics-Berli Che 18
Bayes Rule Let A 1,, A be mutually exclusive ad exhaustive 1,, y evets, with P(A i ) 0 for each A i. Let B be ay evet with P(B) 0. The k k B P B A P B A P ) ( ) ( ) ( k k k A P B A P B P ) ( ) ( ) ( ) ( i i i k k A P B A P 1 ) ( ) ( ) ( ) ( i 1 Statistics-Berli Che 19
Example The proportio p of people p i a give commuity who have a certai disease (D) is 0.005. A test is available to diagose the disease. If a perso has the disease, the probability bilit that t the test t will produce a positive sigal (+) is 0.99. If a perso does ot have the disease, the probability that the test will produce a positive sigal is 0.01. If a perso tests positive, what is the probability that the perso actually has the disease? Statistics-Berli Che 0
Solutio Let D represet the evet that a perso actually has the disease Let + represet the evet that the test gives a positive sigal We wish to fid P(D +) We kow P(D) = 0.005, P(+ D) = 0.99, ad P(+ D C ) = 0.01 Usig Bayes rule P( D ) P( P( D ) P ( D ) D) P( D) P( D C ) P( D C ) 0.99(0.005) 0.99(0.005) 0.01(0.995) 0.33. Statistics-Berli Che 1
Radom Variables Defiitio: A radom variable assigs a umerical value to each outcome i a sample space We ca say a radom variable is a real-valued fuctio of the experimetal outcome Defiitio: A radom variable is discrete if its possible values form a discrete set Statistics-Berli Che
Example The umber of flaws i a 1-ich legth of copper wire maufactured by a certai process varies from wire to wire. Overall, 48% of the wires produced have o flaws, 39% have oe flaw, 1% have two flaws, ad 1% have three flaws. Let be the umber of flaws i a radomly selected piece of wire The, P( = 0) = 0.48, P( = 1) = 0.39, P( = ) = 0.1, ad P( = 3) = 0.01 The list of possible values 0, 1,, ad 3, alog with the probabilities of each, provide a complete descriptio of the populatio from which was draw Statistics-Berli Che 3
Probability Mass Fuctio The descriptio of the possible values of ad the probabilities of each has a ame: The probability mass fuctio Defiitio: The probability mass fuctio (deoted as pmf) of a discrete radom variable is the fuctio p(x) ( ) = P( = x). The probability mass fuctio is sometimes called the probability distributio Statistics-Berli Che 4
Cumulative Distributio Fuctio The probability mass fuctio specifies the probability that a radom variable is equal to a give value A fuctio called the cumulative distributio fuctio (cdf) specifies the probability that a radom variable is less tha or equal to a give value The cumulative distributio fuctio of the radom variable is the fuctio F(x) = P( x) Statistics-Berli Che 5
Example Recall the example of the umber of flaws i a radomly chose piece of wire. The followig is the pdf: P( = 0) = 0.48, P( = 1) = 0.39, P( = ) = 0.1, ad P( = 3) = 0.01 For ay value x, we compute F(x) by summig the probabilities of all the possible values of x that are less tha or equal to x F(0) = P( 0) = 0.48 F(1) = P( 1) = 0.48 + 0.39 = 0.87 F() = P( ) = 0.48 + 0.39 + 0.1 = 0.99 F(3) = P( 3) = 0.48 + 0.39 + 0.1 + 0.0101 = 1 Statistics-Berli Che 6
More o Discrete Radom Variables Let be a discrete radom variable. The The probability mass fuctio (cmf) of is the fuctio p(x) = P( = x) The cumulative distributio fuctio (cdf) of is the fuctio F(x) = P( x) F( x) p( t) P( t) tx tx p ( x ) P ( x ) 1, where the sum is over all the possible x values of x Statistics-Berli Che 7
Mea ad Variace for Discrete Radom Variables The mea (or expected value) of is give by xp( x) also deoted as E x where the sum is over all possible values of, The variace of is give by ( x ) P( x), also deotedas E x xp ( x)., also deoted as E E x The stadard deviatio is the square root of the variace Mea, variace, stadard deviatio provide summary iformatio for a radom variable (probability distributio) Statistics-Berli Che 8
The Probability Histogram Whe the possible values of a discrete radom variable are evely spaced, the probability mass fuctio ca be represeted by a histogram, with rectagles cetered at the possible values of the radom variable The area of the rectagle cetered at a value x is equal to P( = x) Such a histogram is called a probability histogram, because the areas represet probabilities Statistics-Berli Che 9
Example The followig is a probability histogram for the example with umber of flaws i a radomly chose piece of wire P( = 0) = 0.48, P( = 1) = 0.39, P( = ) = 0.1, ad P( = 3) = 0.01 Figure.8 Statistics-Berli Che 30
Cotiuous Radom Variables A radom variable is cotiuous if its probabilities are give by areas uder a curve The curve is called a probability desity fuctio (pdf) for the radom variable. Sometimes the pdf is called the probability distributio Let be a cotiuous radom variable with probability desity fuctio f(x). ) The f ( x ) dx 1. Statistics-Berli Che 31
Computig Probabilities Let be a cotiuous radom variable with probability desity fuctio f(x). Let a ad b be ay two umbers, with a < b. The b Pa ( b) Pa ( b) Pa ( b) f ( xdx ). I additio, a P ( a) P ( a) f( xdx ) P ( a) P ( a) f( xdx ). a a Statistics-Berli Che 3
More o Cotiuous Radom Variables Let be a cotiuous radom variable with probability desity fuctio f(x). The cumulative distributio fuctio (cdf) of is the fuctio x F( x) P( x) f( t) dt. The mea of is give by xf( x) dx., also deoted as E The variace of is give by x ( ) f( x) dx, also deotedas E x f( x) dx. also deoted as E E E, Statistics-Berli Che 33
Media ad Percetiles Let be a cotiuous radom variable with probability mass fuctio f(x) ad cumulative distributio fuctio F(x) The media of is the poit x m that solves the equatio F ( x m) P ( xm) f ( x ) dx 05 0.5. If p is ay umber betwee 0 ad 100, the pth percetile is the poit x p that solves the equatio F( x ) p P( xp) f( x) dx p/100. x m x p The media is the 50 th percetile Statistics-Berli Che 34
Liear Fuctios of Radom Variables If is a radom variable, ad a ad b are costats, the a b a b a a b a a b Statistics-Berli Che 35
More Liear Fuctios If ad Y are radom variables, ad a ad b are costats, the. a by a by a b Y More geerally, if 1,, are radom variables ad c 1,, c are costats, the the mea of the liear combiatio c 1 1,, c is give by c c... c c1 c... c. 1 1 1 Statistics-Berli Che 36
Two Idepedet Radom Variables If ad Y are idepedet radom variables, ad S ad T are sets of umbers, the P( S ad Y T) P( S) P( Y T). More geerally, if 1,, are idepedet d radom variables, ad S 1,, S are sets, the P( S, S,..., S ) P( S ) P( S )... P( S ). 1 1 1 1 Statistics-Berli Che 37
Variace Properties If 1 1,,, are idepedet radom variables, the the variace of the sum 1 + + is give by....... 1 1 If 1,, are idepedet radom variables ad c 1,, c are costats, the the variace of the liear combiatio c 1 1 + + c is give by c c... c c1 c... c. 1 1 1 Statistics-Berli Che 38
More Variace Properties If ad Y are idepedet radom variables with variaces ad Y, the the variace of the sum + Y is Y Y. The variace of the differece Y is Y Y. Statistics-Berli Che 39
Idepedece ad Simple Radom Samples Defiitio: If 1,, is a simple radom sample, the 1,, may be treated as idepedet radom variables, all from the same populatio Phrased aother way, 1,, are idepedet, ad idetically distributed (i.i.d.) id) Statistics-Berli Che 40
Properties of (1/4) If 1,, is a simple radom sample from a populatio 1,, p p p p with mea ad variace, the the sample mea is a radom variable with mea of sample mea 1. variace of sample mea The stadard deviatio of is. Statistics-Berli Che 41
Properties of (/4) Populatio 1 1 3 parameters (, ) 37 40 35... 39 simple radom sample of size sample mea x 1 ( 37.8) 41 38 4... 38.5 sample mea simple radom sample of size x ( 40.) 37.5 38 4... 40. sample mea simple radom sample of size x 3 ( 38.6) 1,,, are i.i.d ad follow the same distributio sample mea cae be view as a radom variable with values x1, x,, xk, ca be represeted as 1 1 Statistics-Berli Che 4
Properties of (3/4) 1 E 1 1 1 1 1 iid 1 1 1,,, 1 are i.i.d ad follow the same distributio with mea E 1 1 1 1 1,,, 1 are idepedet 1 1 1 1 1 1 1 are idetically distributed (follow the same distributio with variace ),,, 1 Statistics-Berli Che 43 )
Properties of (4/4) mea of sample mea (equal to populatio mea ) sample mea xi sample mea x j The spread of sample mea is determied by the variace of sample mea ( equal to where is the populatio variace) Statistics-Berli Che 44
Joitly Distributed Radom Variables If ad Y are joitly discrete radom variables: The joit probability mass fuctio of ad Y is the fuctio pxy (, ) P ( xad Y y) The margial probability mass fuctios of ad Y ca be obtaied from the joit probability mass fuctio as follows: p ( x) P ( x) pxy (, ) p( y) PY ( y) pxy (, ) y where the sums are take over all the possible values of Y ad of, respectively (margializatio) The joit probability mass fuctio has the property that pxy (, ) 1 x y where the sum is take over all the possible values of ad Y Y x Statistics-Berli Che 45
Joitly Cotiuous Radom Variables If ad Y are joitly cotiuous radom variables, with joit probability desity fuctio f(x,y), ad a < b, c < d, the P ( a b d b ad c Y d ) f ( x, y ) dydx. a c The joit probability desity fuctio has the property that f ( x, y ) dydx 1. Statistics-Berli Che 46
Margials of ad Y If ad Y are joitly cotiuous with joit probability desity fuctio f(x,y), the the margial probability desity fuctios of ad Y are give, respectively, by f ( x ) f ( xydy, ) f ( y) f( x, y) dx. Y Such a process is called margializatio Statistics-Berli Che 47
More Tha Two Radom Variables If the radom variables 1 1,,, are joitly discrete, the joit probability mass fuctio is px (,..., x) P ( x,..., x). 1 1 1 If the radom variables 1,, are joitly cotiuous, they have a joit probability desity fuctio f(x 1, x,, x ), where P a b a b f x x dx dx 1 ( 1 1 1,..., ) ( 1,..., ) 1.... a a1 for ay costats a 1 b 1,, a b b b Statistics-Berli Che 48
Meas of Fuctios of Radom Variables (1/) If the radom variables 1 1,,, are joitly discrete, the joit probability mass fuctio is px (,..., x) P ( x,..., x). 1 1 1 If the radom variables 1,, are joitly cotiuous, they have a joit probability desity fuctio f(x 1, x,, x ), where b b1 (,..., 1 1 1, ) ( 1,...,, ) 1.... f a a1 Pa b a b f x x dx dx for ay costats a 1 b 1,, a b. Statistics-Berli Che 49
Meas of Fuctios of Radom Variables (/) Let be a radom variable, ad let h() bea fuctio of. The: If is a discrete with probability mass fuctio p(x), the mea of h() is give by, also deoted as Eh h ( x ) hxp ( ) ( x). x where the sum is take over all the possible values of If is cotiuous with probability desity fuctio f(x), the mea of h(x) is give by, also deoted as Eh h ( x ) hx ( ) f( xdx ). Statistics-Berli Che 50
Fuctios of Joit Radom Variables If ad Y are joitly distributed radom variables, ad h(,y) is a fuctio of ad Y, the If ad Y are joitly discrete with joit probability mass fuctio p(x,y), h x y pxy (, ) h( (, ) (, ). h Y x y where the sum is take over all possible values of ad Y If ad Y are joitly cotiuous with joit probability mass fuctio f(x,y), h ( Y, ) h ( x, y) f ( x, y ) dxdy. Statistics-Berli Che 51
Discrete Coditioal Distributios Let ad Y be joitly discrete radom variables, with joit probability desity fuctio p(x,y), let p (x) deote the margial probability mass fuctio of ad let x be ay umber for which h p (x) > 0. The coditioal probability mass fuctio of Y give = x is p Y pxy (, ) ( y x). px ( ) Note that for ay particular values of x ad y, the value of p Y (y x) is just the coditioal probability bilit P(Y=y =x) ) Statistics-Berli Che 5
Cotiuous Coditioal Distributios Let ad Y be joitly cotiuous radom variables, with joit probability desity fuctio f(x,y). Let f (x) deote the margial desity fuctio of ad let x be ay umber for which h f (x) > 0. The coditioal distributio fuctio of Y give = x is f ( x, y ) fy ( y x ). f ( x) Statistics-Berli Che 53
Coditioal Expectatio Expectatio is aother term for mea A coditioal expectatio is a expectatio, or mea, calculated usig the coditioal probability mass fuctio or coditioal probability desity fuctio The coditioal expectatio of Y give = x is deoted by E(Y = x) or Y Statistics-Berli Che 54
Idepedece (1/) Radom variables 1 1,,, are idepedet, provided that: If 1,, are joitly discrete, the joit probability mass fuctio is equal to the product of the margials: p ( x,...,, x ) p ( x )... p ( x ). 1 1 1 If 1,, are joitly cotiuous, the joit probability desity fuctio is equal to the product of the margials: f ( x,..., x ) f ( x )... f ( x ). 1 1 Statistics-Berli Che 55
Idepedece (/) If ad Y are idepedet radom variables, the: If ad Y are joitly discrete, ad x is a value for which p (x) > 0, the p Y (y x)= p Y (y) If ad Y are joitly cotiuous, ad x is a value for which f (x) > 0, the f Y (y (y x)= f Y Y(y) Statistics-Berli Che 56
Covariace Let ad Y be radom variables with meas ad Y The covariace of ad Y is Cov(, ). Y ( )( Y ) Y A alterative formula is Cov( Y, ). Y Y Statistics-Berli Che 57
Correlatio Let ad Y be joitly distributed radom variables with stadard deviatios ad Y The correlatio betwee ad Y is deoted,y ad is give by Y, Cov( Y, ). Y Or, called correlatio coefficiet For ay two radom variables ad Y -1,Y 1. Statistics-Berli Che 58
Covariace, Correlatio, ad Idepedece If Cov(,Y), = Y,Y = 0, the ad Y are said to be ucorrelated If ad Y are idepedet, the ad Y are ucorrelated It is mathematically possible for ad Y to be ucorrelated without beig idepedet. This rarely occurs i practice Statistics-Berli Che 59
Example The pair of radom variables (, Y ) takes the values (1,, 0), (0,, 1), ( 1,, 0), ad (0, 1),, each with probability ¼ Thus, the margial pmfs of ad Y are symmetric aroud 0, ad E[] = E[Y ] = 0 Furthermore, for all possible value pairs (x, y), either x or y is equal to 0, which implies that Y = 0 ad E[Y ] = 0. Therefore, cov(, Y ) = E[( E[] )(Y E[Y ])] = 0, ad ad Y are ucorrelated However, ad Y are ot idepedet sice, for example, a ozero value of fixes the value of Y to zero Statistics-Berli Che 60
Variace of a Liear Combiatio of Radom Variables (1/) If 1,, are radom variables ad c 1,, c are 1,, 1,, costats, the c... c c1... c 1 1 1 1 c c c cc 1 1... c 1 1 i j i j i1 ji1 1... Cov(, ). For the case of two radom variables Y Y Cov, Y Statistics-Berli Che 61
Variace of a Liear Combiatio of Radom Variables (/) If 1 1,,, are idepedet radom variables ad c 1,, c are costats, the c... c c1... c. 1 1 1 I particular,....... 1 1 Statistics-Berli Che 6
Summary (1/) Probability ad axioms (ad rules) Coutig techiques Coditioal probability Idepedece Radom variables: discrete ad cotiuous Probability mass fuctios Statistics-Berli Che 63
Summary (/) Probability desity fuctios Cumulative distributio fuctios Meas ad variaces for radom variables Liear fuctios of radom variables Mea ad variace of a sample mea Joitly distributed radom variables Statistics-Berli Che 64