A PROBABILITY PRIMER - PDF Free Download

CARLETON COLLEGE A ROBABILITY RIMER SCOTT BIERMAN (Do ot quote without permissio)

A robability rimer INTRODUCTION The field of probability ad statistics provides a orgaizig framework for systematically thikig about radomess. Sice may decisios regardig how scarce resources are allocated are made whe there is ucertaity about the cosequeces of those decisios, havig the help of a orgaized way of thikig about this ucertaity ca be etremely valuable. A itroductio to this area is the poit of the hadout. FUNDAMENTAL DEFINITIONS As with all disciplies, probability ad statistics has its ow laguage. We begi with a few defiitios that are idispesable. Defiitio: Sample space: a list of all possible outcomes. Eample: You buy a stock today. The sample space of stock prices tomorrow cosists of all possible stock prices tomorrow (this would be approimated by all o-egative real umbers). Defiitio: Evet: A set of possible outcomes. Eample: You buy a stock today. The evet that the price of the stock goes up by at least 0% overight cosists of all prices that are 0% higher tha the price you paid for the stock.

Suppose that we let N ( A) be the umber of times we observe the evet A occurrig i N trials, while N represets the umber of trials. The the relative frequecy of the evet A occurrig is defied as N( A). N Defiitio: Relative Frequecy. The relative frequecy of evet A is the proportio of times that evet A occurs i N trials. This immediately brigs us to the defiitio of the probability of a evet occurrig. Defiitio: robability: The probability of evet A occurrig is the relative frequecy of evet A occurrig as the umber of trials approaches ifiity. We will write the probability of evet A occurrig as: ( A). From this defiitio it follows that probabilities of evets must be betwee 0 ad (iclusive). SOME USEFUL EVENTS Some evets are mutually eclusive. Suppose we are cosiderig two evets. These two evets are mutually eclusive if it is impossible for the same trial to result i both evets. Defiitio: Mutually Eclusive: Evets A ad B are mutually eclusive if ad oly if ( A or B) ( A) + ( B) This should be read as: The evets A ad B are mutually eclusive if the probability of evet A or evet B occurrig equals the probability of A occurrig plus the probability of B occurrig. 3

Eample of two mutually eclusive evets: Oe die is rolled. Evet A is rollig a or. Evet B is rollig a 4 or 5. The probability of evet A is /3, the probability of evet B is /3, but the probability of A or B is /3. Eample of two o-mutually eclusive evets: Oe die is rolled. Evet A is rollig a or. Evet B is rollig a or 3. The probability of evet A is /3, the probability of evet B is /3, but the probability of A or B is ot /3. A or B will oly occur whe a,, or 3 is rolled. The meas that the probability of evet A or B occurrig is ½, ot /3. Some evets, whe take together, must occur. Collectios of evets, at least oe of which must occur, are called collectively ehaustive. Defiitio: Collectively Ehaustive: The evets A or B or C are collectively ehaustive if ( A or B or C) Eample of collectively ehaustive evets: Oe die is rolled. Evet A is rollig a umber less tha 5. Evet B is rollig a umber greater tha 3. Ay roll of a die will satisfy either A or B (i fact, a roll of 4 satisfies both). Some collectios of evets are mutually eclusive ad collectively ehaustive. Eample of evets that are mutually eclusive ad collectively ehaustive: You buy a stock today. The evet A is that the price of the stock goes up tomorrow. The evet B is the price of the stock goes dow tomorrow. Evet C is the price of the stock does ot chage. The terms mutually eclusive ad collectively ehaustive are used so frequetly that you ca epect to hear them i everyday coversatio. THE BASIC ADDITION ROERTY OF ROBABILITY Suppose there are two evets; A ad B. I may istaces we will be iterested i calculatig what is the probability of evet A or evet B occurrig. We have already see how to do this for mutually eclusive evets, but we have also see that simply addig the probabilities of the two evets does ot work for o-mutually eclusive evets. 4

Thik of poits cotaied withi the rectagle below as the sample space for some radom variable. A A ad B B The poits cotaied withi the yellow ad gree areas we will call evet A. The poits cotaied withi the blue ad gree areas we will call evet B. Oe poit is radomly selected from the sample space. If the selectio of ay poit from the sample space is equally likely, the the magitude of the yellow ad gree areas relative to the total area as the probability of evet A occurrig, ad the magitude of the blue ad gree areas relative to the total area as the probability of evet B occurrig. The probability of either A or B is the area of all colored regios relative to the total area. But, otice that ( Aor B) does ot equal ( A) plus ( B) area twice. This meas, because this would cout the gree ( A or B) ( A) + ( B) ( A ad B) This ca also be writte 5

( A ad B) ( A) + ( B) ( A or B) So, if two evets are mutually eclusive, the ( A ad B) 0 This meas there is o itersectio betwee these evets. CONDITIONAL ROBABILITY I may cases you wat to kow the probability of some evet give the occurrece of some other evet. The probability of tomorrow beig raiy (without imposig ay coditios) is likely differet tha the probability that tomorrow is raiy give that it is May. Usig the Ve diagram agai, cosider the probability of B occurrig. A A ad B B Without ay coditios, the probability that evet B occurs will equal the size of the blue ad gree areas relative to the area of the etire rectagle. However, if we ask: What is the probability of evet B, coditioal o evet A occurrig? We get a differet aswer. Now, the relevat sample space cosists oly of poits withi the yellow ad gree areas. The 6

probability of evet B occurrig give that evet A has occurred, equals the gree area relative to the yellow area. This priciple ca be geeralized. The probability of evet B occurrig coditioal o evet A equals that probability of evets A ad B occurrig divided by the probability of A occurrig. ( B A) ( A ad B) ( A) Eample: Suppose a die is rolled. Evet A is the roll takes a value less tha 4. Evet B is that the roll is a odd umber. What is the probability of the roll beig a odd umber give that evet A has occurred? We kow that Evet A will occur whe a,, or 3 is rolled. Evet B will occur whe a, 3, or 5 is rolled. So of the three values that ecompass evet A, two of them are associated with evet B. So the probability of evet B occurrig give evet A is two-thirds. Now use the formula. Evets A ad B occur whe a is throw or whe a 3 is throw. The probability of oe of these happeig is /3. Evet A occurs whe a,, or 3 is rolled. The probability of evet A is ½. So ( A ad B) 3 ( A) rewritte: ( B A) ( A ad B) 3 ( A) You will fid that the formula for coditioal probability is very useful. It ca also be ( A ad B) ( A) ( B A) INDEENDENT EVENTS I a casual sese, two evets are idepedet if kowledge that oe evet has occurred does ot cause you to adjust the probability of the other evet occurrig. More formally, 7

Defiitio: Two evets, A ad B, are idepedet, if ( A B) ( A) We have see that by defiitio ( A B) ( A ad B) ( B) This meas that if two evets are idepedet: So, ( A ad B) ( B) ( A) ( A ad B) ( A) ( B) Eample: Suppose I roll a die ad you roll a die. Evet A is my roll is a 3. Evet B is your roll is a 3. The probability of my roll beig a 3 if your roll is a 3 is /6. But this is eactly the same as the probability of my roll beig a 3 ad havig o iformatio about your roll. This meas, ( A B) 6 So, the probability of both our rolls beig a 3 ca be calculated. ( A ad B) ( A) ( B) 36 BAYES THEOREM (THE SIMLE VERSION) Cosider two evets A ad B. These two evets give rise to two other evets; ot A ad ot B. These will be deoted by: A ~ ad B ~. Notice that the evets B ad B ~ are mutually eclusive ad collectively ehaustive. So, of course, are A ad A ~. 8

The questio Bayes asked was: How does the probability of B chage if we kow whether or ot A has occurred? Or: How does the observatio of whether or ot A has occurred cause you to update your probabilistic assessmet of the likelihood of B occurrig? I short, we wat a helpful epressio for ( B A). We already have the defiitio of ( B A), ( B A) ( A ad B) ( A) Ad, this meas, ( A ad B) ( A) ( B A). But it must also be true that ad ( A B) ( A ad B) ( B) ( A ad B) ( B) ( A B). So, ( B A) ( B) ( A B) ( A) Sice B ad B ~ are mutually eclusive ad collectively ehaustive: So, ~ ( A) ( A ad B) + ( A ad B) ~ ~ ( A) ( B) ( A B) + ( B) ( A B) ( B A) ( B) ( A B) ~ ~ ( B) ( A B) + ( B) ( A B) 9

This is Bayes theorem. It is othig more tha the defiitio of coditioal probability applied a couple of times ad a little bit of algebraic cleveress. The importace of Bayes theorem is that the iformatioal requiremets to calculate ( B A) from Bayes theorem are differet tha those required by the defiitio of ( B A). AN EXAMLE OF HOW BAYES THEOREM CAN BE USED All people at a firm are tested for a medical coditio (HIV, for eample). Suppose you have the followig iformatio about this medical coditio ad a laboratory test for this coditio: The chace of a radom draw from the populatio havig the medical coditio is. 000 The chace of a false positive test result from the lab test is. 00 The chace of a false egative test result for the lab test is. 500 Without the test you ratioally believe your chaces of havig the medical coditio are. A importat questio to someoe just tested is; if the test comes back positive, what 000 are the chaces that you have the medical coditio? To aswer this questio, we will formalize the iformatio provided above. Defie the followig evets: A : You have the medical coditio. A ~ : You do ot have the medical coditio. B : You have a positive test result. B ~ : You have a egative test result. 0

Sice out of every 000 people have the medical coditio: ( A) ~ false positive results i out of every 00 lab tests: ( B A) ~ false egative results i out of every 500 lab tests: ( B A) 00. 500 000. Sice there are. Ad, sice there are But we ca go further. The data provided also allow us to calculate. Sice 999 out of every 000 people do ot have the medical coditio: ( A ) ~ ~ correctly positive lab tests out of every 00 positive lab tests: ( B A) ~ 999. Sice there are 99 000 99 00. Ad, sice there are 499 correctly egative lab tests out of every 500 egative lab tests: 499 ( B A). 500 Remember that the perso havig just received the results of her lab test is iterested i calculatig the probability of havig the medical coditio havig just heard that she has a positive lab test result. I terms of our otatio above, she wats to calculate ( A B). Bayes theorem tells us, ( A) ( B A) ( A B) ~ A B ~ ( A) ( B A) + ( A) ( B A) ( ) 0.0908 9.08% 000 000 499 500 + 499 500 999 000 00 Havig just tested positive for the medical coditio there is (oly) a 9.08% chace that she actually has the coditio.

Is there some ituitio behid this result? Suppose there are,000,000 people radomly chose from the populatio, all of whom are tested for the medical coditio. We would epect 000 of them to have the medical coditio. Of the 000 who have the coditio, will ot show a positive test result. This is what a false egative lab test meas. But, we also kow that 998 of the people who have the medical coditio will correctly get a positive test result. Of the 999,000 who do ot have the coditio, 9,990 will receive a positive test result. This is what a false positive meas. So, a total of 998+9,9800,978 receive a positive test result. But of this group oly 998 are actually positive. So, the probability of havig the coditio if you test positive is 998/0098, or 0.0988. retty close. ROBABILITY DISTRIBUTIONS Fudametal to probability distributios are radom variables. Defiitio: A radom variable is a rule that assigs a umber to each possible outcome i a chace eperimet. We will start with discrete radom variables (as opposed to cotiuous radom variables). Eample: The sum of the two dice. A probability distributio simply maps the probability of the radom variable takig o every possible value to each of those values. I the eample above, the probability distributio is: RV 3 4 5 6 7 8 9 0 /36 /36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 /36 /36

This ca also be plotted. robability distributios associated with discrete radom variables must sum (over all possible values of the radom variable) to oe. 0.8 0.6 0.4 Relative Frequecy 0. 0. 0.08 0.06 0.04 0.0 0 3 4 5 6 7 8 9 0 Sum of Two Die Now cosider a radom variable that is cotiuous. For eample, the hourly flow of oil from a well is a radom variable that is cotiuous withi some boudaries. Sice the probability of ay specific umber beig chose is ifiitely small, it is more useful to thik about the probability of a radom umber fallig withi a rage of values. For eample, it is easy to ask a spreadsheet program such as Ecel to radomly select a umber betwee 0 ad 50. The computer will the be asked to select ay real umber betwee 0 ad 50 all of which are equally likely to be draw. The chaces you will get ay 3

particular value eactly is zero, but the chaces you will get a value betwee 0 ad 0 is 0% ( i 5). As with discrete radom variables, it is also possible but more complicated, to put a cotiuous probability distributio ito a diagram. Recall that with a discrete probability distributio: i ( ) i where i is the ith value that the radom variable ca take o, ( ) is the probability the radom variable takes o a particular value, ad is the umber of possible radom values. For a cotiuous probability distributio, f ( ) b a where, ( ) f is the probability of the radom variable,, fallig betwee the values of a ad b, for all a ad b. CUMULATIVE DISTRIBUTION (DENSITY) FUNCTIONS Related to probability distributio fuctios are cumulative probability distributio fuctios. I short they relate the value of a radom variable with the probability of the radom variable beig less tha or equal to that value. For a discrete radom variable, the cumulative distributio fuctio is 4

F j ( ) ( ) for all i j j i For a cotiuous radom variable, it is F a ( a) f ( )d i. THE EXECTED VALUE OF A RANDOM VARIABLE It is ofte helpful to have a measure of the cetral tedecy of a radom variable. There are several of these that people use regularly; the mea value of a radom variable, the media value of a radom variable, ad the mode of a radom variable are the three most importat. The oe that we will pay the most attetio to here is the mea value of a radom variable. It is sometimes called the epected value of the radom variable. Defiitio: The Epected Value of a (discrete) radom variable is the sum of all the possible values that radom variable ca take o each weighted by its probability of occurrig. E ( ) ( ) i i i Defiitio: The Epected Value of a (cotiuous) radom variable is f ( )d 5

E () is ofte called a epectatios operator ad has a variety of properties that are worth kowig somethig about. I geeral, as we fuctioally trasform a radom variable, call it g ( ), we defie, E ( g( ) ) g( ) f ( ) i Therefore, E ( a) ae( ) roof: g ( ) a E E ( a) a( ) i ( a) a ( ) i ( a) ae( ) E E ( + a) E( ) + a roof: ( ) a g + E E ( + a) ( + a) ( ) i ( + a) ( ) + a( ) i 6

E ( + a) ( ) + a ( ) i i ( + a) E( ) a E + E ( a + b) ae( ) + be( ) E E E ( a + b) ( a + b) ( ) i ( a + b) ( a ) ( ) + ( b) ( ) i i ( a + b) a ( ) ( ) + b ( ) ( ) i ( a + b) ae( ) be( ) E + ( a + b) i ( ) a E( ) + abe( ) b E + roof: ( ) ( a b) g + ( a + b) ) E( a + ab b ) E + ( a + b) ) E( a ) + E( ab) b E + ( a + b) ) a E( ) + abe( ) b E + THE VARIANCE AND STANDARD DEVIATION OF A RANDOM VARIABLE Aother useful way of describig a radom variable is to fid a measure for its dispersio aroud the mea value. Commoly the variace or the stadard deviatio of a radom variable is used towards this ed. 7

Defiitio: The variace of a (discrete) radom variable is ( E( ) ) ) E or, i ( E( )) ( ) i i i. Defiitio: The stadard deviatio of a (discrete) radom variable is or, ( E( ) ) ) E i ( E( )) ( ) i i i. There is a alterative way of writig variace that is worth derivig ad rememberig. V ( ) ( ) E ( E( ) ) ( ( ) ( ) E E( ) E( ) V + ( ) E( ) E E( ) ( ) ( ) E E( ) V + ( ) E ( ) E E ( ) ( ) ( ) E E ( ) V + ( ) E( ) E( ) E( ) E( ) V + ( ) E( ) E( ) E( ) V + V ( ) E( ) E( ) 8

JOINT ROBABILITY DISTRIBUTIONS There are may istaces where we are iterested i the relatioship betwee two (or more) radom variables. For eample, if I ow shares of IBM stock ad shares of Apple stock, I will certaily be iterested i the movemet of both stock prices i the future. The degree to which they are likely to go up or dow together will be importat to me. A joit probability distributio of two radom variables idetifies the probability of ay pair of outcomes occurrig together. The otatio: (, 3) should be read the 3 probability that the radom variable takes o a value of three ad that the radom variable takes o a value of three. Suppose we have a joit probability distributio as show by the followig table. The radom variable takes o values of,, or 3. The radom variable takes o values of,, 3, or 4. The umber i the cells of the table refers to the joit probability that equals the value i the associated row ad that equals the value i the associated colum. 3 4 0.5 0.0 0.05 0.05 0.0 0.05 0.00 0.0 3 0.0 0.00 0.0 0.00 If this is a well-defied joit probability distributio, the umbers i the cells are required to sum to. Or, 9

i j (, ) i j MARGINAL DISTRIBUTIONS Joit probability distributios give rise to a plethora of baby distributios. A class of these is kow as margial distributios. Margial distributios tell you the probability that oe radom variable takes o ay of its values regardless of the value of the other radom variable. For eample, the probability that takes o a value of is equal to the sum of (, ), (, ), ad (, ) probability that 3. This equals 0.55. Similarly the takes o a value of is equal to the sum of (, ) (, ), ad (, ) 3,. This equals 0.5. Agai, the probability that takes o a value of 3 is equal to the sum of (, 3), (, 3) (, 3) 3, ad. This equals 0.5. You ca cofirm the probability that takes o a value of 4 is also 0.5. I terms of the joit probability table, to fid the margial distributio of the radom variable, we simply sum all the rows for every colum. The margial distributio of is highlighted below. 3 4 0.5 0.0 0.05 0.05 0.0 0.05 0.00 0.0 Margial probability 3 0.0 0.00 0.0 0.00 distributio of 0.55 0.5 0.5 0.5 The same priciple applies to calculate the margial distributio of. This is show below. 0

3 4 Margial probability distributio of 0.5 0.0 0.05 0.05 0.45 0.0 0.05 0.00 0.0 0.5 3 0.0 0.00 0.0 0.00 0.30 CONDITIONAL DISTRIBUTIONS Margial distributios come i hady i the calculatio of coditioal distributios. Suppose we wat to kow the probability that takes o a specific value of 3, coditioal o beig equal to. We already kow from the joit probability distributio fuctio that (, ) 0. 0 distributio fuctio for coditioal probability is: ( B A) ( 3 ) 3 that ( ) 0. 55 ( A ad B) ( A). Therefore, ( 3, ) 0.0 ( ) 0. 55., ad we kow from the margial. Fially, remember that the defiitio of a For every value of, there is a coditioal probability distributio fuctio for. All four of these are show i the table below.

INDEENDENCE AGAIN If for ay pair of radom variables the values i the coditioal probability cells do ot differ from the ucoditioal probability, the the radom variables are idepedet. A eample of idepedece: We assig a value of whe a flipped coi comes up heads ad a value of 0 whe it comes up tails. The radom variable is the sum of the flip of two cois. The radom variable is the sum of the flip of two differet coi tosses. The joit probability fuctio for ad is show below as are the margial probability fuctios. You should be able to verify this. 0 Margial probability distributio of 0 0.065 0.5 0.065 0.5 0.5 0.5 0.5 0.50 0.065 0.5 0.065 0.5 Margial probability distributio of 0.5 0.50 0.5 From this we ca calculate the coditioal probabilities for give. 3

0 Margial probability distributio of 0 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.50 0.5 0.5 0.5 0.5 Margial probability distributio of 0.5 0.50 0.5 Notice that as you read across ay row the values are eactly the same (icludig the value of the margial probability distributio of. This meas the coditioal distributio of give is equal to the margial probability of. ( ) ( ) This is the defiitio of idepedece. Hece, ad are idepedet radom variables. We could also check this i the other directio. Fid, the coditioal probabilities for give. 0 Margial probability distributio of 0 0.5 0.50 0.5 0.5 0.5 0.50 0.5 0.50 0.5 0.50 0.5 0.5 Margial probability distributio of 0.5 0.50 0.5 4

As you read dow ay colum the values are eactly the same (icludig the value of the margial probability distributio of ). This meas the coditioal distributio of give is equal to the margial probability of. ( ) ( ) CONDITIONAL EXECTED VALUE Oe of the most importat tools i empirical ecoomics is a coditioal epected value. All regressio aalysis is based o this cocept. Let s retur to the joit probability distributio fuctio was saw earlier. This is repeated i the table below. 3 4 0.5 0.0 0.05 0.05 0.0 0.05 0.00 0.0 3 0.0 0.00 0.0 0.00 Remember that the coditioal distributio of give is the followig: 5

This colum is the coditioal distributio of give that equals This colum is the coditioal distributio of give that equals This colum is the coditioal distributio of give that equals 3 This colum is the coditioal distributio of give that equals 4 0.5/0.55 0.0/0.5 0.05/0.5 0.05/0.5 0.0/0.55 0.05/0.5 0.00/0.5 0.0/0.5 3 0.0/0.55 0.00/0.5 0.0/0.5 0.00/0.5 Oce you uderstad how to calculate a epected value ad ca derive a coditioal distributio fuctio, there is o great trick to calculatig the coditioal epected value of a radom variable. What, for eample, is the epected value of coditioal o beig equal to? Coditioal o beig equal to we kow that will equal with a probability of 0.5 0.0 ; we kow that will equal with a probability of ; ad, fially, 0.55 0.55 0.0 that will equal 3 with a probability of. This meas 0.55 0.5 0.0 0.0 ( ) + + 3. 909 E. 0.55 0.55 0. 55 This same type of calculatio ca be carried out for all other values of. E E E 0.0 0.05 0.00 ( ) + + 3. 333 0.5 0.5 0. 5 0.05 0.00 0.0 ( 3) + + 3. 333 0.5 0.5 0. 5 0.05 0.0 0.00 ( 4) + + 3. 666 0.5 0.5 0. 5 The relatioship betwee ad the coditioal epected value of is show below. 6

Coditioal Epected Value.5.5 Coditioal Epected Value of X.75.5.5 0.75 0.5 0.5 0 0 3 4 X If this coditioal epectatio was liear we would have a liear regressio fuctio, somethig very ear ad dear to rofessor Kaazawa a heart. COVARIANCE The et to last cocept we will pay attetio to i this hadout measures the degree to which two radom variables move with each other or agaist each other. This is ofte captured by the covariace of the radom variables (aother importat measure that measures the same cocept is the correlatio coefficiet). 7

Defiitio: The Covariace of two radom variables, ad y, is i j ( ) ( y ) ( E( ) ) y E( y), i j i j or, E (( E( ) )( y E( y) )) With eough patiece you ought to be able to show that if the joit probability fuctio for two radom variables is give by, 3 4 0.5 0.0 0.05 0.05 0.0 0.05 0.00 0.0 3 0.0 0.00 0.0 0.00 the the covariace betwee ad is equal to.9. The fact that this umber is positive suggests that the two radom variables ted to move i the same directio. useful. You will fid that the followig maipulatio of the defiitio of covariace is Cov (, y) E( ( E( ) )( y E( y) )) (, y) E( y E( y) ye( ) E( ) E( y) ) Cov + (, y) E( y) E( ) E( y) E( ) E( y) E( ) E( y) Cov + Cov (, y) E( y) E( ) E( y) 8

CORRELATION COEFFICIENT A correlatio coefficiet betwee two radom variables is liked, as you might imagie, closely to the covariace of those two radom variables. Defiitio: The Correlatio Coefficiet betwee two radom variables is ρ V ( y) ( ) V ( y) Cov, It turs out that ρ must lie betwee - ad ad is a measure of the degree of liear associatio betwee ad y. If for o other reaso tha it is uit-free it is easier to use tha the covariace as a measure of how ad y move together. 9