Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons Law of terated expectatons The Normal dstrbuton Appendx I: Correlaton, ndependence and lnear relatonshps Appendx II: Covarance and ndependence. Random varables a. Random varables:,, Z take on dfferent values wth dfferent probabltes; conventon s to use captal letters for random varables and lower case letters for realzed values. So, for nstance, s a random varable, and x or x, x, and x 3 would be specfc realzed values of b. (Probablty) Densty functons (pdfs): descrbe the dstrbuton of the random varable the probablty that the random varable takes on dfferent values used to determne probabltes. Dscrete random varable (e.g. Bnomal dstrbuton): takes on a fnte or countably nfnte set of values wth postve probablty. densty functon: f ( x j) P( x j) 0 and f ( x j) (note sgma notaton). Contnuous random varable (e.g. Normal dstrbuton). densty functon: f ( x) 0 and f ( x) dx. Use the densty functons to determne the probabltes:. Dscrete: Pa ( < b) P ( x) f( x) a< x b a< x b b. Contnuous: P( a < b) f ( x) dx a
c. Examples of random varables. Unform [a,b]: f( x) x [ ab, ] b a and s 0 otherwse. Standard Normal - N(0,): f( x) e π x. Measures of central tendences and varablty a. Expectaton/Mean (measure of central tendency): E ( ), µ. The average value of (observed wth a large number of random samples from the dstrbuton). A weghted average of the dfferent values of (weght the values by ther respectve probabltes). Dscrete: E( ) µ xp( x) x f( x). Contnuous: E( ) µ xf ( x) dx. Propertes. Lnear operator: E( a + b) ae( ) + b a. Extends to many random varables: E( a) Ea ( ) ae ( ) aµ. And for some functon g(.), Eg ( ( )) gx ( ) f( x) or g( x) f ( x) dx for a contnuous dstrbuton b. Varance (measure of varablty or dsperson around the mean): Var( ),. The average squared devaton of from ts mean (observed wth a large number of random samples from the dstrbuton). A weghted average of the dfferent squared devatons of from ts mean (weght the squared devatons by ther respectve probabltes). Dscrete: µ ( µ ) ( µ ) Var( ) E( ) x P( x ) x f ( x ) Var( ) E( µ ) x µ f ( x) dx. Contnuous: ( ) 3. E ( µ ) E ( ) µ. Propertes:. Not a lnear operator: Var a b a Var ( + ) ( ) v. Standard devaton (StdDev): (postve square root)
. Lnear operator: f a>0, then StdDev a + b Var a a Var a StdDev ( ) ( ) ( ) ( ) µ c. Standardzng random varables (z-scores): Z (has mean zero and unt varance) µ ( ) ( ) 0. Mean: EZ E ( E µ ). Varance: Var Z E Z Var ( ) ( ) ( ) 3. Jont densty functons a. Consder and, two random varables (e.g. people are randomly drawn from a populaton and ther heghts and weghts are recorded) b. If dscrete, then the jont densty s defned by f ( xy, ) P ( x& y) c. Note that P ( x) f ( x) P ( x& y) f ( xy, ). y y. So, the margnal densty P ( x) f ( x) s the sum over the jont denstes f ( xy, ). y d. Here s an example.. In the followng table, the random varable takes on three values (x, x and x3), and takes on two (y and y). The fgures n the box are the jont probabltes, f ( xy, ) P ( x& y). And so, for example, f ( x, y) P ( x & y)... And the margnal probabltes can be recovered from the jont probabltes by just summng across the rows and columns. So, for example, P ( x) f ( x) P ( x & yj) j, f ( x, y) + f ( x, y). +..4. y y x 0. 0. 0.4 P(x) x 0. 0.3 0.4 P(x) x3 0. 0. 0. P(x3) 0.4 0.6 P(y) P(y) 3
e. Independence. f, ( x, y) P( xp ) ( y) f ( x) f ( y) for all values of and, (x,y) the jont densty functon s the product of the margnal denstes (apples to dscrete and contnuous dstrbutons). and n the prevous example are not ndependent, snce, for example: ( )( ) f, ( x, y). P( x) P ( y) f( x) f( y).4.4.6. We can extend to many ndependent random varables: f ( x, x,, x) P ( x, x,, x),,, n n n n n f ( ) ( ) ( ) ( ) x f x f x n n f x. Not ndependent means dependent 4. Measures of assocaton a. Consder two random varables, and. b. Covarance: Cov(, ) E( µ )( µ ) ( x µ )( y µ ) f ( x, y) c. Some examples: and both have mean 0 n the followng examples. On the left, most of the data are n quadrants I and III, where ( x µ )( y µ ) > 0, and so when you sum those products you get a postve covarance. Most of the acton on the rght s n quadrants II and IV where ( x µ )( y µ ) < 0, and so those products sum to a negatve covarance. d. Propertes:. Cov(, ) E( ) µ µ. Note that Cov(, ) E( µ )( µ ) Var( ) 4
. Measures the extent to whch there s a lnear relatonshp between and v. If Cov(, ) > 0 then as llustrated above, and tend to move together n a postve drecton, so that ncreases s are on average assocated wth ncreases n and f the covarance s negatve, then they tend to move n opposte drectons v. If and are ndependent, then Cov(, ) 0. Opposte need not hold 0 does not necessarly mply ndependence t could just mean that there s a hghly non-lnear relatonshp between and.. Here s an example of & havng zero covarance, but not beng ndependent: Jont & Margnal Denstes Cov Contrbutons 0 0 - - 0.33 0.33 E() - 0.67 (0.33) 0 0.33-0.33 0 0 - - - 0.33 0.33 (0.67) 0.33 0.33 0.67 Cov(,) 0.0000 E() 0.67 v. Cov( a + b, c + d ) bdcov(, ) the magntude of the covarance s never greater than the product of the magntudes of the standard devatons (ths s an nstance of the Cauchy-Schwartz Inequalty) v. e. Varances of sums of random varables. Var( + ) + Cov(, ) +. More generally: Var( a + a ) a + a a Cov(, ) + a. So f Cov(, ) 0 (so that and are uncorrelated), then Var( + ) Var( ) + Var( ) (the varance of the sum s the sum of the varances) v. And even more generally:. n n n Var a a a Cov(, ) note that when j, the term s j j j (, ) a Cov a. If the s are parwse uncorrelated, then Cov(, j) 0 when j, and so n n n n n n ths case, Var a aajcov(, j) aacov(, ) a j 5
f. Correlaton: a. If they are parwse uncorrelated, then the varance of the sum s the sum of the varances. Cov(, ) Corr(, ) ρ StdDev( ) StdDev( ). so ρ. And smlar to above:. If Cov(, ) 0, then ρ 0.. If and are ndependent, then they are uncorrelated and ρ 0 3. ρ captures the extent to whch there s a lnear relatonshp between and whch s smlar to, though not the same as, the extent to whch they move together 4. If a + b, then. Propertes: avar( ) a or a a Cov(, ) Corr(, ) ρ StdDev( ) StdDev( ) and so f and are lnearly related they have a correlaton of + or -.. Corr a b a b Corr aa < 0 ( +, + ) (, ) f aa > 0, and Corr(, ) f. So lnear transformatons of random varables may affect the sgn of the correlaton, but not the magntude. 5. Interestng result a. Suppose that the random varable s a lnear functon of another random varable plus an addtve random error U, whch s uncorrelated wth, then:. a + b + U, where, and U are all random varables and Cov(, U ) 0. Cov(, ) Cov(, a + b + U ) Cov(, a) + bcov(, ) + Cov(, U ). Snce Cov(, a) Cov(, U ) 0, Cov(, ) bcov(, ) Cov(, ) StdDev( ) v. or b ρ Corr(, ) Cov(, ) StdDev( ). Ths s a relatonshp that wll haunt you throughout the semester. 6
6. Condtonal dstrbutons a. Recall the defnton of condtonal probabltes: suggest that P ( y x) P ( y& x) P ( x) PA ( B) PA ( B), whch mght PB ( ) f, ( xy, ) b. If dscrete, then f ( y x) P ( y x) f ( x) same formula apples to contnuous dstrbutons. Dvdng by f ( x ) effectvely scales up the margnal denstes. and ensures that you have a vald densty functon, snce f, ( xy, ) f ( x) f ( y x) dy dy f, ( x, y) dy f ( x) f ( x). f ( x) c. If and are ndependent then the condtonal dstrbutons and margnal dstrbutons are the same. f ( y x) f ( y) and f ( x y) f ( x). In words: If and are ndependent than knowng the partcular value of, y, tells you nothng new about, and vce-versa d. Condtonal expectatons and varances. The expected value of condtonal on beng a certan value as the value of changes, the condtonal expectaton of gven x may also change. E ( x) E ( x) yp j ( yj x) yjf ( yj x). If and are ndependent, then E ( x) E ( ) knowng the value of doesn t change the expected value of. Condtonal varances are smlarly defned the expected squared devaton from the condtonal mean:. Var ( x) E( [ E ( x) ] x) E ( x) ( E ( x) ) 7
7. Law of Iterated Expectatons { x } a. E[ g(, ) ] E E [ g( x, ) x] snce Eg (, ) g x, yj P ( x & yj) and. [ ] ( ) x yj E{ Ex g( x, ) x } g( x, yj) P( yj x) P( x) x yj. [ ] ( ) ( ) g x, y P( y x ) P( x ) g x, y P( x & y ) j j j j x yj x yj b. Ths obvously holds for contnuous random varables as well. Ex g( x, ) x k for some constant k. so that condtonal on x (or the x s), the expected value of g( x, ) s some constant k. And because that expectaton s always k, for any x, the overall expectaton of (, ) Eg (, ) k. c. Why ths s so useful? In many cases, we wll show that [ ] g must be k as well: [ ] d. For example: We wll show that under certan assumptons, and condtonal on the x s, the OLS estmator s an unbased estmator, so that t s expectaton, condtonal on the x s, s n the fact the true parameter value. But snce ths holds for any set of x s, t must also be true overall. And so n ths case, we can just say that the OLS estmator s an unbased estmator, and drop the condtonal on the x s. 8. The Normal dstrbuton a. Standard Normal (Gaussan): ( µ, ) b. If s Ν ( µ, ) c. Propertes:, then Z Ν has mean µ and varance µ s ( 0,). If s Ν ( µ, ) then a + b Ν ( aµ + b, a ). If and + Ν( µ, ) 8 Ν (the Standard Normal dstrbuton) are ndependent wth the same dstrbuton, Ν ( µ, ) + Ν( µ, ).. Ths mples that ( ), then. More generally, assume that n random varables (,, n ) are ndependently and dentcally dstrbuted Ν ( µ, ), then Ν( n µ, n ) and Ν( µ, ). n n
v. s a specfc form of the more general weghted average α, n where 0 α for all and α.. wll have mean αµ µ α µ. and varance α α, and wll be Normally dstrbuted. 9. Appendx I - Correlaton and Lnear Relatonshps: ρ P ( β0 + β) a. Lnear mples a correlaton of + or -. Suppose that β0 + β and β 0.. Then cov(, ) cov(, β0 + β ) E(( µ )( β0 + β β0 β µ )) β E(( µ ) ) β var( ).. And snce var( ) E(( β0 + β β0 β µ ) ) β E(( µ ) ) β var( ), the correlaton of and s: cov(, ) β var( ) β ρ + or dependng on the var( ) var( ) var( ) β var( ) β sgn of β 0. b. Non-lnear mples correlaton not + or - here s an example:. Suppose that U ( β0 + β), where µ U 0 and cov(, U ) 0, but var( U ) U 0 (so we don t have a perfectly lnear relatonshp between and ).. Then cov(, ) cov(, β0 + β + U) E(( µ )( β0 + β + U β0 β µ )).. And snce var( ) E(( β0 + β + U β0 βµ ) ) β var( ) + U, the correlaton of β E(( µ ) ) + β cov( U, ) + var( U) and s: ρ v. Snce cov(, ) β var( ) var( ) var( ) var( ) var( ) var( ) 0 ( β + U ) U U, the denomnator wll be larger n magntude than the numerator and so ρ <. v. Notce that f 0, then we have a lnear relatonshp, and as above U ρ + or.. 9
0. Appendx II: Covarance and ndependence Not Independent! Independent! ^ margnal marg 0 0.5 for 0 0.5-0% 0% 0% 0% - 4% 8% 8% 0% -0.5 0% 0% 0% 0% -0.5 4% 8% 8% 0% 0 0% 0% 0% 0% 0 4% 8% 8% 0% 0.5 0% 0% 0% 0% 0.5 4% 8% 8% 0% 0% 0% 0% 0% 4% 8% 8% 0% margnal for 0% 40% 40% marg 0% 40% 40% Indep! Covarance calculaton prob 0% - 0% -0.5 0.5 0% 0 0 0% 0.5 0.5 0% mean 0 0.5 varance 0.65 0.88 covarance 0 covar 0 -mu -mu product - 0.5-0.5-0.5-0.5 0.5 0-0.5 0 0.5-0.5-0.5 0.5 0.5 0