Math 525: Lecture 5. January 18, PDF Free Download

Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the sequece does ot coverge to ay poit i R, we say it diverges. I our case, we always use coverge to mea coverges to a poit i R. Depedig o the cotext, sometimes people will be talkig about covergece i other spaces (e.g., if a, oe might say the sequece coverges to a poit i R = R {,+ }: this is a perfectly valid use of the termiology). Defiitio 1.2. Let (a ) R be a sequece. The... We say the series a coverges (writte a < ) if the sequece of partial sums (s N ) N=1 defied by s N = N a coverges. Otherwise, we say it diverges. I the coverget case, we defie a = lim N s N. The series a coverges absolutely if the series a coverges. Propositio 1.3. If a series coverges absolutely to L 0, the series coverges to a umber i [ L,L]. Proof. First, ote that if a L, the L a a a L. The remaider of the proof requires kowledge of Cauchy sequeces (if you are ot familiar with them, you ca safely skip this proof). Suppose a coverges absolutely. Defie s N = N a ad S N = N a. The, for N > M, s N s M = a N +a N+1 + +a M+1 a N + a N+1 + + a M+1 = S N S M. Sice (S N ) N is coverget, it is a Cauchy sequece. From the above, we see that (s N ) N is a Cauchy sequece ad hece coverget. Rearragig the terms i a series may chage its value. However, i some cases, we ca safely rearrage the terms i a series. Propositio 1.4. If a series is made up oly of positive terms, it ca be rearraged without chagig its sum. 1

2 Expectatio (discrete case) I this lecture, we defie expectatios for discrete radom variables. Hadlig the discrete case separately serves the purpose of buildig our ituitio of expectatios before we hadle the more difficult case of o-discrete radom variables. Recall that for a discrete radom variable X, we ca fid a coutable set {x } R such that P({X = x }) = 1. Note that this does ot ecessarily imply that the rage of X is {x } (remember, radom variables are fuctios from Ω to R). However, we ca defie a ew radom variable, call it Y, as follows: Y(ω) = x I {X=x}(ω). Note that P({Y = x }) = P( { I {X=x} = 1 } ) = P({X = x }), ad hece the radom variables X ad Y are, for all itets ad purposes, idetical. Therefore, for the remaider, we will always assume without loss of geerality that a discrete radom variable X has the form X(ω) = x I Λ (ω) for some partitio Λ 1,Λ 2,... of the sample space (i.e., Ω = Λ 1 Λ 2 ad Λ i Λ j = wheever i j). Defiitio 2.1. A discrete radom variable X is itegrable if x P(Λ ) <. Defiitio 2.2. The expectatio of a itegrable discrete radom variable X is EX = x P(Λ ). Example 2.3. Toss a coi N 1 times. Let X N be the umber of heads. If the probability of heads is p, the expectatio of X N is EX N = P({X = }) = =0 ( ) N p (1 p) N = pn. That is, you are expected to see pn heads o average. If the coi is fair, for example, pn = N/2 (half of the tosses should, o average, be heads). Note, i particular, that we oly defie the expectatio of radom variables that are itegrable, ad itegrability has to do with absolute covergece. 2

Remark 2.4. We have, so far, igored a techical issue. Earlier, we characterized a radom variable i terms of the sets Λ 1,Λ 2,... However, the choice of these sets is ot uique. For example, the costat radom variable X(ω) = 1 ca be writte i may ways. Two possibilities are X(ω) = 1I {Ω} (ω) ad X(ω) = 1I {Λ} (ω)+ 1I {Λ c }(ω) where Λ is ay subset of the sample space Ω. Sice the defiitio of expectatio depeds o a particular choice of the sets Λ 1,Λ 2,..., it is ot clear that the expectatio will remai the same if we chage our choice of Λ 1,Λ 2,... This techicality is hadled o page 55 of Walsh, Joh B. Kowig the odds: a itroductio to probability. Vol. 139. America Mathematical Soc., 2012. Example 2.5. Let X be a oegative iteger-valued radom variable (i.e., 0 P({X = }) = 1). We assume X has the form X(ω) = 0I {X=} (ω). If X is itegrable, E[X] = 0P({X = }) = 0P({X = 0})+1P({X = 1})+2P({X = 2})+ = (P({X = 1})+P({X = 2})+ )+(P({X = 2})+P({X = 3})+ ) = 1P({X }). Propositio 2.6. Let X ad Y be discrete radom variables ad a,b R. The, 1. If X ad Y are itegrable, so is ax +by ad E[aX +by] = aex +bey. 2. If X Y ad Y is itegrable, the X is itegrable. 3. If X ad Y are itegrable ad X Y, the EX EY. 4. If X is itegrable, EX E X. Proof. Recall that we ca partitio Ω ito evets Λ X 1,ΛX 2,... o which X is costat. We ca do the same for Y, obtaiig Λ Y 1,Λ Y 2,... This allows us to defie Λ ij = Λ X i Λ Y j, o which both X ad Y are costat. Sice (Λ ij ) i,j is a coutable sequece, let s relabel it (Λ ) ad take X = x ad Y = y o Λ. 1. Suppose X ad Y are itegrable. The, ax +by P(Λ ) a x P(Λ )+ b y P(Λ ) = a x P(Λ X )+ b y P(Λ Y ) = a E[ X ]+ b E[ Y ] ad hece ax + by is itegrable. Repeatig almost the exact same computatio as above without the absolute value sigs yields the desired result. 3

2. This follows from x P(Λ ) y P(Λ ). 3. Exercise. 4. Take Y = X i (3). Most importatly, the above propositio tells us that the expectatio is a liear fuctio. That is, let X be the set of all radom variables. Defie T : X R as the mappig from a radom variable to its expectatio: T(X) = EX. The, T is liear fuctio (i.e., T(aX +by) = at(x)+bt(y)). As a example of expectatios, we itroduce ow probability geeratig fuctio of a discrete radom variable. We poit out that our treatmet is a bit cavalier for the time beig, but we will come back to geeratig fuctios i a more pricipled maer. Before we move to this example, let s give a simple defiitio: Defiitio 2.7. The probability mass fuctio (PMF) of a discrete radom variable X is p: R [0,1] defied by { P(Λ ) if x = x p(x) = 0 otherwise. Example 2.8. Let X be a oegative iteger-valued radom variable. Defie G, the probability geeratig fuctio of X, by Igorig the itegrability of X, G(t) = G(t) = E [ t X]. p()t = p(0)+ =0 p()t where p is the probability mass fuctio of X. Beig a power series, G has a radius covergece 0 R which characterizes which values of t it coverges for (i.e., coverges for t < R ad diverges for t > R ). Sice G(1) = p()1 = =0 p() = 1, we kow that the radius of covergece must be at least oe (i.e., R 1). Furthermore, G(0) = p(0)+ =0 p()0 = p(0). 4

Now, if we take derivatives of G (for values of t iside the radius of covergece), we get G (t) = G (t) =. G (k) (t) = p()t 1 ( 1)p()t 2 =2 ( 1) ( k +1)p()t k. =k We coclude that p() = 1! G() (0) for = 1,2,... I the previous example, we wrote E[t X ] eve though t X was ot ecessarily itegrable for arbitrary t. We will ofte perform this abuse of otatio by writig EY for ay radom variable Y with the implicit uderstadig that the EY is oly well-defied whe Y is itegrable. Propositio 2.9. Let f: R R ad X be a discrete radom variable with probability mass fuctio p ad support {x }. The, Y = f X is itegrable if ad oly if f(x ) p(x ) <, i which case E[f(X)] = f(x )p(x ). 3 Variace Defiitio 3.1. Let X be a discrete radom variable. Its variace is defied as VarX = E [ (X EX) 2] (there is a implicit assumptio about itegrability i the defiitio of variace). Its stadard deviatio is VarX. Oce agai iterpretig E as a average, it is clear from the defiitio that variace is a measure of how far the radom variable is from the expectatio o average. Note also that VarX = E [ (X EX) 2] = E [ X 2 2XEX +(EX) 2] = E [ X 2] 2EXEX +(EX) 2 = E [ X 2] 2(EX) 2 +(EX) 2 = E [ X 2] (EX) 2, which gives us a useful formula for the variace of a radom variable. We will see ext class that EX 2 is referred to as the secod raw momet, ad aother ame for the variace is the secod cetral momet. 5

Example 3.2. Toss a coi N 1 times. Let X N be the umber of heads. Remember, we computed EX N = pn. Therefore, to get VarX N, it is sufficiet to compute E[X 2 N ]: E [ ] N XN 2 = 2 P({X N = }) = =0 ( ) N 2 p (1 p) N = p((n 1)Np+N). Exercise 3.3. Let X be a radom variable with fiite variace. The Var(aX +b) = a 2 Var(X). Exercise 3.4. Let X be a discrete radom variable. Show that if X 2 is itegrable, the X is itegrable (i.e., it is ot possible to have a radom variable with fiite variace but ifiite expectatio). Remark 3.5. For those familiar with measure theory, the above is a immediate cosequece of the deeper fact that for a fiite measure space, L q L p for 1 p q. 6

Math 525: Lecture 5. January 18, 2018