Math 525: Lecture 5. January 18, 2018

Similar documents
Convergence of random variables. (telegram style notes) P.J.C. Spreij

Introduction to Probability. Ariel Yadin. Lecture 7

Sequences and Series of Functions

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables

M17 MAT25-21 HOMEWORK 5 SOLUTIONS

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Distribution of Random Samples & Limit theorems

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Ma 530 Infinite Series I

Chapter 6 Infinite Series

Lecture 19: Convergence

Advanced Stochastic Processes.

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

This section is optional.

EE 4TM4: Digital Communications II Probability Theory

MAT1026 Calculus II Basic Convergence Tests for Series

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

Ma 530 Introduction to Power Series

lim za n n = z lim a n n.

Lecture 12: September 27

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin. Lecture 2

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Mathematics 170B Selected HW Solutions.

LECTURE 8: ASYMPTOTICS I

6.3 Testing Series With Positive Terms

Math 155 (Lecture 3)

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

Infinite Sequences and Series

1 Convergence in Probability and the Weak Law of Large Numbers

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Probability for mathematicians INDEPENDENCE TAU

Lecture 3 : Random variables and their distributions

Math F215: Induction April 7, 2013

2.4.2 A Theorem About Absolutely Convergent Series

1 Lecture 2: Sequence, Series and power series (8/14/2012)

Math 10A final exam, December 16, 2016

Chapter 0. Review of set theory. 0.1 Sets

2 Banach spaces and Hilbert spaces

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

An Introduction to Randomized Algorithms

Probability and Random Processes

Chapter 8. Euler s Gamma function

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =

Expectation and Variance of a random variable

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Parameter, Statistic and Random Samples

Solutions of Homework 2.

Chapter 6 Principles of Data Reduction

Sequences, Series, and All That

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Lecture 12: November 13, 2018

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

4. Partial Sums and the Central Limit Theorem

Alternating Series. 1 n 0 2 n n THEOREM 9.14 Alternating Series Test Let a n > 0. The alternating series. 1 n a n.

7 Sequences of real numbers

MATH 312 Midterm I(Spring 2015)

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

Seunghee Ye Ma 8: Week 5 Oct 28

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.

n=1 a n is the sequence (s n ) n 1 n=1 a n converges to s. We write a n = s, n=1 n=1 a n

MATH4822E FOURIER ANALYSIS AND ITS APPLICATIONS

A Proof of Birkhoff s Ergodic Theorem

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Lecture Chapter 6: Convergence of Random Sequences

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Notes #3 Sequences Limit Theorems Monotone and Subsequences Bolzano-WeierstraßTheorem Limsup & Liminf of Sequences Cauchy Sequences and Completeness

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

Sequence A sequence is a function whose domain of definition is the set of natural numbers.

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

University of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

0, otherwise. EX = E(X 1 + X n ) = EX j = np and. Var(X j ) = np(1 p). Var(X) = Var(X X n ) =

Sequences. A Sequence is a list of numbers written in order.

Machine Learning Theory (CS 6783)

The standard deviation of the mean

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

1 Generating functions for balls in boxes

Lecture 20: Multivariate convergence and the Central Limit Theorem

PROBLEM SET 5 SOLUTIONS 126 = , 37 = , 15 = , 7 = 7 1.

Council for Innovative Research

Discrete Probability Functions

MA131 - Analysis 1. Workbook 9 Series III

Are the following series absolutely convergent? n=1. n 3. n=1 n. ( 1) n. n=1 n=1

INFINITE SEQUENCES AND SERIES

SOLUTIONS TO EXAM 3. Solution: Note that this defines two convergent geometric series with respective radii r 1 = 2/5 < 1 and r 2 = 1/5 < 1.

Metric Space Properties

Lecture 8: Convergence of transformations and law of large numbers

Proposition 2.1. There are an infinite number of primes of the form p = 4n 1. Proof. Suppose there are only a finite number of such primes, say

Transcription:

Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the sequece does ot coverge to ay poit i R, we say it diverges. I our case, we always use coverge to mea coverges to a poit i R. Depedig o the cotext, sometimes people will be talkig about covergece i other spaces (e.g., if a, oe might say the sequece coverges to a poit i R = R {,+ }: this is a perfectly valid use of the termiology). Defiitio 1.2. Let (a ) R be a sequece. The... We say the series a coverges (writte a < ) if the sequece of partial sums (s N ) N=1 defied by s N = N a coverges. Otherwise, we say it diverges. I the coverget case, we defie a = lim N s N. The series a coverges absolutely if the series a coverges. Propositio 1.3. If a series coverges absolutely to L 0, the series coverges to a umber i [ L,L]. Proof. First, ote that if a L, the L a a a L. The remaider of the proof requires kowledge of Cauchy sequeces (if you are ot familiar with them, you ca safely skip this proof). Suppose a coverges absolutely. Defie s N = N a ad S N = N a. The, for N > M, s N s M = a N +a N+1 + +a M+1 a N + a N+1 + + a M+1 = S N S M. Sice (S N ) N is coverget, it is a Cauchy sequece. From the above, we see that (s N ) N is a Cauchy sequece ad hece coverget. Rearragig the terms i a series may chage its value. However, i some cases, we ca safely rearrage the terms i a series. Propositio 1.4. If a series is made up oly of positive terms, it ca be rearraged without chagig its sum. 1

2 Expectatio (discrete case) I this lecture, we defie expectatios for discrete radom variables. Hadlig the discrete case separately serves the purpose of buildig our ituitio of expectatios before we hadle the more difficult case of o-discrete radom variables. Recall that for a discrete radom variable X, we ca fid a coutable set {x } R such that P({X = x }) = 1. Note that this does ot ecessarily imply that the rage of X is {x } (remember, radom variables are fuctios from Ω to R). However, we ca defie a ew radom variable, call it Y, as follows: Y(ω) = x I {X=x}(ω). Note that P({Y = x }) = P( { I {X=x} = 1 } ) = P({X = x }), ad hece the radom variables X ad Y are, for all itets ad purposes, idetical. Therefore, for the remaider, we will always assume without loss of geerality that a discrete radom variable X has the form X(ω) = x I Λ (ω) for some partitio Λ 1,Λ 2,... of the sample space (i.e., Ω = Λ 1 Λ 2 ad Λ i Λ j = wheever i j). Defiitio 2.1. A discrete radom variable X is itegrable if x P(Λ ) <. Defiitio 2.2. The expectatio of a itegrable discrete radom variable X is EX = x P(Λ ). Example 2.3. Toss a coi N 1 times. Let X N be the umber of heads. If the probability of heads is p, the expectatio of X N is EX N = P({X = }) = =0 ( ) N p (1 p) N = pn. That is, you are expected to see pn heads o average. If the coi is fair, for example, pn = N/2 (half of the tosses should, o average, be heads). Note, i particular, that we oly defie the expectatio of radom variables that are itegrable, ad itegrability has to do with absolute covergece. 2

Remark 2.4. We have, so far, igored a techical issue. Earlier, we characterized a radom variable i terms of the sets Λ 1,Λ 2,... However, the choice of these sets is ot uique. For example, the costat radom variable X(ω) = 1 ca be writte i may ways. Two possibilities are X(ω) = 1I {Ω} (ω) ad X(ω) = 1I {Λ} (ω)+ 1I {Λ c }(ω) where Λ is ay subset of the sample space Ω. Sice the defiitio of expectatio depeds o a particular choice of the sets Λ 1,Λ 2,..., it is ot clear that the expectatio will remai the same if we chage our choice of Λ 1,Λ 2,... This techicality is hadled o page 55 of Walsh, Joh B. Kowig the odds: a itroductio to probability. Vol. 139. America Mathematical Soc., 2012. Example 2.5. Let X be a oegative iteger-valued radom variable (i.e., 0 P({X = }) = 1). We assume X has the form X(ω) = 0I {X=} (ω). If X is itegrable, E[X] = 0P({X = }) = 0P({X = 0})+1P({X = 1})+2P({X = 2})+ = (P({X = 1})+P({X = 2})+ )+(P({X = 2})+P({X = 3})+ ) = 1P({X }). Propositio 2.6. Let X ad Y be discrete radom variables ad a,b R. The, 1. If X ad Y are itegrable, so is ax +by ad E[aX +by] = aex +bey. 2. If X Y ad Y is itegrable, the X is itegrable. 3. If X ad Y are itegrable ad X Y, the EX EY. 4. If X is itegrable, EX E X. Proof. Recall that we ca partitio Ω ito evets Λ X 1,ΛX 2,... o which X is costat. We ca do the same for Y, obtaiig Λ Y 1,Λ Y 2,... This allows us to defie Λ ij = Λ X i Λ Y j, o which both X ad Y are costat. Sice (Λ ij ) i,j is a coutable sequece, let s relabel it (Λ ) ad take X = x ad Y = y o Λ. 1. Suppose X ad Y are itegrable. The, ax +by P(Λ ) a x P(Λ )+ b y P(Λ ) = a x P(Λ X )+ b y P(Λ Y ) = a E[ X ]+ b E[ Y ] ad hece ax + by is itegrable. Repeatig almost the exact same computatio as above without the absolute value sigs yields the desired result. 3

2. This follows from x P(Λ ) y P(Λ ). 3. Exercise. 4. Take Y = X i (3). Most importatly, the above propositio tells us that the expectatio is a liear fuctio. That is, let X be the set of all radom variables. Defie T : X R as the mappig from a radom variable to its expectatio: T(X) = EX. The, T is liear fuctio (i.e., T(aX +by) = at(x)+bt(y)). As a example of expectatios, we itroduce ow probability geeratig fuctio of a discrete radom variable. We poit out that our treatmet is a bit cavalier for the time beig, but we will come back to geeratig fuctios i a more pricipled maer. Before we move to this example, let s give a simple defiitio: Defiitio 2.7. The probability mass fuctio (PMF) of a discrete radom variable X is p: R [0,1] defied by { P(Λ ) if x = x p(x) = 0 otherwise. Example 2.8. Let X be a oegative iteger-valued radom variable. Defie G, the probability geeratig fuctio of X, by Igorig the itegrability of X, G(t) = G(t) = E [ t X]. p()t = p(0)+ =0 p()t where p is the probability mass fuctio of X. Beig a power series, G has a radius covergece 0 R which characterizes which values of t it coverges for (i.e., coverges for t < R ad diverges for t > R ). Sice G(1) = p()1 = =0 p() = 1, we kow that the radius of covergece must be at least oe (i.e., R 1). Furthermore, G(0) = p(0)+ =0 p()0 = p(0). 4

Now, if we take derivatives of G (for values of t iside the radius of covergece), we get G (t) = G (t) =. G (k) (t) = p()t 1 ( 1)p()t 2 =2 ( 1) ( k +1)p()t k. =k We coclude that p() = 1! G() (0) for = 1,2,... I the previous example, we wrote E[t X ] eve though t X was ot ecessarily itegrable for arbitrary t. We will ofte perform this abuse of otatio by writig EY for ay radom variable Y with the implicit uderstadig that the EY is oly well-defied whe Y is itegrable. Propositio 2.9. Let f: R R ad X be a discrete radom variable with probability mass fuctio p ad support {x }. The, Y = f X is itegrable if ad oly if f(x ) p(x ) <, i which case E[f(X)] = f(x )p(x ). 3 Variace Defiitio 3.1. Let X be a discrete radom variable. Its variace is defied as VarX = E [ (X EX) 2] (there is a implicit assumptio about itegrability i the defiitio of variace). Its stadard deviatio is VarX. Oce agai iterpretig E as a average, it is clear from the defiitio that variace is a measure of how far the radom variable is from the expectatio o average. Note also that VarX = E [ (X EX) 2] = E [ X 2 2XEX +(EX) 2] = E [ X 2] 2EXEX +(EX) 2 = E [ X 2] 2(EX) 2 +(EX) 2 = E [ X 2] (EX) 2, which gives us a useful formula for the variace of a radom variable. We will see ext class that EX 2 is referred to as the secod raw momet, ad aother ame for the variace is the secod cetral momet. 5

Example 3.2. Toss a coi N 1 times. Let X N be the umber of heads. Remember, we computed EX N = pn. Therefore, to get VarX N, it is sufficiet to compute E[X 2 N ]: E [ ] N XN 2 = 2 P({X N = }) = =0 ( ) N 2 p (1 p) N = p((n 1)Np+N). Exercise 3.3. Let X be a radom variable with fiite variace. The Var(aX +b) = a 2 Var(X). Exercise 3.4. Let X be a discrete radom variable. Show that if X 2 is itegrable, the X is itegrable (i.e., it is ot possible to have a radom variable with fiite variace but ifiite expectatio). Remark 3.5. For those familiar with measure theory, the above is a immediate cosequece of the deeper fact that for a fiite measure space, L q L p for 1 p q. 6