The Strong Law of Large Numbers

Similar documents
4 Sequences of measurable functions

5. Stochastic processes (1)

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

MODULE 3 FUNCTION OF A RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES PROBABILITY DISTRIBUTION OF A FUNCTION OF A RANDOM VARIABLE

SOLUTIONS TO ECE 3084

Cash Flow Valuation Mode Lin Discrete Time

arxiv: v1 [math.pr] 19 Feb 2011

Chapter 2. First Order Scalar Equations

An Introduction to Malliavin calculus and its applications

Lecture 10: The Poincaré Inequality in Euclidean space

Representation of Stochastic Process by Means of Stochastic Integrals

Linear Response Theory: The connection between QFT and experiments

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS

Lecture 33: November 29

Asymptotic Equipartition Property - Seminar 3, part 1

23.5. Half-Range Series. Introduction. Prerequisites. Learning Outcomes

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Explicit Formulas In this lecture we try to apply the method of characteristics to the Hamilton-Jacobi equation: u t

6. Stochastic calculus with jump processes

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux

Discrete Markov Processes. 1. Introduction

18 Biological models with discrete time

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Notes for Lecture 17-18

The Arcsine Distribution

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

GMM - Generalized Method of Moments

Math 10B: Mock Mid II. April 13, 2016

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

f(s)dw Solution 1. Approximate f by piece-wise constant left-continuous non-random functions f n such that (f(s) f n (s)) 2 ds 0.

Avd. Matematisk statistik

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Monotonic Solutions of a Class of Quadratic Singular Integral Equations of Volterra type

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

14 Autoregressive Moving Average Models

Predator - Prey Model Trajectories and the nonlinear conservation law

Class Meeting # 10: Introduction to the Wave Equation

Vehicle Arrival Models : Headway

in Engineering Prof. Dr. Michael Havbro Faber ETH Zurich, Switzerland Swiss Federal Institute of Technology

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

Transform Techniques. Moment Generating Function

. Now define y j = log x j, and solve the iteration.

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Empirical Process Theory

Lecture Notes 2. The Hilbert Space Approach to Time Series

Comparison between the Discrete and Continuous Time Models

Stationary Distribution. Design and Analysis of Algorithms Andrei Bulatov

1 Review of Zero-Sum Games

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

Generalized Snell envelope and BSDE With Two general Reflecting Barriers

A proof of Ito's formula using a di Title formula. Author(s) Fujita, Takahiko; Kawanishi, Yasuhi. Studia scientiarum mathematicarum H Citation

Basic notions of probability theory (Part 2)

Introduction to Probability and Statistics Slides 4 Chapter 4

KEY. Math 334 Midterm III Winter 2008 section 002 Instructor: Scott Glasgow

Reliability of Technical Systems

Reading from Young & Freedman: For this topic, read sections 25.4 & 25.5, the introduction to chapter 26 and sections 26.1 to 26.2 & 26.4.

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Problem Set 9 Due December, 7

EECE 301 Signals & Systems Prof. Mark Fowler

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

) were both constant and we brought them from under the integral.

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

Stochastic models and their distributions

Basic definitions and relations

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

EECE 301 Signals & Systems Prof. Mark Fowler

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Solutions from Chapter 9.1 and 9.2

Echocardiography Project and Finite Fourier Series

Optimality Conditions for Unconstrained Problems

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

An random variable is a quantity that assumes different values with certain probabilities.

Math 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities:

EECE251. Circuit Analysis I. Set 4: Capacitors, Inductors, and First-Order Linear Circuits

Matlab and Python programming: how to get started

Lecture 2 April 04, 2018

Two Coupled Oscillators / Normal Modes

E β t log (C t ) + M t M t 1. = Y t + B t 1 P t. B t 0 (3) v t = P tc t M t Question 1. Find the FOC s for an optimum in the agent s problem.

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence

ENGI 9420 Engineering Analysis Assignment 2 Solutions

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

Product Integration. Richard D. Gill. Mathematical Institute, University of Utrecht, Netherlands EURANDOM, Eindhoven, Netherlands August 9, 2001

Linear Cryptanalysis

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:

Properties Of Solutions To A Generalized Liénard Equation With Forcing Term

Math 334 Fall 2011 Homework 11 Solutions

Heavy Tails of Discounted Aggregate Claims in the Continuous-time Renewal Model

Approximation Algorithms for Unique Games via Orthogonal Separators

KEY. Math 334 Midterm I Fall 2008 sections 001 and 003 Instructor: Scott Glasgow

Advanced Organic Chemistry

13.3 Term structure models

EE650R: Reliability Physics of Nanoelectronic Devices Lecture 9:

Oscillation of an Euler Cauchy Dynamic Equation S. Huff, G. Olumolode, N. Pennington, and A. Peterson

On a Fractional Stochastic Landau-Ginzburg Equation

U( θ, θ), U(θ 1/2, θ + 1/2) and Cauchy (θ) are not exponential families. (The proofs are not easy and require measure theory. See the references.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

8. Basic RL and RC Circuits

Transcription:

Lecure 9 The Srong Law of Large Numbers Reading: Grimme-Sirzaker 7.2; David Williams Probabiliy wih Maringales 7.2 Furher reading: Grimme-Sirzaker 7.1, 7.3-7.5 Wih he Convergence Theorem (Theorem 54) and he Ergodic Theorem (Theorem 55) we have wo very differen saemens of convergence of somehing o a saionary disribuion. We are looking a a recurren Markov chain (X ), i.e. one ha visis every sae a arbirarily large imes, so clearly X iself does no converge, as. In his lecure, we look more closely a he differen ypes of convergence and develop mehods o show he so-called almos sure convergence, of which he saemen of he Ergodic Theorem is an example. 9.1 Modes of convergence Definiion 59 Le X n, n 1, and X be random variables. Then we define 1. X n X in probabiliy, if for all ε >, P( X n X > ε) as n. 2. X n X in disribuion, if P(X n x) P(X x) as n, for all x R a which x P(X x) is coninuous. 3. X n X in L 1, if E( X n ) < for all n 1 and E( X n X ) as n. 4. X n X almos surely (a.s.), if P(X n X as n ) = 1. Almos sure convergence is he noion ha we will sudy in more deail here. I helps o consider random variables as funcions X n : Ω R on a sample space Ω, or a leas as funcions of a common, ypically infinie, family of independen random variables. Wha is differen here from previous pars of he course (excep for he Ergodic Theorem, which we ye have o inspec more horoughly), is ha we wan o calculae probabiliies ha fundamenally depend on an infinie number of random variables. So far, we have been able o rever o evens depending on only finiely many random variables by condiioning. This will no work here. 47

48 Lecure Noes Par B Applied Probabiliy Oxford MT 27 Le us sar by recalling he definiion of convergence of sequences, as n, x n x m 1 nm 1 n nm x n x < 1/m. If we wan o consider all sequences (x n ) n 1 of possible values of he random variables (X n ) n 1, hen n m = inf{k 1 : n k x n x < 1/m} N { } will vary as a funcion of he sequence (x n ) n 1, and so i will become a random variable N m = inf{k 1 : n k X n X < 1/m} N { } as a funcion of (X n ) n 1. This definiion of N m permis us o wrie P(X n X) = P( m 1 N m < ). This will occasionally help, when we are given almos sure convergence, bu is no much use when we wan o prove almos sure convergence. To prove almos sure convergence, we can ransform as follows P(X n X) = P( m 1 N 1 n N X n X < 1/m) = 1 P( m 1 N 1 n N X n X 1/m) =. We are used o evens such as A m,n = { X n X 1/m}, and we undersand evens as subses of Ω, or loosely idenify his even as se of all ((x k ) k 1, x) for which x n x 1/m. This is useful, because we can now ranslae m 1 N 1 n N ino se operaions and wrie P( m 1 N 1 n N A m,n ) =. This even can only have zero probabiliy if all evens N 1 n N A m,n, m 1, have zero probabiliy (formally, his follows from he sigma-addiiviy of he measure P). The Borel-Canelli lemma will give a crierion for his. Proposiion 6 The following implicaions hold X n X almos surely X n X in probabiliy X n X in disribuion X n X in L 1 E(X n ) E(X) No oher implicaions hold in general. Proof: Mos of his is Par A maerial. Some counerexamples are on Assignmen 5. I remains o prove ha almos sure convergence implies convergence in probabiliy. Suppose, X n X almos surely, hen he above consideraions yield P( m 1 N m < ) = 1, i.e. P(N k < ) P( m 1 N m < ) = 1 for all k 1. Now fix ε >. Choose m 1 such ha 1/m < ε. Then clearly X n X > ε > 1/m implies N m > n so ha P( X n X > ε) P(N m > n) P(N m = ) =, as n, for any ε >. Therefore, X n X in probabiliy.

Lecure 9: The Srong Law of Large Numbers 49 9.2 The firs Borel-Canelli lemma Le us now work on a sample space Ω. I is safe o hink of Ω = R N R and ω Ω as ω = ((x n ) n 1, x) as he se of possible oucomes for an infinie family of random variables (and a limiing variable). The Borel-Canelli lemmas are useful o prove almos sure resuls. Paricularly limiing resuls ofen require cerain evens o happen infiniely ofen (i.o.) or only a finie number of imes. Logically, his can be expressed as follows. Consider evens A n Ω, n 1. Then ω A n i.o. n 1 m n ω A m ω A m n 1 m n Lemma 61 (Borel-Canelli (firs lemma)) Le A = n 1 infiniely many of he evens A n occur. Then P(A n ) < P(A) = n 1 Proof: We have ha A m n A m for all n 1, and so P(A) P ( m n A m ) m n P(A m ) as n m n A m be he even ha whenever n 1 P(A n) <. 9.3 The Srong Law of Large Numbers Theorem 62 Le (X n ) n 1 be a sequence of independen and idenically disribued (iid) random variables wih E(X 4 1) < and E(X 1 ) = µ. Then S n n := 1 n n X i µ i=1 almos surely. Fac 63 Theorem 62 remains valid wihou he assumpion E(X 4 1) <, jus assuming E( X 1 ) <. The proof for he general resul is hard, bu under he exra momen condiion E(X 4 1) < here is a nice proof. Lemma 64 In he siuaion of Theorem 62, here is a consan K < such ha for all n E((S n nµ) 4 ) Kn 2.

5 Lecure Noes Par B Applied Probabiliy Oxford MT 27 Proof: Le Z k = X k µ and T n = Z 1 +... + Z n = S n nµ. Then ( n ) 4 E(Tn 4 ) = E Z i = ne(z1 4 ) + 3n(n 1)E(Z2 1 Z2 2 ) Kn2 i=1 by expanding he fourh power and noing ha mos erms vanish such as E(Z 1 Z2) 3 = E(Z 1 )E(Z2) 3 =. K was chosen appropriaely, say K = 4 max{e(z 4 1), (E(Z 2 1)) 2 }. Proof of Theorem 62: By he lemma, ( (Sn E n µ ) 4 ) Kn 2 Now, by Tonelli s heorem, ( ( ) ) 4 Sn E n µ = ( (Sn ) ) 4 E n µ < n n 1 n 1 Bu if a series converges, he underlying sequence converges o zero, and so ( ) 4 Sn n µ < a.s. ( ) 4 Sn n µ almos surely S n n µ almos surely. This proof did no use he Borel-Canelli lemma, bu we can also conclude by he Borel-Canelli lemma: Proof of Theorem 62: We know by Markov s inequaliy ha ( ) 1 P n S n nµ n γ E((S n/n µ) 4 ) = Kn 2+4γ. n 4γ Define for γ (, 1/4) { } 1 A n = n S n nµ n γ by he firs Borel-Canelli lemma, where A = n 1 if and only if S n N n N n µ < n γ n 1 P(A n ) < P(A) = m n A n. Bu now, even A c happens S n n µ.

Lecure 9: The Srong Law of Large Numbers 51 9.4 The second Borel-Canelli lemma We won need he second Borel-Canelli lemma in his course, bu include i for compleeness. Lemma 65 (Borel-Canelli (second lemma)) Le A = n 1 m n A n be he even ha infiniely many of he evens A n occur. Then P(A n ) = and (A n ) n 1 independen P(A) = 1. n 1 Proof: The conclusion is equivalen o P(A c ) =. By de Morgan s laws A c = A c m. n 1 m n However, ( r ) P m n A c m = lim r P m=n A c m = (1 P(A m )) exp ( P(A m )) = exp P(A m ) = m n m n m n whenever n 1 P(A n) =. Thus P(A c ) = lim n P m n A c m =. As a echnical deail: o jusify some of he limiing probabiliies, we use coninuiy of P along increasing and decreasing sequences of evens, ha follows from he sigma-addiiviy of P, cf. Grimme-Sirzaker, Lemma 1.3.(5). 9.5 Examples Example 66 (Arrival imes in Poisson process) A Poisson process has independen and idenically disribued iner-arrival imes (Z n ) n wih Z n Exp(λ). We denoed he parial sums (arrival imes) by T n = Z +... + Z n 1. The Srong Law of Large Numbers yields T n n 1 λ almos surely, as n.

52 Lecure Noes Par B Applied Probabiliy Oxford MT 27 Example 67 (Reurn imes of Markov chains) For a posiive-recurren discree-ime Markov chain we denoed by N i = N (1) i = inf{n > : M n = i}, N (m+1) i = inf{n > N (m) i : M n = i}, m N, he successive reurn imes o. By he srong Markov propery, he random variables N (m+1) i N (m) i, m 1 are independen and idenically disribued. If we define N () i = and sar from i, hen his holds for m. The Srong Law of Large Number yields Similarly, in coninuous ime, for N (m) i m E i(n i ) almos surely, as m. we ge H i = H (1) i = inf{ T 1 : X = i}, H (m) i = T (m) N, m N, i H (m) i m E i(h i ) = m i almos surely, as m. Example 68 (Empirical disribuions) If (Y n ) n 1 is an infinie sample (independen and idenically disribued random variables) from a discree disribuion ν on S, hen he random variables B n (i) = 1 {Yn=i}, n 1, are also independen and idenically disribued for each fixed i S, as funcions of independen variables. The Srong Law of Large Numbers yields ν (n) i = #{k = 1,...,n : Y k = i} = B(i) 1 +... + B n (i) n n E(B (i) 1 ) = P(Y 1 = i) = ν i almos surely, as n. The probabiliy mass funcion ν (n) is called empirical disribuion. I liss relaive frequencies in he sample and, for a specific realisaion, can serve as an approximaion of he rue disribuion. In applicaions of saisics, i is he sample disribuion associaed wih a populaion disribuion. The resul ha empirical disribuions converge o he rue disribuion, is rue uniformly in i and in higher generaliy, i is usually referred o as he Glivenko-Canelli heorem. Remark 69 (Discree ergodic heorem) If (M n ) n is a posiive-recurren discreeime Markov chain, he Ergodic Theorem is a saemen very similar o he example of empirical disribuions #{k =,..., n 1 : M k = i} n P η (M = i) = η i almos surely, as n, for a saionary disribuion η, bu of course, he M n, n, are no independen (in general). Therefore, we need o work a bi harder o deduce he Ergodic Theorem from he Srong Law of Large Numbers.

Lecure 1 Renewal processes and equaions 1.1 Moivaion and definiion Reading: Grimme-Sirzaker 1.1-1.2; Ross 7.1-7.3 So far, he opic has been coninuous-ime Markov chains, and we ve inroduced hem as discree-ime Markov chains wih exponenial holding imes. In his seing we have a heory very much similar o he discree-ime heory, wih independence of fuure and pas given he presen (Markov propery), ransiion probabiliies, invarian disribuions, class srucure, convergence o equilibrium, ergodic heorem, ime reversal, deailed balance ec. A few odd feaures can occur, mainly due o explosion. These parallels are due o he exponenial holding imes and heir lack of memory propery which is he key o he Markov propery in coninuous ime. In pracice, his assumpion is ofen no reasonable. Example 7 Suppose ha you coun he changing of baeries for an elecrical device. Given ha he baery has been in use for ime, is is residual lifeime disribued as is oal lifeime? We would assume his, if we were modelling wih a Poisson process. We may wish o replace he exponenial disribuion by oher disribuions, e.g. one ha canno ake arbirarily large values or, for oher applicaions, one ha can produce clusering effecs (many shor holding imes separaed by significanly longer ones). We sared he discussion of coninuous-ime Markov chains wih birh processes as generalised Poisson processes. Similarly, we sar here generalising he Poisson process o have non-exponenial bu independen idenically disribued iner-arrival imes. Definiion 71 Le (Z n ) n be a sequence of independen idenically disribued posiive random variables, T n = n 1 k= Z k, n 1, he parial sums. Then he process X = (X ) defined by X = #{n 1 : T n } is called a renewal process. The common disribuion of Z n, n, is called iner-arrival disribuion. 53

54 Lecure Noes Par B Applied Probabiliy Oxford MT 27 Example 72 If (Y ) is a coninuous-ime Markov chain wih Y = i, hen Z n = H (n+1) i H (n) i, he imes beween successive reurns o i by Y, are independen and idenically disribued (by he srong Markov propery). The associaed couning process X = #{n 1 : H (n) } couning he visis o i is hus a renewal process. 1.2 The renewal funcion Definiion 73 The funcion m() := E(X ) is called he renewal funcion. I plays an imporan role in renewal heory. Remember ha for Z n Exp(λ) we had X Poi(λ) and in paricular m() = E(X ) = λ. To calculae he renewal funcion for general renewal processes, we should invesigae he disribuion of X. Noe ha, as for birh processes, so ha we can express X = k T k < T k+1, P(X = k) = P(T k < T k+1 ) = P(T k ) P(T k+1 ) in erms of he disribuions of T k = Z +... + Z k 1, k 1. Recall ha for wo independen coninuous random variables S and T wih densiies f and g, he random variable S + T has densiy (f g)(u) = f(u )g()d, u R, he convoluion (produc) of f and g, and if S and T, hen (f g)(u) = u f(u )g()d, u. I is no hard o check ha he convoluion produc is symmeric, associaive and disribues over sums of funcions. While he firs wo of hese properies ranslae as S + T = T + S and (S + T) + U = S + (T + U) for associaed random variables, he hird propery has no such meaning, since sums of densiies are no longer probabiliy densiies. However, he definiion of he convoluion produc makes sense for general nonnegaive inegrable funcions, and we will mee oher relevan examples soon. We can define convoluion powers f (1) = f and f (k+1) = f f (k), k 1. Then P(T k ) = if Z n, n, are coninuous wih densiy f. f Tk (s)ds = f (k) (s)ds,

Lecure 1: Renewal processes and equaions 55 Proposiion 74 Le X be a renewal process wih inerarrival densiy f. Then m() = E(X ) is differeniable in he weak sense ha i is he inegral funcion of m (s) := f (k) (s) k=1 Lemma 75 Le X be an N-valued random variable. Then E(X) = k 1 P(X k). Proof: We use Tonelli s Theorem P(X k) = P(X = j) = k 1 k 1 j k j 1 j k=1 P(X = j) = j jp(x = j) = E(X). Proof of Proposiion 74: Le us inegrae k=1 f (k) (s) using Tonelli s Theorem k=1 f (k) (s)ds = k=1 f (k) (s)ds = P(T k ) = k=1 P(X k) = E(X ) = m(). k=1 1.3 The renewal equaion For coninuous-ime Markov chains, condiioning on he firs ransiion ime was a powerful ool. We can do his here and ge wha is called he renewal equaion. Proposiion 76 Le X be a renewal process wih inerarrival densiy f. Then m() = E(X ) is he unique (locally bounded) soluion of m() = F() + where F() = f(s)ds = P(Z 1 ). m( s)f(s)ds, i.e. m = F + f m, Proof: Condiioning on he firs arrival will involve he process X u = X T1 +u, u. Noe ha X = 1 and ha X u 1 is a renewal process wih inerarrival imes Z n = Z n+1, n, independen of T 1. Therefore E(X ) = f(s)e(x T 1 = s)ds = f(s)e( X s )ds = F(s) + f(s)m( s)ds.

56 Lecure Noes Par B Applied Probabiliy Oxford MT 27 For uniqueness, suppose ha also l = F + f l, hen α = l m is locally bounded and saisfies α = f α = α f. Ieraion gives α = α f (k) for all k 1 and, summing over k gives for he righ hand side somehing finie: ( ) ( α f (k) () = α f (k) )() = (α m )() k 1 k 1 ( ) = α( s)m (s)ds sup α(u) u [,] m() < bu he lef-hand side is infinie unless α() =. Therefore l() = m(), for all. Example 77 We can express m as follows: m = F + F k 1 f (k). Indeed, we check ha l = F + F k 1 f (k) saisfies he renewal equaion: F + f l = F + F f + F j 2 f (j) = F + F k 1 f (k) = l, jus using properies of he convoluion produc. By Proposiion 76, l = m. Unlike Poisson processes, general renewal processes do no have a linear renewal funcion, bu i will be asympoically linear (Elemenary Renewal Theorem, as we will see). In fac, renewal funcions are in one-o-one correspondence wih inerarrival disribuions we do no prove his, bu i should no be oo surprising given ha m = F + f m is almos symmeric in f and m. Unlike he Poisson process, incremens of general renewal processes are no saionary (unless we change he disribuion of Z in a clever way, as we will see) nor independen. Some of he imporan resuls in renewal heory are asympoic resuls. These asympoic resuls will, in paricular, allow us o prove he Ergodic Theorem for Markov chains. 1.4 Srong Law and Cenral Limi Theorem of renewal heory Theorem 78 (Srong Law of renewal heory) Le X be a renewal process wih mean inerarrival ime µ (, ). Then X 1 µ almos surely, as. Proof: Noe ha X is consan on [T n, T n+1 ) for all n, and herefore consan on [T X, T X+1). Therefore, as soon as X >, T X < T X +1 = T X +1 X + 1. X X X X + 1 X

Lecure 1: Renewal processes and equaions 57 Now P(X ) = 1, since X n T n+1 = which is absurd, since T n+1 = Z +... + Z n is a finie sum of finie random variables. Therefore, we conclude from he Srong Law of Large Numbers for T n, ha Therefore, if X and T n /n µ, hen T X X µ almos surely, as. µ lim µ as, X bu his means P(X / 1/µ) P(X, T n /n µ) = 1, as required. Try o do his proof for convergence in probabiliy. The nasy ε expressions are no very useful in his conex, and he proof is very much harder. Bu we can now deduce a corresponding Weak Law of Renewal Theory, because almos sure convergence implies convergence in probabiliy. We also have a Cenral Limi Theorem: Theorem 79 (Cenral Limi Theorem of Renewal Theory) Le X = (X ) be a renewal process whose inerarrival imes (Y n ) n saisfy < σ 2 = V ar(y 1 ) < and µ = E(Y 1 ). Then X /µ N(, 1) in disribuion, as. σ2 /µ 3 The proof is no difficul and lef as an exercise on Assignmen 5. 1.5 The elemenary renewal heorem Theorem 8 Le X be a renewal process wih mean inerarrival imes µ and m() = E(X ). Then m() = E(X ) 1 µ as Noe ha his does no follow easily from he srong law of renewal heory since almos sure convergence does no imply convergence of means (cf. Proposiion 6, see also he couner example on Assignmen 5). In fac, he proof is longer and no examinable: we sar wih a lemma. Lemma 81 For a renewal process X wih arrival imes (T n ) n 1, we have E(T X+1) = µ(m() + 1), where m() = E(X ), µ = E(T 1 ).

58 Lecure Noes Par B Applied Probabiliy Oxford MT 27 This ough o be rue, because T X+1 is he sum of X +1 inerarrival imes, each wih mean µ. Taking expecaions, we should ge m() + 1 imes µ. However, if we condiion on X we have o know he disribuion of he residual inerarrival ime afer, bu wihou lack of memory, we are suck. Proof: Le us do a one-sep analysis on he quaniy of ineres g() = E(T X+1): g()= E(T X+1 T 1 =s)f(s)ds = ( s + E(TX s +1) ) f(s)ds + sf(s)ds = µ + (g f)(). This is almos he renewal equaion. In fac, g 1 () = g()/µ 1 saisfies he renewal equaion g 1 () = 1 µ g( s)f(s)ds = (g 1 ( s) + 1)f(s)ds = F() + (g 1 f)(), and, by Proposiion 76, g 1 () = m(), i.e. g() = µ(1 + m()) as required. Proof of Theorem 8: Clearly < E(T X+1) = µ(m() + 1) gives he lower bound lim inf m() 1 µ. For he upper bound we use a runcaion argumen and inroduce { Zj if Z Z j = Z j a = j < a a if Z j a wih associaed renewal process X. Zj Z j for all j implies X X for all, hence m() m(). Puing hings ogeher, we ge from he lemma again Therefore E( T X e ) = E( T X e ) E( Z +1 X e ) = µ( m() + 1) E( Z +1 X e+1 ) µ(m() + 1) a. m() 1 µ + a µ µ so ha lim sup m() 1 µ Now µ = E( Z 1 ) = E(Z 1 a) E(Z 1 ) = µ a a (by monoone convergence). Therefore lim sup m() 1 µ. Noe ha runcaion was necessary o ge E( Z ex+1 ) a. I would have been enough if we had E(Z X+1) = E(Z 1 ) = µ, bu his is no rue. Look a he Poisson process as an example. We know ha he residual lifeime has already mean µ = 1/λ, bu here is also he par of Z X+1 before ime. We will explore his in Lecure 11 when we discuss residual lifeimes in renewal heory.