I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy =, f xy = 1. Thus we get the product rule X t Y t X Y = Y s dx s + X s dy s + b s β s ds. If b s β s then we obtain the familiar product rule from ordinary calculus. Back to quadratic variation: Theorem.1. If (X t is a standard process with the integral representation X t = a s ds + b s db s, where t T, then the quadratic variation of (X t exists and it is given by X t = b s ds for t T. From this we can derive the quadratic covariation, as mentioned in last week s notes. If X t and Y t are standard processes dx t = a t dt + b t db t dy t = α t dt + β t db t, then, writing a partition π as t t 1 t n, then lim µ(π (X ti X ti 1 (Y ti Y ti 1 b s β s ds in probability. This can be shown using a polarization trick. If we write the above sum as Q π (X t, Y t, then you can check: Q π (X t, Y t = 1 4 Q π(x t + Y t 1 4 Q π(x t Y t, Where for a process (Z t, we have set Q π (Z t = n (Z t i Z ti 1. As µ(π, this converges in probability to 1 4 (b s + β s ds 1 4 (b s β s ds = 1 b s β s ds.
Proof of quadratic variation theorem. The proof will be in several steps. Step 1. We begin with the first term and show: Proposition.. If a s is measurable, adapted, and then the quadratic variation of the process a s ds < a.s. exists and is equal to. A t = a s ds Proof. Let t [, T and let (π n be a sequence of partitions of [, t with µ(π n. Since u u a s ds is continuous (a.s., it is uniformly continuous on [, t. So for each ɛ > there is a (random δ = δ(ω such that for u v both in [, t with u v δ, one has v u a s ds ɛ a s ds. Now for any π = {t t n } with µ(π δ, we have (A ti A ti 1 = max 1 i n ɛ. (i (i a s ds a s ds (i a s ds This means that Q πn (X t a.s. for each fixed t, so we certainly have convergence in probability. Step. Preliminary bounds on quadratic variation for bounded martingales. Here we give some bounds that do not directly imply convergence, but will be helpful. Lemma.3. If (Z s s [,t is a continuous martingale such that Z s B for all s a.s., then for some C >, sup EQ π(z t CE(Z t Z < π and lim µ(π E(Z ti Z ti 1 4 =.
Proof. Write Q π(z t = [ n 1 (Z ti Z ti 1 4 + (Z ti Z ti 1 (Z tj Z tj 1. j=i+1 When we take expectation of this, the second is done by conditioning on F ti. Note that if we write j = Z tj Z tj 1, then the j s are uncorrelated given F ti (if j i + 1 this is a version of the familiar statement that martingale increments are uncorrelated, and so 4B E [ ( [ (Z t Z ti F ti = E j F ti = E (Z tj Z tj 1 F ti. j=i+1 j=i+1 Therefore EQ π(z t 4B 1B E(Z ti Z ti 1 + 8B E E i = 1B E(Z t Z. [ n 1 (Z ti Z ti 1 This establishes the first inequality. For the second, we again use the modulus of continuity: ω(ɛ = sup{ Z u Z v : u v t and u v ɛ}, which is bounded by B and satisfies lim ɛ ω(δ = (as Z s is uniformly continuous. Thus E (Z ti Z ti 1 4 E [ ω (µ(πq π (Z t ω (µ(π Q π (Z t. By the last part, Q π (Z t is bounded. Also DCT implies that ω (µ(π as µ(π. Step 3. Proof in the bounded case. Proposition.4. If b s is measurable, adapted, and satisfies T b s ds C a.s., then the quadratic variation of Z t = b s db s (for t T exists and equals b s ds a.s. for each fixed t [, T. Proof. The proof for BM consisted of showing analyzing Q π (B t t = [ (Bti B ti 1 (t i, 3
and showing that this quantity, in addition to being mean, has second moment going to zero, so it converges to zero in probability. Taking the same approach, put A π = Q π (Z t b s ds = [ i (Z ti Z ti 1 b s ds. Writing i for the summand on the right, you can check that once again the i s are mean zero and uncorrelated (you have to use conditional isometry here, and so ( EA π = E Q π ( t b s ds = Therefore using (x + y x + y and the last step, [ lim sup EA π lim sup E(Z ti Z ti 1 4 + µ(π µ(π (i = lim sup E b s ds. µ(π E i. (i E b s ds The proof that this last term is is then identical to that given in step 1. Indeed, we take the usual modulus of continuity { v } ω(δ = sup b s ds : u v T and u v δ. Then (i u b s ds ω(µ(π b s ds ω(µ(πc. Since ω(δ is bounded and approaches as δ, the BCT implies the expectation of this converges to. Since we have shown that EA π, A π in L and thus in probability. This means the quadratic variation of Z t is b s ds. Step 4. Localizing to get the general result. We define a stopping by τ M as the minimal value of u such that u T or any of the following three holds: u u u a s ds M, b s ds M, or b s db s M. 4
This is a stopping time, since all three processes above are continuous and adapted. Then ( t P Q π (X t b s ds ɛ ( t P Q π (X t b s ds ɛ, t τ M + P(t > τ M ( t = P Q π (X t τm b s1 {s τm } ds ɛ, t τ M + P(t > τ M ( t P Q π (X t τm b s1 {s τm } ds ɛ + P(t > τ M. Once again, if M is large, then the conditions of X t being a standard process imply that τ M = T, so the second probability can be made as small as we want. Therefore we must show that Q π (X t τm b s1 {s τm } ds in probability. (1 To compute this quadratic variation, we just use the tools we have developed so far. First, write X t τm = ã s ds + bs db s, where ã s = a s 1 {s τm } and b s = b s 1 {s τm }. (This is by the corollary to the persistence of identity result. Write A t and C t for these two integrals. Since ã s satisfies the conditions of step 1, Q π (A t in probability as µ(π. Furthermore, b s satisfies the conditions of step 3, so Q π (C t ( b s ds = b s1 {s τm } ds in probability. Last, Q π (X t τm = Q π (A t + Q π (C t + (i ã s ds bs db s, (i and the last term by Cauchy-Schwarz is bounded by Q 1/ π (A t Q 1/ π (C t, which converges to. Therefore Q π (X t τm b s1 {s τm } ds in probability. 1 Stochastic differential equations Here we will talk just a little bit about SDE s. A simple SDE will be of the form dx t = µ(t, X t dt + σ(t, X t db t with X =. The term µ(t, X t is the drift term and the term σ(t, X t is the diffusion term. This is because in a small amount of time t, µ(t, X t gives a deterministic push, or drift, to the process of size µ(t, X t t and σ(t, X t gives a (random increment of BM of variance roughly σ(t, X t t. There are many issues related to SDE s and we will only touch on very few of them. 5
1.1 Examples Geometric BM. First, let s see some simple derivations, before we address existence of uniqueness of solutions. Take a very simple SDE dx t = µx t dt + σx t db t with X = x >. In this equation, when µ >, whenever X t >, the drift pushes up to be more positive, and whenever X t <, the drift pushes us to be more negative. In short, we are pushed away from the origin.this would give us exponential growth if there were no diffusion term, which gives us random pushes. We first guess a solution of the type X t = f(t, B t and apply Ito to get ( dx t = f t (t, B t + 1 f xx(t, B t dt + f x (t, B t db t. If we equate the coefficients, we obtain the equations µf(t, x = f t (t, x + 1 f xx(t, x σf(t, x = f x (t, x For a fixed t, the second equation is of the type f = σf, so its general solution is f(t, x = C t e σx, or rewritten as f(t, x = exp (σx + g(t. Plugging this into the first one, we obtain or µ exp (σx + g(t = g (t exp (σx + g(t + σ µ = g (t + σ, or g(t = (µ σ t + C. In other words, one has f(t, x = C exp (σx + (µ σ t. Noting that C = f(,, we obtain X t = x exp This process is called geometric BM. A couple of notes: ( σb t + (µ σ t. 1. If µ = then we obtain the familiar exponential martingale. 6 exp (σx + g(t,
. For any t, one has EX t = x exp ((µ σ t Ee σbt = x e µt, which goes to infinity exponentially. However if we rewrite X t as ( ( X t = x exp t σ B t t + µ σ, then we see that since B t /t a.s., a.s. X t a.s. is recurrent on (, if µ < σ if µ > σ if µ = σ and x >. In the first case, the drift term is not large enough to overcome the diffusion term. Ornstein-Uhlenbeck (OU process. Now consider for α, σ > dx t = αx t dt + σ db t, X = x. In this case, the drift term pushes us toward, but unlike in geometric BM, where the process in some cases converges to, the size of the diffusion is independent of X t (it is always σ. So we would expect the diffusion term to kick us away from often, but not for very long, and then we are pulled back to zero. Steele says it is reasonable to think that X t is a gaussian process, and since we have considered f(s db s for f deterministic as Gaussian processes, it is reasonable to try to solve it with such processes. His proposed solution type is [ X t = a(t x + where a, b are differentiable, deterministic functions. written as X t = f(t, Y t, b(s db s, where f(t, x = a(tx + a(tx, and dy t = b(t db t. Then This is a Gaussian process, and is dx t = df(t, Y t = f t (t, Y t dt + f x (t, Y t dy t + 1 f xx(t, Y t (dy t. Since f t = a (tx + a (tx, f x = a(t, and f xx =, we obtain dx t = (a (tx + a (ty t dt + a(t dy t ( = a (tx + a (t b(s db s dt + a(tb(t db t = a (t a(t X t dt + a(tb(t db t with X = x, 7
so long as a( = 1 and a(t > for all t. For OU, we want a (t = α and a(tb(t = σ. a(t Therefore a(t = Ce αt, and with a( = 1, we obtain a(t = e αt. The second becomes b(t = σe αt. Therefore we obtain a solution for OU of Note X t = e αt [x + σ 1. For any t, EX t = x e αt. e αs db s = x e αt + σ. For any t, Var X t = σ e α(t s ds = σ α [1 e αt. e α(t s db s. 3. For any t, X t is Gaussian (since it is a deterministic function plus an integral of a deterministic function whose mean converges to and variance converges to σ. Thus α X t converges to a mean zero Gaussian with this variance. In fact, if we start not at a fixed x, but at a random x distributed with this Gaussian distribution, then the process will always have this same distribution (it is invariant in time. Brownian bridge. One definition of the BB is X t = B t tb 1 for t [, 1. Graphically, we run BM until time 1, look at its endpoint, draw a straight line from to this endpoint, and subtract the line from the graph of BM. This forces the BM to hit at time 1. A different characterization is dx t = X t 1 t dt + db t with X =. Here we have random drift back to with magnitude 1/(1 t times the current value. This is reasonable since for the BB, we need to get from the current value X t back to zero in time 1 t. Using the same representation for solutions as in the OU case, we obtain a (t a(t = 1 1 t Thus a(t = 1 t and b(t = 1/(1 t. This means X t = (1 t and a(tb(t = 1. 1 1 s db s for t [, 1 is a solution. This is an integral of a deterministic function so again is Gaussian. Also for each t [, 1, the integrand is in H [, t, so EX t =. The covariance (as we computed before for such integrals is for s t < 1, Cov(X t, X s = (1 t(1 s s 1 (1 u du = (1 t(1 s s 1 s = s(1 t. You can check this is also the covariance of the BB according to the first definition. 8
1. Basic existence and uniqueness As in ODE, we get existence and uniqueness under a Lipschitz condition. Theorem 1.1. Let T > and consider the SDE dx t = µ(t, X t dt + σ(t, X t db t, X = x. Suppose µ and σ are measurable functions that satisfy and µ(t, x µ(t, y + σ(t, x σ(t, y K x y µ(t, x + σ(t, x K(1 + x. Then there exists a continuous adapted (X t t [,T with the properties: 1. (L -bounded sup t T EX t <.. (Satisfies SDE a.s., for all t [, T, X t = x + µ(s, X s ds + σ(s, X s db s. If X t and Y t are adapted, continuous, and satisfy 1 and, then a.s., X t = Y t for all t [, T. This is a strong type of existence result. There is another notion called weak solution which we will not discuss here. You can see that some type of regularity condition is necessary on the coefficients by looking at ODE. In other words, if we take σ(t, x, we just get an ODE, and there are standard examples to show that if Lipschitz-type conditions are abandoned on µ(t, x, then solutions may not exist. Proof. Uniqueness. Suppose that (X t and (Y t are solutions. Then Now, X t Y t = (µ(s, X s µ(s, Y s ds + (σ(s, X s σ(s, Y s db s. ( ( E(X t Y t E (µ(s, X s µ(s, Y s ds + E (σ(s, X s σ(s, Y s db s. The first term is written as ( E 1 [,t (µ(s, X s µ(s, Y s ds E = te 9 ds (µ(s, X s µ(s, Y s ds (µ(s, X s µ(s, Y s ds.
The second term is equal by isometry to E (σ(s, X s σ(s, Y s ds. (We can apply isometry since X and Y are L -bounded and σ is Lipschitz, so the integrands are in H [, T. Applying the Lipschitz condition we get for C = K max{t, 1}, E(X t Y t C Putting g(t = E(X t Y t, we obtain g(t C E(X s Y s ds <. g(s ds for t [, T. By iterating this inequality, we will find that g is identically zero. Indeed, putting M = sup{g(t : t [, T }, we get g(t CMt. Plugging this estimate back in to the integral n times, we obtain g(t MC n tn n!. This means g(t = for all t [, T, and so for any fixed t, X t = Y t a.s. But since X and Y are continuous, this means a.s., X t = Y t for all t [, T. 1