Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2)

Similar documents
Stochastic Calculus. Kevin Sinclair. August 2, 2016

Lecture 4: Introduction to stochastic processes and stochastic calculus

Lecture 12: Diffusion Processes and Stochastic Differential Equations

1. Stochastic Processes and filtrations

FE610 Stochastic Calculus for Financial Engineers. Stevens Institute of Technology

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Kolmogorov Equations and Markov Processes

The concentration of a drug in blood. Exponential decay. Different realizations. Exponential decay with noise. dc(t) dt.

Verona Course April Lecture 1. Review of probability

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

On pathwise stochastic integration

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Wiener Measure and Brownian Motion

Poisson random measure: motivation

2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1

1. Stochastic Process

ELEMENTS OF PROBABILITY THEORY

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion

Stochastic Volatility and Correction to the Heat Equation

I forgot to mention last time: in the Ito formula for two standard processes, putting

Gaussian, Markov and stationary processes

A Short Introduction to Diffusion Processes and Ito Calculus

Solution for Problem 7.1. We argue by contradiction. If the limit were not infinite, then since τ M (ω) is nondecreasing we would have

Bernardo D Auria Stochastic Processes /12. Notes. March 29 th, 2012

Homework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples

Malliavin Calculus in Finance

Exercises in stochastic analysis

Stochastic Calculus Made Easy

A Concise Course on Stochastic Partial Differential Equations

Stochastic Integration and Stochastic Differential Equations: a gentle introduction

MATH4210 Financial Mathematics ( ) Tutorial 7

GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM

Stochastic Differential Equations.

Notes on uniform convergence

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Stochastic differential equation models in biology Susanne Ditlevsen

Reflected Brownian Motion

L p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by

Clases 11-12: Integración estocástica.

Numerical Integration of SDEs: A Short Tutorial

A new approach for investment performance measurement. 3rd WCMF, Santa Barbara November 2009

Lecture 4: Ito s Stochastic Calculus and SDE. Seung Yeal Ha Dept of Mathematical Sciences Seoul National University

Continuous Time Finance

Lecture 22 Girsanov s Theorem

Backward Stochastic Differential Equations with Infinite Time Horizon

Large Deviations for Small-Noise Stochastic Differential Equations

Large Deviations for Small-Noise Stochastic Differential Equations

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Analysis Qualifying Exam

The Pedestrian s Guide to Local Time

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Econ 508B: Lecture 5

The Multivariate Normal Distribution 1

Existence and Uniqueness

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

1 Solution to Problem 2.1

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

1 Introduction. 2 Diffusion equation and central limit theorem. The content of these notes is also covered by chapter 3 section B of [1].

Stochastic Calculus for Finance II - some Solutions to Chapter VII

I. ANALYSIS; PROBABILITY

The Wiener Itô Chaos Expansion

Stat 5101 Lecture Slides Deck 4. Charles J. Geyer School of Statistics University of Minnesota

Chapter 4: Monte-Carlo Methods

Xt i Xs i N(0, σ 2 (t s)) and they are independent. This implies that the density function of X t X s is a product of normal density functions:

Measure and Integration: Solutions of CW2

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

r=1 I r of intervals I r should be the sum of their lengths:

More Empirical Process Theory

University of Regina. Lecture Notes. Michael Kozdron

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Stochastic integral. Introduction. Ito integral. References. Appendices Stochastic Calculus I. Geneviève Gauthier.

1.3.1 Definition and Basic Properties of Convolution

Example 4.1 Let X be a random variable and f(t) a given function of time. Then. Y (t) = f(t)x. Y (t) = X sin(ωt + δ)

Introduction. Stochastic Processes. Will Penny. Stochastic Differential Equations. Stochastic Chain Rule. Expectations.

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

Jump-type Levy Processes

1 Review of Probability

Dynamic Risk Measures and Nonlinear Expectations with Markov Chain noise

Derivation of Itô SDE and Relationship to ODE and CTMC Models

Backward martingale representation and endogenous completeness in finance

Continuum Probability and Sets of Measure Zero

1 Brownian Local Time

UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 197

Topics in fractional Brownian motion

Scaling Limits of Waves in Convex Scalar Conservation Laws under Random Initial Perturbations

Some Background Material

Lecture Notes on Metric Spaces

3 Continuous Random Variables

BASICS OF PROBABILITY

Partial Differential Equations with Applications to Finance Seminar 1: Proving and applying Dynkin s formula

STAT 331. Martingale Central Limit Theorem and Related Results

SDE Coefficients. March 4, 2008

B8.3 Mathematical Models for Financial Derivatives. Hilary Term Solution Sheet 2

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Mathematical Methods for Physics and Engineering

Malliavin Calculus: Analysis on Gaussian spaces

Lecture 9: March 26, 2014

Transcription:

Some Terminology and Concepts that We will Use, But Not Emphasize (Section 6.2) Statistical analysis is based on probability theory. The fundamental object in probability theory is a probability space, (Ω, F, P). Ω sample space, universe of discourse; typical element, ω Ω F σ-field on Ω; collection of subsets of Ω P probability measure on (Ω, F) A random variable is a real-valued function defined on Ω. I use upper case Latin letters to represent random variables (usually!). To emphasize that a random variable is a function, we sometimes write something like X(ω), where ω Ω. (Tsay writes X(η).) 1

Stochastic Processes A stochastic process, or a time series, is a sequence of random variables in which the index is time, {X t }. We can also think of a sequence of probability spaces, {(Ω t, F t, P t )}. This may not be very useful. Everything is changing. Another type of a sequence of probability spaces, {(Ω, F t, P t )}. Still may not be very useful. May be useful if F s F t for s < t. Such a sequence of σ-fields is called a filtration. This is what we will usually assume. 2

Martingales If {X t, F t } is a sequence such that F t is a filtration and E(X t+1 F t ) = X t then {X t, F t } is called a martingale. Thus, a martingale is a zero-drift stochastic process. For a martingale, X 0, X 1,..., we have E( X) = 0 and E(X T ) = x 0. 3

Probability Models in Continuous Time We often observe a stochastic process {X t } at regularly-spaced fixed points in time, in which case we just map time to some set of the intergers, and denote the terms in the process as X 1, X 2,... or maybe... X 1, X 0, X 1,.... Even though we may not be able to observe a process in continuous time, it is easy to think of the index as continuous. When this is the case, we often write the random variable as X(t) instead of X t. 4

Probability Models in Continuous Time Instead of X t or X(t), we might also write the sequence of random variables as X(ω t ). In this case, the model for a stochastic process begins with a sample space Ω over which there is a sampling sequence (ω 1, ω 2,...), or more generally, in continuous time, ω(t). Each sequence yields a path or trajectory, The sample space for the stochastic process becomes the set of paths. 5

Stochastic Processes in Continuous Time Our approach to analysis of stochastic processes in continuous time will involve the usual consideration of finite changes. The backshift operator B will not be used. We will use the difference operator, but its meaning will change to correspond to its use in calculus. X t means some small change to X t (and there may be some ambiguity about forward change or backward change that will need to be resolved). t means some small change to t, and it is usually associated with a X t (so if directions are an issue, both are in the same direction). 6

Brownian Motion or Wiener Processes Suppose in a sequence W 0, W 1,..., the distribution of W t+1 W t is normal with mean 0 and standard deviation 1. What is the distribution of W t+2 W t? It is normal with mean 0 and standard deviation 2. If we allow continuous time, what is the distribution of W t+0.5 W t? It must be normal with mean 0 and standard deviation 0.5. More generally, the distribution of the change in time t has a standard deviation of t This kind of process with the Markovian property and with a normal distribution of the changes leads to a Brownian motion or a Wiener process. 7

Wiener Processes We start with two properties: The change W t during a small period of time t is given by W t = Z t, where Z is a realization of a variable with a N(0,1) distribution. The values of W t for any two short intervals of time t are independent. 8

Wiener Processes over Longer Periods of Time Now, consider N time periods, and let T = N t. We have W(T) W(0) = N i=1 Z i t. 9

Wiener Processes As in ordinary calculus, we consider the limit as t 0, and we have the differential equation dw = Zdt. This is a Brownian motion or a Wiener process. Terminology: What s this? Sometimes it s dw that is called the Brownian motion or the Wiener process. and sometimes it s just W. The fact that we had t in the original equation has important implications that we will return to later. We can use the Wiener process to develop a generalized Wiener process: ds = µdt + σdw, where µ and σ are constants: drift and volatility. 10

An Aside: Moments of a Normal Random Variable The moment-generating function (MGF) for the random variable X, if it exists, is ψ X (t) = E ( e tx). An important use of the MGF is in computing the moments. For the random variable X with MGF ψ X (t), we have E(X r ) = ψ (r) X (0). 11

An Aside: Moments of a Normal Random Variable The moment-generating function exists for many common distributions, and in particular, for the N(µ, σ 2 ) distribution we have ψ X (t) = e µt+σ2 t 2 /2, for < t <. Hence, we get the raw moments as E(X) = dψ X(t) dt E(X 2 ) = d2 ψ X (t) dt 2 E(X 3 ) = d3 ψ X (t) dt 3 E(X 4 ) = d4 ψ X (t) dt 4 t=0 = µ t=0 = µ 2 + σ 2 t=0 = µ 3 + 3µσ 2 t=0 = Exercise. 12

Additional Asides Expectation is a linear operator; hence it is easy to determine expectations of linear transformations of variables. If then etc. E Y = ax + b, E(Y ) = ae(x) + b ( Y 2) ( = E (ax + b) 2) = a 2 ( E X 2) + 2abE(X) + b 2 We can form relationships among expectations of powers and variances; for example: E ( X 2) = V(X) + (E(X)) 2 E ( X 4) = V ( X 2) + ( E ( X 2)) 2 etc. 13

Additional Asides (Continued) For a standard normal random variable Z, that is, Z N(0,1), we have Z 2 χ 2 (1). Now a χ 2 (ν) is a gamma(2, ν/2). The expectation of a gamma(α, β) is αβ and the variance is αβ 2. (These ( are easy relationships to remember.) From this, we have E Z 2) ( = 1 and V Z 2) = 1/2. If X N(µ, σ 2 ) its relationship with a standard normal random variable Z is X = σz + µ. This gives X 2 = σ 2 Z 2 + 2µσZ + µ 2. Hence, using the stuff above, we have ( E X 4) ( = V X 2) + ( E (X 2)) 2 = σ 4 V ( Z 2) + 4µ 2 σ 2 V(Z) + ( σ 2 E ( Z 2) + 2µσE(Z) + µ 2) 2 = 2σ 4 + 4µ 2 σ 2 + ( σ 2 + µ 2) 2 = 3σ 4 + 6µ 2 σ 2 + µ 4. (which you get using different methods in the suggested exercise.) 14

Properties of the Discrete Process Underlying the Wiener Process With W = Z t, we immediately have from the previous results for the normal distribution E( W) = 0 E ( ( W) 2) = V( W) + (E( W)) 2 = t E ( ( W) 3) = 0 E ( ( W) 4) = V ( ( W) 2) + ( E ( ( W) 2)) 2 = 3( t) 2 Because of independence, for i W and j W, representing two nonoverlapping intervals of time, E(( i W)( j W)) = cov( i W, j W) = 0. 15

The Wiener process is a random variable; that is, it is a realvalued mapping from a sample space Ω. We sometimes use the notation W(ω) to emphasize this fact. The Wiener process is a function in continuous time. We sometimes use the notation W(t, ω) or W(t ω ) to emphasize the time dependency. Most of the time we drop the ω. Also, sometimes we write W t instead of W(t). All of these notations are equivalent. 16

Continuing the Definition of a Wiener Process There two additional properties of a Wiener process or Brownian motion that we need in order to have a useful model. We need an initial value, and we need it to be continuous in time. Because the Wiener process is a random variable, the values it takes are those of a function at some point in the underlying sample space, Ω. Therefore, when we speak of W(t) at some t, we must speak in terms of probabilities of values or ranges of values. When we speak of a particular value, the most we can say is that the values occurs almost surely. Almost surely means with probability 1. 17

Two Additional Properties of a Wiener Process That Are Almost Sure We assume W(t) = 0 almost surely at t = 0. We assume W(t) is almost surely continuous in t. These two properties together with the limiting forms of the two properties given at the beginning define a Wiener process or Brownian motion. (There is a theorem due to Kolmogorov that states that given the first three properties, there exists a version that is absolutely continuous in t.) 18

Some Asides: Almost Everywhere In a measure space a statement holds almost everywhere if it holds over the full space except possibly on a set of measure 0. In that case, we write the statement followed by a.e., or if the measure is a probability measure, we usually use a.s.. For example, consider the function f n (x) = ni [0,n 1 ](x) for n = 1,2,... and x IR, with the usual measure (Lebesgue). Then lim n f n (x) = 0 for any x 0. So we have lim n f n (x) = 0 a.e. This is also a good example to show the effect of interchange of a limit and an integral (which is often defined in terms of a limit). Because of the property above, lim n f n(x)dx = 0. But because f n (x)dx = 1, for any n = 1,2,..., lim n f n (x)dx = 1. 19

Almost Surely If the measure is a probability measure instead of almost everywhere or a.e., we use the term almost surely or a.s. Almost surely means with probability 1. If X U(0,1), what is the probability that X = 0.5? It is 0. So we could say X [0,0.5) (0.5,1] a.s. The term is most often used for a type of convergence of a sequence of random variables, {X t }. We say {X t } converges to Y almost surely, and write if X t Y a.s., lim X t = Y a.s. t Here, Y could be a random variable, or it may be a fixed number. 20

Some Asides: Convergence If {X t } converges to Y almost surely, we also write X t a.s. Y. Convergence almost everywhere is also called strong convergence. There are several kinds of weaker convergence. The most common is convergence in probability. We say {X t } converges to Y in probability, and write if for any fixed ɛ > 0 X t p Y. lim Pr( X t Y > ɛ) = 0. t We also have convergence in r th moment, X N Lr Y : lim E( X t Y r r) = 0. t For r = 2 we also write ms-lim n X t = Y. Now,... back to Wiener processes. 21

Properties of a Wiener Process From the definition, we can see immediately that the Wiener process is Markovian the Wiener process is a martingale. 22

Generalized Wiener Processes A Wiener process or Brownian motion is a model for changes. It models diffusion. If the process drifts over time (in a constant manner), we can add a term for the drift, adt. More generally, a model for the state of a process that has both a Brownian difusion and a drift is a generalized Wiener process: ds = adt + bdw, where a and b are constants. A generalized Wiener process is a type of a more general drift-diffusion process. While the expected value of the Wiener process at any time is 0, the expected value of the state S is not necessarily 0. Likewise, the variance is affected by b. Both the expected value and the variance of S are functions of time. 23

Properties of a Wiener Process One of the most interesting properties of a Wiener process is that it is not differentiable. (We also say that its first variation is infinite.) It is infinitely wiggly. 24

Variation in Normal Processes at Shorter Time Intervals W(t) 2 0 1 2 0 20 40 60 80 100 t W(t) 2 0 1 2 0 20 40 60 80 100 t W(t) 2 0 1 2 0 20 40 60 80 100 t 25

An Aside: Continuity and Differentiability We consider three successively stronger types of continuity, and one modification of the strong type. Definition Let f be a real-valued function whose domain includes a set D IR d. We say that f is uniformly continuous over D if, given ɛ > 0, δ x, y D with x y < δ, f(x) f(y) < ɛ. Continuity is a point-wise property, while uniform continuity is a property for all points in some given set. 26

An Aside: Continuity and Differentiability Example: (Continued) The function f(x) = 1/x is continuous on ]0, [, but is not uniformly continuous over that interval. This function is, however, uniformly continuous over any closed and bounded subinterval of ]0, [. The Heine-Cantor theorem, in fact, states that any function that is continuous over a compact set is uniformly continuous over that set. If {x n } is a Cauchy sequence in the domain of a a uniformly continuous function f, then {f(x n )} is also a Cauchy sequence. If a function f is uniformly continuous over a finite interval ]a, b[, then f is bounded over ]a,b[. 27

An Aside: Continuity and Differentiability (Continued) Definition Let f be a real-valued function defined on [a, b] (its domain may be larger). We say that f is absolutely continuous on [a, b] if, given ɛ > 0, there exists a δ such that for every finite collection of nonoverlapping open rectangles ]x i, y i [ [a, b] with n i=1 x i y i < δ, n i=1 f(x i ) f(y i ) < ɛ. If f is absolutely continuous over D, it is uniformly continuous on D, but the converse is not true. 28

An Aside: Continuity and Differentiability (Continued) Example: The Cantor function, defined over the interval [0,1], is an example of a function that is continuous everywhere, and hence, uniformly continuous on that compact set, but not absolutely continuous. The Cantor function takes different values over the different intervals used in the construction of the Cantor set. The Cantor set is i=1 C i, where C 1 = [0,1/3] [2/3,1] C 2 = [0,1/9] [2/9,1/3] [2/3,7/9] [8/9,1]. 29

Example Continued: The Cantor Function Let f 0 (x) = x, and then for n = 0,1,..., let f n+1 (x) = 0.5f n (3x) for 0 x < 1/3 f n+1 (x) = 0.5 for 1/3 x < 2/3 f n+1 (x) = 0.5 + 0.5f n (3(x 2/3)) for 2/3 x 1. The Cantor function is f(x) = lim n f n (x). The Cantor function has a derivative of 0 almost everywhere, but has no derivative at any member of the Cantor set. 30

An Aside: Continuity and Differentiability (Continued) An absolutely continuous function is of bounded variation; it has a derivative almost everywhere; and if the derivative is 0 a.e., the function is constant. 31

An Aside: Continuity and Differentiability (Continued) A slightly stronger form of continuity is Lipschitz-continuity. It places an explicit bound on the amount by which the function can change. Definition Let f be a real-valued function whose domain is an interval D IR d. We say that f is Lipschitz-continuous if for any y 1, y 2 D and y 1 y 2, there exists γ such that f(y 1 ) f(y 2 ) γ y 1 y 2. The smallest γ for which the inequality holds is called the Lipschitz constant. 32

An Aside: Continuity and Differentiability (Continued) Every Lipschitz-continuous function is absolutely continuous. Lipschitz continuity plays an important role in nonparametric function estimation. The graph of a scalar-valued Lipschitz-continuous function f over D IR has the interesting geometric property that the entire graph of f(x) lies between the lines y = f(c) ± γ(x c) for any c D. Example: The function f(x) = x for x [0,1] is an example of a absolutely continuous everywhere on [0, 1], but is not Lipschitz continuous on that set. (The problem with Lipschitz continuity occurs at x = 0.) 33

An Aside: Continuity and Differentiability (Continued) Finally, a slight modification of Lipschitz-continuity yields another form of continuity called uniform Lipschitz-continuity of order α, or Hölder continuity of order α. Definition Let f be a real-valued function whose domain is an interval D IR d. We say that f is Hölder-continuous of order α where α > 0, if for any y 1, y 2 D and y 1 y 2, there exists γ such that f(y 1 ) f(y 2 ) γ y 1 y 2 α. 34

An Aside: Continuity and Differentiability (Continued) Depending on α, Hölder continuity may be stronger or weaker than Lipschitz continuity. For α < 1, Hölder continuity does not guarantee differentiability, whereas uniform continuity, and a fortiori, Lipschitz continuity, does guarantee it, except on set of measure 0. 35

An Aside: Continuity and Differentiability (Continued) Continuity has to do with how function values change as the function argument changes. A continuous function does not have abrupt changes. Differentiability is a related concept that has to do with the rate of change. 36

An Aside: Continuity and Differentiability (Continued) Definition Let x be a point in IR and let f be a real-valued function defined in an open neighborhood of x. We say that f is differentiable at the point x if the limit exists. lim h 0 f(x + h) f(x) h If the limit exists, it is called the derivative of f at the point x and is denoted as f. Wherever it exists, the derivative is a function, and we often denote it as f (x). 37

An Aside: Continuity and Differentiability (Continued) Differentiability obviously depends on continuity, but does continuity guarantee differentiability? Example: The Weierstrass function, defined over the interval [ 2, 2], is an example of a function that is continuous everywhere but differentiable nowhere. The Weierstrass function is f(x) = n=0 a n cos(b n xπ), where 0 < a < 1 and b is a positive odd integer such that ab > 1 + 3π/2. Graph this function. (It s self-similar.) 38

An Aside: Continuity and Differentiability (Continued) The Weierstrass function shows that Hölder continuity may not be sufficient to guarantee differentiability. (The Weierstrass function is Hölder continuous for all orders α < 1.) Uniform continuity is the weakest form that guarantees differentiability. A uniformly continuous function is differentiable almost everywhere. Even Lipschitz-continuity does not guarantee differentiability. For example f(x) = x is Lipschitz continuous over [ a, a], but it is not differentiable at x = 0. 39

An Aside: Variation of Functionals The variation of a functional is a measure of its rate of change. It is similar in concept to an integral of a derivative of a function. For studying variation, we will be interested only in functions from the interval [0,T] to IR. To define the variation of a general function f : [0, T] IR, we form N intervals 0 = t 0 t 1 t N = T. The intervals are not necessarily of equal length, so we define as the maximum length of any interval; that is, = max(t i t i 1 ). Now, we denote the p th variation of f as V p (f) and define it as V p (f) = lim 0 N i=1 (Notice that 0 implies N.) f(t i ) f(t i 1 ) p. 40

First Variation of Functionals With equal intervals, t, for the first variation, we can write V 1 (f) = lim t 0 = lim N N i=1 N 1 i=0 f(t i ) f(t i 1 ) t f(t i + t) f(t i ), t from which we can see that for a differentiable function f : [0,T] IR, V 1 (f) = T 0 df dt dt. The notation FV (f), or more properly, FV(f), is sometimes used instead of V 1 (f). 41

Second Variation of Functionals Again, with equal intervals, t, for the second variation, we can write V 2 (f) = lim t 0 N i=1 = lim t lim t 0 N (f(t i ) f(t i 1 )) 2 N 1 i=0 t ( f(ti + t) f(t i ) t ) 2. For a differentiable function f : [0, T] IR, we have V 2 (f) = lim t 0 t T 0 The integrand is bounded, therefore this limit is 0, and we conclude that the second variation of a differentiable function is 0. *** discuss roughness *** df dt 2 dt. 42

Variation of Stochastic Functionals If X is a stochastic functional, then V p (X) is also stochastic. If it converges to a deterministic quantity, the nature of the convergence must be considered. 43

First and Second Variation of a Wiener Process Two important properties of a Wiener process on [0, T] are V 2 (W) = T a.s., which from properties of second variation implies that W(t) is not differentiable. V 1 (W) = a.s. Notice that because W is a random variable we must temper our statement with a phrase about the probability or expected value. 44

Since the second variation is nonzero, W cannot be differentiable. But also because of the continuity of W in t, it is easy to see that the first variation diverges if the second variation converges to a finite value. This is because N 1 n=0 (W(t n+1 ) W(t n )) 2 sup W(t n+1 ) W(t n ) N 1 n=0 W(t n+1 ) W(t n ) In the limit the term on the left is T > 0, and the term on the right is 0 times V 1 (W); therefore V 1 (W) =. 45

Properties of Differentials Although W and dw are random variables, the product dw dw is deterministic. We can see this by considering the stochastic process ( W) 2. We have seen that V ( ( W) 2) = 2( t) 2, so the variance of this process is 2( t) 2 ; that is, as t 0, the variance of this process goes to 0 faster, as ( t) 2. ( Also, as we have seen, E ( W) 2) = t, and so ( W) 2 goes to t at the same rate as t 0. That is, ( W)( W) a.s. t as t 0. 46

Properties of Differentials (Continued) The convergence of ( W)( W) to t as t 0 yields dwdw = dt. (This equality is almost sure.) But dt is a deterministic quantity. This is one of the most remarkable facts about a Wiener process. 47

Multidimensional Wiener Processes If we have two Wiener processes W 1 and W 2, with V(dW 1 ) = V(dW 2 ) = dt) and cov(dw 1,dW 2 ) = ρdt (that is, corr(dw 1,dW 2 ) = ρ), then by a similar argument as before, we have dw 1 dw 2 = ρdt, almost surely. Again, this is deterministic. The results of course extend to any vector of Wiener processes (W 1,..., W d ). If (W 1,..., W d ) arise from W i = X i t, where the vector of Xs has a multivariate normal distribution with mean 0 and variance-covariance matrix Σ, then the variancecovariance matrix of (dw 1,...,dW d ) is Σdt, which is deterministic. 48

Multidimensional Wiener Processes Note my notation for vectors! (discussed in class) Starting with (Z 1,..., Z d ) i.i.d. N(0,1) and forming the Wiener processes B = (B 1,..., B d ) beginning with B i = Z i t, we can form a vector of Wiener processes W = (W 1,..., W d ) with variancecovariance matrix Σdt for dw = (dw 1,...,dW d ) by the transformation or equivalently by W = Σ 1/2 B, W = Σ C B, where Σ C is a Cholesky factor of Σ, that is, Σ T C Σ C = Σ. Recall, for a fixed matrix A, so from above, for example, ***notation V(AY ) = A T V(Y )A, V(dW) = Σ T CV(dB)Σ C = Σ T Cdiag(dt)Σ C = Σdt. 49

Stochastic Integrals With Respect to Wiener Processes The stochastic differentials such as dw naturally lead us to consider integral with respect to stochastic differentials, that is, stochastic integrals. If W is a Wiener process on [0, T], we may be interested in an integral of the form T 0 g(y (t), t)dw, where Y (t) is a stochastic process (that is, Y is a random variable) and g is some function. The problem with developing a definition of this integral following the same steps as in the definition of a Riemann integral, that is, as a limit of sequences of sums of areas of rectangles, is that because the sides of these rectangles, Y and dw, are random variables, there are different kinds of convergence of a limit. Also, the convergence of products of Y (t) depend on where Y (t) is evaluated. 50

Ito Integral We begin developing a definition of T g(y (t), t)dw, 0 by considering how the Riemann integral is defined in terms of the sums I n (t) = n 1 i=0 g(y (τ i ), τ i )(W(t i+1 ) W(t i )), where 0 = t 0 τ 0 t 1 τ 1 τ n 1 t n = T. As in the Riemann case we will define the integral in terms of a limit as the mesh size goes to 0. First, the existence depends on a finite expectation that is similar to a variance. We assume E ( T 0 g(y (t), t)dt ) <. 51

Ito Integral As mentioned before the convergence must be qualified because the intervals are random variables; furthermore, (although it is not obvious!) the convergence depends on where τ i is in the interval [t i, t i+1 ]. The first choice in the definition of the Ito stochastic integral is to choose τ i = t i. Other choices, such as choosing τ i to be at the midpoint of the integral, lead to different types of stochastic integrals. Next is the definition of the type of convergence. In the Ito stochastic integral, the convergence is in mean square, that is L 2 convergence. 52

Definition of the Ito Integral With the two choices me have made, we take I n (t) = n 1 i=0 and the Ito integral is defined as g(y (t i ), t i )(W(t i+1 ) W(t i )), I(t) = ms-lim n I n (t). This integral based on a Wiener process is used throughout financial analysis. Note that this integral is a random variable; in fact, it is a stochastic process. This is because of the fact that the differentials are from a Wiener process. Also, because the integral is defined by a Wiener process, it is a martingale. 53

Ito Processes An Ito process is a generalized Wiener process dx = adt+bdw, in which the parameters a and b are functions of the underlying variable X and of time t (of course, X is also a function of t). The functions a and b must be measurable with respect to the filtration generated by W(t) (that is, to the sequence of smallest σ-fields with respect to which W(t) is measurable. (This is expressed more simply by saying a and b are adapted to the filtration generated by W(t).) The Ito process is of the form dx(t) = a(x(t), t)dt + b(x(t), t)dw. 54

Ito Processes The Ito integral (or any other stochastic integral) gives us a solution to this stochastic differential equation: T T X(T) = X(0) + a(x(t), t)dt + b(x(t), t)dw(t). 0 0 (The differential in the first integral is deterministic although the integrand is stochastic. The second integral, however, is a stochastic integral. Other definitions of this integral would require modifications in the interpretation of properties of the Ito process.) We are often interested in multidimensional Ito processes. Their second-order properties (variances and covariances) behave very similarly to those of Wiener processes, which we discussed earlier. 55

A Model for the Price of a Financial Asset in Continuous Time In applications, both the drift component and the diffusion component are rates that depend on the magnitude of the current state. If we factor that value in the Ito process defined above and let µ and σ represent the adjusted functions a and b, we have dx(t) X(t) = µ(x(t), t)dt + σ(x(t), t)dw. This Ito process is widely-used in financial applications. In this form µ( ) is called the drift and the diffusion component σ( ) is called the volatility. 56

Geometric Brownian Motion The Ito process would be much easier to work with if µ( ) and σ( ) did not depend on the value of the state; that is, if we we use the model ds(t) = µ(t)dt + σ(t)dw, S(t) where I have switched to S(t) because I m thinking of the price of a stock. The Ito process would be even easier to work with if µ( ) and σ( ) were constant; that is, if we we just use the model ds(t) S(t) = µdt + σdw. This model is called a geometric Brownian motion, and is widely used in modeling prices of various financial assets. 57

More on Prices of Financial Assets For our initial applications we will use the simple geometric Brownian motion model. What are the meaning of the parameters in the model? How would you estimate them for a given security? Obviously, we can t work with dt, so how long is t?? The difference is equivalent to continuous compounding. Recall if and amount A is invested for n periods at a per-period rate R that is compounded m times per period, the terminal value is A ( 1 + R m) nm. 58

The limit of the terminal value as m is Ae Rn. In practice, t can be taken as one year. The continuous compounding formula can be used to adjust. We estimate µ and σ from the historical rates of return; the mean and the standard deviation respectively. It turns out that the main one of interest is σ. It is not clear how to estimate σ. Here s a widely-used formula for the annualized volatility given N periods each of length t (measured in years) with closing prices S 0, S 1,..., S N and r i = log(s i /S i 1 ) σ = 1 1 t N N i=1 (r i r) 2. 59

The Geometric Brownian Motion Drift-Diffusion Model A simple form of this model, ds(t) = µs(t)dt + σs(t)dw(t), in which µ and σ are constants, leads to the Black-Scholes theory of options pricing. This model is called geometric Brownian motion (from the use of geometric to refer to series with multiplicative changes, as opposed to arithmetic series that have additive changes). 60

Geometric Brownian Motion Note that as a model for the rate of return, ds(t)/s(t) geometric Brownian motion is similar to other common statistical models: ds(t) S(t) = µdt + σdw(t) or response = systematic component + random error. Without the stochastic component, the differential equation has the simple solution S(t) = ce µt, from which we get the formula for continuous compounding for a rate µ. 61

An Intuitive Examination of Geometric Brownian Motion in Prices What rate of growth do we expect for S in the geometric Brownian motion model ds(t) = µdt + σdw(t)? S(t) Should it be µ because that is the rate for the systematic component, and the expected value of the random component is 0? Consider a rate of change σ, that is equally likely to be positive or negative. What is the effect on a given quantity if there is an uptick of σ followed by a downtick of equal magnitude (or a downtick followed by an uptick)? The result for the two periods is σ 2. (This comes from the multiplication of the given quantity by (1+σ)(1 σ).) The average over the two periods is σ 2 /2. The stochastic component reduces the expected rate of µ by σ 2 /2. This is the price of risk. 62